Re: [DOCS] synchronize_seqscans' description is a bit misleading
От | Tom Lane |
---|---|
Тема | Re: [DOCS] synchronize_seqscans' description is a bit misleading |
Дата | |
Msg-id | 18280.1365649805@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | synchronize_seqscans' description is a bit misleading (Gurjeet Singh <gurjeet@singh.im>) |
Ответы |
Re: [DOCS] synchronize_seqscans' description is a bit misleading
|
Список | pgsql-hackers |
Gurjeet Singh <gurjeet@singh.im> writes: > If I'm reading the code right [1], this GUC does not actually *synchronize* > the scans, but instead just makes sure that a new scan starts from a block > that was reported by some other backend performing a scan on the same > relation. Well, that's the only *direct* effect, but ... > Since the backends scanning the relation may be processing the relation at > different speeds, even though each one took the hint when starting the > scan, they may end up being out of sync with each other. The point you're missing is that the synchronization is self-enforcing: whichever backend gets ahead of the others will be the one forced to request (and wait for) the next physical I/O. This will naturally slow down the lower-CPU-cost-per-page scans. The other ones tend to catch up during the I/O operation. The feature is not terribly useful unless I/O costs are high compared to the CPU cost-per-page. But when that is true, it's actually rather robust. Backends don't have to have exactly the same per-page processing cost, because pages stay in shared buffers for a while after the current scan leader reads them. > Imagining that all scans on a table are always synchronized, may make some > wrongly believe that adding more backends scanning the same table will not > incur any extra I/O; that is, only one stream of blocks will be read from > disk no matter how many backends you add to the mix. I noticed this when I > was creating partition tables, and each of those was a CREATE TABLE AS > SELECT FROM original_table (to avoid WAL generation), and running more than > 3 such transactions caused the disk read throughput to behave unpredictably, > sometimes even dipping below 1 MB/s for a few seconds at a stretch. It's not really the scans that's causing that to be unpredictable, it's the write I/O from the output side, which is forcing highly nonsequential behavior (or at least I suspect so ... how many disk units were involved in this test?) regards, tom lane
В списке pgsql-hackers по дате отправления: