Re: old synchronized scan patch
От | Jeff Davis |
---|---|
Тема | Re: old synchronized scan patch |
Дата | |
Msg-id | 1165339167.4302.67.camel@dogma.v10.wvs обсуждение исходный текст |
Ответ на | Re: old synchronized scan patch (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Tue, 2006-12-05 at 10:49 -0500, Tom Lane wrote: > "Florian G. Pflug" <fgp@phlo.org> writes: > > Hannu Krosing wrote: > >> The worst that can happen, is a hash collision, in which case you lose > >> the benefits of sync scans, but you wont degrade compared to non-sync > >> scans > > > But it could cause "mysterious" performance regressions, no? > > There are other issues for the "no lock" approach that Jeff proposes. > Suppose that we have three or four processes that are actually doing > synchronized scans of the same table. They will have current block > numbers that are similar but probably not identical. They will all be > scribbling on the same hashtable location. So if another process comes > along to join the "pack", it might get the highest active block number, > or the lowest, or something in between. Even discounting the possibility > that it gets random bits because it managed to read the value > non-atomically, how well is that really going to work? > That's an empirical question. I expect it will work, since any active scan will have a significant cache trail behind it. > Another thing that we have to consider is that the actual block read > requests will likely be distributed among the "pack leaders", rather > than all being issued by one process. AFAIK this will destroy the > kernel's ability to do read-ahead, because it will fail to recognize > that sequential reads are being issued --- any single process is *not* > reading sequentially, and I think that read-ahead scheduling is > generally driven off single-process behavior rather than the emergent > behavior of the whole system. (Feel free to contradict me if you've > actually read any kernel code that does this.) It might still be better > than unsynchronized reads, but it'd be leaving a lot on the table. > That's a very interesting point. I had assumed read-ahead was at the kernel level without really caring what processes issued the requests, but it may be system-dependent. I think that's what the elevator (or I/O scheduler, or whatever it's called) is supposed to do. I'll see if I can find some relevant source code in a Linux or FreeBSD box. The controller certainly wouldn't care about process IDs, however. Regards,Jeff Davis
В списке pgsql-hackers по дате отправления: