Re: old synchronized scan patch
От | Jeff Davis |
---|---|
Тема | Re: old synchronized scan patch |
Дата | |
Msg-id | 1165444056.2048.45.camel@dogma.v10.wvs обсуждение исходный текст |
Ответ на | Re: old synchronized scan patch ("Jim C. Nasby" <jim@nasby.net>) |
Список | pgsql-hackers |
On Wed, 2006-12-06 at 12:48 -0600, Jim C. Nasby wrote: > On Tue, Dec 05, 2006 at 09:09:39AM -0800, Jeff Davis wrote: > > That being said, I can just lock the hint table (the shared memory hint > > table, not the relation) and only update the hint every K pages, as Niel > > Conway suggested when I first proposed it. If we find a K small enough > > so the feature is useful, but large enough that we're sure there won't > > be contention, this might be a good option. However, I don't know that > > we would eliminate the contention, because if K is a constant (rather > > than random), the backends would still all want to update that shared > > memory table at the same time. > > What about some algorithm where only one backend will update the hint > entry (perhaps the first one, or the slowest one (ie: lowest page > number))? ISTM that would eliminate a lot of contention, and if you get > clever with the locking scheme you could probably allow other backends > to do non-blocking reads except when the page number passes a 4-byte > value (assuming 4-byte int updates are atomic). > If we have one backend in charge, how does it pass the torch when it finishes the scan? I think you're headed back in the direction of an independent "scanning" process. That's not unreasonable, but there are a lot of other issues to deal with. One thought of mine goes something like this: A scanner process starts up and scans with a predefined length of a cache trail in the shared_buffers, perhaps a chunk of buffers used like a circular list (so it doesn't interfere with caching). When a new scan starts, it could request a block from this scanner process and begin the scan there. If the new scan keeps up with the scanner process, it will always be getting cached data. If it falls behind, the request turns into a new block request. In theory, the scan could actually catch back up to the scanner process after falling behind. We could use a meaningful event (like activity on a particular relation) to start/stop the scanner process. It's just another idea, but I'm still not all that sure that synchronization is necessary. Does anyone happen to have an answer on whether OS-level readahead is system-wide, or per-process? I expect that it's system wide, but Tom raised the issue and it may be a drawback if some OSs do per-process readahead. Regards,Jeff Davis
В списке pgsql-hackers по дате отправления: