Re: GSoC on WAL-logging hash indexes
От | Heikki Linnakangas |
---|---|
Тема | Re: GSoC on WAL-logging hash indexes |
Дата | |
Msg-id | 5319D3A3.9070209@vmware.com обсуждение исходный текст |
Ответ на | Re: GSoC on WAL-logging hash indexes (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On 03/07/2014 03:48 PM, Robert Haas wrote: > On Fri, Mar 7, 2014 at 4:34 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> >Hmm. You suggested ensuring that a scan always has at least a pin, and split >> >takes a vacuum-lock. That ought to work. There's no need for the more >> >complicated maneuvers you described, ISTM that you can just replace the >> >heavy-weight share lock with holding a pin on the primary page of the >> >bucket, and an exclusive lock with a vacuum-lock. Note that >> >_hash_expandtable already takes the exclusive lock conditionally, ie. if it >> >doesn't get the lock immediately it just gives up. We could do the same with >> >the cleanup lock. > We could try that. I assume you mean do*just* what you describe > here, without the split-in-progress or moved-by-split flags I > suggested. Yep. > The only issue I see with that is that instead of everyone > piling up on the heavyweight lock, a wait which is interruptible, > they'd all pile up on the buffer content lwlock, a wait which isn't. > And splitting a bucket can involve an arbitrary number of I/O > operations, so that's kind of unappealing. Even checkpoints would be > blocked until the bucket split completed, which seems unfortunate. Hmm. I doubt that's a big deal in practice, although I agree it's a bit ugly. Once we solve the crash-safety of splits, we actually have the option of doing the split in many small steps, even when there's no crash involved. You could for example grab the vacuum-lock, move all the tuples in the first 5 pages, and then release the lock to give other backends that are queued up a chance to do their scans/insertions. Then re-acquire the lock, and continue where you left. Or just bail out and let the next vacuum or insertion to finish it later. - Heikki
В списке pgsql-hackers по дате отправления: