Re: B-tree parent pointer and checkpoints
От | Heikki Linnakangas |
---|---|
Тема | Re: B-tree parent pointer and checkpoints |
Дата | |
Msg-id | 4CD7FDBA.1020506@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: B-tree parent pointer and checkpoints (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Ответы |
Re: B-tree parent pointer and checkpoints
|
Список | pgsql-hackers |
On 02.11.2010 16:40, Heikki Linnakangas wrote: > On 02.11.2010 16:30, Tom Lane wrote: >> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >>> I think we can fix this by requiring that any multi-WAL-record actions >>> that are in-progress when a checkpoint starts (at the REDO-pointer) must >>> finish before the checkpoint record is written. >> >> What happens if someone wants to start a new split while the checkpoint >> is hanging fire? > > You mean after CreateCheckPoint has determined the redo pointer, but > before it has written the checkpoint record? The new split can go ahead, > and the checkpoint doesn't need care about it. Recovery will start at > the redo pointer, so it will see the split record, and will know to > finish the incomplete split if necessary. > > The logic is the same as with inCommit. Checkpoint will fetch the list > of in-progress splits some time after determining the redo-pointer. It > will then wait until all of those splits have finished. Any new splits > that begin after fetching the list don't affect the checkpoint. > > inCommit can't be used as is, because it's tied to the Xid, but > something similar should work. Here's a first draft of this, using the inCommit flag as is. It works, but suffers from starvation if you have a lot of concurrent multi-WAL-record actions. I tested that by running INSERTs to a table with tsvector field with a GiST index on it from five concurrent sessions, and saw checkpoints regularly busy-waiting for over a minute. To avoid that, we need something a little bit more complicated than a boolean flag. I'm thinking of adding a counter beside the inCommit flag that's incremented every time a new multi-WAL-record action begins, so that the checkpoint process can distinguish between a new action that was started after deciding the REDO pointer and an old one that's still running. (inCommit is a misnomer now, of course. Will need to find a better name..) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: