Index Page Split logging
От | Simon Riggs |
---|---|
Тема | Index Page Split logging |
Дата | |
Msg-id | 1199192779.9558.256.camel@ebony.site обсуждение исходный текст |
Ответы |
Re: Index Page Split logging
Re: Implementing Sorting Refinements |
Список | pgsql-hackers |
When we split an index page we perform a multi-block operation that is both fairly expensive and complex to reconstruct should we crash partway through. If we could log *only* the insert that caused the split, rather than the split itself, we would avoid that situation entirely. This would then mean that the recovery code would resolve the split by performing a full logical split rather than replaying pieces of the original physical split. Doing that would remove a ton of complexity, as well as reducing log volumes. We would need to ensure that the right-hand page of the split reached disk before the left-hand page. If a crash occurs when only the right hand page has reached disk then there would be no link (on disk) to it and so it would be ignored. We would need an orphaned page detection mechanism to allow the page to be VACUUMed sometime in the future. There would also be some sort of ordering required in the buffer manager, so that pages which must be written last are kept pinned until the first page is written. That sounds like it is fairly straightforward and it would allow a generic mechanism that worked for all index splits, rather than requiring individual code for each rmgr. ISTM that would require Direct I/O to perform physical writes in a specific order, rather than just issue the writes and fsync. Which probably kills it for now, even assuming you followed me on every point up till now... So I'm mentioning this really to get the idea out there and see if anybody has any bright ideas, rather than as a well-formed proposal for immediate implementation. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
В списке pgsql-hackers по дате отправления: