RE: Proposed WAL changes
От | Mikheev, Vadim |
---|---|
Тема | RE: Proposed WAL changes |
Дата | |
Msg-id | 8F4C99C66D04D4118F580090272A7A234D32FC@sectorbase1.sectorbase.com обсуждение исходный текст |
Ответ на | Proposed WAL changes (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Proposed WAL changes
|
Список | pgsql-hackers |
> >> * Store two past checkpoint locations, not just one, in pg_control. > >> On startup, we fall back to the older checkpoint if the newer one > >> is unreadable. Also, a physical copy of the newest > >> checkpoint record > > > And what to do if older one is unreadable too? > > (Isn't it like using 2 x CRC32 instead of CRC64 ? -:)) > > Then you lose --- but two checkpoints gives you twice the chance of > recovery (probably more, actually, since it's much more likely that > the previous checkpoint will have reached disk safely). This is not correct. If log is corrupted somehow (checkpoint wasn't flushed as promised) then you have no chance to *recover* because of DB will be (most likely) in inconsistent state (data pages flushed before corresponding log records etc). So, second checkpoint gives us twice the chance to *restart* in normal way - read checkpoint and rollforward from redo record, - not to *recover*. But this approach twice increases on-line log size requirements and doesn't help to handle cases when pg_control was corrupted. Note, I agreed that disaster *restart* must be implemented, I just think that "two checkpoints" approach is not the best way to follow. From my POV, scanning logs is much better - it doesn't require doubling size of on-line logs and allows to *restart* if pg_control was lost/corrupted: If there is no pg_control or it's corrupted or points to unexistent/corrupted checkpoint record then scan logs from newest to oldest one till we find last valid checkpoint record or oldest valid log record and than redo from there. > See later discussion --- Andreas convinced me that flushing NEXTXID > records to disk isn't really needed after all. (I didn't > take the flush out of my patch yet, but will do so.) I still want > to leave the NEXTXID records in there, though, because I think that > XID and OID assignment ought to work as nearly alike as possible. As I explained in short already: with UNDO we'll be able to reuse XIDs after restart - ie there will be no point to have NEXTXID records at all. And there is no point to add it now. Does it fix anything? Isn't "fixing" all what we must do in beta? Vadim
В списке pgsql-hackers по дате отправления: