Re: Hot Standby and VACUUM FULL
От | Tom Lane |
---|---|
Тема | Re: Hot Standby and VACUUM FULL |
Дата | |
Msg-id | 19652.1265036777@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Hot Standby and VACUUM FULL (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Ответы |
Re: Hot Standby and VACUUM FULL
Re: Hot Standby and VACUUM FULL |
Список | pgsql-hackers |
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Tom Lane wrote: >> Once the updated map file is moved into place, the relocation is effectively >> committed even if we subsequently abort the transaction. We can make that >> window pretty narrow but not remove it completely. > We could include the instructions to update the map file in the commit > record, instead of introducing a new record type, and update the map > file only *after* writing the commit record. The map file doesn't grow, > so we can be pretty confident that updating it doesn't fail (failure > would lead to PANIC). > I'm assuming the map file is fixed size, with a fixed location for each > relation, so that we can just overwrite the old file without the > create+rename dance, and not worry about torn-pages. That seems too fragile to me, as I don't find it a stretch at all to think that writing the map file might fail --- just think Windows antivirus code :-(. Now, once we have written the WAL record for the mapfile change, we can't really afford a failure in my approach either. But I think a rename() after successfully creating/writing/ fsync'ing a temp file is a whole lot safer than writing from a standing start. The other problem with what you sketch is that it'd require holding the mapfile write lock across commit, because we still have to have strict serialization of updates. [ thinks for awhile ... ] OTOH, overwrite-in-place is what we've always used for pg_control updates, and I don't recall ever seeing a report of a problem that could be traced to that. Maybe we should forget the rename() trick and overwrite the map file in place. I still think it needs to be a separate WAL record though. I'm thinking * obtain lock* open file for read/write* read current contents* construct modified contents* write and sync WAL record* writeback file through already-opened descriptor* fsync* release lock Not totally clear if this is more or less safe than the rename method; but given the assumption that the file is less than one disk block, it should be just as atomic as pg_control updates are. regards, tom lane
В списке pgsql-hackers по дате отправления: