Re: Race condition between hot standby and restoring a FPW
От | Heikki Linnakangas |
---|---|
Тема | Re: Race condition between hot standby and restoring a FPW |
Дата | |
Msg-id | 54637BAD.3040209@vmware.com обсуждение исходный текст |
Ответ на | Re: Race condition between hot standby and restoring a FPW (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On 11/12/2014 05:08 PM, Robert Haas wrote: > On Wed, Nov 12, 2014 at 7:39 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> 2. When ReadBufferExtended doesn't find the page in cache, it returns the >> buffer in !BM_VALID state (i.e. still in I/O in-progress state). Require the >> caller to call a second function, after locking the page, to finish the I/O. > > This seems like a reasonable approach. > > If you tilt your head the right way, zeroing a page and restoring a > backup block are the same thing: either way, you want to "read" the > block into shared buffers without actually reading it, so that you can > overwrite the prior contents with something else. So, you could fix > this by adding a new mode, RBM_OVERWRITE, and passing the new page > contents as an additional argument to ReadBufferExtended, which would > then memcpy() that data into place where RBM_ZERO calls MemSet() to > zero it. Yes, that would be quite a clean API. However, there's a problem with locking, when the redo routine modifies multiple pages. Currently, you lock the page first, and replace the page with the new contents while holding the lock. With RBM_OVERWRITE, the new page contents would sneak into the buffer before RestoreBackupBlock has acquired the lock on the page, and another backend might pin and lock the page before RestoreBackupBlock does. The page contents would be valid, but they might not be consistent with other buffers yet. The redo routine might be doing an atomic operation that spans multiple pages, by holding the locks on all the pages until it's finished with all the changes, but the backend would see a partial result. - Heikki
В списке pgsql-hackers по дате отправления: