Re: Online checksums verification in the backend
От | Julien Rouhaud |
---|---|
Тема | Re: Online checksums verification in the backend |
Дата | |
Msg-id | CAOBaU_Yb=PBm1mkZqYrS1KoHA=jtQQOUzc4Ahnz7npwPbKew6w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Online checksums verification in the backend (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Online checksums verification in the backend
|
Список | pgsql-hackers |
On Fri, Oct 30, 2020 at 10:58 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-10-30 10:01:08 +0800, Julien Rouhaud wrote: > > On Fri, Oct 30, 2020 at 2:17 AM Andres Freund <andres@anarazel.de> wrote: > > > The code does IO while holding the buffer mapping lock. That seems > > > *entirely* unacceptable to me. That basically locks 1/128 of shared > > > buffers against concurrent mapping changes, while reading data that is > > > likely not to be on disk? Seriously? > > > > The initial implementation had a different approach, reading the buffer once > > without holding the buffer mapping lock (which could lead to some false > > positive in some unlikely scenario), and only if a corruption is detected the > > read is done once again *while holding the buffer mapping lock* to ensure it's > > not a false positive. Some benchmarking showed that the performance was worse, > > so we dropped that optimisation. Should we go back to something like that or > > do you have a better way to ensure a consistent read of a buffer which isn't in > > shared buffers? > > I suspect that you're gonna need something quite different than what the > function is doing right now. Not because such a method will be faster in > isolation, but because there's a chance to have it correct and not have > a significant performance impact onto the rest of the system. > > I've not thought about it in detail yet. Is suspect you'll need to > ensure there is a valid entry in the buffer mapping table for the buffer > you're processing. By virtue of setting BM_IO_IN_PROGRESS on that entry > you're going to prevent concurrent IO from starting until your part is > done. So I'm assuming that the previous optimization to avoid almost every time doing an IO while holding a buffer mapping lock isn't an option? In that case, I don't see any other option than reverting the patch and discussing a new approach.
В списке pgsql-hackers по дате отправления: