Re: race condition when writing pg_control
От | Thomas Munro |
---|---|
Тема | Re: race condition when writing pg_control |
Дата | |
Msg-id | CA+hUKGJU378uN_OOc28pN0hgjj2TFOb-jYZX6kamkGB9BBMK5w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: race condition when writing pg_control (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: race condition when writing pg_control
|
Список | pgsql-hackers |
On Fri, May 17, 2024 at 4:46 PM Thomas Munro <thomas.munro@gmail.com> wrote: > The specific problem here is that LocalProcessControlFile() runs in > every launched child for EXEC_BACKEND builds. Windows uses > EXEC_BACKEND, and Windows' NTFS file system is one of the two file > systems known to this list to have the concurrent read/write data > mashing problem (the other being ext4). Phngh... this is surprisingly difficult to fix. Things that don't work: We "just" need to acquire ControlFileLock while reading the file or examining the object in shared memory, or get a copy of it, passed through the EXEC_BACKEND BackendParameters that was acquired while holding the lock, but the current location of this code in child startup is too early to use LWLocks, and the postmaster can't acquire locks either so it can't even safely take a copy to pass on. You could reorder startup so that we are allowed to acquire LWLocks in children at that point, but then you'd need to convince yourself that there is no danger of breaking some ordering requirement in external preload libraries, and figure out what to do about children that don't even attach to shared memory. Maybe that's possible, but that doesn't sound like a good idea to back-patch. First idea idea I've come up with to avoid all of that: pass a copy of the "proto-controlfile", to coin a term for the one read early in postmaster startup by LocalProcessControlFile(). As far as I know, the only reason we need it is to suck some settings out of it that don't change while a cluster is running (mostly can't change after initdb, and checksums can only be {en,dis}abled while down). Right? Children can just "import" that sucker instead of calling LocalProcessControlFile() to figure out the size of WAL segments yada yada, I think? Later they will attach to the real one in shared memory for all future purposes, once normal interlocking is allowed. I dunno. Draft patch attached. Better plans welcome. This passes CI on Linux systems afflicted by EXEC_BACKEND, and Windows. Thoughts?
Вложения
В списке pgsql-hackers по дате отправления: