Обсуждение: Checks in RegisterBackgroundWorker.()

Поиск
Список
Период
Сортировка

Checks in RegisterBackgroundWorker.()

От
Heikki Linnakangas
Дата:
The first patch on my "Refactoring backend fork+exec code" thread [0] 
changes the allocations of BackgroundWorkerList from plain malloc() to 
MemoryContextAlloc(PostmasterContext). However, that actually caused a 
segfault in worker_spi tests in EXEC_BACKEND mode.

BackgroundWorkerList is a postmaster-private data structure and should 
not be accessed in backends. That assumption failed in 
RegisterBackgroundWorker(). When you put worker_spi in 
shared_preload_libraries, its _PG_init() function calls 
RegisterBackgroundWorker(), as expected. But in EXEC_BACKEND mode, the 
library is loaded *again* in each backend process, and each of those 
loads also call RegisterBackgroundWorker(). It's too late to correctly 
register any static background workers at that stage, but 
RegisterBackgroundWorker() still goes through the motions and adds the 
element to BackgroundWorkerList. If you change the malloc() to 
MemoryContextAlloc(PostmasterContext), it segfaults because 
PostmasterContext == NULL in a backend process.

In summary, RegisterBackgroundWorker() is doing some questionable and 
useless work, when a shared preload library is loaded to a backend 
process in EXEC_BACKEND mode.

Attached patches:

1. Tighten/clarify those checks. See also commit message for details.
2. The patch from the other thread to switch to 
MemoryContextAlloc(PostmasterContext)
3. A fix for a highly misleading comment in the same file.

Any comments?

[0] 
https://www.postgresql.org/message-id/flat/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef%40iki.fi

P.S. In addition to those, I also considered these changes but didn't 
implement them yet:

- Change RegisterBackgroundWorker() to return true/false to indicate 
whether the registration succeeded. Currently, the caller has no way of 
knowing. In many cases, I think even an ERROR and refusing to start up 
the server would be appropriate. But at least we should let the caller 
know and decide.

- Add "Assert(am_postmaster)" assertions to the functions in bgworker.c 
that are only supposed to be called in postmaster process. The functions 
have good explicit comments on that, but wouldn't hurt to also assert. 
(There is no 'am_postmaster' flag, the equivalent is actually 
(IsPostmasterEnvironment && !IsUnderPostmaster), but perhaps we should 
define a macro or flag for that)

-- 
Heikki Linnakangas
Neon (https://neon.tech)
Вложения

Re: Checks in RegisterBackgroundWorker.()

От
Thomas Munro
Дата:
On Fri, Aug 25, 2023 at 3:15 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> In summary, RegisterBackgroundWorker() is doing some questionable and
> useless work, when a shared preload library is loaded to a backend
> process in EXEC_BACKEND mode.

Yeah.  When I was working on 7389aad6 ("Use WaitEventSet API for
postmaster's event loop."), I also tried to move all of the
postmaster's state variables into PostmasterContext (since the only
reason for that scope was the signal handler code that is now gone),
and I hit a variant of this design problem.  I wonder if that would be
unblocked by this...

https://www.postgresql.org/message-id/CA+hUKGKH_RPAo=NgPfHKj--565aL1qiVpUGdWt1_pmJehY+dmw@mail.gmail.com



Re: Checks in RegisterBackgroundWorker.()

От
Heikki Linnakangas
Дата:
On 25/08/2023 00:00, Thomas Munro wrote:
> On Fri, Aug 25, 2023 at 3:15 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> In summary, RegisterBackgroundWorker() is doing some questionable and
>> useless work, when a shared preload library is loaded to a backend
>> process in EXEC_BACKEND mode.
> 
> Yeah.  When I was working on 7389aad6 ("Use WaitEventSet API for
> postmaster's event loop."), I also tried to move all of the
> postmaster's state variables into PostmasterContext (since the only
> reason for that scope was the signal handler code that is now gone),
> and I hit a variant of this design problem.  I wonder if that would be
> unblocked by this...
> 
> https://www.postgresql.org/message-id/CA+hUKGKH_RPAo=NgPfHKj--565aL1qiVpUGdWt1_pmJehY+dmw@mail.gmail.com

A-ha, yes I believe this patch will unblock that. 
RegisterBackgroundWorker() has no legit reason to access 
BackgroundWorkerList in child processes, and with these patches, it no 
longer does.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Checks in RegisterBackgroundWorker.()

От
Heikki Linnakangas
Дата:
Here's a new version of these patches. I fixed one comment and ran 
pgindent, no other changes.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Вложения

Re: Checks in RegisterBackgroundWorker.()

От
Thomas Munro
Дата:
On Thu, Sep 28, 2023 at 9:46 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Here's a new version of these patches. I fixed one comment and ran
> pgindent, no other changes.

> Subject: [PATCH v2 1/3] Clarify the checks in RegisterBackgroundWorker.

LGTM.  I see it passes on CI, and I also tested locally with
EXEC_BACKEND, with shared_preload_libraries=path/to/pg_prewarm.so
which works fine.

> Subject: [PATCH v2 2/3] Allocate Backend structs in PostmasterContext.

LGTM.  I checked that you preserve the behaviour on OOM (LOG), and you
converted free() to pfree() in code that runs in the postmaster, but
dropped it in the code that runs in the child because all children
should delete PostmasterContext, making per-object pfree redundant.
Good.

> Subject: [PATCH v2 3/3] Fix misleading comment on StartBackgroundWorker().

LGTM.  Hmm, maybe I would have called that function
"BackgroundWorkerMain()" like several other similar things, but that's
not important.

This doesn't quite fix the problem I was complaining about earlier,
but it de-confuses things.  (Namely that if BackgroundWorkerList
weren't a global variable, RegisterWorkerMain() wouldn't be able to
find it, and if it took some kind of context pointer as an argument,
_PG_init() functions wouldn't be able to provide it, unless we changed
_PG_init() to take an argument, which we can't really do.  Oh well.)



Re: Checks in RegisterBackgroundWorker.()

От
Heikki Linnakangas
Дата:
On 06/10/2023 13:13, Thomas Munro wrote:
> On Thu, Sep 28, 2023 at 9:46 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Subject: [PATCH v2 3/3] Fix misleading comment on StartBackgroundWorker().
> 
> LGTM.  Hmm, maybe I would have called that function
> "BackgroundWorkerMain()" like several other similar things, but that's
> not important.

That's a good idea. I renamed it to BackgroundWorkerMain().

Pushed with that change, thanks for the review!

-- 
Heikki Linnakangas
Neon (https://neon.tech)