Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds
От | Tom Lane |
---|---|
Тема | Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds |
Дата | |
Msg-id | 1273950.1707436232@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds
|
Список | pgsql-bugs |
Andres Freund <andres@anarazel.de> writes: > I think we ought to understand *why* we are getting the "Too many open > files". The AcquireExternalFD() in CreateWaitEventSet() should prevent > that. Actually, I think the AcquireExternalFD() in CreateWaitEventSet() is *causing* that and needs to be removed. What is happening in Alexander's new example is that we are doing AcquireExternalFD() for each postgres_fdw connection (cf. libpqsrv_connect in libpq-be-fe-helpers.h), and the example is tuned to bring that exactly up to the limit of what AcquireExternalFD() allows. Then the next WaitLatchOrSocket call fails, because it does WaitEventSet *set = CreateWaitEventSet(CurrentResourceOwner, 3); Then when pgfdw_abort_cleanup tries to clean up the connections' state, it needs to do WaitLatchOrSocket again, and that fails again, and we PANIC because we're already in abort state. Since WaitLatchOrSocket is going to free this WaitEventSet before it returns, it's not apparent to me why we need to count it as a long-lived FD: we could just as well assume that it can slide in under the NUM_RESERVED_FDS limit. Or perhaps use ReserveExternalFD instead of AcquireExternalFD. We'd need some API extension to tell latch.c to do that, but that doesn't seem hard. (Unless we could consider that all WaitEventSets should use ReserveExternalFD? Not sure I want to argue for that though.) I guess a third possibility is that WaitLatchOrSocket could just permanently hang onto the WaitEventSet once it's got one. > One annoying bit is that AcquireExternalFD() failing emits the same error as > if epoll_create1() itself failing, including the same errno. It's the former. I tend to agree now that maybe using the same error text wasn't too smart. regards, tom lane
В списке pgsql-bugs по дате отправления: