Обсуждение: latch usage and postmaster death

Поиск
Список
Период
Сортировка

latch usage and postmaster death

От
Andres Freund
Дата:
Hi,

a significant number of WaitLatch's in the backend currently don't check
for postmaster death. That's imo wrong.  E.g. SELECT pg_sleep(100); just
continues to run.

I think we should change most sites to error out in that case. I wonder
if we shouldn't add another WL_ flag that automatically makes the latch
code do so; instead of repeating the code at every callsite.

Places that I've noticed in a quick skim:
* pg_sleep()
* gather_getnext()
* shm_mq_send_bytes()?
* shm_mq_receive_bytes()?
* ProcSleep()?
* ProcWaitForSignal()

The only case where we don't necessarily want to react to postmaster
death is syslogger, which is supposed to finish logging before shutting
down.


Additionally I noticed that we're not always diligent about following
the correct pattern when using latches. For example check
gather_readnext():        /* Nothing to do except wait for developments. */        WaitLatch(MyLatch, WL_LATCH_SET, 0);
      CHECK_FOR_INTERRUPTS();        ResetLatch(MyLatch);
 
we should reset the latch before checking for interrupts, not
after. This way an interrupt that arrives between the two will possibly
be ignored.

Greetings,

Andres Freund



Re: latch usage and postmaster death

От
Robert Haas
Дата:
On Mon, Mar 21, 2016 at 5:35 AM, Andres Freund <andres@anarazel.de> wrote:
> a significant number of WaitLatch's in the backend currently don't check
> for postmaster death. That's imo wrong.  E.g. SELECT pg_sleep(100); just
> continues to run.
>
> I think we should change most sites to error out in that case. I wonder
> if we shouldn't add another WL_ flag that automatically makes the latch
> code do so; instead of repeating the code at every callsite.

Yeah, or just make it do it always.  And probably FATAL rather than ERROR.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company