removing set_latch_on_sigusr1
От | Robert Haas |
---|---|
Тема | removing set_latch_on_sigusr1 |
Дата | |
Msg-id | CA+TgmoY6QSwGj1Uhu-SqXExdMmCX=Mk6S0pOFn-VSF_2EWDwCQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: removing set_latch_on_sigusr1
Re: removing set_latch_on_sigusr1 |
Список | pgsql-hackers |
As part of the dynamic background worker mechanism, I introduced a flag set_latch_on_sigusr1. When set, a process sets its own latch any time it receives SIGUSR1. The original motivation for this was that the postmaster sends SIGUSR1 to a process to indicate the death of a background worker, but cannot directly set the process latch since we want to minimize the postmaster's contact surface with the main shared memory segment. The reason I introduced a new flag instead of just *always* setting the latch was because I thought somebody might complain about the cost of setting the latch every time. But now I think that's exactly what we should do: forget the flag, always set the latch. If you get a SIGUSR1, somebody is trying to get your attention. At worst, setting the latch will cause whatever latch-wait loop got interrupted to re-execute without making forward progress. Most of the time, though, you'll do something like CHECK_FOR_INTERRUPTS() in that loop, and you'll probably find that you've got one. The code we're talking about here is in procsignal_sigusr1_handler. That function can potentially call HandleCatchupInterrupt, HandleNotifyInterrupt, HandleParallelMessageInterrupt, or RecoveryConflictInterrupt. And all of those functions set the process latch. So the only thing we're gaining by having set_latch_on_sigusr1 is the possibility that we might avoid setting the latch in response to either (1) a postmaster signal related to a background worker that we happen not to care about at the moment or (2) a totally spurious SIGUSR1. But neither of those events should happen very often, so what's the point? The cost of setting the latch when we don't need to is typically small, but NOT setting it when we should have done can lead to an indefinite hang. As it happens, the TupleQueueFunnelNext function I recently committed has such a hazard, which I failed to spot during review and testing. If people don't like this, I can instead cause that function to set the flag. But every place that sets the flag has to use a PG_TRY()/PG_CATCH() block to make sure the old value of the flag gets restored. I'm pretty sure that's going to burn more cycles than the flag can ever hope to save, not to mention the risk of bugs due to people forgetting to add necessary volatile qualifiers. We've already got four PG_TRY() blocks in the code to cater to this stupid flag, and if we keep it around I'm sure we'll accumulate at least a few more. Patch attached. Objections? Suggestions? Comments? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: