Re: Issue with the PRNG used by Postgres
От | Andres Freund |
---|---|
Тема | Re: Issue with the PRNG used by Postgres |
Дата | |
Msg-id | 20240412060241.hm3o7jvgrq6q7kyi@awork3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Issue with the PRNG used by Postgres (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Issue with the PRNG used by Postgres
|
Список | pgsql-hackers |
Hi, On 2024-04-11 21:41:39 -0700, Andres Freund wrote: > FWIW, I just reproduced the scenario with signals. I added tracking of the > total time actually slept and lost to SpinDelayStatus, and added a function to > trigger a wait on a spinlock. > > To wait less, I set max_standby_streaming_delay=0.1, but that's just for > easier testing in isolation. In reality that could have been reached before > the spinlock is even acquired. > > On a standby, while a recovery conflict is happening: > PANIC: XX000: stuck spinlock detected at crashme, path/to/file:line, after 4.38s, lost 127.96s > > > So right now it's really not hard to trigger the stuck-spinlock logic > completely spuriously. This doesn't just happen with hot standby, there are > plenty other sources of lots of signals being sent. Oh my. There's a workload that completely trivially hits this, without even trying hard. LISTEN/NOTIFY. PANIC: XX000: stuck spinlock detected at crashme, file:line, after 0.000072s, lost 133.027159s Yes, it really triggered in less than 1ms. That was with just one session doing NOTIFYs from a client. There's plenty users that send NOTIFY from triggers, which afaics will result in much higher rates of signals being sent. Even with a bit less NOTIFY traffic, this very obviously gets into the territory where plain scheduling delays will trigger the stuck spinlock logic. Greetings, Andres
В списке pgsql-hackers по дате отправления: