Re: failure in 019_replslot_limit
От | Alexander Lakhin |
---|---|
Тема | Re: failure in 019_replslot_limit |
Дата | |
Msg-id | ff7bad44-bc27-7179-e9ed-79cb6866fe03@gmail.com обсуждение исходный текст |
Ответ на | Re: failure in 019_replslot_limit (Andres Freund <andres@anarazel.de>) |
Список | pgsql-hackers |
09.02.2024 21:59, Andres Freund wrote: > >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-02-04%2001%3A53%3A44 >> ) and saw that it's not checkpointer, but walsender is hanging: > How did you reproduce this? As kestrel didn't produce this failure until recently, I supposed that the cause is the same as with subscription/031_column_list — longer test duration, so I ran this test in parallel (with 20-30 jobs) in a slowed down VM, so that one successful test duration increased to 100-120 seconds. And I was lucky enough to catch it within 100 iterations. But now, that we know what's happening there, I think I could reproduce it much easily, with some sleep(s) added, if it would be of any interest. > So it's the issue that we wait effectively forever to to send a FATAL. I've > previously proposed that we should not block sending out fatal errors, given > that allows clients to do prevent graceful restarts and a lot of other things. > Yes, I had demonstrated one of those unpleasant things previously too: https://www.postgresql.org/message-id/91c8860a-a866-71a7-a060-3f07af531295%40gmail.com Best regards, Alexander
В списке pgsql-hackers по дате отправления: