Re: walsender bug: stuck during shutdown
От | Fujii Masao |
---|---|
Тема | Re: walsender bug: stuck during shutdown |
Дата | |
Msg-id | 94910fe9-a720-7f49-c678-d9a16d42e6fb@oss.nttdata.com обсуждение исходный текст |
Ответ на | walsender bug: stuck during shutdown (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
Ответы |
Re: walsender bug: stuck during shutdown
|
Список | pgsql-hackers |
On 2020/11/24 5:52, Alvaro Herrera wrote: > Hello > > Chloe Dives reported that sometimes a walsender would become stuck > during shutdown and *not* shutdown, thus preventing postmaster from > completing the shutdown cycle. This has been observed to cause the > servers to remain in such state for several hours. > > After a lengthy investigation and thanks to a handy reproducer by Chris > Wilson, we found that the problem is that WalSndDone wants to avoid > shutting down until everything has been sent and acknowledged; but this > test is coded in a way that ignores the possibility that we have never > received anything from the other end. In that case, both > MyWalSnd->flush and MyWalSnd->write are InvalidRecPtr, so the condition > in WalSndDone to terminate the loop is never fulfilled. So the > walsender is looping forever and never terminates, blocking shutdown of > the whole instance. > > The attached patch fixes the problem by testing for the problematic > condition. > > Apparently this problem has existed forever. Fujii-san almost patched > for it in 5c6d9fc4b2b8 (2014!), but missed it by a zillionth of an inch. Thanks for working on this! Could you tell me the discussion thread where Chloe Dives reported the issue to? Sorry I could not find that.. I'd like to see the procedure to reproduce the issue. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
В списке pgsql-hackers по дате отправления: