BUG #17345: pg_basebackup stucked for 2 hours before timeout
От | PG Bug reporting form |
---|---|
Тема | BUG #17345: pg_basebackup stucked for 2 hours before timeout |
Дата | |
Msg-id | 17345-a66a0084532b7beb@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #17345: pg_basebackup stucked for 2 hours before timeout
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17345 Logged by: Bo Chen Email address: bchen90@163.com PostgreSQL version: 11.13 Operating system: euleros v2r9 x86_64 Description: Hello experts, I am facing an issue for pg_basebackup in docker env. when the primary VM restarted while pg_basebackup is running on the standby docker in VM. It takes 2 hours before pg_basebackup times out. After analysis and reproduce the problem, I think the reason is the parent process for fetching data files is blocking for tcp keeplive, and it ignore or block SIGCHLD when running poll API. So we add signaling the parent when fetching wal exit not zero. Belowing is the modifing code. #include "streamutil.h" +#include <sys/prctl.h> #define ERRCODE_DATA_CORRUPTED "XX001" @@ -565,6 +566,8 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) uint32 hi, lo; char statusdir[MAXPGPATH]; + pid_t bgpid; + int ret; param = pg_malloc0(sizeof(logstreamer_param)); param->timeline = timeline; @@ -662,12 +665,24 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) * a fork(). On Windows, we create a thread. */ #ifndef WIN32 + bgpid = getpid(); + bgchild = fork(); if (bgchild == 0) { + (void)prctl(PR_SET_PDEATHSIG, SIGQUIT); /* in child process */ - exit(LogStreamerMain(param)); + ret = LogStreamerMain(param); + if (ret != 0) + { + kill(bgpid, SIGINT); + } + exit(ret); } else if (bgchild < 0) { This is the stacks when pg_basebackup stucking #0 0xf7f6e039 in __kernel_vsyscall () #1 0xf7a1f2ea in poll () from /usr/lib/libc.so.6 #2 0xf7b25ea0 in pqSocketPoll (sock=5, forRead=1, forWrite=0, end_time=-1) at fe-misc.c:1127 Belowing is the same issue from Ninad Shah. https://www.postgresql.org/message-id/CAOFEiBd9j620TsBZPT0%2BuvdemQqwTrCLohcLjuDfQ2ye-xdswQ%40mail.gmail.com Regards, Bo Chenbo
В списке pgsql-bugs по дате отправления: