Re: Debugging buildfarm pg_upgrade check failures
От | Andrew Dunstan |
---|---|
Тема | Re: Debugging buildfarm pg_upgrade check failures |
Дата | |
Msg-id | 55B3F7C4.8000205@dunslane.net обсуждение исходный текст |
Ответ на | Debugging buildfarm pg_upgrade check failures (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On 07/25/2015 10:59 AM, Tom Lane wrote: > Now that we've restored proper logging of "make check", I looked into > today's failure report from axolotl: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-07-24%2020%3A29%3A18 > > What evidently happened there is that "pg_ctl start" gave up waiting for > the postmaster to start too soon. The postmaster log appears to contain > > LOG: database system was shut down at 2015-07-24 16:45:40 EDT > FATAL: the database system is starting up > LOG: MultiXact member wraparound protections are now enabled > LOG: database system is ready to accept connections > > which indicates that it did successfully come up, but not till after one > "PQping" probe from pg_ctl, which was rejected with "still starting up". > Meanwhile we've got this log output from pg_ctl: > > waiting for server to start........ stopped waiting > pg_ctl: could not start server > Examine the log output. > > Counting the dots indicates that pg_ctl gave up after precisely 5 seconds. > Now, looking at the logic in pg_ctl's test_postmaster_connection(), the > only explanation that seems to fit the observed output is that the stat() > on the postmaster pidfile (at line 650 in HEAD) failed. It's not clear > why though, since the postmaster was clearly still alive at this point, > and we must have been able to read the pidfile earlier to construct a > connection string, else there would have been no PQping attempt. > > Maybe the stat failed for some unexpected resource-exhaustion kind of > reason? > > It seems plausible to me that we should change pg_ctl to only consider > stat() failure to be a reason to give up waiting if errno is ENOENT, > not anything else. At a minimum, I'd like to modify it to print the > errno if it's anything else, so that we can confirm or deny this theory > next time we see this buildfarm failure. > > Comments? > > Certainly let's look at the errno. cheers andrdew
В списке pgsql-hackers по дате отправления: