Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0
От | Tom Lane |
---|---|
Тема | Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0 |
Дата | |
Msg-id | 2187263.1644616494@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0
|
Список | pgsql-bugs |
I wrote: >> Hmm, maybe. It sure *looks* like we need to do >> - return conn->asyncStatus == PGASYNC_BUSY || conn->write_failed; >> + return conn->asyncStatus == PGASYNC_BUSY && !conn->write_failed; > Dunno anything about the Windows angle, but after hacking together > a reproducer for this, I'm fairly convinced that this is indeed a bug. After further study, I now think that checking write_failed here is the wrong thing entirely, because it's irrelevant to what we want to do, which is keep reading until we've collected a server message or seen read EOF. What we do need to do though is check to see if we've closed the socket. That's because places like libpqwalreceiver.c assume that if PQisBusy is true, then PQsocket() must be valid: while (PQisBusy(streamConn)) { int rc; rc = WaitLatchOrSocket(MyLatch, WL_EXIT_ON_PM_DEATH | WL_SOCKET_READABLE | WL_LATCH_SET, PQsocket(streamConn), 0, WAIT_EVENT_LIBPQWALRECEIVER_RECEIVE); It seems like the existing coding in PQisBusy could allow that to fail. I scraped the buildfarm for instances of the "cannot wait on socket event without a socket" error that latch.c would emit if that happened, but found none going back three months. That's perhaps because of the other behavior I noted that if the walsender crashes, we'll probably report write failure immediately in PQsendQuery. So maybe this is mostly hypothetical in practice, but I think what we actually want here is as attached. In any case, I still don't see a way for this error to result in an infinite loop, so it doesn't seem like an explanation for the Windows problems. regards, tom lane diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c index 59121873d2..9afd4d88b4 100644 --- a/src/interfaces/libpq/fe-exec.c +++ b/src/interfaces/libpq/fe-exec.c @@ -1957,10 +1957,14 @@ PQisBusy(PGconn *conn) parseInput(conn); /* - * PQgetResult will return immediately in all states except BUSY, or if we - * had a write failure. + * PQgetResult will return immediately in all states except BUSY. Also, + * if we've detected read EOF and dropped the connection, we can expect + * that PQgetResult will fail immediately. Note that we do *not* check + * conn->write_failed here --- once that's become set, we know we have + * trouble, but we need to keep trying to read until we have a complete + * server message or detect read EOF. */ - return conn->asyncStatus == PGASYNC_BUSY || conn->write_failed; + return conn->asyncStatus == PGASYNC_BUSY && conn->status != CONNECTION_BAD; } /*
В списке pgsql-bugs по дате отправления: