Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0
От | Tom Lane |
---|---|
Тема | Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0 |
Дата | |
Msg-id | 1113966.1644280235@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0
|
Список | pgsql-bugs |
I wrote: > The seeming timing problem with the two CRL tests remains. I spent some more time poking at this, and found that: * There are three tests, not two, that intermittently fail. They are at 001_ssltests.pl lines 565, 608, 618. It's suspicious that these are exactly the tests that expect to see "sslv3 alert" or "tlsv1 alert" conditions rather than anything higher-level; but I don't have any insight as to why that might be relevant. * The failure occurs on the WRITE side, not the read side; the 'server closed the connection unexpectedly' message we see coming back from libpq is from pqsecure_raw_write. (I verified this by changing the texts of the various instances of that message.) * If I make my_sock_write ignore EPIPE/ECONNRESET, as per the attached entirely-uncommitable patch, the errors go away. I hypothesize that something about OpenBSD scheduling is allowing the server to (sometimes) exit before the client-side openssl has flushed all its buffers, and the client-side code doesn't handle that well. It's not very clear why this wouldn't be affecting all users of OpenSSL, but there you have it. While the attached is surely no good as a general patch, could we get away with ignoring EPIPE/ECONNRESET in writes during connection startup? We'd notice the failure soon enough on the read side if it's not this problem. (This seems a bit related to libpq's other hacks that postpone recognition of write failures.) By the by, today's fairywren failure [1] sure looks related: # Failed test 'intermediate client certificate is missing: matches' # at t/001_ssltests.pl line 608. # 'psql: error: connection to server at "127.0.0.1", port 50577 failed: could not receive data from server:Software caused connection abort (0x00002745/10053) # SSL SYSCALL error: Software caused connection abort (0x00002745/10053) # could not send startup packet: No error (0x00000000/0)' # doesn't match '(?^:SSL error: tlsv1 alert unknown ca)' This is evidently on the read not write side, so it's not quite the same thing, but ... regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2022-02-07%2021%3A04%3A53 diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c index 9f735ba437..11084a6a07 100644 --- a/src/interfaces/libpq/fe-secure-openssl.c +++ b/src/interfaces/libpq/fe-secure-openssl.c @@ -1697,6 +1697,10 @@ my_sock_write(BIO *h, const char *buf, int size) BIO_set_retry_write(h); break; + case EPIPE: + case ECONNRESET: + return size; + default: break; }
В списке pgsql-bugs по дате отправления: