Re: backtrace_on_internal_error
От | Andres Freund |
---|---|
Тема | Re: backtrace_on_internal_error |
Дата | |
Msg-id | 20231208193316.5ylgs4zb6zngwyg4@awork3.anarazel.de обсуждение исходный текст |
Ответ на | Re: backtrace_on_internal_error (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: backtrace_on_internal_error
(Andres Freund <andres@anarazel.de>)
Re: backtrace_on_internal_error (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Hi, On 2023-12-08 10:51:01 -0800, Andres Freund wrote: > On 2023-12-08 13:46:07 -0500, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > On 2023-12-08 13:23:50 -0500, Tom Lane wrote: > > >> Hmm, don't suppose you have a way to reproduce that? > > > > > After a bit of trying, yes. I put an abort() into pgtls_open_client(), after > > > initialize_SSL(). Connecting does result in: > > > LOG: could not accept SSL connection: Success > > > > OK. I can dig into that, unless you're already on it? > > I think I figured it it out. Looks like we need to translate a closed socket > (recvfrom() returning 0) to ECONNRESET or such. I think we might just need to expand the existing branch for EOF: if (r < 0) ereport(COMMERROR, (errcode_for_socket_access(), errmsg("could not accept SSL connection: %m"))); else ereport(COMMERROR, (errcode(ERRCODE_PROTOCOL_VIOLATION), errmsg("could not accept SSL connection: EOF detected"))); The openssl docs say: The following return values can occur: 0 The TLS/SSL handshake was not successful but was shut down controlled and by the specifications of the TLS/SSL protocol.Call SSL_get_error() with the return value ret to find out the reason. 1 The TLS/SSL handshake was successfully completed, a TLS/SSL connection has been established. <0 The TLS/SSL handshake was not successful because a fatal error occurred either at the protocol level or a connectionfailure occurred. The shutdown was not clean. It can also occur if action is needed to continue the operation fornonblocking BIOs. Call SSL_get_error() with the return value ret to find out the reason. Which fits with my reproducer - due to the abort the connection was *not* shut down via SSL in a controlled manner, therefore r < 0. Hm, oddly enough, there's this tidbit in the SSL_get_error() manpage: On an unexpected EOF, versions before OpenSSL 3.0 returned SSL_ERROR_SYSCALL, nothing was added to the error stack, and errno was 0. Since OpenSSL 3.0 the returned error is SSL_ERROR_SSL with a meaningful error on the error stack. But I reproduced this with 3.1. Seems like we should just treat errno == 0 as a reason to emit the "EOF detected" message? I wonder if we should treat send/recv returning 0 different from an error message perspective during an established connection. Right now we produce could not receive data from client: Connection reset by peer because be_tls_read() sets errno to ECONNRESET - despite that not having been returned by the OS. But I guess that's a topic for another day. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: