Обсуждение: Rare SSL failures on eelpout

Поиск
Список
Период
Сортировка

Rare SSL failures on eelpout

От
Thomas Munro
Дата:
(Moving this discussion from the pgsql-committers thread "pgsql:
Update ssl test certificates and keys", which is innocent.)

On Wed, Jan 16, 2019 at 10:37 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Jan 4, 2019 at 10:08 AM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > Thanks.  FWIW I've just updated eelpout (a Debian testing BF animal
> > that runs all the extra tests including SSL) to use libssl-dev
> > (instead of libssl1.0-dev), and cleared its accache files.  Let's see
> > if that works...
>
> Since that upgrade (to libssl 1.1.1a-1), there are have been a few
> intermittent failures in the SSL tests on that animal (thanks Tom for
> pointing that out off-list).

I reproduced this manually.  From time to time (less than 1% of the
times I tried), the server hangs up without sending a goodbye message,
thereby causing a failure to match the expected psql error message.
From the kernel's perspective on the psql side, a working-as-expected
case looks like this:

sendto(3, "\27\3\3\2\356\313\334\372\201\353\201h\204\353\240A\4\355\232\340\375\0X\220\326V\374\253\222i\2724"...,
1115, MSG_NOSIGNAL, NULL, 0) = 1115
ppoll([{fd=3, events=POLLOUT|POLLERR}], 1, NULL, NULL, 0) = 1 ([{fd=3,
revents=POLLOUT}])
sendto(3, "\27\3\3\0f\335\313\224\263\256r\371\215\177\273,N\345\342\200\257\r\321\6\323@\316\241\316\17\204\26"...,
107, MSG_NOSIGNAL, NULL, 0) = 107
ppoll([{fd=3, events=POLLIN|POLLERR}], 1, NULL, NULL, 0) = 1 ([{fd=3,
revents=POLLIN}])
recvfrom(3, "\27\3\3\0\23", 5, 0, NULL, NULL) = 5
recvfrom(3, "I\"\t;\3006\276\347\344]7>\2\234m\340]B\241", 19, 0,
NULL, NULL) = 19
close(3)                                = 0
write(2, "psql: SSL error: sslv3 alert cer"..., 49psql: SSL error:
sslv3 alert certificate revoked
) = 49

... and the unexpected failure case looks like this:

sendto(3, "\27\3\3\2\356/\254\307\277\342G?&BB\321\f\314\245\22\246U\337\263;\203f\302s\306\37\276"...,
1115, MSG_NOSIGNAL, NULL, 0) = 1115
ppoll([{fd=3, events=POLLOUT|POLLERR}], 1, NULL, NULL, 0) = 1 ([{fd=3,
revents=POLLOUT|POLLERR|POLLHUP}])
sendto(3, "\27\3\3\0f\177\335\20\305\353\234\306\211#\343\321\336\22111J\312\323F\210\6U\331\264GAr"...,
107, MSG_NOSIGNAL, NULL, 0) = -1 ECONNRESET (Connection reset by peer)
write(2, "psql: server closed the connecti"..., 199psql: server closed
the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
could not send startup packet: Connection reset by peer
) = 199

To the kernel on the server side, a good case looks like this:

[pid 13740] sendto(8,
"\27\3\3\0\23\252\21\227\2\232s\354\21\243\236\207\301\3B\341\253\306k\346",
24, 0, NULL, 0) = 24
[pid 13740] write(2, "2019-01-22 09:23:04.425 UTC [137"...,
1112019-01-22 09:23:04.425 UTC [13740] [unknown] LOG:  could not
accept SSL connection: certificate verify failed
) = 111
[pid 13740] exit_group(0)               = ?
[pid 13740] +++ exited with 0 +++

I didn't manage to trace a bad case on the server side (maybe the
strace interferes with the required timing), but I noticed that
SSL_shutdown()'s return code (which we don't check) is always -1, and
I noticed that SSL_get_error() after that is always SSL_ERROR_SSL.  I
wondered if we might somehow be reaching exit() when the alert
response is still buffered inside OpenSSL, but that doesn't seem right
-- the port has noblock=0 at that stage.

Hmm.  Why is psql doing two sendto() calls without reading a response
in between, when it's possible for the server to exit after the first,
anyway?  Seems like a protocol violation somewhere?

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> Hmm.  Why is psql doing two sendto() calls without reading a response
> in between, when it's possible for the server to exit after the first,
> anyway?  Seems like a protocol violation somewhere?

Keep in mind this is all down inside the SSL handshake, so if any
protocol is being violated, it's theirs not ours.

My gut reaction is that this probably indicates that in the "certificate
verify failed" code path, we're exiting the server too soon without
letting openssl finish out its handshake fully.  But that could be all
wet, or even if true it might not be convenient to postpone exit (e.g.,
we'd have to save the SSL error code somewhere, I suspect).

The whole thing reminds me of the recent bug #15598:

https://www.postgresql.org/message-id/87k1iy44fd.fsf%40news-spur.riddles.org.uk

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Jan 23, 2019 at 4:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> > Hmm.  Why is psql doing two sendto() calls without reading a response
> > in between, when it's possible for the server to exit after the first,
> > anyway?  Seems like a protocol violation somewhere?
>
> Keep in mind this is all down inside the SSL handshake, so if any
> protocol is being violated, it's theirs not ours.

The sendto() of 1115 bytes is SSL_connect()'s last syscall, just
before it returns 1 to indicate success (even though it wasn't
successful?), without waiting for a further response.  The sendto() of
107 bytes is our start-up packet, which either succeeds and is
followed by reading a "certificate revoked" message from the server,
or fails with ECONNRESET if the socket has already been shut down at
the server end due to the racing exit.

It seems very strange to me that the error report is deferred until we
send our start-up packet.  It seems like a response that belongs to
the connection attempt, not our later data sending.  Bug in OpenSSL?
Unintended consequence of our switch to blocking IO at that point?

I tried to find out how this looked on 1.0.2, but it looks like Debian
has just removed the older version from the buster distro and I'm out
of time to hunt this on other OSes today.

> The whole thing reminds me of the recent bug #15598:
>
> https://www.postgresql.org/message-id/87k1iy44fd.fsf%40news-spur.riddles.org.uk

Yeah, if errors get moved to later exchanges but the server might exit
and close its end of the socket before we can manage to initiate a
later exchange, it starts to look just like that.

A less interesting bug is the appearance of 3 nonsensical "Success"
(glibc) or "No error: 0" (FreeBSD) error messages in the server logs
on systems running OpenSSL 1.1.1, much like this, which I guess might
mean EOF:

https://www.postgresql.org/message-id/CAEepm=3cc5wYv=X4Nzy7VOUkdHBiJs9bpLzqtqJWxdDUp5DiPQ@mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Jan 23, 2019 at 11:23 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Jan 23, 2019 at 4:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The whole thing reminds me of the recent bug #15598:
> >
> > https://www.postgresql.org/message-id/87k1iy44fd.fsf%40news-spur.riddles.org.uk
>
> Yeah, if errors get moved to later exchanges but the server might exit
> and close its end of the socket before we can manage to initiate a
> later exchange, it starts to look just like that.

Based on some clues from Andrew Gierth (in the email referenced above
and also in an off-list chat), I did some experimentation that seemed
to confirm a theory of his that Linux might be taking a shortcut when
both sides are local, bypassing the RST step because it can see both
ends (whereas normally the TCP stack should cause the *next* sendto()
to fail IIUC?).  I think this case is essentially the same as bug
#15598, it's just happening at a different time.

With a simple socket test program I can see that if you send a single
packet after the remote end has closed and after it had already read
everything you sent it up to now, you get EPIPE.  If there was some
outstanding data from a previous send that it hadn't read yet when it
closed its end, you get ECONNRESET.  This doesn't happen if client and
server are on different machines, or on FreeBSD even on the same
machine, but does happen if client and server are on the same Linux
system (whether using the loopback interface or a real network
interface).  However, after you get ECONNRESET, you can still read the
final data that was sent by the server before it closed, which
presumably contains the error we want to report to the user.  That
suggests that we could perhaps handle ECONNRESET both at startup
packet send time (for certificate rejection, eelpout's case) and at
initial query send (for idle timeout, bug #15598's case) by attempting
to read.  Does that make sense?  I haven't poked into the libpq state
machine stuff to see if that would be easy or hard.

PS: looking again at the strace output from earlier, it's kinda funny
that it says revents=POLLOUT|POLLERR|POLLHUP, since that seems to be a
contradiction: if this were poll() and not ppoll() I think it might
violate POSIX which says "[POLLHUP] and POLLOUT are
mutually-exclusive; a stream can never be writable if a hangup has
occurred", but I don't see what we could do differently with that
anyway.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> With a simple socket test program I can see that if you send a single
> packet after the remote end has closed and after it had already read
> everything you sent it up to now, you get EPIPE.  If there was some
> outstanding data from a previous send that it hadn't read yet when it
> closed its end, you get ECONNRESET.  This doesn't happen if client and
> server are on different machines, or on FreeBSD even on the same
> machine, but does happen if client and server are on the same Linux
> system (whether using the loopback interface or a real network
> interface).  However, after you get ECONNRESET, you can still read the
> final data that was sent by the server before it closed, which
> presumably contains the error we want to report to the user.  That
> suggests that we could perhaps handle ECONNRESET both at startup
> packet send time (for certificate rejection, eelpout's case) and at
> initial query send (for idle timeout, bug #15598's case) by attempting
> to read.  Does that make sense?

Hmm ... it definitely makes sense that we shouldn't assume that a *write*
failure means there is nothing left to *read*.  I would say that's
generally true regardless of the specific errno.  How about the opposite
case, btw --- should we continue to try to flush data after a read
error?  Or is it safe to assume that means the other end is gone?

> I haven't poked into the libpq state
> machine stuff to see if that would be easy or hard.

Me either.

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
I wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
>> That suggests that we could perhaps handle ECONNRESET both at startup
>> packet send time (for certificate rejection, eelpout's case) and at
>> initial query send (for idle timeout, bug #15598's case) by attempting
>> to read.  Does that make sense?

> Hmm ... it definitely makes sense that we shouldn't assume that a *write*
> failure means there is nothing left to *read*.

After staring at the code for awhile, I am thinking that there may be
a bug of that ilk, but if so it's down inside OpenSSL.  Perhaps it's
specific to the OpenSSL version you're using on eelpout?  There is
not anything we could do differently in libpq, AFAICS, because it's
OpenSSL's responsibility to read any data that might be available.

I also looked into the idea that we're doing something wrong on the
server side, allowing the final error message to not get flushed out.
A plausible theory there is that SSL_shutdown is returning a WANT_READ
or WANT_WRITE error and we should retry it ... but that doesn't square
with your observation upthread that it's returning SSL_ERROR_SSL.

It's all very confusing, but I think there's a nontrivial chance
that this is an OpenSSL bug, especially since we haven't been able
to replicate it elsewhere.

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Tue, Mar 5, 2019 at 10:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > Thomas Munro <thomas.munro@gmail.com> writes:
> >> That suggests that we could perhaps handle ECONNRESET both at startup
> >> packet send time (for certificate rejection, eelpout's case) and at
> >> initial query send (for idle timeout, bug #15598's case) by attempting
> >> to read.  Does that make sense?
>
> > Hmm ... it definitely makes sense that we shouldn't assume that a *write*
> > failure means there is nothing left to *read*.
>
> After staring at the code for awhile, I am thinking that there may be
> a bug of that ilk, but if so it's down inside OpenSSL.  Perhaps it's
> specific to the OpenSSL version you're using on eelpout?  There is
> not anything we could do differently in libpq, AFAICS, because it's
> OpenSSL's responsibility to read any data that might be available.
>
> I also looked into the idea that we're doing something wrong on the
> server side, allowing the final error message to not get flushed out.
> A plausible theory there is that SSL_shutdown is returning a WANT_READ
> or WANT_WRITE error and we should retry it ... but that doesn't square
> with your observation upthread that it's returning SSL_ERROR_SSL.
>
> It's all very confusing, but I think there's a nontrivial chance
> that this is an OpenSSL bug, especially since we haven't been able
> to replicate it elsewhere.

Hmm.  Yes, it is strange that we haven't seen it elsewhere, but
remember that very few animals are running the ssl tests; also it
requires particular timing to hit.

OK, here's something.  I can reproduce it quite easily on this
machine, and I can fix it like this:

diff --git a/src/interfaces/libpq/fe-connect.c
b/src/interfaces/libpq/fe-connect.c
index f29202db5f..e9c137f1bd 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2705,6 +2705,7 @@ keep_going:
         /* We will come back to here until there is

   libpq_gettext("could not send startup packet: %s\n"),

   SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
                                        free(startpacket);
+                                       pqHandleSendFailure(conn);
                                        goto error_return;
                                }

If I add some printf debugging in there, I can see that block being
reached every hundred or so times I try to connect with a revoked
certificate, and with that extra call to pqHandleSendFailure() in
there the error comes out as it should:

psql: SSL error: sslv3 alert certificate revoked

Now I'm confused because we already have handling like that in
PQsendQuery(), so I can't explain bug #15598.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Tue, Mar 5, 2019 at 10:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's all very confusing, but I think there's a nontrivial chance
>> that this is an OpenSSL bug, especially since we haven't been able
>> to replicate it elsewhere.

> Hmm.  Yes, it is strange that we haven't seen it elsewhere, but
> remember that very few animals are running the ssl tests; also it
> requires particular timing to hit.

True.  I've spent some time today running the ssl tests on various
machines here, without any luck reproducing.

> OK, here's something.  I can reproduce it quite easily on this
> machine, and I can fix it like this:

>    libpq_gettext("could not send startup packet: %s\n"),
>    SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
>                                         free(startpacket);
> +                                       pqHandleSendFailure(conn);
>                                         goto error_return;
>                                 }

Yeah.  But I don't like pqHandleSendFailure all that much: it has strong
constraints on what state libpq has to be in, as a consequence of which
it's called from a bunch of ad-hoc places, and can't be called from
some others.  It's kind of accidental that it will work here.

I was toying with the idea of getting rid of that function in
favor of a design like this:

* Handle any socket-write failure at some low level of libpq by
recording the fact that the error occurred (plus whatever data we
need to produce a message about it), and then starting to just
discard output data.

* Eventually, we'll try to read.  Process any available input data
as usual (and, if we get a read error, report that).  When no more
input data is available, if a socket write failure has been recorded,
report that much as if it were an incoming ERROR message, and then
shut down the connection.

This would essentially give us pqHandleSendFailure-like behavior
across the board, which might be enough to fix this problem as well as
bug #15598.  Or not ... as you say, we haven't thoroughly understood
either issue, so it's possible this wouldn't do it.

This would have the effect of prioritizing reports of socket read
failures over those of write failures, which is an interesting
behavioral change, but offhand I can't see a problem with it.

One thing that isn't real clear to me is how much timing sensitivity
there is in "when no more input data is available".  Can we assume that
if we've gotten ECONNRESET or an allied error from a write, then any
data the far end might've wished to send us is already available to
read?  In principle, since TCP allows shutting down either transmission
direction independently, the server could close the read side of its
socket while continuing to transmit data.  In practice, PG servers
don't do that; but could network timing glitches create the same effect?
Even if it's possible, does it happen enough to worry about?

The reason I'm concerned is that I don't think it'd be bright to ignore a
send error until we see input EOF, which'd be the obvious way to solve a
timing problem if there is one.  If our send error was some transient
glitch and the connection is really still open, then we might not get EOF,
but we won't get a server reply either because no message went to the
server.  You could imagine waiting some small interval of time before
deciding that it's time to report the write failure, but ugh.

In any case, consuming immediately-available data before reporting the
write error would already be a step forward over what we're doing now,
I think.

Thoughts?

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Tue, Mar 5, 2019 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > OK, here's something.  I can reproduce it quite easily on this
> > machine, and I can fix it like this:
>
> >    libpq_gettext("could not send startup packet: %s\n"),
> >    SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
> >                                         free(startpacket);
> > +                                       pqHandleSendFailure(conn);
> >                                         goto error_return;
> >                                 }
>
> Yeah.  But I don't like pqHandleSendFailure all that much: it has strong
> constraints on what state libpq has to be in, as a consequence of which
> it's called from a bunch of ad-hoc places, and can't be called from
> some others.  It's kind of accidental that it will work here.
>
> I was toying with the idea of getting rid of that function in
> favor of a design like this:
>
> * Handle any socket-write failure at some low level of libpq by
> recording the fact that the error occurred (plus whatever data we
> need to produce a message about it), and then starting to just
> discard output data.
>
> * Eventually, we'll try to read.  Process any available input data
> as usual (and, if we get a read error, report that).  When no more
> input data is available, if a socket write failure has been recorded,
> report that much as if it were an incoming ERROR message, and then
> shut down the connection.
>
> This would essentially give us pqHandleSendFailure-like behavior
> across the board, which might be enough to fix this problem as well as
> bug #15598.  Or not ... as you say, we haven't thoroughly understood
> either issue, so it's possible this wouldn't do it.
>
> This would have the effect of prioritizing reports of socket read
> failures over those of write failures, which is an interesting
> behavioral change, but offhand I can't see a problem with it.

That seems to recreate (and extend) the behaviour that we see on
another TCP stacks, or on Linux with a remote connection, namely that
the first sendto() succeeds (even though future sendto() calls may
fail, we don't usually get to that because we read and discover an
application-level error message or whatever).

> One thing that isn't real clear to me is how much timing sensitivity
> there is in "when no more input data is available".  Can we assume that
> if we've gotten ECONNRESET or an allied error from a write, then any
> data the far end might've wished to send us is already available to
> read?  In principle, since TCP allows shutting down either transmission
> direction independently, the server could close the read side of its
> socket while continuing to transmit data.  In practice, PG servers
> don't do that; but could network timing glitches create the same effect?
> Even if it's possible, does it happen enough to worry about?

That is beyond my knowledge of networking; I have CC'd Andrew Gierth
in case he has something to say about that.  It certainly seems
sensible to assume that eg RST must follow any data that the other end
sent before closing, if it did indeed call close(), and in our case we
know that it never calls shutdown(SHUT_RD), so the only other
possibility seems to be that it crashed or the connection was lost.
So it seems safe to assume that we can read the server's parting words
after we've had a EPIPE/ECONNRESET error.  Based on some quick
googling, it looks like SHUT_RD doesn't actually send anything anyway
(unlike SHUT_WR which sends FIN and would result in EOF at the other
end), so I'm not sure if SHUT_RD even "exists" outside the mind of the
TCP stack doing it and therefore I'm not sure if there is any "timing
glitch" that could resemble it.  But I don't know.

> The reason I'm concerned is that I don't think it'd be bright to ignore a
> send error until we see input EOF, which'd be the obvious way to solve a
> timing problem if there is one.  If our send error was some transient
> glitch and the connection is really still open, then we might not get EOF,
> but we won't get a server reply either because no message went to the
> server.  You could imagine waiting some small interval of time before
> deciding that it's time to report the write failure, but ugh.
>
> In any case, consuming immediately-available data before reporting the
> write error would already be a step forward over what we're doing now,
> I think.
>
> Thoughts?

It seems reasonable to me on the grounds that sending data doesn't
mean the other guy got it anyway and we can see that on other TCP
stacks.  Preferring errors resulting from reads seems sensible because
it's exactly what we want in these cases, and I can't immediately
think of a downside.  It's bigger surgery than I was thinking of but
it seems like it removes some pre-existing complicated code and
replaces it with something simple, so that seems to be a win.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Tue, Mar 5, 2019 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> True.  I've spent some time today running the ssl tests on various
> machines here, without any luck reproducing.

BTW, I went looking for other failures on the buildfarm I noticed that
even for eelpout it's only happening on master and REL_11_STABLE:

https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=200&branch=&stage=sslCheck&filter=Submit

Disappointingly, that turned out to be just because 10 and earlier
didn't care what the error message said.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> BTW, I went looking for other failures on the buildfarm I noticed that
> even for eelpout it's only happening on master and REL_11_STABLE:

Yeah, I'd noticed that.

> Disappointingly, that turned out to be just because 10 and earlier
> didn't care what the error message said.

That is, you can reproduce the failure on old branches?  That lets
out a half-theory I'd had, which was that Andres' changes to make
the backend always run its socket in nonblock mode had had something
to do with it.  (Those changes do represent a plausible reason why
SSL_shutdown might be returning WANT_READ/WANT_WRITE; but I'm not
in a hurry to add such code without evidence that it actually
happens and something useful would change if we retry.)

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 6, 2019 at 3:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > Disappointingly, that turned out to be just because 10 and earlier
> > didn't care what the error message said.
>
> That is, you can reproduce the failure on old branches?  That lets
> out a half-theory I'd had, which was that Andres' changes to make
> the backend always run its socket in nonblock mode had had something
> to do with it.  (Those changes do represent a plausible reason why
> SSL_shutdown might be returning WANT_READ/WANT_WRITE; but I'm not
> in a hurry to add such code without evidence that it actually
> happens and something useful would change if we retry.)

Yes, on REL_10_STABLE:

$ for i in `seq 1 1000 ` ; do
    psql "host=localhost port=56024 dbname=certdb user=postgres
sslcert=ssl/client-revoked.crt sslkey=ssl/client-revoked.key"
  done
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked
...
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked
psql: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
could not send startup packet: Connection reset by peer
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked
psql: SSL error: sslv3 alert certificate revoked

Zooming in with strace:

sendto(3, "\27\3\3\2\356\r\214\352@\21\320\202\236}\376\367\262\227\177\255\212\204`q\254\108\326\201+c)"...,
1115, MSG_NOSIGNAL, NULL, 0) = 1115
ppoll([{fd=3, events=POLLOUT|POLLERR}], 1, NULL, NULL, 0) = 1 ([{fd=3,
revents=POLLOUT|POLLERR|POLLHUP}])
sendto(3, "\27\3\3\0cW_\210\337Q\227\360\216k\221\346\372pw\27\325P\203\357\245km\304Rx\355\200"...,
104, MSG_NOSIGNAL, NULL, 0) = -1 ECONNRESET (Connection reset by peer)

You can see that poll() already knew the other end had closed the
socket.  Since this is clearly timing... let's see, yeah, I can make
it fail every time by adding sleep(1) before the comment "Send the
startup packet.".  I assume that'll work on any Linux machine?

To set this test up, I ran a server with the following config:

ssl=on
ssl_ca_file='root+client_ca.crt'
ssl_cert_file='server-cn-only.crt'
ssl_key_file='server-cn-only.key'
ssl_crl_file='root+client.crl'

I copied those files out of src/test/ssl/ssl/.  Then I ran the psql
command shown earlier.  I think I had to chmod 600 the keys.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> You can see that poll() already knew the other end had closed the
> socket.  Since this is clearly timing... let's see, yeah, I can make
> it fail every time by adding sleep(1) before the comment "Send the
> startup packet.".  I assume that'll work on any Linux machine?

Great idea, but no cigar --- doesn't do anything for me except make
the ssl test really slow.  (I tried it on RHEL6 and Fedora 28 and, just
for luck, current macOS.)  What this seems to prove is that the thing
that's different about eelpout is the particular kernel it's running,
and that that kernel has some weird timing behavior in this situation.

I've also been experimenting with reducing libpq's SO_SNDBUF setting
on the socket, with more or less the same idea of making the sending
of the startup packet slower.  No joy there either.

Annoying.  I'd be happier about writing code to fix this if I could
reproduce it :-(

            regards, tom lane

PS: but now I'm wondering about trying other non-Linux kernels.


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 6, 2019 at 6:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > You can see that poll() already knew the other end had closed the
> > socket.  Since this is clearly timing... let's see, yeah, I can make
> > it fail every time by adding sleep(1) before the comment "Send the
> > startup packet.".  I assume that'll work on any Linux machine?
>
> Great idea, but no cigar --- doesn't do anything for me except make
> the ssl test really slow.  (I tried it on RHEL6 and Fedora 28 and, just
> for luck, current macOS.)  What this seems to prove is that the thing
> that's different about eelpout is the particular kernel it's running,
> and that that kernel has some weird timing behavior in this situation.
>
> I've also been experimenting with reducing libpq's SO_SNDBUF setting
> on the socket, with more or less the same idea of making the sending
> of the startup packet slower.  No joy there either.
>
> Annoying.  I'd be happier about writing code to fix this if I could
> reproduce it :-(

Hmm.  Note that eelpout only started doing it with OpenSSL 1.1.1.  But
I just tried the sleep(1) trick on an x86 box running the same version
of Debian, OpenSSL etc and it didn't work.  So eelpout (a super cheap
virtualised 4-core ARMv8 system rented from scaleway.com running
Debian Buster with a kernel identifying itself as 4.9.23-std-1 and
libc6 2.28-7) is indeed starting to look pretty weird.  Let me know if
you want to log in and experiment on that machine.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 6, 2019 at 7:05 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Wed, Mar 6, 2019 at 6:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Annoying.  I'd be happier about writing code to fix this if I could
> > reproduce it :-(
>
> Hmm.  Note that eelpout only started doing it with OpenSSL 1.1.1.

Bleugh.  I think this OpenSSL package might just be buggy on ARM.  On
x86, apparently the same version of OpenSSL and all other details of
the test the same, I can see that SSL_connect() returns <= 0
(failure), and then we ask for that cert revoked message directly and
never even reach the startup packet sending code.  On ARM,
SSL_connect() returns 1 (success) and then we proceed as discussed and
eventually get the error later (or not).  So I think I should figure
out a minimal repro and report this to them.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> Bleugh.  I think this OpenSSL package might just be buggy on ARM.  On
> x86, apparently the same version of OpenSSL and all other details of
> the test the same, I can see that SSL_connect() returns <= 0
> (failure), and then we ask for that cert revoked message directly and
> never even reach the startup packet sending code.  On ARM,
> SSL_connect() returns 1 (success) and then we proceed as discussed and
> eventually get the error later (or not).  So I think I should figure
> out a minimal repro and report this to them.

Yeah, I've still been unable to reproduce even with the sleep idea,
so eelpout is definitely looking like a special snowflake from here.
In any case, there seems little doubt that getting past SSL_connect()
when the cert check has failed is an OpenSSL bug; I don't feel a need
to create a workaround for it.

The bug #15598 report is more troublesome, as we don't have a strong
reason to believe it's not common on Windows.  However, I wonder whether
we can really do anything at all about that one.  If I understand what
Andrew was hypothesizing in that thread, it was that Windows might be
dropping undelivered data on the floor once the server closes its end
of the connection.  That would be totally broken behavior, but I never
expect anything else from Microsoft :-(.  If that is an accurate theory
then rewriting libpq won't fix it.

I still think the redesign I suggested upthread would make things cleaner,
but I don't have time or interest to make it happen in the near future
if it's not fixing an observable bug.

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 6, 2019 at 9:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The bug #15598 report is more troublesome, as we don't have a strong
> reason to believe it's not common on Windows.  However, I wonder whether
> we can really do anything at all about that one.  If I understand what
> Andrew was hypothesizing in that thread, it was that Windows might be
> dropping undelivered data on the floor once the server closes its end
> of the connection.  That would be totally broken behavior, but I never
> expect anything else from Microsoft :-(.  If that is an accurate theory
> then rewriting libpq won't fix it.

Here is a stupid Python 2.7 program to try to test that.  Run one copy
of it like this:

$ python ./test.py --server

The server will wait for a client, send a message immediately, and
then close the socket after a second.  The client will connect and
send something once before and twice after the server closes the
socket, and finally see if it can read the message from the server.

Here's the output I get from the client on some different systems (the
server was running on the same system):

$ uname -a
Linux debian 4.18.0-3-amd64 #1 SMP Debian 4.18.20-2 (2018-11-23)
x86_64 GNU/Linux
$ python ./test.py --client
Sending A...
2
Sending B...
[Errno 104] Connection reset by peer
Sending C...
[Errno 32] Broken pipe
This is the server saying goodbye

$ uname -a
Darwin macaque.local 18.2.0 Darwin Kernel Version 18.2.0: Thu Dec 20
20:46:53 PST 2018; root:xnu-4903.241.1~1/RELEASE_X86_64 x86_64
$ python2.7 ./test.py --client
Sending A...
2
Sending B...
[Errno 32] Broken pipe
Sending C...
[Errno 32] Broken pipe
This is the server saying goodbye

$ uname -a
FreeBSD dogmatix 13.0-CURRENT FreeBSD 13.0-CURRENT c0873ea614a(master)
GENERIC  amd64
$ python2.7 ./test.py --client
Sending A...
2
Sending B...
2
Sending C...
2
This is the server saying goodbye

So... can anyone tell us what happens on Windows?

(A secondary question might be what happens if the server and client
are on different machines since I guess it could be different?)

-- 
Thomas Munro
https://enterprisedb.com

Вложения

Re: Rare SSL failures on eelpout

От
Shawn Debnath
Дата:
On Wed, Mar 06, 2019 at 11:13:31AM +1300, Thomas Munro wrote:
> So... can anyone tell us what happens on Windows?

Voila!

C:\Users\Shawn Debnath\Desktop>systeminfo
OS Name:                   Microsoft Windows 10 Pro
OS Version:                10.0.17763 N/A Build 17763
[...]

C:\Users\Shawn Debnath\Desktop>c:\Python27\python.exe tmunro-ssl-test.py --server


C:\Users\Shawn Debnath\Desktop>c:\Python27\python.exe tmunro-ssl-test.py --client
Sending A...
2
Sending B...
[Errno 10054] An existing connection was forcibly closed by the remote host
Sending C...
[Errno 10054] An existing connection was forcibly closed by the remote host
Traceback (most recent call last):
  File "tmunro-ssl-test.py", line 57, in <module>
    client()
  File "tmunro-ssl-test.py", line 51, in client
    print s.recv(1024)
socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host


-- 
Shawn Debnath
Amazon Web Services (AWS)


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 6, 2019 at 4:07 PM Shawn Debnath <sdn@amazon.com> wrote:
> On Wed, Mar 06, 2019 at 11:13:31AM +1300, Thomas Munro wrote:
> > So... can anyone tell us what happens on Windows?

> C:\Users\Shawn Debnath\Desktop>c:\Python27\python.exe tmunro-ssl-test.py --client
> Sending A...
> 2
> Sending B...
> [Errno 10054] An existing connection was forcibly closed by the remote host
> Sending C...
> [Errno 10054] An existing connection was forcibly closed by the remote host
> Traceback (most recent call last):
>   File "tmunro-ssl-test.py", line 57, in <module>
>     client()
>   File "tmunro-ssl-test.py", line 51, in client
>     print s.recv(1024)
> socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host

Thanks!  Wow, so not only can we not read the final message sent by
the server, if we try we get an error.  That's... not what we want.  I
wonder if there might be a way to put the socket into don't-do-that
mode...

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 6, 2019 at 9:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > Bleugh.  I think this OpenSSL package might just be buggy on ARM.  On
> > x86, apparently the same version of OpenSSL and all other details of
> > the test the same, I can see that SSL_connect() returns <= 0
> > (failure), and then we ask for that cert revoked message directly and
> > never even reach the startup packet sending code.  On ARM,
> > SSL_connect() returns 1 (success) and then we proceed as discussed and
> > eventually get the error later (or not).  So I think I should figure
> > out a minimal repro and report this to them.
>
> Yeah, I've still been unable to reproduce even with the sleep idea,
> so eelpout is definitely looking like a special snowflake from here.
> In any case, there seems little doubt that getting past SSL_connect()
> when the cert check has failed is an OpenSSL bug; I don't feel a need
> to create a workaround for it.

I was wrong: it breaks on an x86 system for me too (either with the
sleep, or with clunky scheduling, I was running psql under strace when
I saw it).  Not sure what I did wrong last time I tried that.  I
opened a bug report with OpenSSL, let's see what they say:

https://github.com/openssl/openssl/issues/8500

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Sun, Mar 17, 2019 at 2:00 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Wed, Mar 6, 2019 at 9:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Yeah, I've still been unable to reproduce even with the sleep idea,
> > so eelpout is definitely looking like a special snowflake from here.
> > In any case, there seems little doubt that getting past SSL_connect()
> > when the cert check has failed is an OpenSSL bug; I don't feel a need
> > to create a workaround for it.
>
> I was wrong: it breaks on an x86 system for me too (either with the
> sleep, or with clunky scheduling, I was running psql under strace when
> I saw it).  Not sure what I did wrong last time I tried that.  I
> opened a bug report with OpenSSL, let's see what they say:
>
> https://github.com/openssl/openssl/issues/8500

This was an intentional change in TLS1.3, reducing round trips by
verifying the client certificate later.

I'm pretty sure the fix I mentioned earlier -- namely adding an ad-hoc
call to pqHandleSendFailure() if we fail to send the start-up packet
-- would fix eelpout's measles (though obviously wouldn't solve the
problem for Windows given what we have learned about its TCP
implementation).  I should probably go and do that, unless you want to
write the more general handling for send failure you described, and
are prepared to back-patch it?

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Sun, Mar 17, 2019 at 2:00 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>> I opened a bug report with OpenSSL, let's see what they say:
>> https://github.com/openssl/openssl/issues/8500

> This was an intentional change in TLS1.3, reducing round trips by
> verifying the client certificate later.

Ugh.  So probably we can reproduce it elsewhere if we use cutting-edge
OpenSSL versions.

> I'm pretty sure the fix I mentioned earlier -- namely adding an ad-hoc
> call to pqHandleSendFailure() if we fail to send the start-up packet
> -- would fix eelpout's measles (though obviously wouldn't solve the
> problem for Windows given what we have learned about its TCP
> implementation).  I should probably go and do that, unless you want to
> write the more general handling for send failure you described, and
> are prepared to back-patch it?

Well, I'm not sure about the back-patching aspect of that, but maybe
I should write the patch and then we should see how risky it looks.
Give me a few days ...

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
I wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
>> This was an intentional change in TLS1.3, reducing round trips by
>> verifying the client certificate later.

> Ugh.  So probably we can reproduce it elsewhere if we use cutting-edge
> OpenSSL versions.

I installed OpenSSL 1.1.1a on my Mac laptop.  I got through 100 cycles
of the ssl tests without a problem, which is not too surprising because
longfin has been running on pretty much the exact same software stack
since late November, and it has not shown the problem.  However ...
I threw in the sleep() where you advised in fe-connect.c, and kaboom!

t/001_ssltests.pl .. 67/75
#   Failed test 'certificate authorization fails with revoked client cert: matches'
#   at t/001_ssltests.pl line 375.
#                   'psql: server closed the connection unexpectedly
#       This probably means the server terminated abnormally
#       before or while processing the request.
# could not send startup packet: Broken pipe
# '
#     doesn't match '(?^:SSL error)'
t/001_ssltests.pl .. 74/75
#   Failed test 'intermediate client certificate is missing: matches'
#   at t/001_ssltests.pl line 411.
#                   'psql: server closed the connection unexpectedly
#       This probably means the server terminated abnormally
#       before or while processing the request.
# could not send startup packet: Broken pipe
# '
#     doesn't match '(?^:SSL error)'
# Looks like you failed 2 tests of 75.
t/001_ssltests.pl .. Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/75 subtests
t/002_scram.pl ..... ok

It seems quite repeatable this way.

So that confirms that it's the OpenSSL version that is critical,
and that you need a very new version to make it fail.

I shall now see about fixing it...

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
I wrote:
> ... I don't like pqHandleSendFailure all that much: it has strong
> constraints on what state libpq has to be in, as a consequence of which
> it's called from a bunch of ad-hoc places, and can't be called from
> some others.  It's kind of accidental that it will work here.
> I was toying with the idea of getting rid of that function in
> favor of a design like this:
> * Handle any socket-write failure at some low level of libpq by
> recording the fact that the error occurred (plus whatever data we
> need to produce a message about it), and then starting to just
> discard output data.
> * Eventually, we'll try to read.  Process any available input data
> as usual (and, if we get a read error, report that).  When no more
> input data is available, if a socket write failure has been recorded,
> report that much as if it were an incoming ERROR message, and then
> shut down the connection.
> This would essentially give us pqHandleSendFailure-like behavior
> across the board, which might be enough to fix this problem as well as
> bug #15598.  Or not ... as you say, we haven't thoroughly understood
> either issue, so it's possible this wouldn't do it.

Here's a draft patch along that line.  It's gratifyingly small, and
it does fix the SSL problem for me.  I have no way of investigating
whether it fixes bug #15598, but it might.

Aside from the SSL cert-verify issue, I've checked the behavior when
the backend is shut down ("pg_ctl stop") between queries, and when
the backend is kill 9'd mid-query.  The shutdown case reacts well with
or without SSL:

regression=# select 2+2;
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The backend-crash case is fine without SSL:

regression=# select * from tenk1 t1, tenk1 t2, tenk1 t3;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

but a bit less nice with it:

regression=# select * from tenk1 t1, tenk1 t2, tenk1 t3;
SSL SYSCALL error: EOF detected
The connection to the server was lost. Attempting reset: Succeeded.

That seems cosmetic --- we just haven't expended the same effort on
what to print for connection EOF with SSL as without it.  (Also,
you get that same result without this patch.)

Of course, this hardly counts as exhaustive testing, especially since
I only tried one OpenSSL version.

> One thing that isn't real clear to me is how much timing sensitivity
> there is in "when no more input data is available".  Can we assume that
> if we've gotten ECONNRESET or an allied error from a write, then any
> data the far end might've wished to send us is already available to
> read?  In principle, since TCP allows shutting down either transmission
> direction independently, the server could close the read side of its
> socket while continuing to transmit data.  In practice, PG servers
> don't do that; but could network timing glitches create the same effect?
> Even if it's possible, does it happen enough to worry about?
> The reason I'm concerned is that I don't think it'd be bright to ignore a
> send error until we see input EOF, which'd be the obvious way to solve a
> timing problem if there is one.  If our send error was some transient
> glitch and the connection is really still open, then we might not get EOF,
> but we won't get a server reply either because no message went to the
> server.  You could imagine waiting some small interval of time before
> deciding that it's time to report the write failure, but ugh.

As written, this patch does seem vulnerable to this concern, if it's real.
We could address it for post-connection failures by tweaking PQgetResult
so that it doesn't call pqWait if there's already a write error recorded;
but I'm far from sure that that would be a net improvement.  Also, making
a comparable change to the behavior of the PQconnectPoll state machine
might be tricky.

My current feeling is that this is OK to put in HEAD but I think the
risk-reward ratio isn't very good for the back branches.  Even with
an OpenSSL version where this makes a difference, the problematic
behavior is pretty hard to hit.  So I'm a bit inclined to do nothing
in the back branches.

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index c96a52b..e3bf6a7 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -537,6 +537,10 @@ pqDropServerData(PGconn *conn)
     conn->last_sqlstate[0] = '\0';
     conn->auth_req_received = false;
     conn->password_needed = false;
+    conn->write_failed = false;
+    if (conn->write_err_msg)
+        free(conn->write_err_msg);
+    conn->write_err_msg = NULL;
     conn->be_pid = 0;
     conn->be_key = 0;
 }
@@ -3702,6 +3706,8 @@ freePGconn(PGconn *conn)
     /* Note that conn->Pfdebug is not ours to close or free */
     if (conn->last_query)
         free(conn->last_query);
+    if (conn->write_err_msg)
+        free(conn->write_err_msg);
     if (conn->inBuffer)
         free(conn->inBuffer);
     if (conn->outBuffer)
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index ac969e7..93ff428 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -791,6 +791,32 @@ pqSaveErrorResult(PGconn *conn)
 }

 /*
+ * As above, and append conn->write_err_msg to whatever other error we have.
+ * This is used when we've detected a write failure and have exhausted our
+ * chances of reporting something else instead.
+ */
+static void
+pqSaveWriteError(PGconn *conn)
+{
+    /*
+     * Ensure conn->result is an error result, and add anything in
+     * conn->errorMessage to it.
+     */
+    pqSaveErrorResult(conn);
+
+    /*
+     * Now append write_err_msg to that.  If it's null because of previous
+     * strdup failure, do what we can.  (It's likely our machinations here are
+     * all getting OOM failures as well, but ...)
+     */
+    if (conn->write_err_msg)
+        pqCatenateResultError(conn->result, conn->write_err_msg);
+    else
+        pqCatenateResultError(conn->result,
+                              libpq_gettext("write to server failed\n"));
+}
+
+/*
  * This subroutine prepares an async result object for return to the caller.
  * If there is not already an async result object, build an error object
  * using whatever is in conn->errorMessage.  In any case, clear the async
@@ -1224,7 +1250,7 @@ PQsendQuery(PGconn *conn, const char *query)
         pqPuts(query, conn) < 0 ||
         pqPutMsgEnd(conn) < 0)
     {
-        pqHandleSendFailure(conn);
+        /* error message should be set up already */
         return 0;
     }

@@ -1243,7 +1269,7 @@ PQsendQuery(PGconn *conn, const char *query)
      */
     if (pqFlush(conn) < 0)
     {
-        pqHandleSendFailure(conn);
+        /* error message should be set up already */
         return 0;
     }

@@ -1389,7 +1415,7 @@ PQsendPrepare(PGconn *conn,
     return 1;

 sendFailed:
-    pqHandleSendFailure(conn);
+    /* error message should be set up already */
     return 0;
 }

@@ -1641,40 +1667,11 @@ PQsendQueryGuts(PGconn *conn,
     return 1;

 sendFailed:
-    pqHandleSendFailure(conn);
+    /* error message should be set up already */
     return 0;
 }

 /*
- * pqHandleSendFailure: try to clean up after failure to send command.
- *
- * Primarily, what we want to accomplish here is to process any ERROR or
- * NOTICE messages that the backend might have sent just before it died.
- * Since we're in IDLE state, all such messages will get sent to the notice
- * processor.
- *
- * NOTE: this routine should only be called in PGASYNC_IDLE state.
- */
-void
-pqHandleSendFailure(PGconn *conn)
-{
-    /*
-     * Accept and parse any available input data, ignoring I/O errors.  Note
-     * that if pqReadData decides the backend has closed the channel, it will
-     * close our side of the socket --- that's just what we want here.
-     */
-    while (pqReadData(conn) > 0)
-        parseInput(conn);
-
-    /*
-     * Be sure to parse available input messages even if we read no data.
-     * (Note: calling parseInput within the above loop isn't really necessary,
-     * but it prevents buffer bloat if there's a lot of data available.)
-     */
-    parseInput(conn);
-}
-
-/*
  * Select row-by-row processing mode
  */
 int
@@ -1763,8 +1760,11 @@ PQisBusy(PGconn *conn)
     /* Parse any available data, if our state permits. */
     parseInput(conn);

-    /* PQgetResult will return immediately in all states except BUSY. */
-    return conn->asyncStatus == PGASYNC_BUSY;
+    /*
+     * PQgetResult will return immediately in all states except BUSY, or if we
+     * had a write failure.
+     */
+    return conn->asyncStatus == PGASYNC_BUSY || conn->write_failed;
 }


@@ -1804,7 +1804,13 @@ PQgetResult(PGconn *conn)
             }
         }

-        /* Wait for some more data, and load it. */
+        /*
+         * Wait for some more data, and load it.  (Note: if the connection has
+         * been lost, pqWait should return immediately because the socket
+         * should be read-ready, either with the last server data or with an
+         * EOF indication.  We expect therefore that this won't result in any
+         * undue delay in reporting a previous write failure.)
+         */
         if (flushResult ||
             pqWait(true, false, conn) ||
             pqReadData(conn) < 0)
@@ -1820,6 +1826,17 @@ PQgetResult(PGconn *conn)

         /* Parse it. */
         parseInput(conn);
+
+        /*
+         * If we had a write error, but nothing above obtained a query result
+         * or detected a read error, report the write error.
+         */
+        if (conn->write_failed && conn->asyncStatus == PGASYNC_BUSY)
+        {
+            pqSaveWriteError(conn);
+            conn->asyncStatus = PGASYNC_IDLE;
+            return pqPrepareAsyncResult(conn);
+        }
     }

     /* Return the appropriate thing. */
@@ -2252,7 +2269,7 @@ PQsendDescribe(PGconn *conn, char desc_type, const char *desc_target)
     return 1;

 sendFailed:
-    pqHandleSendFailure(conn);
+    /* error message should be set up already */
     return 0;
 }

diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index e5ef8d4..718da1c 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -824,6 +824,13 @@ definitelyFailed:
  *
  * Return 0 on success, -1 on failure and 1 when not all data could be sent
  * because the socket would block and the connection is non-blocking.
+ *
+ * Upon write failure, conn->write_failed is set and the error message is
+ * saved in conn->write_err_msg, but we clear the output buffer and return
+ * zero anyway; this is because callers should soldier on until it's possible
+ * to read from the server and check for an error message.  write_err_msg
+ * should be reported only when we are unable to obtain a server error first.
+ * (Thus, a -1 result is returned only for an internal *read* failure.)
  */
 static int
 pqSendSome(PGconn *conn, int len)
@@ -832,13 +839,32 @@ pqSendSome(PGconn *conn, int len)
     int            remaining = conn->outCount;
     int            result = 0;

+    /*
+     * If we already had a write failure, we will never again try to send data
+     * on that connection.  Even if the kernel would let us, we've probably
+     * lost message boundary sync with the server.  conn->write_failed
+     * therefore persists until the connection is reset, and we just discard
+     * all data presented to be written.
+     */
+    if (conn->write_failed)
+    {
+        /* conn->write_err_msg should be set up already */
+        conn->outCount = 0;
+        return 0;
+    }
+
     if (conn->sock == PGINVALID_SOCKET)
     {
         printfPQExpBuffer(&conn->errorMessage,
                           libpq_gettext("connection not open\n"));
+        conn->write_failed = true;
+        /* Transfer error message to conn->write_err_msg, if possible */
+        /* (strdup failure is OK, we'll cope later) */
+        conn->write_err_msg = strdup(conn->errorMessage.data);
+        resetPQExpBuffer(&conn->errorMessage);
         /* Discard queued data; no chance it'll ever be sent */
         conn->outCount = 0;
-        return -1;
+        return 0;
     }

     /* while there's still data to send */
@@ -876,17 +902,18 @@ pqSendSome(PGconn *conn, int len)

                 default:
                     /* pqsecure_write set the error message for us */
+                    conn->write_failed = true;

                     /*
-                     * We used to close the socket here, but that's a bad idea
-                     * since there might be unread data waiting (typically, a
-                     * NOTICE message from the backend telling us it's
-                     * committing hara-kiri...).  Leave the socket open until
-                     * pqReadData finds no more data can be read.  But abandon
-                     * attempt to send data.
+                     * Transfer error message to conn->write_err_msg, if
+                     * possible (strdup failure is OK, we'll cope later)
                      */
+                    conn->write_err_msg = strdup(conn->errorMessage.data);
+                    resetPQExpBuffer(&conn->errorMessage);
+
+                    /* Discard queued data; no chance it'll ever be sent */
                     conn->outCount = 0;
-                    return -1;
+                    return 0;
             }
         }
         else
@@ -921,6 +948,9 @@ pqSendSome(PGconn *conn, int len)
              * can do, and works pretty well in practice.  (The documentation
              * used to say that you only need to wait for write-ready, so
              * there are still plenty of applications like that out there.)
+             *
+             * Note that errors here don't result in write_failed becoming
+             * set.
              */
             if (pqReadData(conn) < 0)
             {
@@ -956,6 +986,7 @@ pqSendSome(PGconn *conn, int len)
  *
  * Return 0 on success, -1 on failure and 1 when not all data could be sent
  * because the socket would block and the connection is non-blocking.
+ * (See pqSendSome comments about how failure should be handled.)
  */
 int
 pqFlush(PGconn *conn)
diff --git a/src/interfaces/libpq/fe-protocol2.c b/src/interfaces/libpq/fe-protocol2.c
index 1ab1421..0e36974 100644
--- a/src/interfaces/libpq/fe-protocol2.c
+++ b/src/interfaces/libpq/fe-protocol2.c
@@ -1450,42 +1450,30 @@ pqFunctionCall2(PGconn *conn, Oid fnid,
         pqPutInt(fnid, 4, conn) != 0 || /* function id */
         pqPutInt(nargs, 4, conn) != 0)    /* # of args */
     {
-        pqHandleSendFailure(conn);
+        /* error message should be set up already */
         return NULL;
     }

     for (i = 0; i < nargs; ++i)
     {                            /* len.int4 + contents       */
         if (pqPutInt(args[i].len, 4, conn))
-        {
-            pqHandleSendFailure(conn);
             return NULL;
-        }

         if (args[i].isint)
         {
             if (pqPutInt(args[i].u.integer, 4, conn))
-            {
-                pqHandleSendFailure(conn);
                 return NULL;
-            }
         }
         else
         {
             if (pqPutnchar((char *) args[i].u.ptr, args[i].len, conn))
-            {
-                pqHandleSendFailure(conn);
                 return NULL;
-            }
         }
     }

     if (pqPutMsgEnd(conn) < 0 ||
         pqFlush(conn))
-    {
-        pqHandleSendFailure(conn);
         return NULL;
-    }

     for (;;)
     {
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 47dbc31..ec51fee 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -1926,50 +1926,35 @@ pqFunctionCall3(PGconn *conn, Oid fnid,
         pqPutInt(1, 2, conn) < 0 || /* format code: BINARY */
         pqPutInt(nargs, 2, conn) < 0)    /* # of args */
     {
-        pqHandleSendFailure(conn);
+        /* error message should be set up already */
         return NULL;
     }

     for (i = 0; i < nargs; ++i)
     {                            /* len.int4 + contents       */
         if (pqPutInt(args[i].len, 4, conn))
-        {
-            pqHandleSendFailure(conn);
             return NULL;
-        }
         if (args[i].len == -1)
             continue;            /* it's NULL */

         if (args[i].isint)
         {
             if (pqPutInt(args[i].u.integer, args[i].len, conn))
-            {
-                pqHandleSendFailure(conn);
                 return NULL;
-            }
         }
         else
         {
             if (pqPutnchar((char *) args[i].u.ptr, args[i].len, conn))
-            {
-                pqHandleSendFailure(conn);
                 return NULL;
-            }
         }
     }

     if (pqPutInt(1, 2, conn) < 0)    /* result format code: BINARY */
-    {
-        pqHandleSendFailure(conn);
         return NULL;
-    }

     if (pqPutMsgEnd(conn) < 0 ||
         pqFlush(conn))
-    {
-        pqHandleSendFailure(conn);
         return NULL;
-    }

     for (;;)
     {
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4a93d8e..dbe0f7e 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -410,6 +410,8 @@ struct pg_conn
     bool        password_needed;    /* true if server demanded a password */
     bool        sigpipe_so;        /* have we masked SIGPIPE via SO_NOSIGPIPE? */
     bool        sigpipe_flag;    /* can we mask SIGPIPE via MSG_NOSIGNAL? */
+    bool        write_failed;    /* have we had a write failure on sock? */
+    char       *write_err_msg;    /* write error message, or NULL if OOM */

     /* Transient state needed while establishing connection */
     bool        try_next_addr;    /* time to advance to next address/host? */
@@ -585,7 +587,6 @@ extern void pqSaveMessageField(PGresult *res, char code,
 extern void pqSaveParameterStatus(PGconn *conn, const char *name,
                       const char *value);
 extern int    pqRowProcessor(PGconn *conn, const char **errmsgp);
-extern void pqHandleSendFailure(PGconn *conn);

 /* === in fe-protocol2.c === */


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Tue, Mar 19, 2019 at 9:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > ... I don't like pqHandleSendFailure all that much: it has strong
> > constraints on what state libpq has to be in, as a consequence of which
> > it's called from a bunch of ad-hoc places, and can't be called from
> > some others.  It's kind of accidental that it will work here.
> > I was toying with the idea of getting rid of that function in
> > favor of a design like this:
> > * Handle any socket-write failure at some low level of libpq by
> > recording the fact that the error occurred (plus whatever data we
> > need to produce a message about it), and then starting to just
> > discard output data.
> > * Eventually, we'll try to read.  Process any available input data
> > as usual (and, if we get a read error, report that).  When no more
> > input data is available, if a socket write failure has been recorded,
> > report that much as if it were an incoming ERROR message, and then
> > shut down the connection.
> > This would essentially give us pqHandleSendFailure-like behavior
> > across the board, which might be enough to fix this problem as well as
> > bug #15598.  Or not ... as you say, we haven't thoroughly understood
> > either issue, so it's possible this wouldn't do it.
>
> Here's a draft patch along that line.  It's gratifyingly small, and
> it does fix the SSL problem for me.  I have no way of investigating
> whether it fixes bug #15598, but it might.
>
> Aside from the SSL cert-verify issue, I've checked the behavior when
> the backend is shut down ("pg_ctl stop") between queries, and when
> the backend is kill 9'd mid-query.  The shutdown case reacts well with
> or without SSL:
>
> regression=# select 2+2;
> FATAL:  terminating connection due to administrator command
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> The backend-crash case is fine without SSL:
>
> regression=# select * from tenk1 t1, tenk1 t2, tenk1 t3;
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
>
> but a bit less nice with it:
>
> regression=# select * from tenk1 t1, tenk1 t2, tenk1 t3;
> SSL SYSCALL error: EOF detected
> The connection to the server was lost. Attempting reset: Succeeded.
>
> That seems cosmetic --- we just haven't expended the same effort on
> what to print for connection EOF with SSL as without it.  (Also,
> you get that same result without this patch.)
>
> Of course, this hardly counts as exhaustive testing, especially since
> I only tried one OpenSSL version.

Neat.  Fixes the problem for me on eelpout's host.

> > One thing that isn't real clear to me is how much timing sensitivity
> > there is in "when no more input data is available".  Can we assume that
> > if we've gotten ECONNRESET or an allied error from a write, then any
> > data the far end might've wished to send us is already available to
> > read?

Following a trail beginning at
https://en.wikipedia.org/wiki/Transmission_Control_Protocol I see that
RFC 1122 4.2.2.13 discusses this topic and possible variations in this
area.  I don't know enough about any of this stuff to draw hard
conclusions from primary sources, but it does appear that an
implementation might be within its rights to jettison that data,
unfortunately.  As seen with the simple Python script experiment
up-thread, we tested four implementations, and (amusingly) got four
different behaviours:

1.  For Windows, the answer to your question was clearly "no" in that
experiment.  I think this behaviour sucks for any protocol that
involves unsolicited messages in both directions (that is, not based
on pure request/response processing), because there is no way to
guarantee that you receive a "goodbye" message, and that's what's
happening in bug #15598.  I'm not sure if there is anything we ever
can do about this though, other than complaining on the internet or
opening a second TCP connection for unsolicited server->client
messages.

2.  Linux, FreeBSD and Darwin gave slightly different error sequences
when writing after the remote connection was closed (though I suspect
they'd behave much the same way for a connection to a remote host),
but all allowed the "goodbye" message to be read, so the answer there
is "yes".

> >  In principle, since TCP allows shutting down either transmission
> > direction independently, the server could close the read side of its
> > socket while continuing to transmit data.  In practice, PG servers
> > don't do that; but could network timing glitches create the same effect?
> > Even if it's possible, does it happen enough to worry about?
> > The reason I'm concerned is that I don't think it'd be bright to ignore a
> > send error until we see input EOF, which'd be the obvious way to solve a
> > timing problem if there is one.  If our send error was some transient
> > glitch and the connection is really still open, then we might not get EOF,
> > but we won't get a server reply either because no message went to the
> > server.  You could imagine waiting some small interval of time before
> > deciding that it's time to report the write failure, but ugh.
>
> As written, this patch does seem vulnerable to this concern, if it's real.
> We could address it for post-connection failures by tweaking PQgetResult
> so that it doesn't call pqWait if there's already a write error recorded;
> but I'm far from sure that that would be a net improvement.  Also, making
> a comparable change to the behavior of the PQconnectPoll state machine
> might be tricky.

I think ignoring send errors until you eventually get a recv error or
an EOF is OK, because that's roughly equivalent to the way sending
works in TCP in general anyway; you find out that the other end has
gone away a bit later.  That is the reason you can't reproduce this
problem on a FreeBSD box with OpenSSL 1.1.1 even if you add the sleep,
or (presumably, though I didn't test) on a Linux system with remote
server: the send() succeeds, because it doesn't or can't peek at the
other end's TCP state and see that it's been closed already, instead
it merely transmits the message and the other send later sends back a
RST bit that'll cause some future send() syscall to fail.

I can't imagine what sort of transient glitch could apply here.  I
guess the question is: is there a way for send() to report an error,
but recv() to report EAGAIN so we finish up blocked?  I can't
immediately see how that could happen.

> My current feeling is that this is OK to put in HEAD but I think the
> risk-reward ratio isn't very good for the back branches.  Even with
> an OpenSSL version where this makes a difference, the problematic
> behavior is pretty hard to hit.  So I'm a bit inclined to do nothing
> in the back branches.

Shouldn't we also back-patch the one-line change adding
pqHandleSendFailure()?  Then eelpout would be green for REL_11_STABLE,
and any future buildfarm animals (probably underpowered ones with
wonky scheduling) when they are eventually upgraded to OpenSSL 1.1.1.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Tue, Mar 19, 2019 at 9:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My current feeling is that this is OK to put in HEAD but I think the
>> risk-reward ratio isn't very good for the back branches.  Even with
>> an OpenSSL version where this makes a difference, the problematic
>> behavior is pretty hard to hit.  So I'm a bit inclined to do nothing
>> in the back branches.

> Shouldn't we also back-patch the one-line change adding
> pqHandleSendFailure()?

As I said before, I don't like that patch: at best it's an abuse of
pqHandleSendFailure, because that function is only meant to be called
at start of a query cycle.  It wouldn't be hard to break this usage and
not notice, especially given that we often don't test back-patched
changes very carefully in the back branches if they seem OK in HEAD.

Possibly we could consider back-patching the more aggressive patch
once it's survived v12 beta testing, and just living with the issue
till then.  Given what we know now, I don't think this is a big
problem for the field: how many people use SSL on local connections?

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Tue, Mar 19, 2019 at 12:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> 2.  Linux, FreeBSD and Darwin gave slightly different error sequences
> when writing after the remote connection was closed (though I suspect
> they'd behave much the same way for a connection to a remote host),
> but all allowed the "goodbye" message to be read, so the answer there
> is "yes".

Qualification, after some off-list discussion with Andrew Gierth: even
though that appears to work in happy tests on those TCP
implementations, it may be that they can't really guarantee that it
works in adverse conditions, because the server may not be able to
retransmit packets that preceded the reset but somehow went missing,
and it's also not clear what happens if they arrive out of order.
Dunno.  (Oh man, all this hidden magic, is it actually possible to
write working application code without imbibing Stevens TCP/IP
Illustrated?  It's a bit like last year's work on filesystem quirks,
when I realised that man pages are nice and all, but you basically
need to read the source or talk to someone who has...)

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Tue, Mar 19, 2019 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > Shouldn't we also back-patch the one-line change adding
> > pqHandleSendFailure()?
>
> As I said before, I don't like that patch: at best it's an abuse of
> pqHandleSendFailure, because that function is only meant to be called
> at start of a query cycle.  It wouldn't be hard to break this usage and
> not notice, especially given that we often don't test back-patched
> changes very carefully in the back branches if they seem OK in HEAD.
>
> Possibly we could consider back-patching the more aggressive patch
> once it's survived v12 beta testing, and just living with the issue
> till then.  Given what we know now, I don't think this is a big
> problem for the field: how many people use SSL on local connections?

Yeah, now that we understand this properly I agree this is unlikely to
bother anyone in real life.  I just want to make the build farm green.
I wondered about ssl_max_protocol_version = 'TLSv1.2', but that GUC's
too new.  Another option would be to change the "command_fails_like"
pattern to tolerate both errors in v11.

-- 
Thomas Munro
https://enterprisedb.com


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> Yeah, now that we understand this properly I agree this is unlikely to
> bother anyone in real life.  I just want to make the build farm green.
> I wondered about ssl_max_protocol_version = 'TLSv1.2', but that GUC's
> too new.

Yeah; also, forcing that would reduce our test coverage.

> Another option would be to change the "command_fails_like"
> pattern to tolerate both errors in v11.

This might be a reasonable stopgap measure, since it'd basically
only affect buildfarm results.

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Tue, Mar 19, 2019 at 9:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> One thing that isn't real clear to me is how much timing sensitivity
>>> there is in "when no more input data is available".  Can we assume that
>>> if we've gotten ECONNRESET or an allied error from a write, then any
>>> data the far end might've wished to send us is already available to
>>> read?

> Following a trail beginning at
> https://en.wikipedia.org/wiki/Transmission_Control_Protocol I see that
> RFC 1122 4.2.2.13 discusses this topic and possible variations in this
> area.  I don't know enough about any of this stuff to draw hard
> conclusions from primary sources, but it does appear that an
> implementation might be within its rights to jettison that data,
> unfortunately.

I spent some time looking at that, and as far as I can see, the
behavior reported for Windows is flat out forbidden by TCP.  RFC1122
does discuss the possibility that an O/S might not support half-closed
connections; but that would only matter if (in Unix terms) we issued
shutdown(sock, SHUT_WR) and expected to continue to be able to read,
which we do not.  In any other scenario, TCP is supposed to deliver
any data it successfully received.

We can't, of course, work magic and retrieve data the TCP stack never
got, nor is there much we can do about it if the stack throws away
data despite the spec.  So I'm not inclined to fuss about the corner
cases.  The main thing for us is to ensure that if a server error
message is available to read, we do so and return it rather than
returning a less-helpful bleat about being unable to write.  The
proposed patch does that.  It will also tend to report bleats about
read failure rather than write failure, even if from libpq's perspective
the write failure happened first; but that seems fine to me.

>>> The reason I'm concerned is that I don't think it'd be bright to ignore a
>>> send error until we see input EOF, which'd be the obvious way to solve a
>>> timing problem if there is one.  If our send error was some transient
>>> glitch and the connection is really still open, then we might not get EOF,
>>> but we won't get a server reply either because no message went to the
>>> server.  You could imagine waiting some small interval of time before
>>> deciding that it's time to report the write failure, but ugh.

I'm likewise inclined to dismiss this worry, because I don't see
how it could happen, given that the server doesn't use shutdown(2).
A send error should imply either that the kernel saw RST from the
remote end, or that the connection is local and the kernel knows that
the server closed its socket or crashed.  In either scenario the
kernel should consider that the incoming data stream is ended; maybe
it'll give us whatever data it received or maybe not, but it shouldn't
allow us to sit and wait for more data.

Or in other words: there are no client-visible "transient glitches" in
TCP.  Either the connection is still up, or it's been reset.

So I'm inclined to (1) commit the patch as-proposed in HEAD, and
(2) hack the ssl test cases in v11 as you suggested.  If we see field
complaints about this, we can consider reverting (2) in favor of
a back-patch once v12 beta is over.

            regards, tom lane


Re: Rare SSL failures on eelpout

От
Thomas Munro
Дата:
On Wed, Mar 20, 2019 at 8:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So I'm inclined to (1) commit the patch as-proposed in HEAD, and
> (2) hack the ssl test cases in v11 as you suggested.  If we see field
> complaints about this, we can consider reverting (2) in favor of
> a back-patch once v12 beta is over.

This plan looks good to me.

-- 
Thomas Munro
https://enterprisedb.com