Обсуждение: YA race condition in 001_stream_rep.pl (was Re: pgsql: Allow using the updated tuple while moving it to a different par)

Поиск
Список
Период
Сортировка
I wrote:
> This looks to me like it's probably a low-probability timing problem in
> the test script itself.
> ...
> We could maybe hack around that with a short "sleep" in this script, but
> a less cruddy solution would be to change PostgresNode::reload so that
> it somehow waits for the server to process the reload signal.

On further reflection, I doubt that would help.  The issue isn't whether
the standby_2 postmaster has processed the SIGHUP, but whether its
walreceiver process has sent a feedback message.  The loop in
WalReceiverMain does things in this order:

1. Check for SIGTERM, exit if seen.

2. Check for SIGHUP, process that (and possibly send a feedback message)
if seen.

3. Read and process streaming data till no more is available.

We've seen just a tiny number of failures at this point in the test
--- scraping the buildfarm database, I find these:

   sysname    |      snapshot       |     stage     |                                      l
           

--------------+---------------------+---------------+------------------------------------------------------------------------------
 longfin      | 2017-09-26 14:39:05 | recoveryCheck | Timed out waiting for slot xmins to advance at
t/001_stream_rep.plline 155. 
 flaviventris | 2018-04-07 17:02:01 | recoveryCheck | Timed out waiting for slot xmins to advance at
t/001_stream_rep.plline 155. 
 serinus      | 2018-05-18 20:14:01 | recoveryCheck | Timed out waiting for slot xmins to advance at
t/001_stream_rep.plline 155. 
 idiacanthus  | 2018-06-23 12:46:01 | recoveryCheck | Timed out waiting for slot xmins to advance at
t/001_stream_rep.plline 157. 
 francolin    | 2018-06-29 13:30:01 | recoveryCheck | Timed out waiting for slot xmins to advance at
t/001_stream_rep.plline 157. 
 flaviventris | 2018-07-12 07:36:01 | recoveryCheck | Timed out waiting for slot xmins to advance at
t/001_stream_rep.plline 157. 
(6 rows)

So it's clearly a very low-probability race condition.  My hypothesis
is that the SIGHUP arrives, but the walreceiver process is already in
the read-and-process-data part of its loop, and the timing is such that
it manages to stay there until we issue the standby2 shutdown command.
When it finally notices that it has no more data available, it iterates
the outermost loop, and then shuts down without ever having sent a
feedback message.  The "replay_check()" call in the test script doesn't
ensure any synchronization here, since it only verifies that WAL data
has been applied, not that any feedback has happened.

I can reproduce the failure pretty reliably with a hack like the one
attached, which makes it unlikely that the walreceiver will send a
feedback message instantly when it gets the SIGHUP.

So the issue boils down to this: the test script is, effectively,
assuming that it's guaranteed that the walreceiver will send a feedback
message before it shuts down; but there is no such guarantee.  Is this
a bug in the test script, or a bug in the walreceiver logic?  I can
see the value of having such a guarantee, but I also think it would be
nigh impossible to make it a bulletproof guarantee.  We could perhaps
add "XLogWalRcvSendHSFeedback(true)" somewhere closer to process exit,
but that might add more failure modes than it removes.

Or we could change the test script to wait for feedback before it
issues the shutdown, but then I think the test is a bit pointless.

Comments?

            regards, tom lane

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 987bb84..d7bb1b4 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -447,7 +447,7 @@ WalReceiverMain(void)
                 {
                     got_SIGHUP = false;
                     ProcessConfigFile(PGC_SIGHUP);
-                    XLogWalRcvSendHSFeedback(true);
+                    XLogWalRcvSendHSFeedback(false);
                 }

                 /* See if we can read data immediately */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index a0d3e8f..8f8dbc5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -145,7 +145,7 @@ is( $node_standby_1->psql(
 $node_standby_2->append_conf('recovery.conf',
     "primary_slot_name = $slotname_2");
 $node_standby_2->append_conf('postgresql.conf',
-    "wal_receiver_status_interval = 1");
+    "wal_receiver_status_interval = 10");
 $node_standby_2->restart;

 # Fetch xmin columns from slot's pg_replication_slots row, after waiting for

On Sat, Jul 14, 2018 at 11:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>
> I can reproduce the failure pretty reliably with a hack like the one
> attached, which makes it unlikely that the walreceiver will send a
> feedback message instantly when it gets the SIGHUP.
>
> So the issue boils down to this: the test script is, effectively,
> assuming that it's guaranteed that the walreceiver will send a feedback
> message before it shuts down; but there is no such guarantee.  Is this
> a bug in the test script, or a bug in the walreceiver logic?  I can
> see the value of having such a guarantee, but I also think it would be
> nigh impossible to make it a bulletproof guarantee.  We could perhaps
> add "XLogWalRcvSendHSFeedback(true)" somewhere closer to process exit,
> but that might add more failure modes than it removes.
>
> Or we could change the test script to wait for feedback before it
> issues the shutdown, but then I think the test is a bit pointless.
>

Currently, neither the code nor our documentation suggests that we can
expect HSFeedback before the shutdown, so it is better to adjust the
test.  If the sole purpose of the test is to test feedback after
shutdown, then it is worth considering to remove the test altogether.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Amit Kapila <amit.kapila16@gmail.com> writes:
> On Sat, Jul 14, 2018 at 11:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So the issue boils down to this: the test script is, effectively,
>> assuming that it's guaranteed that the walreceiver will send a feedback
>> message before it shuts down; but there is no such guarantee.  Is this
>> a bug in the test script, or a bug in the walreceiver logic?  I can
>> see the value of having such a guarantee, but I also think it would be
>> nigh impossible to make it a bulletproof guarantee.  We could perhaps
>> add "XLogWalRcvSendHSFeedback(true)" somewhere closer to process exit,
>> but that might add more failure modes than it removes.
>> 
>> Or we could change the test script to wait for feedback before it
>> issues the shutdown, but then I think the test is a bit pointless.

We've seen two more instances of this failure in the buildfarm since
this exchange (dragonet and flaviventris --- seems like Andres' menagerie
is particularly prone to it).

> Currently, neither the code nor our documentation suggests that we can
> expect HSFeedback before the shutdown, so it is better to adjust the
> test.  If the sole purpose of the test is to test feedback after
> shutdown, then it is worth considering to remove the test altogether.

Hearing nobody speaking in favor of the other alternatives,
I've removed the test.

            regards, tom lane


On Wed, Jul 18, 2018 at 05:42:40PM -0400, Tom Lane wrote:
> Hearing nobody speaking in favor of the other alternatives,
> I've removed the test.

I had this problem running in the background for a couple of days, but I
could not come up with a solution cleaner than just removing the test.
Thanks for doing so.
--
Michael

Вложения