Обсуждение: A WalSnd issue related to state WALSNDSTATE_STOPPING

Поиск
Список
Период
Сортировка

A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Paul Guo
Дата:
Hello,

Recently I encountered a panic (assert failure) on a cassert enabled build of Greenplum database (an MPP database based on postgres). The based postgresql version is 9.4 stable (9.4.19). The panic is not complex. Please see below for more details. This issue seems to be a bug in postgresql also so I'm writing to community.

(gdb) bt                                                                        
#0  0x00007fb8c56321f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007fb8c56338e8 in __GI_abort () at abort.c:90                           
#2  0x0000000000ae3fd2 in ExceptionalCondition (conditionName=0xdcec75 "!(syncWalSnd)",
    errorType=0xdce77a "FailedAssertion", fileName=0xdce770 "syncrep.c", lineNumber=505) at assert.c:66
#3  0x0000000000924dbc in SyncRepReleaseWaiters () at syncrep.c:505             
#4  0x0000000000915e5e in ProcessStandbyReplyMessage () at walsender.c:1667     
#5  0x0000000000915bea in ProcessStandbyMessage () at walsender.c:1566          
#6  0x0000000000915ab5 in ProcessRepliesIfAny () at walsender.c:1502            
#7  0x0000000000916303 in WalSndLoop (send_data=0x916d46 <XLogSendPhysical>) at walsender.c:1923
#8  0x000000000091439b in StartReplication (cmd=0x15abea8) at walsender.c:704   
#9  0x000000000091586e in exec_replication_command (                            
    cmd_string=0x161fa78 "START_REPLICATION 1/84000000 TIMELINE 6") at walsender.c:1416
#10 0x0000000000979915 in PostgresMain (argc=1, argv=0x15bafd8, dbname=0x15baec8 "", username=0x15baea8 "gpadmin")
    at postgres.c:5296                                                          

(gdb) p max_wal_senders                                                         
$1 = 1                                                                          

(gdb) p WalSndCtl->walsnds[0]                                                   
$2 = {                                                                          
  pid = 2765,                                                                   
  state = WALSNDSTATE_STOPPING,                                                 
  sentPtr = 6509560784,                                                         
  sendKeepalive = 0 '\000',                                                     
  needreload = 0 '\000',                                                        
  write = 6509560784,                                                           
  flush = 6509560664,                                                           
  apply = 6509560664,                                                           
  caughtup_within_range = 1 '\001',                                             
  xlogCleanUpTo = 6509560664,                                                   
  mutex = 0 '\000',                                                             
  latch = {                                                                     
    is_set = 1,                                                                 
    is_shared = 1 '\001',                                                       
    owner_pid = 2765                                                            
  },                                                                            
  sync_standby_priority = 1,                                                    
  synchronous = 0 '\000',                                                       
  replica_disconnected_at = 0                                                   
}                                                                               

Below is related code.

  for (i = 0; i < max_wal_senders; i++)
    {
        /* use volatile pointer to prevent code rearrangement */
        volatile WalSnd *walsnd = &walsndctl->walsnds[i];

        if (walsnd->pid != 0 &&
            walsnd->state >= WALSNDSTATE_STREAMING &&
            walsnd->sync_standby_priority > 0 &&
            (priority == 0 ||
             priority > walsnd->sync_standby_priority) &&
            !XLogRecPtrIsInvalid(walsnd->flush))
        {
            priority = walsnd->sync_standby_priority;
            syncWalSnd = walsnd;
        }
    }

    /*
     * We should have found ourselves at least.
     */
    Assert(syncWalSnd);

The panic reason is that since there is just one wal sender and WalSndCtl->walsnds[0].state is WALSNDSTATE_STOPPING so syncWalSnd will be NULL and that causes assert failure. Latest postgresql code removes the Assert code but the related similar code logic was kept. It looks like that we need to modify the logic similar like below (PG 9.4 STABLE) to allow WALSNDSTATE_STOPPING which is reasonable here If I understand correctly.

        if (walsnd->pid != 0 &&
-           walsnd->state == WALSNDSTATE_STREAMING &&
+           walsnd->state >= WALSNDSTATE_STREAMING &&
            walsnd->sync_standby_priority > 0 &&
            (priority == 0 ||
             priority > walsnd->sync_standby_priority) &&

For Latest master,  the logic is as below but it could be modified similarly.
          /* Must be streaming */
          if (state != WALSNDSTATE_STREAMING)
              continue;

If yes this change should be on all branches that have the patch which introduces WALSNDSTATE_STOPPING.

Thanks.

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Michael Paquier
Дата:
On Wed, Nov 21, 2018 at 12:54:30PM +0800, Paul Guo wrote:
> The panic reason is that since there is just one wal sender and
> WalSndCtl->walsnds[0].state is WALSNDSTATE_STOPPING so syncWalSnd will be
> NULL and that causes assert failure. Latest postgresql code removes the
> Assert code but the related similar code logic was kept. It looks like that
> we need to modify the logic similar like below (PG 9.4 STABLE) to allow
> WALSNDSTATE_STOPPING which is reasonable here If I understand correctly.

The checkpointer initializes a shutdown checkpoint where it tells to all
the WAL senders to stop once all the children processes are gone, so it
seems to me that there is little point in processing
SyncRepReleaseWaiters() when a WAL sender is in WALSNDSTATE_STOPPING
state as at this stage syncrep makes little sense.  It is still
necessary to process standby messages at this stage so as the primary
can shut down when it is sure that all the standbys have flushed the
shutdown checkpoint record of the primary.
--
Michael

Вложения

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Michael Paquier
Дата:
On Wed, Nov 21, 2018 at 04:09:41PM +0900, Michael Paquier wrote:
> The checkpointer initializes a shutdown checkpoint where it tells to all
> the WAL senders to stop once all the children processes are gone, so it
> seems to me that there is little point in processing
> SyncRepReleaseWaiters() when a WAL sender is in WALSNDSTATE_STOPPING
> state as at this stage syncrep makes little sense.  It is still
> necessary to process standby messages at this stage so as the primary
> can shut down when it is sure that all the standbys have flushed the
> shutdown checkpoint record of the primary.

Just refreshed my memory with c6c33343, which is actually at the origin
of the issue, and my previous argument is flawed.  If a WAL sender has
reached WALSNDSTATE_STOPPING no regular backends are around but a WAL
sender could always commit a transaction in parallel which may need to
make sure that its record is flushed and sync'ed, and this needs to make
sure that waiters are correctly released.  So it is necessary to patch
up SyncRepGetSyncStandbysPriority and SyncRepGetSyncStandbysQuorum as
mentioned upthread, perhaps adding a comment when looking at
MyWalSnd->state looks adapted. Paul, would you like to write a patch?
--
Michael

Вложения

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Paul Guo
Дата:
On Thu, Nov 22, 2018 at 1:29 PM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Nov 21, 2018 at 04:09:41PM +0900, Michael Paquier wrote:
> The checkpointer initializes a shutdown checkpoint where it tells to all
> the WAL senders to stop once all the children processes are gone, so it
> seems to me that there is little point in processing
> SyncRepReleaseWaiters() when a WAL sender is in WALSNDSTATE_STOPPING
> state as at this stage syncrep makes little sense.  It is still
> necessary to process standby messages at this stage so as the primary
> can shut down when it is sure that all the standbys have flushed the
> shutdown checkpoint record of the primary.

Just refreshed my memory with c6c33343, which is actually at the origin
of the issue, and my previous argument is flawed.  If a WAL sender has
reached WALSNDSTATE_STOPPING no regular backends are around but a WAL
sender could always commit a transaction in parallel which may need to
make sure that its record is flushed and sync'ed, and this needs to make
sure that waiters are correctly released.  So it is necessary to patch
up SyncRepGetSyncStandbysPriority and SyncRepGetSyncStandbysQuorum as
mentioned upthread, perhaps adding a comment when looking at
MyWalSnd->state looks adapted. Paul, would you like to write a patch?
--
Michael

Yes, please see the attached patch.
 
Вложения

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Michael Paquier
Дата:
On Thu, Nov 22, 2018 at 06:31:04PM +0800, Paul Guo wrote:
> Yes, please see the attached patch.

Thanks, I have reviewed your patch, and could not resist to change
SyncRepReleaseWaiters() on top of the rest with a comment to not be
confused about the WAL sender states allowed to release waiters.

The previous experience is proving that it seems unwise to rely on the
order of the elements in WalSndState, so I have tweaked things so as the
state is listed for all the code paths involved.

Paul, what do you think?
--
Michael

Вложения

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Paul Guo
Дата:
On Thu, Nov 22, 2018 at 06:31:04PM +0800, Paul Guo wrote:
> Yes, please see the attached patch.

Thanks, I have reviewed your patch, and could not resist to change
SyncRepReleaseWaiters() on top of the rest with a comment to not be
confused about the WAL sender states allowed to release waiters.

The previous experience is proving that it seems unwise to rely on the
order of the elements in WalSndState, so I have tweaked things so as the
state is listed for all the code paths involved.

Paul, what do you think?
--
Michael

Agree. Your code change is safer.
 

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Michael Paquier
Дата:
On Tue, Nov 27, 2018 at 10:07:04AM +0800, Paul Guo wrote:
> Agree. Your code change is safer.

Thanks Paul for double-checking.  I'll come back to this patch in two
days, look at it again and commit it if there are no objections from
others.
--
Michael

Вложения

Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

От
Michael Paquier
Дата:
On Tue, Nov 27, 2018 at 11:40:25AM +0900, Michael Paquier wrote:
> Thanks Paul for double-checking.  I'll come back to this patch in two
> days, look at it again and commit it if there are no objections from
> others.

Committed and back-patched down to 9.4.  Thanks Paul for the report!
--
Michael

Вложения