Обсуждение: Make pg_waldump report replication origin ID, LSN, and timestamp.

Поиск
Список
Период
Сортировка

Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Masahiko Sawada
Дата:
Hi all,

I realized that pg_waldump doesn't show replication origin ID, LSN,
and timestamp of PREPARE TRANSACTION record. Commit 7b8a899bdeb
improved pg_waldump two years ago so that it reports the detail of
information of PREPARE TRANSACTION but ISTM that it overlooked showing
replication origin information. As far as I can see in the discussion
thread[1], there was no discussion on that. These are helpful when
diagnosing 2PC related issues on the subscriber side.

I've attached a patch to add replication origin information to
xact_desc_prepare().

Regards,

[1] https://www.postgresql.org/message-id/CAHGQGwEvhASad4JJnCv%3D0dW2TJypZgW_Vpb-oZik2a3utCqcrA%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Michael Paquier
Дата:
On Mon, Dec 06, 2021 at 04:35:07PM +0900, Masahiko Sawada wrote:
> I've attached a patch to add replication origin information to
> xact_desc_prepare().

Yeah.

+   if (origin_id != InvalidRepOriginId)
+       appendStringInfo(buf, "; origin: node %u, lsn %X/%X, at %s",
+                        origin_id,
+                        LSN_FORMAT_ARGS(parsed.origin_lsn),
+                        timestamptz_to_str(parsed.origin_timestamp));

Shouldn't you check for parsed.origin_lsn instead?  The replication
origin is stored there as far as I read EndPrepare().

Commit records check after XACT_XINFO_HAS_ORIGIN, but
xact_desc_abort() may include this information for ROLLBACK PREPARED
transactions so we could use the same logic as xact_desc_commit() for
the abort case, no?
--
Michael

Вложения

Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Masahiko Sawada
Дата:
On Mon, Dec 6, 2021 at 5:09 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Dec 06, 2021 at 04:35:07PM +0900, Masahiko Sawada wrote:
> > I've attached a patch to add replication origin information to
> > xact_desc_prepare().
>
> Yeah.
>
> +   if (origin_id != InvalidRepOriginId)
> +       appendStringInfo(buf, "; origin: node %u, lsn %X/%X, at %s",
> +                        origin_id,
> +                        LSN_FORMAT_ARGS(parsed.origin_lsn),
> +                        timestamptz_to_str(parsed.origin_timestamp));
>
> Shouldn't you check for parsed.origin_lsn instead?  The replication
> origin is stored there as far as I read EndPrepare().

Yeah, I was thinking check origin_lsn instead. That way, we don't show
invalid origin_timestamp and origin_lsn even if origin_id is set. But
as far as I read, the same is true for xact_desc_commit() (and
xact_desc_rollback()). That is, since apply workers always its origin
id and could do commit transactions that are not replicated from the
publisher, it's possible that xact_desc_commit() reports like:

rmgr: Transaction len (rec/tot):    117/   117, tx:        725, lsn:
0/014BE938, prev 0/014BE908, desc: COMMIT 2021-12-06 22:04:44.462200
JST; inval msgs: catcache 55 catcache 54 catcache 64; origin: node 1,
lsn 0/0, at 2000-01-01 09:00:00.000000 JST

Also, looking at PrepareRedoAdd(), we check the replication origin id.
So I think that it'd be better to check origin_id for consistency.

>
> Commit records check after XACT_XINFO_HAS_ORIGIN, but
> xact_desc_abort() may include this information for ROLLBACK PREPARED
> transactions so we could use the same logic as xact_desc_commit() for
> the abort case, no?

Good catch! I'll submit an updated patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Masahiko Sawada
Дата:
On Mon, Dec 6, 2021 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Dec 6, 2021 at 5:09 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Mon, Dec 06, 2021 at 04:35:07PM +0900, Masahiko Sawada wrote:
> > > I've attached a patch to add replication origin information to
> > > xact_desc_prepare().
> >
> > Yeah.
> >
> > +   if (origin_id != InvalidRepOriginId)
> > +       appendStringInfo(buf, "; origin: node %u, lsn %X/%X, at %s",
> > +                        origin_id,
> > +                        LSN_FORMAT_ARGS(parsed.origin_lsn),
> > +                        timestamptz_to_str(parsed.origin_timestamp));
> >
> > Shouldn't you check for parsed.origin_lsn instead?  The replication
> > origin is stored there as far as I read EndPrepare().
>
> Yeah, I was thinking check origin_lsn instead. That way, we don't show
> invalid origin_timestamp and origin_lsn even if origin_id is set. But
> as far as I read, the same is true for xact_desc_commit() (and
> xact_desc_rollback()). That is, since apply workers always its origin
> id and could do commit transactions that are not replicated from the
> publisher, it's possible that xact_desc_commit() reports like:
>
> rmgr: Transaction len (rec/tot):    117/   117, tx:        725, lsn:
> 0/014BE938, prev 0/014BE908, desc: COMMIT 2021-12-06 22:04:44.462200
> JST; inval msgs: catcache 55 catcache 54 catcache 64; origin: node 1,
> lsn 0/0, at 2000-01-01 09:00:00.000000 JST

Hmm, is that okay in the first place? This happens since the apply
worker updates twophaesstate value of pg_subscription after setting
the origin id and before entering the apply loop. No changes in this
transaction will be replicated but an empty transaction that has
origin id and doesn't have origin lsn and time will be replicated.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Michael Paquier
Дата:
On Mon, Dec 06, 2021 at 11:24:09PM +0900, Masahiko Sawada wrote:
> On Mon, Dec 6, 2021 at 5:09 PM Michael Paquier <michael@paquier.xyz> wrote:
>> Shouldn't you check for parsed.origin_lsn instead?  The replication
>> origin is stored there as far as I read EndPrepare().
>
> Also, looking at PrepareRedoAdd(), we check the replication origin id.
> So I think that it'd be better to check origin_id for consistency.

Okay, this consistency would make sense, then.  Perhaps some comments
should be added to tell that?

>> Commit records check after XACT_XINFO_HAS_ORIGIN, but
>> xact_desc_abort() may include this information for ROLLBACK PREPARED
>> transactions so we could use the same logic as xact_desc_commit() for
>> the abort case, no?
>
> Good catch! I'll submit an updated patch.

Thanks!
--
Michael

Вложения

Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Masahiko Sawada
Дата:
On Wed, Dec 8, 2021 at 4:31 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Dec 06, 2021 at 11:24:09PM +0900, Masahiko Sawada wrote:
> > On Mon, Dec 6, 2021 at 5:09 PM Michael Paquier <michael@paquier.xyz> wrote:
> >> Shouldn't you check for parsed.origin_lsn instead?  The replication
> >> origin is stored there as far as I read EndPrepare().
> >
> > Also, looking at PrepareRedoAdd(), we check the replication origin id.
> > So I think that it'd be better to check origin_id for consistency.
>
> Okay, this consistency would make sense, then.  Perhaps some comments
> should be added to tell that?

Agreed. I've attached an updated patch that incorporated your review
comments. Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Michael Paquier
Дата:
On Wed, Dec 08, 2021 at 05:03:30PM +0900, Masahiko Sawada wrote:
> Agreed. I've attached an updated patch that incorporated your review
> comments. Please review it.

That looks correct to me.  One thing that I have noticed while
reviewing is that we don't check XactCompletionApplyFeedback() in
xact_desc_commit(), which would happen if a transaction needs to do
a remote_apply on a standby.  synchronous_commit is a user-settable
parameter, so it seems to me that it could be useful for debugging?

That's not related to your patch, but while we are looking at the
area..
--
Michael

Вложения

Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Masahiko Sawada
Дата:
On Thu, Dec 9, 2021 at 4:02 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Dec 08, 2021 at 05:03:30PM +0900, Masahiko Sawada wrote:
> > Agreed. I've attached an updated patch that incorporated your review
> > comments. Please review it.
>
> That looks correct to me.  One thing that I have noticed while
> reviewing is that we don't check XactCompletionApplyFeedback() in
> xact_desc_commit(), which would happen if a transaction needs to do
> a remote_apply on a standby.  synchronous_commit is a user-settable
> parameter, so it seems to me that it could be useful for debugging?
>

Agreed.

Thank you for updating the patch. The patch looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: Make pg_waldump report replication origin ID, LSN, and timestamp.

От
Michael Paquier
Дата:
On Thu, Dec 09, 2021 at 05:42:56PM +0900, Masahiko Sawada wrote:
> Thank you for updating the patch. The patch looks good to me.

Done this way.  Please note that while testing this patch, I have
found a completely different issue.  I'll spawn a thread about that in
a minute..
--
Michael

Вложения