Обсуждение: Re: pgsql: Preserve conflict-relevant data during logical replication.
On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > The fix looks good to me. I'll push your patch in sometime. The tests in this patch are insufficient to prove that this logic works properly. I tried with this patch: diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 7918176fc58..7c1d0ef07b5 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -2325,6 +2325,16 @@ RecordTransactionCommitPrepared(TransactionId xid, TimestampTz committs; bool replorigin; + /* + * Note it is important to set committs value after marking ourselves as + * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is because + * we want to ensure all transactions that have acquired commit timestamp + * are finished before we allow the logical replication client to advance + * its xid which is used to hold back dead rows for conflict detection. + * See comments atop worker.c. + */ + committs = GetCurrentTimestamp(); + /* * Are we using the replication origins feature? Or, in other words, are * we replaying remote actions? @@ -2344,16 +2354,6 @@ RecordTransactionCommitPrepared(TransactionId xid, */ pg_write_barrier(); - /* - * Note it is important to set committs value after marking ourselves as - * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is because - * we want to ensure all transactions that have acquired commit timestamp - * are finished before we allow the logical replication client to advance - * its xid which is used to hold back dead rows for conflict detection. - * See comments atop worker.c. - */ - committs = GetCurrentTimestamp(); - /* * Emit the XLOG commit record. Note that we mark 2PC commits as * potentially having AccessExclusiveLocks since we don't know whether or If it is in fact important to acquire the commit timestamp after setting delayChkptFlags, you'd hope this would lead to a test failure, but it doesn't for me. I understand it probably requires an injection point to be certain of hitting the race condition, but I think that would be worth doing. Otherwise, if something gets broken here by accident, it might be a long time before anyone notices. -- Robert Haas EDB: http://www.enterprisedb.com
RE: pgsql: Preserve conflict-relevant data during logical replication.
От
"Zhijie Hou (Fujitsu)"
Дата:
On Tuesday, September 2, 2025 11:03 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > The fix looks good to me. I'll push your patch in sometime. > > The tests in this patch are insufficient to prove that this logic works properly. I > tried with this patch: > > diff --git a/src/backend/access/transam/twophase.c > b/src/backend/access/transam/twophase.c > index 7918176fc58..7c1d0ef07b5 100644 > --- a/src/backend/access/transam/twophase.c > +++ b/src/backend/access/transam/twophase.c > @@ -2325,6 +2325,16 @@ RecordTransactionCommitPrepared(TransactionId > xid, > TimestampTz committs; > bool replorigin; > > + /* > + * Note it is important to set committs value after marking ourselves as > + * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is > because > + * we want to ensure all transactions that have acquired commit > timestamp > + * are finished before we allow the logical replication client to advance > + * its xid which is used to hold back dead rows for conflict detection. > + * See comments atop worker.c. > + */ > + committs = GetCurrentTimestamp(); > + > /* > * Are we using the replication origins feature? Or, in other words, are > * we replaying remote actions? > @@ -2344,16 +2354,6 @@ RecordTransactionCommitPrepared(TransactionId > xid, > */ > pg_write_barrier(); > > - /* > - * Note it is important to set committs value after marking ourselves as > - * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is > because > - * we want to ensure all transactions that have acquired commit > timestamp > - * are finished before we allow the logical replication client to advance > - * its xid which is used to hold back dead rows for conflict detection. > - * See comments atop worker.c. > - */ > - committs = GetCurrentTimestamp(); > - > /* > * Emit the XLOG commit record. Note that we mark 2PC commits as > * potentially having AccessExclusiveLocks since we don't know whether > or > > If it is in fact important to acquire the commit timestamp after setting > delayChkptFlags, you'd hope this would lead to a test failure, but it doesn't for > me. I understand it probably requires an injection point to be certain of hitting > the race condition, but I think that would be worth doing. Otherwise, if > something gets broken here by accident, it might be a long time before anyone > notices. Thanks for pointing it out! I agree that adding a test is valuable to mitigate the risk of future code changes. We added a similar safeguard for the RecordTransactionCommit() function by adding Assert(xactStopTimestamp == 0) after marking the DELAY_CHKPT_xxx flag, and did not do any precautionary check for RecordTransactionCommitPrepared as the code to acquire the timestamp and setting flag was close by and had explicit comments. I'll prepare a test case that uses the injection point and share it in the original thread. Best Regards, Hou zj
On Wed, Sep 3, 2025 at 5:10 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, September 2, 2025 11:03 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com> > > wrote: > > > > If it is in fact important to acquire the commit timestamp after setting > > delayChkptFlags, you'd hope this would lead to a test failure, but it doesn't for > > me. I understand it probably requires an injection point to be certain of hitting > > the race condition, but I think that would be worth doing. Otherwise, if > > something gets broken here by accident, it might be a long time before anyone > > notices. > > Thanks for pointing it out! > > I agree that adding a test is valuable to mitigate the risk of future code > changes. We added a similar safeguard for the RecordTransactionCommit() function > by adding Assert(xactStopTimestamp == 0) after marking the DELAY_CHKPT_xxx flag, > and did not do any precautionary check for RecordTransactionCommitPrepared as > the code to acquire the timestamp and setting flag was close by and had explicit > comments. > > I'll prepare a test case that uses the injection point and share it in the > original thread. > The test for this case is added in commit 6456c6e2c4ad1cf9752e09cce37bfcfe2190c5e0. -- With Regards, Amit Kapila.
On Wed, Sep 10, 2025 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > The test for this case is added in commit > 6456c6e2c4ad1cf9752e09cce37bfcfe2190c5e0. Thanks! -- Robert Haas EDB: http://www.enterprisedb.com