Обсуждение: Re: pgsql: Preserve conflict-relevant data during logical replication.
On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> The fix looks good to me. I'll push your patch in sometime.
The tests in this patch are insufficient to prove that this logic
works properly. I tried with this patch:
diff --git a/src/backend/access/transam/twophase.c
b/src/backend/access/transam/twophase.c
index 7918176fc58..7c1d0ef07b5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2325,6 +2325,16 @@ RecordTransactionCommitPrepared(TransactionId xid,
TimestampTz committs;
bool replorigin;
+ /*
+ * Note it is important to set committs value after marking ourselves as
+ * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is because
+ * we want to ensure all transactions that have acquired commit timestamp
+ * are finished before we allow the logical replication client to advance
+ * its xid which is used to hold back dead rows for conflict detection.
+ * See comments atop worker.c.
+ */
+ committs = GetCurrentTimestamp();
+
/*
* Are we using the replication origins feature? Or, in other words, are
* we replaying remote actions?
@@ -2344,16 +2354,6 @@ RecordTransactionCommitPrepared(TransactionId xid,
*/
pg_write_barrier();
- /*
- * Note it is important to set committs value after marking ourselves as
- * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is because
- * we want to ensure all transactions that have acquired commit timestamp
- * are finished before we allow the logical replication client to advance
- * its xid which is used to hold back dead rows for conflict detection.
- * See comments atop worker.c.
- */
- committs = GetCurrentTimestamp();
-
/*
* Emit the XLOG commit record. Note that we mark 2PC commits as
* potentially having AccessExclusiveLocks since we don't know whether or
If it is in fact important to acquire the commit timestamp after
setting delayChkptFlags, you'd hope this would lead to a test failure,
but it doesn't for me. I understand it probably requires an injection
point to be certain of hitting the race condition, but I think that
would be worth doing. Otherwise, if something gets broken here by
accident, it might be a long time before anyone notices.
--
Robert Haas
EDB: http://www.enterprisedb.com
RE: pgsql: Preserve conflict-relevant data during logical replication.
От
"Zhijie Hou (Fujitsu)"
Дата:
On Tuesday, September 2, 2025 11:03 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > The fix looks good to me. I'll push your patch in sometime. > > The tests in this patch are insufficient to prove that this logic works properly. I > tried with this patch: > > diff --git a/src/backend/access/transam/twophase.c > b/src/backend/access/transam/twophase.c > index 7918176fc58..7c1d0ef07b5 100644 > --- a/src/backend/access/transam/twophase.c > +++ b/src/backend/access/transam/twophase.c > @@ -2325,6 +2325,16 @@ RecordTransactionCommitPrepared(TransactionId > xid, > TimestampTz committs; > bool replorigin; > > + /* > + * Note it is important to set committs value after marking ourselves as > + * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is > because > + * we want to ensure all transactions that have acquired commit > timestamp > + * are finished before we allow the logical replication client to advance > + * its xid which is used to hold back dead rows for conflict detection. > + * See comments atop worker.c. > + */ > + committs = GetCurrentTimestamp(); > + > /* > * Are we using the replication origins feature? Or, in other words, are > * we replaying remote actions? > @@ -2344,16 +2354,6 @@ RecordTransactionCommitPrepared(TransactionId > xid, > */ > pg_write_barrier(); > > - /* > - * Note it is important to set committs value after marking ourselves as > - * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is > because > - * we want to ensure all transactions that have acquired commit > timestamp > - * are finished before we allow the logical replication client to advance > - * its xid which is used to hold back dead rows for conflict detection. > - * See comments atop worker.c. > - */ > - committs = GetCurrentTimestamp(); > - > /* > * Emit the XLOG commit record. Note that we mark 2PC commits as > * potentially having AccessExclusiveLocks since we don't know whether > or > > If it is in fact important to acquire the commit timestamp after setting > delayChkptFlags, you'd hope this would lead to a test failure, but it doesn't for > me. I understand it probably requires an injection point to be certain of hitting > the race condition, but I think that would be worth doing. Otherwise, if > something gets broken here by accident, it might be a long time before anyone > notices. Thanks for pointing it out! I agree that adding a test is valuable to mitigate the risk of future code changes. We added a similar safeguard for the RecordTransactionCommit() function by adding Assert(xactStopTimestamp == 0) after marking the DELAY_CHKPT_xxx flag, and did not do any precautionary check for RecordTransactionCommitPrepared as the code to acquire the timestamp and setting flag was close by and had explicit comments. I'll prepare a test case that uses the injection point and share it in the original thread. Best Regards, Hou zj
On Wed, Sep 3, 2025 at 5:10 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, September 2, 2025 11:03 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com> > > wrote: > > > > If it is in fact important to acquire the commit timestamp after setting > > delayChkptFlags, you'd hope this would lead to a test failure, but it doesn't for > > me. I understand it probably requires an injection point to be certain of hitting > > the race condition, but I think that would be worth doing. Otherwise, if > > something gets broken here by accident, it might be a long time before anyone > > notices. > > Thanks for pointing it out! > > I agree that adding a test is valuable to mitigate the risk of future code > changes. We added a similar safeguard for the RecordTransactionCommit() function > by adding Assert(xactStopTimestamp == 0) after marking the DELAY_CHKPT_xxx flag, > and did not do any precautionary check for RecordTransactionCommitPrepared as > the code to acquire the timestamp and setting flag was close by and had explicit > comments. > > I'll prepare a test case that uses the injection point and share it in the > original thread. > The test for this case is added in commit 6456c6e2c4ad1cf9752e09cce37bfcfe2190c5e0. -- With Regards, Amit Kapila.
On Wed, Sep 10, 2025 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > The test for this case is added in commit > 6456c6e2c4ad1cf9752e09cce37bfcfe2190c5e0. Thanks! -- Robert Haas EDB: http://www.enterprisedb.com