Re: logical replication empty transactions

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: logical replication empty transactions
Дата
Msg-id CAHut+PvVC8=qhxuhk3AWsK6Oo_hgLfvBiPQXJw=04acEcKFVFg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: logical replication empty transactions  (Ajin Cherian <itsajin@gmail.com>)
Список pgsql-hackers
On Fri, Mar 4, 2022 at 12:41 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> I have split the patch into two. I have kept the logic of skipping
> streaming changes in the second patch.
> I will work on the second patch once we can figure out a solution for
> the COMMIT PREPARED after restart problem.
>

Please see below my review comments for the first patch only (v23-0001)

======

1. Patch failed to apply cleanly - whitespace warnings.

git apply ../patches_misc/v23-0001-Skip-empty-transactions-for-logical-replication.patch
../patches_misc/v23-0001-Skip-empty-transactions-for-logical-replication.patch:68:
trailing whitespace.
 * change in a transaction is processed. This makes it possible
warning: 1 line adds whitespace errors.

~~~

2. src/backend/replication/pgoutput/pgoutput.c - typedef struct PGOutputTxnData

+/*
+ * Maintain a per-transaction level variable to track whether the
+ * transaction has sent BEGIN. BEGIN is only sent when the first
+ * change in a transaction is processed. This makes it possible
+ * to skip transactions that are empty.
+ */
+typedef struct PGOutputTxnData

I felt that this comment is describing details all about its bool
member but I think maybe it should be describing something also about
the structure itself (because this is the structure comment). E.g. it
should say about it only being allocated by the pgoutput_begin_txn()
and it is accessible via txn->output_plugin_private. Maybe also say
this has subtle implications if this is NULL then it means the tx
can't be 2PC etc...

~~~

3. src/backend/replication/pgoutput/pgoutput.c - pgoutput_send_begin

+/*
+ * Send BEGIN.
+ *
+ * This is where the BEGIN is actually sent. This is called while processing
+ * the first change of the transaction.
+ */
+static void
+pgoutput_send_begin(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)

IMO there is no need to repeat "This is where the BEGIN is actually
sent.", because "Send BEGIN." already said the same thing :-)

~~~

4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_commit_txn

+ /*
+ * If a BEGIN message was not yet sent, then it means there were no relevant
+ * changes encountered, so we can skip the COMMIT message too.
+ */
+ sent_begin_txn = txndata->sent_begin_txn;
+ txn->output_plugin_private = NULL;
+ OutputPluginUpdateProgress(ctx, !sent_begin_txn);
+
+ pfree(txndata);

Not quite sure why this pfree is positioned where it is (after that
function call). I felt this should be a couple of lines up so txndata
is freed as soon as you had no more use for it (i.e. after you copied
the bool from it)

~~~

5. src/backend/replication/pgoutput/pgoutput.c - maybe_send_schema

@@ -594,6 +658,13 @@ maybe_send_schema(LogicalDecodingContext *ctx,
  if (schema_sent)
  return;

+   /* set up txndata */
+   txndata = toptxn->output_plugin_private;

The comment does quite feel right. Nothing is "setting up" anything.
Really, all this does is assign a reference to the tx private data.
Probably better with no comment at all?

~~~

6. src/backend/replication/pgoutput/pgoutput.c - maybe_send_schema

I observed that every call to the maybe_send_schema function also has
adjacent code that already/always is checking to call
pgoutput_send_begin_tx function.

So then I am wondering is the added logic to the maybe_send_schema
even needed at all? It looks a bit redundant. Thoughts?

~~~

7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change

@@ -1141,6 +1212,7 @@ pgoutput_change(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
  Relation relation, ReorderBufferChange *change)
 {
  PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
+ PGOutputTxnData *txndata = (PGOutputTxnData *) txn->output_plugin_private;
  MemoryContext old;

Maybe if is worth deferring this assignment until after the row-filter
check. Otherwise, you are maybe doing it for nothing and IIRC this is
hot code so the less you do here the better. OTOH a single assignment
probably amounts to almost nothing.

~~~

8. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change

@@ -1354,6 +1438,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
    int nrelations, Relation relations[], ReorderBufferChange *change)
 {
  PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
+ PGOutputTxnData *txndata;
  MemoryContext old;

This variable declaration should be done later in the block where it
is assigned.

~~~

9. src/backend/replication/pgoutput/pgoutput.c - suggestion

I notice there is quite a few places in the patch that look like:

+ txndata = (PGOutputTxnData *) txn->output_plugin_private;
+
+ /* Send BEGIN if we haven't yet */
+ if (txndata && !txndata->sent_begin_txn)
+ pgoutput_send_begin(ctx, txn);
+

It might be worth considering encapsulating all those in a helper function like:
pgoutput_maybe_send_begin(ctx, txn)

It would certainly be a lot tidier.

~~~

10. src/backend/replication/syncrep.c - SyncRepEnabled

@@ -539,6 +538,15 @@ SyncRepReleaseWaiters(void)
 }

 /*
+ * Check if synchronous replication is enabled
+ */
+bool
+SyncRepEnabled(void)

Missing period for that function comment.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: wal_compression=zstd
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: pg_tablespace_location() failure with allow_in_place_tablespaces