Обсуждение: Re: Documentation update of wal_retrieve_retry_interval to mention table sync worker

Поиск
Список
Период
Сортировка

Re: Documentation update of wal_retrieve_retry_interval to mention table sync worker

От
Peter Smith
Дата:
On Thu, Dec 26, 2024 at 1:37 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Hi,
>
> Currently, we restart the table synchronization worker after the
> duration specified by wal_retrieve_retry_interval following the last
> failure. While this behavior is documented for apply workers, it is
> not mentioned for table synchronization workers. I believe this detail
> should be included in the documentation for table synchronization
> workers as well. Attached is a patch to address this omission.
>
> Regards,
> Vignesh

Hi Vignesh,

Here are some review comments for your v1 patch.

+1 to enhance the documentation.

======

1.
        <para>
         In logical replication, this parameter also limits how often a failing
-        replication apply worker will be respawned.
+        replication apply worker, and table synchronization worker will be
+        respawned.
        </para>

/, and/or/


SUGGESTION
In logical replication, this parameter also limits how often a failing
replication apply worker or table synchronization worker will be
respawned.

======

2.
I think the reader might never be aware of any of this (throttled
relaunch) behaviour unless they accidentally stumble across the docs
for this GUC, so IMO this information should be mentioned elsewhere --
wherever the tablesync worker errors are documented. But, TBH, I can't
find anywhere in the PostgreSQL docs where it even mentions
re-launching failed tablesync workers!

Anyway, I think it might be good to include such information in some
suitable place (maybe in the CREATE SUBSCRIPTION notes? or maybe in
Chapter 29?) to say something like...

SUGGESTION:
In practice, if a table synchronization worker fails during logical
replication, the apply worker detects the failure and attempts to
respawn the table synchronization worker to continue the
synchronization process. This behaviour ensures that transient errors
do not permanently disrupt the replication setup. See also
wal_retrieve_retry_interval.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Patch v3-0001 LGTM

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Mon, 13 Jan 2025 at 12:33, vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 6 Jan 2025 at 08:47, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Vignesh,
> >
> > Some review comments for your v2 patch.
> >
> > ======
> > doc/src/sgml/logical-replication.sgml
> >
> > AFAICT the only difference you made is changing:
> > FROM "a special kind of apply process"
> > TO "a special kind of table synchronization worker process".
> >
> > There is only ONE kind of tablesync process, so I think saying "a
> > special kind of table synchronization worker process" seems
> > misleading. I also thought maybe it is better to mention that this is
> > PER table.
> >
> > SUGGESTION:
> > ... a special table synchronization worker process per table.
>
> Thanks, the updated v3 version patch has the changes for the same.
>
 Hi Vignesh,

I reviewed the v3 patch. And it looks good to me.

Thanks and Regards,
Shlok Kyal