Обсуждение: Dropping publication breaks logical replication

Поиск
Список
Период
Сортировка

Dropping publication breaks logical replication

От
Ashutosh Bapat
Дата:
Hi Vignesh, Amit,
We encountered a situation where a customer dropped a publication
accidentally and that broke logical replication in an irrecoverable
manner. This is PG 15.3 but the team confirmed that the behaviour is
reproducible with PG 17 as well.

When a WAL sender processes a WAL record recording a change in
publication, it ends up calling LoadPublication() which throws an
error if a publication mentioned in START_REPLICATION command is not
found. The downstream tries to reconnect but the WAL sender again
repeats the same process going in an error loop. Creating the
publication does not help since WAL sender will always encounter the
WAL record dropping the publication first.

There are ways to come out of this situation, but not very clean always
1. Remove publication from subscription, run logical replication till
it passes the point where publication was added, add the publication
back and continue. It's not always possible to know when the
publication was added back and thus it becomes tedious or next to
impossible to apply these steps.
2. Reseeding the replication slot which involves copying all the data
again and not feasible in case of large databases.
3. Skipping the transaction which dropped the publication. This will
work if drop publication was the only thing in that transaction but
not otherwise. Confirming that is tricky and requires some expert
help.

In PG 18 onwards, this behaviour is fixed by throwing a WARNING
instead of an error. In the relevant thread [1] where the fix to PG 18
was discussed, backpatching was also discussed. Back then it was
deferred because of lack of field reports. But we are seeing this
situation now. So maybe it's time to backpatch the fix. Further PG 15
documentation mentions that
https://www.postgresql.org/docs/15/sql-createsubscription.html. So the
users will expect that their logical replication will not be affected
(except for the data published by the publication) if a publication is
dropped or does not exist. So, backpatching the change would make the
behaviour compatible with the documentation.

The backport seems to be straight forward. Please let me know if you
need my help in doing so, if we decide to backport the fix.

-- 
Best Wishes,
Ashutosh Bapat



Re: Dropping publication breaks logical replication

От
Dilip Kumar
Дата:
On Fri, Aug 1, 2025 at 10:55 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Vignesh, Amit,
> We encountered a situation where a customer dropped a publication
> accidentally and that broke logical replication in an irrecoverable
> manner. This is PG 15.3 but the team confirmed that the behaviour is
> reproducible with PG 17 as well.
>
> When a WAL sender processes a WAL record recording a change in
> publication, it ends up calling LoadPublication() which throws an
> error if a publication mentioned in START_REPLICATION command is not
> found. The downstream tries to reconnect but the WAL sender again
> repeats the same process going in an error loop. Creating the
> publication does not help since WAL sender will always encounter the
> WAL record dropping the publication first.
>
> There are ways to come out of this situation, but not very clean always
> 1. Remove publication from subscription, run logical replication till
> it passes the point where publication was added, add the publication
> back and continue. It's not always possible to know when the
> publication was added back and thus it becomes tedious or next to
> impossible to apply these steps.
> 2. Reseeding the replication slot which involves copying all the data
> again and not feasible in case of large databases.
> 3. Skipping the transaction which dropped the publication. This will
> work if drop publication was the only thing in that transaction but
> not otherwise. Confirming that is tricky and requires some expert
> help.
>
> In PG 18 onwards, this behaviour is fixed by throwing a WARNING
> instead of an error. In the relevant thread [1] where the fix to PG 18
> was discussed, backpatching was also discussed. Back then it was
> deferred because of lack of field reports. But we are seeing this
> situation now. So maybe it's time to backpatch the fix. Further PG 15
> documentation mentions that
> https://www.postgresql.org/docs/15/sql-createsubscription.html. So the
> users will expect that their logical replication will not be affected
> (except for the data published by the publication) if a publication is
> dropped or does not exist. So, backpatching the change would make the
> behaviour compatible with the documentation.
>
> The backport seems to be straight forward. Please let me know if you
> need my help in doing so, if we decide to backport the fix.

I think you missed to add the link to the "relevant thread [1] "

--
Regards,
Dilip Kumar
Google



Re: Dropping publication breaks logical replication

От
Ashutosh Bapat
Дата:
On Fri, Aug 1, 2025 at 11:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 10:55 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Hi Vignesh, Amit,
> > We encountered a situation where a customer dropped a publication
> > accidentally and that broke logical replication in an irrecoverable
> > manner. This is PG 15.3 but the team confirmed that the behaviour is
> > reproducible with PG 17 as well.
> >
> > When a WAL sender processes a WAL record recording a change in
> > publication, it ends up calling LoadPublication() which throws an
> > error if a publication mentioned in START_REPLICATION command is not
> > found. The downstream tries to reconnect but the WAL sender again
> > repeats the same process going in an error loop. Creating the
> > publication does not help since WAL sender will always encounter the
> > WAL record dropping the publication first.
> >
> > There are ways to come out of this situation, but not very clean always
> > 1. Remove publication from subscription, run logical replication till
> > it passes the point where publication was added, add the publication
> > back and continue. It's not always possible to know when the
> > publication was added back and thus it becomes tedious or next to
> > impossible to apply these steps.
> > 2. Reseeding the replication slot which involves copying all the data
> > again and not feasible in case of large databases.
> > 3. Skipping the transaction which dropped the publication. This will
> > work if drop publication was the only thing in that transaction but
> > not otherwise. Confirming that is tricky and requires some expert
> > help.
> >
> > In PG 18 onwards, this behaviour is fixed by throwing a WARNING
> > instead of an error. In the relevant thread [1] where the fix to PG 18
> > was discussed, backpatching was also discussed. Back then it was
> > deferred because of lack of field reports. But we are seeing this
> > situation now. So maybe it's time to backpatch the fix. Further PG 15
> > documentation mentions that
> > https://www.postgresql.org/docs/15/sql-createsubscription.html. So the
> > users will expect that their logical replication will not be affected
> > (except for the data published by the publication) if a publication is
> > dropped or does not exist. So, backpatching the change would make the
> > behaviour compatible with the documentation.
> >
> > The backport seems to be straight forward. Please let me know if you
> > need my help in doing so, if we decide to backport the fix.
>
> I think you missed to add the link to the "relevant thread [1] "

Thanks for noticing it. Here it is

[1]
https://www.postgresql.org/message-id/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: Dropping publication breaks logical replication

От
Amit Kapila
Дата:
On Fri, Aug 1, 2025 at 10:54 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Vignesh, Amit,
> We encountered a situation where a customer dropped a publication
> accidentally and that broke logical replication in an irrecoverable
> manner. This is PG 15.3 but the team confirmed that the behaviour is
> reproducible with PG 17 as well.
>
> When a WAL sender processes a WAL record recording a change in
> publication, it ends up calling LoadPublication() which throws an
> error if a publication mentioned in START_REPLICATION command is not
> found. The downstream tries to reconnect but the WAL sender again
> repeats the same process going in an error loop. Creating the
> publication does not help since WAL sender will always encounter the
> WAL record dropping the publication first.
>
> There are ways to come out of this situation, but not very clean always
> 1. Remove publication from subscription, run logical replication till
> it passes the point where publication was added, add the publication
> back and continue. It's not always possible to know when the
> publication was added back and thus it becomes tedious or next to
> impossible to apply these steps.
> 2. Reseeding the replication slot which involves copying all the data
> again and not feasible in case of large databases.
> 3. Skipping the transaction which dropped the publication. This will
> work if drop publication was the only thing in that transaction but
> not otherwise. Confirming that is tricky and requires some expert
> help.
>
> In PG 18 onwards, this behaviour is fixed by throwing a WARNING
> instead of an error. In the relevant thread [1] where the fix to PG 18
> was discussed, backpatching was also discussed. Back then it was
> deferred because of lack of field reports. But we are seeing this
> situation now.
>

Thanks for the report. One more reason we were hesitant to backpatch
was that it is possible that some users may expect replication to stop
in this case as mentioned by Tomas in one of his emails [1] ("See the
para starting with "Imagine you have a subscriber ..." in his email").
We thought, as it could be perceived as a behavior change, so better
to do it as a HEAD only change.

Now, seeing this report, it seems the customer(s) are probably okay to
skip a missing publication and let replication continue. So, we should
consider backpatching this change but it would be better if few more
people can share their opinion on this matter.

[1] - https://www.postgresql.org/message-id/dc08add3-10a8-738b-983a-191c7406707b%40enterprisedb.com

--
With Regards,
Amit Kapila.



Re: Dropping publication breaks logical replication

От
Ashutosh Bapat
Дата:
On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 10:54 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Hi Vignesh, Amit,
> > We encountered a situation where a customer dropped a publication
> > accidentally and that broke logical replication in an irrecoverable
> > manner. This is PG 15.3 but the team confirmed that the behaviour is
> > reproducible with PG 17 as well.
> >
> > When a WAL sender processes a WAL record recording a change in
> > publication, it ends up calling LoadPublication() which throws an
> > error if a publication mentioned in START_REPLICATION command is not
> > found. The downstream tries to reconnect but the WAL sender again
> > repeats the same process going in an error loop. Creating the
> > publication does not help since WAL sender will always encounter the
> > WAL record dropping the publication first.
> >
> > There are ways to come out of this situation, but not very clean always
> > 1. Remove publication from subscription, run logical replication till
> > it passes the point where publication was added, add the publication
> > back and continue. It's not always possible to know when the
> > publication was added back and thus it becomes tedious or next to
> > impossible to apply these steps.
> > 2. Reseeding the replication slot which involves copying all the data
> > again and not feasible in case of large databases.
> > 3. Skipping the transaction which dropped the publication. This will
> > work if drop publication was the only thing in that transaction but
> > not otherwise. Confirming that is tricky and requires some expert
> > help.
> >
> > In PG 18 onwards, this behaviour is fixed by throwing a WARNING
> > instead of an error. In the relevant thread [1] where the fix to PG 18
> > was discussed, backpatching was also discussed. Back then it was
> > deferred because of lack of field reports. But we are seeing this
> > situation now.
> >
>
> Thanks for the report. One more reason we were hesitant to backpatch
> was that it is possible that some users may expect replication to stop
> in this case as mentioned by Tomas in one of his emails [1] ("See the
> para starting with "Imagine you have a subscriber ..." in his email").
> We thought, as it could be perceived as a behavior change, so better
> to do it as a HEAD only change.

Yes, that's a valid concern. We have to choose between missing some
changes because of missing publication and an irrecoverable error. The
latter seems more serious. The first is covered by our documentation -
maybe indirectly and we throw a WARNING. So choosing the second seems
a better option. Maybe we could do a better job at documenting this.

I wish we could pass a "missing_ok" flag with START_REPLICATION
command, but we can't do that in the back branches. And we haven't
done that when we committed the fix to PG 18.

>
> Now, seeing this report, it seems the customer(s) are probably okay to
> skip a missing publication and let replication continue. So, we should
> consider backpatching this change but it would be better if few more
> people can share their opinion on this matter.

Including Tomas for his opinion. Who else do you think can provide an
opinion based on experience?

Thinking aloud about what you suggest in [1] in the same thread. The
problem there is, upstream can not access downstream subscription and
has no control over them so it can not avoid dropping a publication
even if it's being used by a subscription. What at most we can do is
not allow dropping a publication being used by a running WAL sender by
locking publication in use somehow. However, even that won't help
much. Assume that a WAL sender disconnects for some other reason,
followed by the publication getting dropped. We end up in the same
situation.

[1] https://www.postgresql.org/message-id/CAA4eK1K40xhObN1MWO7%3DrzrJmo%2BoQ048O8drZ86-F7artVvwQQ%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: Dropping publication breaks logical replication

От
vignesh C
Дата:
On Fri, 1 Aug 2025 at 10:54, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Vignesh, Amit,
> We encountered a situation where a customer dropped a publication
> accidentally and that broke logical replication in an irrecoverable
> manner. This is PG 15.3 but the team confirmed that the behaviour is
> reproducible with PG 17 as well.
>
> When a WAL sender processes a WAL record recording a change in
> publication, it ends up calling LoadPublication() which throws an
> error if a publication mentioned in START_REPLICATION command is not
> found. The downstream tries to reconnect but the WAL sender again
> repeats the same process going in an error loop. Creating the
> publication does not help since WAL sender will always encounter the
> WAL record dropping the publication first.
>
> There are ways to come out of this situation, but not very clean always
> 1. Remove publication from subscription, run logical replication till
> it passes the point where publication was added, add the publication
> back and continue. It's not always possible to know when the
> publication was added back and thus it becomes tedious or next to
> impossible to apply these steps.
> 2. Reseeding the replication slot which involves copying all the data
> again and not feasible in case of large databases.
> 3. Skipping the transaction which dropped the publication. This will
> work if drop publication was the only thing in that transaction but
> not otherwise. Confirming that is tricky and requires some expert
> help.
>
> In PG 18 onwards, this behaviour is fixed by throwing a WARNING
> instead of an error. In the relevant thread [1] where the fix to PG 18
> was discussed, backpatching was also discussed. Back then it was
> deferred because of lack of field reports. But we are seeing this
> situation now. So maybe it's time to backpatch the fix. Further PG 15
> documentation mentions that
> https://www.postgresql.org/docs/15/sql-createsubscription.html. So the
> users will expect that their logical replication will not be affected
> (except for the data published by the publication) if a publication is
> dropped or does not exist. So, backpatching the change would make the
> behaviour compatible with the documentation.
>
> The backport seems to be straight forward. Please let me know if you
> need my help in doing so, if we decide to backport the fix.

Now that this has been reported on the back branches, we should
consider whether it's appropriate to backport the fix. Here are the
patches prepared for the back branches.

Regards,
Vignesh

Вложения

Re: Dropping publication breaks logical replication

От
Ashutosh Bapat
Дата:
Hi Vignesh,
Thanks for the patches.

On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote:

> >
> > The backport seems to be straight forward. Please let me know if you
> > need my help in doing so, if we decide to backport the fix.
>
> Now that this has been reported on the back branches, we should
> consider whether it's appropriate to backport the fix. Here are the
> patches prepared for the back branches.

PG14 and + patches do not test that DROP PUBLICATION does not disrupt
the publication. I think we need to test that as well.

PG13 tests DROP PUBLICATION OTOH. That's good. I think it has a race
condition because +my $offset = -s $node_publisher->logfile; is
executed after dropping the publication. If some background change
triggers publication validation before capturing the file offset, we
might miss the WARNING and the test will fail. Instead capturing
offset before dropping publication may be safer - the publication
exists till it dropped, so the log file cannot have WARNING in there
when offset is captured.

--
Best Wishes,
Ashutosh Bapat



Re: Dropping publication breaks logical replication

От
vignesh C
Дата:
On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Vignesh,
> Thanks for the patches.
>
> On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote:
>
> > >
> > > The backport seems to be straight forward. Please let me know if you
> > > need my help in doing so, if we decide to backport the fix.
> >
> > Now that this has been reported on the back branches, we should
> > consider whether it's appropriate to backport the fix. Here are the
> > patches prepared for the back branches.
>
> PG14 and + patches do not test that DROP PUBLICATION does not disrupt
> the publication. I think we need to test that as well.

Currently, the test across all branches except PG13 is the same test
used in the master branch. For PG13, since there was no existing
subscription, I modified the test slightly to accommodate that. If I
handle the comment you suggest, the test in master and the backbranch
will be different. Should we keep the test similar to the master or is
it ok to address your above comment and keep it different?

> PG13 tests DROP PUBLICATION OTOH. That's good. I think it has a race
> condition because +my $offset = -s $node_publisher->logfile; is
> executed after dropping the publication. If some background change
> triggers publication validation before capturing the file offset, we
> might miss the WARNING and the test will fail. Instead capturing
> offset before dropping publication may be safer - the publication
> exists till it dropped, so the log file cannot have WARNING in there
> when offset is captured.

I will handle this in the next version.

Regards,
Vignesh



Re: Dropping publication breaks logical replication

От
vignesh C
Дата:
On Mon, 4 Aug 2025 at 16:08, vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Hi Vignesh,
> > Thanks for the patches.
> >
> > On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > >
> > > > The backport seems to be straight forward. Please let me know if you
> > > > need my help in doing so, if we decide to backport the fix.
> > >
> > > Now that this has been reported on the back branches, we should
> > > consider whether it's appropriate to backport the fix. Here are the
> > > patches prepared for the back branches.
> >
> > PG14 and + patches do not test that DROP PUBLICATION does not disrupt
> > the publication. I think we need to test that as well.
>
> Currently, the test across all branches except PG13 is the same test
> used in the master branch. For PG13, since there was no existing
> subscription, I modified the test slightly to accommodate that. If I
> handle the comment you suggest, the test in master and the backbranch
> will be different. Should we keep the test similar to the master or is
> it ok to address your above comment and keep it different?
>
> > PG13 tests DROP PUBLICATION OTOH. That's good. I think it has a race
> > condition because +my $offset = -s $node_publisher->logfile; is
> > executed after dropping the publication. If some background change
> > triggers publication validation before capturing the file offset, we
> > might miss the WARNING and the test will fail. Instead capturing
> > offset before dropping publication may be safer - the publication
> > exists till it dropped, so the log file cannot have WARNING in there
> > when offset is captured.
>
> I will handle this in the next version.

This is addressed in the attached patch. Only the PG13 branch patch is
updated, there is no change in other branch patches.

Regards,
Vignesh

Вложения

Re: Dropping publication breaks logical replication

От
Ashutosh Bapat
Дата:
On Mon, Aug 4, 2025 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Hi Vignesh,
> > Thanks for the patches.
> >
> > On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > >
> > > > The backport seems to be straight forward. Please let me know if you
> > > > need my help in doing so, if we decide to backport the fix.
> > >
> > > Now that this has been reported on the back branches, we should
> > > consider whether it's appropriate to backport the fix. Here are the
> > > patches prepared for the back branches.
> >
> > PG14 and + patches do not test that DROP PUBLICATION does not disrupt
> > the publication. I think we need to test that as well.
>
> Currently, the test across all branches except PG13 is the same test
> used in the master branch. For PG13, since there was no existing
> subscription, I modified the test slightly to accommodate that. If I
> handle the comment you suggest, the test in master and the backbranch
> will be different. Should we keep the test similar to the master or is
> it ok to address your above comment and keep it different?

IMO we should modify the test on master as well and either backpatch
both commits or backpatch after combining those two commits.

--
Best Wishes,
Ashutosh Bapat



Re: Dropping publication breaks logical replication

От
Amit Kapila
Дата:
On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Now, seeing this report, it seems the customer(s) are probably okay to
> > skip a missing publication and let replication continue. So, we should
> > consider backpatching this change but it would be better if few more
> > people can share their opinion on this matter.
>
> Including Tomas for his opinion. Who else do you think can provide an
> opinion based on experience?
>

I don't have any particular names in mind but Dilip and Sawada-San
names are listed as reviewers in the commit [1], so it would be good
to see what are their thoughts on this.

Please note that this behavior is from the time logical replication
was introduced, so we need to be a bit careful in changing the
behavior in backbranches.

[1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7c99dc587a010a0c40d72a0e435111ca7a371c02

--
With Regards,
Amit Kapila.



Re: Dropping publication breaks logical replication

От
Dilip Kumar
Дата:
On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Now, seeing this report, it seems the customer(s) are probably okay to
> > > skip a missing publication and let replication continue. So, we should
> > > consider backpatching this change but it would be better if few more
> > > people can share their opinion on this matter.
> >
> > Including Tomas for his opinion. Who else do you think can provide an
> > opinion based on experience?
> >
>
> I don't have any particular names in mind but Dilip and Sawada-San
> names are listed as reviewers in the commit [1], so it would be good
> to see what are their thoughts on this.
>
> Please note that this behavior is from the time logical replication
> was introduced, so we need to be a bit careful in changing the
> behavior in backbranches.
>
> [1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7c99dc587a010a0c40d72a0e435111ca7a371c02

I believe we should backpatch this fix. The old behavior doesn't seem
intentional, and IMHO users might not be relying on that behavior, but
that's just my thought and someone can come across a real world use
case where a user might be depending on that behavior? Although we
initially didn't backpatch it because it changed existing behavior and
hadn't received any complaints, a recent complaint suggests that it's
now better to improve the back branches as well.


--
Regards,
Dilip Kumar
Google



Re: Dropping publication breaks logical replication

От
vignesh C
Дата:
On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Vignesh,
> Thanks for the patches.
>
> On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote:
>
> > >
> > > The backport seems to be straight forward. Please let me know if you
> > > need my help in doing so, if we decide to backport the fix.
> >
> > Now that this has been reported on the back branches, we should
> > consider whether it's appropriate to backport the fix. Here are the
> > patches prepared for the back branches.
>
> PG14 and + patches do not test that DROP PUBLICATION does not disrupt
> the publication. I think we need to test that as well.

The attached v3 version patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Dropping publication breaks logical replication

От
Ashutosh Bapat
Дата:
On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Now, seeing this report, it seems the customer(s) are probably okay to
> > > skip a missing publication and let replication continue. So, we should
> > > consider backpatching this change but it would be better if few more
> > > people can share their opinion on this matter.
> >
> > Including Tomas for his opinion. Who else do you think can provide an
> > opinion based on experience?
> >
>
> I don't have any particular names in mind but Dilip and Sawada-San
> names are listed as reviewers in the commit [1], so it would be good
> to see what are their thoughts on this.
>
> Please note that this behavior is from the time logical replication
> was introduced, so we need to be a bit careful in changing the
> behavior in backbranches.

Agreed.

Only Dilip has expressed an opinion so far. Haven't heard from others,
so can't guess what their opinions are.

If we are ok backpatching it, I will review Vignesh's patches thoroughly.

--
Best Wishes,
Ashutosh Bapat



Re: Dropping publication breaks logical replication

От
Amit Kapila
Дата:
On Fri, Aug 8, 2025 at 5:19 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > Now, seeing this report, it seems the customer(s) are probably okay to
> > > > skip a missing publication and let replication continue. So, we should
> > > > consider backpatching this change but it would be better if few more
> > > > people can share their opinion on this matter.
> > >
> > > Including Tomas for his opinion. Who else do you think can provide an
> > > opinion based on experience?
> > >
> >
> > I don't have any particular names in mind but Dilip and Sawada-San
> > names are listed as reviewers in the commit [1], so it would be good
> > to see what are their thoughts on this.
> >
> > Please note that this behavior is from the time logical replication
> > was introduced, so we need to be a bit careful in changing the
> > behavior in backbranches.
>
> Agreed.
>
> Only Dilip has expressed an opinion so far. Haven't heard from others,
> so can't guess what their opinions are.
>

Yeah, let's wait for a few more days. Even if we decide to backpatch
it, let's target the next minor release.

--
With Regards,
Amit Kapila.



Re: Dropping publication breaks logical replication

От
Masahiko Sawada
Дата:
On Fri, Aug 8, 2025 at 5:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 5:19 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > Now, seeing this report, it seems the customer(s) are probably okay to
> > > > > skip a missing publication and let replication continue. So, we should
> > > > > consider backpatching this change but it would be better if few more
> > > > > people can share their opinion on this matter.
> > > >
> > > > Including Tomas for his opinion. Who else do you think can provide an
> > > > opinion based on experience?
> > > >
> > >
> > > I don't have any particular names in mind but Dilip and Sawada-San
> > > names are listed as reviewers in the commit [1], so it would be good
> > > to see what are their thoughts on this.
> > >
> > > Please note that this behavior is from the time logical replication
> > > was introduced, so we need to be a bit careful in changing the
> > > behavior in backbranches.
> >
> > Agreed.
> >
> > Only Dilip has expressed an opinion so far. Haven't heard from others,
> > so can't guess what their opinions are.
> >
>
> Yeah, let's wait for a few more days. Even if we decide to backpatch
> it, let's target the next minor release.

I'm personally hesitant to backpatch this change. I'm not sure if
there are any users who aware of this behavior and depend on it, but
it seems to me that for users who update to a new minor version having
this change, the problem will simply change from that replication
stops due to missing publications to that replication can continue but
they will almost silently lose some changes (users often don't see
warnings in server logs). I guess dealing with the latter problem
would be more difficult.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com