Обсуждение: Dropping publication breaks logical replication
Hi Vignesh, Amit, We encountered a situation where a customer dropped a publication accidentally and that broke logical replication in an irrecoverable manner. This is PG 15.3 but the team confirmed that the behaviour is reproducible with PG 17 as well. When a WAL sender processes a WAL record recording a change in publication, it ends up calling LoadPublication() which throws an error if a publication mentioned in START_REPLICATION command is not found. The downstream tries to reconnect but the WAL sender again repeats the same process going in an error loop. Creating the publication does not help since WAL sender will always encounter the WAL record dropping the publication first. There are ways to come out of this situation, but not very clean always 1. Remove publication from subscription, run logical replication till it passes the point where publication was added, add the publication back and continue. It's not always possible to know when the publication was added back and thus it becomes tedious or next to impossible to apply these steps. 2. Reseeding the replication slot which involves copying all the data again and not feasible in case of large databases. 3. Skipping the transaction which dropped the publication. This will work if drop publication was the only thing in that transaction but not otherwise. Confirming that is tricky and requires some expert help. In PG 18 onwards, this behaviour is fixed by throwing a WARNING instead of an error. In the relevant thread [1] where the fix to PG 18 was discussed, backpatching was also discussed. Back then it was deferred because of lack of field reports. But we are seeing this situation now. So maybe it's time to backpatch the fix. Further PG 15 documentation mentions that https://www.postgresql.org/docs/15/sql-createsubscription.html. So the users will expect that their logical replication will not be affected (except for the data published by the publication) if a publication is dropped or does not exist. So, backpatching the change would make the behaviour compatible with the documentation. The backport seems to be straight forward. Please let me know if you need my help in doing so, if we decide to backport the fix. -- Best Wishes, Ashutosh Bapat
On Fri, Aug 1, 2025 at 10:55 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Vignesh, Amit, > We encountered a situation where a customer dropped a publication > accidentally and that broke logical replication in an irrecoverable > manner. This is PG 15.3 but the team confirmed that the behaviour is > reproducible with PG 17 as well. > > When a WAL sender processes a WAL record recording a change in > publication, it ends up calling LoadPublication() which throws an > error if a publication mentioned in START_REPLICATION command is not > found. The downstream tries to reconnect but the WAL sender again > repeats the same process going in an error loop. Creating the > publication does not help since WAL sender will always encounter the > WAL record dropping the publication first. > > There are ways to come out of this situation, but not very clean always > 1. Remove publication from subscription, run logical replication till > it passes the point where publication was added, add the publication > back and continue. It's not always possible to know when the > publication was added back and thus it becomes tedious or next to > impossible to apply these steps. > 2. Reseeding the replication slot which involves copying all the data > again and not feasible in case of large databases. > 3. Skipping the transaction which dropped the publication. This will > work if drop publication was the only thing in that transaction but > not otherwise. Confirming that is tricky and requires some expert > help. > > In PG 18 onwards, this behaviour is fixed by throwing a WARNING > instead of an error. In the relevant thread [1] where the fix to PG 18 > was discussed, backpatching was also discussed. Back then it was > deferred because of lack of field reports. But we are seeing this > situation now. So maybe it's time to backpatch the fix. Further PG 15 > documentation mentions that > https://www.postgresql.org/docs/15/sql-createsubscription.html. So the > users will expect that their logical replication will not be affected > (except for the data published by the publication) if a publication is > dropped or does not exist. So, backpatching the change would make the > behaviour compatible with the documentation. > > The backport seems to be straight forward. Please let me know if you > need my help in doing so, if we decide to backport the fix. I think you missed to add the link to the "relevant thread [1] " -- Regards, Dilip Kumar Google
On Fri, Aug 1, 2025 at 11:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Aug 1, 2025 at 10:55 AM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Hi Vignesh, Amit, > > We encountered a situation where a customer dropped a publication > > accidentally and that broke logical replication in an irrecoverable > > manner. This is PG 15.3 but the team confirmed that the behaviour is > > reproducible with PG 17 as well. > > > > When a WAL sender processes a WAL record recording a change in > > publication, it ends up calling LoadPublication() which throws an > > error if a publication mentioned in START_REPLICATION command is not > > found. The downstream tries to reconnect but the WAL sender again > > repeats the same process going in an error loop. Creating the > > publication does not help since WAL sender will always encounter the > > WAL record dropping the publication first. > > > > There are ways to come out of this situation, but not very clean always > > 1. Remove publication from subscription, run logical replication till > > it passes the point where publication was added, add the publication > > back and continue. It's not always possible to know when the > > publication was added back and thus it becomes tedious or next to > > impossible to apply these steps. > > 2. Reseeding the replication slot which involves copying all the data > > again and not feasible in case of large databases. > > 3. Skipping the transaction which dropped the publication. This will > > work if drop publication was the only thing in that transaction but > > not otherwise. Confirming that is tricky and requires some expert > > help. > > > > In PG 18 onwards, this behaviour is fixed by throwing a WARNING > > instead of an error. In the relevant thread [1] where the fix to PG 18 > > was discussed, backpatching was also discussed. Back then it was > > deferred because of lack of field reports. But we are seeing this > > situation now. So maybe it's time to backpatch the fix. Further PG 15 > > documentation mentions that > > https://www.postgresql.org/docs/15/sql-createsubscription.html. So the > > users will expect that their logical replication will not be affected > > (except for the data published by the publication) if a publication is > > dropped or does not exist. So, backpatching the change would make the > > behaviour compatible with the documentation. > > > > The backport seems to be straight forward. Please let me know if you > > need my help in doing so, if we decide to backport the fix. > > I think you missed to add the link to the "relevant thread [1] " Thanks for noticing it. Here it is [1] https://www.postgresql.org/message-id/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com -- Best Wishes, Ashutosh Bapat
On Fri, Aug 1, 2025 at 10:54 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Vignesh, Amit, > We encountered a situation where a customer dropped a publication > accidentally and that broke logical replication in an irrecoverable > manner. This is PG 15.3 but the team confirmed that the behaviour is > reproducible with PG 17 as well. > > When a WAL sender processes a WAL record recording a change in > publication, it ends up calling LoadPublication() which throws an > error if a publication mentioned in START_REPLICATION command is not > found. The downstream tries to reconnect but the WAL sender again > repeats the same process going in an error loop. Creating the > publication does not help since WAL sender will always encounter the > WAL record dropping the publication first. > > There are ways to come out of this situation, but not very clean always > 1. Remove publication from subscription, run logical replication till > it passes the point where publication was added, add the publication > back and continue. It's not always possible to know when the > publication was added back and thus it becomes tedious or next to > impossible to apply these steps. > 2. Reseeding the replication slot which involves copying all the data > again and not feasible in case of large databases. > 3. Skipping the transaction which dropped the publication. This will > work if drop publication was the only thing in that transaction but > not otherwise. Confirming that is tricky and requires some expert > help. > > In PG 18 onwards, this behaviour is fixed by throwing a WARNING > instead of an error. In the relevant thread [1] where the fix to PG 18 > was discussed, backpatching was also discussed. Back then it was > deferred because of lack of field reports. But we are seeing this > situation now. > Thanks for the report. One more reason we were hesitant to backpatch was that it is possible that some users may expect replication to stop in this case as mentioned by Tomas in one of his emails [1] ("See the para starting with "Imagine you have a subscriber ..." in his email"). We thought, as it could be perceived as a behavior change, so better to do it as a HEAD only change. Now, seeing this report, it seems the customer(s) are probably okay to skip a missing publication and let replication continue. So, we should consider backpatching this change but it would be better if few more people can share their opinion on this matter. [1] - https://www.postgresql.org/message-id/dc08add3-10a8-738b-983a-191c7406707b%40enterprisedb.com -- With Regards, Amit Kapila.
On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 1, 2025 at 10:54 AM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Hi Vignesh, Amit, > > We encountered a situation where a customer dropped a publication > > accidentally and that broke logical replication in an irrecoverable > > manner. This is PG 15.3 but the team confirmed that the behaviour is > > reproducible with PG 17 as well. > > > > When a WAL sender processes a WAL record recording a change in > > publication, it ends up calling LoadPublication() which throws an > > error if a publication mentioned in START_REPLICATION command is not > > found. The downstream tries to reconnect but the WAL sender again > > repeats the same process going in an error loop. Creating the > > publication does not help since WAL sender will always encounter the > > WAL record dropping the publication first. > > > > There are ways to come out of this situation, but not very clean always > > 1. Remove publication from subscription, run logical replication till > > it passes the point where publication was added, add the publication > > back and continue. It's not always possible to know when the > > publication was added back and thus it becomes tedious or next to > > impossible to apply these steps. > > 2. Reseeding the replication slot which involves copying all the data > > again and not feasible in case of large databases. > > 3. Skipping the transaction which dropped the publication. This will > > work if drop publication was the only thing in that transaction but > > not otherwise. Confirming that is tricky and requires some expert > > help. > > > > In PG 18 onwards, this behaviour is fixed by throwing a WARNING > > instead of an error. In the relevant thread [1] where the fix to PG 18 > > was discussed, backpatching was also discussed. Back then it was > > deferred because of lack of field reports. But we are seeing this > > situation now. > > > > Thanks for the report. One more reason we were hesitant to backpatch > was that it is possible that some users may expect replication to stop > in this case as mentioned by Tomas in one of his emails [1] ("See the > para starting with "Imagine you have a subscriber ..." in his email"). > We thought, as it could be perceived as a behavior change, so better > to do it as a HEAD only change. Yes, that's a valid concern. We have to choose between missing some changes because of missing publication and an irrecoverable error. The latter seems more serious. The first is covered by our documentation - maybe indirectly and we throw a WARNING. So choosing the second seems a better option. Maybe we could do a better job at documenting this. I wish we could pass a "missing_ok" flag with START_REPLICATION command, but we can't do that in the back branches. And we haven't done that when we committed the fix to PG 18. > > Now, seeing this report, it seems the customer(s) are probably okay to > skip a missing publication and let replication continue. So, we should > consider backpatching this change but it would be better if few more > people can share their opinion on this matter. Including Tomas for his opinion. Who else do you think can provide an opinion based on experience? Thinking aloud about what you suggest in [1] in the same thread. The problem there is, upstream can not access downstream subscription and has no control over them so it can not avoid dropping a publication even if it's being used by a subscription. What at most we can do is not allow dropping a publication being used by a running WAL sender by locking publication in use somehow. However, even that won't help much. Assume that a WAL sender disconnects for some other reason, followed by the publication getting dropped. We end up in the same situation. [1] https://www.postgresql.org/message-id/CAA4eK1K40xhObN1MWO7%3DrzrJmo%2BoQ048O8drZ86-F7artVvwQQ%40mail.gmail.com -- Best Wishes, Ashutosh Bapat
On Fri, 1 Aug 2025 at 10:54, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Vignesh, Amit, > We encountered a situation where a customer dropped a publication > accidentally and that broke logical replication in an irrecoverable > manner. This is PG 15.3 but the team confirmed that the behaviour is > reproducible with PG 17 as well. > > When a WAL sender processes a WAL record recording a change in > publication, it ends up calling LoadPublication() which throws an > error if a publication mentioned in START_REPLICATION command is not > found. The downstream tries to reconnect but the WAL sender again > repeats the same process going in an error loop. Creating the > publication does not help since WAL sender will always encounter the > WAL record dropping the publication first. > > There are ways to come out of this situation, but not very clean always > 1. Remove publication from subscription, run logical replication till > it passes the point where publication was added, add the publication > back and continue. It's not always possible to know when the > publication was added back and thus it becomes tedious or next to > impossible to apply these steps. > 2. Reseeding the replication slot which involves copying all the data > again and not feasible in case of large databases. > 3. Skipping the transaction which dropped the publication. This will > work if drop publication was the only thing in that transaction but > not otherwise. Confirming that is tricky and requires some expert > help. > > In PG 18 onwards, this behaviour is fixed by throwing a WARNING > instead of an error. In the relevant thread [1] where the fix to PG 18 > was discussed, backpatching was also discussed. Back then it was > deferred because of lack of field reports. But we are seeing this > situation now. So maybe it's time to backpatch the fix. Further PG 15 > documentation mentions that > https://www.postgresql.org/docs/15/sql-createsubscription.html. So the > users will expect that their logical replication will not be affected > (except for the data published by the publication) if a publication is > dropped or does not exist. So, backpatching the change would make the > behaviour compatible with the documentation. > > The backport seems to be straight forward. Please let me know if you > need my help in doing so, if we decide to backport the fix. Now that this has been reported on the back branches, we should consider whether it's appropriate to backport the fix. Here are the patches prepared for the back branches. Regards, Vignesh
Вложения
- v1_PG13-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION.patch
- v1_PG14-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v1_PG15-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v1_PG17-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v1_PG16-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
Hi Vignesh, Thanks for the patches. On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote: > > > > The backport seems to be straight forward. Please let me know if you > > need my help in doing so, if we decide to backport the fix. > > Now that this has been reported on the back branches, we should > consider whether it's appropriate to backport the fix. Here are the > patches prepared for the back branches. PG14 and + patches do not test that DROP PUBLICATION does not disrupt the publication. I think we need to test that as well. PG13 tests DROP PUBLICATION OTOH. That's good. I think it has a race condition because +my $offset = -s $node_publisher->logfile; is executed after dropping the publication. If some background change triggers publication validation before capturing the file offset, we might miss the WARNING and the test will fail. Instead capturing offset before dropping publication may be safer - the publication exists till it dropped, so the log file cannot have WARNING in there when offset is captured. -- Best Wishes, Ashutosh Bapat
On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Vignesh, > Thanks for the patches. > > On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > The backport seems to be straight forward. Please let me know if you > > > need my help in doing so, if we decide to backport the fix. > > > > Now that this has been reported on the back branches, we should > > consider whether it's appropriate to backport the fix. Here are the > > patches prepared for the back branches. > > PG14 and + patches do not test that DROP PUBLICATION does not disrupt > the publication. I think we need to test that as well. Currently, the test across all branches except PG13 is the same test used in the master branch. For PG13, since there was no existing subscription, I modified the test slightly to accommodate that. If I handle the comment you suggest, the test in master and the backbranch will be different. Should we keep the test similar to the master or is it ok to address your above comment and keep it different? > PG13 tests DROP PUBLICATION OTOH. That's good. I think it has a race > condition because +my $offset = -s $node_publisher->logfile; is > executed after dropping the publication. If some background change > triggers publication validation before capturing the file offset, we > might miss the WARNING and the test will fail. Instead capturing > offset before dropping publication may be safer - the publication > exists till it dropped, so the log file cannot have WARNING in there > when offset is captured. I will handle this in the next version. Regards, Vignesh
On Mon, 4 Aug 2025 at 16:08, vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Hi Vignesh, > > Thanks for the patches. > > > > On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > The backport seems to be straight forward. Please let me know if you > > > > need my help in doing so, if we decide to backport the fix. > > > > > > Now that this has been reported on the back branches, we should > > > consider whether it's appropriate to backport the fix. Here are the > > > patches prepared for the back branches. > > > > PG14 and + patches do not test that DROP PUBLICATION does not disrupt > > the publication. I think we need to test that as well. > > Currently, the test across all branches except PG13 is the same test > used in the master branch. For PG13, since there was no existing > subscription, I modified the test slightly to accommodate that. If I > handle the comment you suggest, the test in master and the backbranch > will be different. Should we keep the test similar to the master or is > it ok to address your above comment and keep it different? > > > PG13 tests DROP PUBLICATION OTOH. That's good. I think it has a race > > condition because +my $offset = -s $node_publisher->logfile; is > > executed after dropping the publication. If some background change > > triggers publication validation before capturing the file offset, we > > might miss the WARNING and the test will fail. Instead capturing > > offset before dropping publication may be safer - the publication > > exists till it dropped, so the log file cannot have WARNING in there > > when offset is captured. > > I will handle this in the next version. This is addressed in the attached patch. Only the PG13 branch patch is updated, there is no change in other branch patches. Regards, Vignesh
Вложения
- v1_PG17-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v1_PG15-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v1_PG14-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v1_PG16-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v2_PG13-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION.patch
On Mon, Aug 4, 2025 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Hi Vignesh, > > Thanks for the patches. > > > > On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > The backport seems to be straight forward. Please let me know if you > > > > need my help in doing so, if we decide to backport the fix. > > > > > > Now that this has been reported on the back branches, we should > > > consider whether it's appropriate to backport the fix. Here are the > > > patches prepared for the back branches. > > > > PG14 and + patches do not test that DROP PUBLICATION does not disrupt > > the publication. I think we need to test that as well. > > Currently, the test across all branches except PG13 is the same test > used in the master branch. For PG13, since there was no existing > subscription, I modified the test slightly to accommodate that. If I > handle the comment you suggest, the test in master and the backbranch > will be different. Should we keep the test similar to the master or is > it ok to address your above comment and keep it different? IMO we should modify the test on master as well and either backpatch both commits or backpatch after combining those two commits. -- Best Wishes, Ashutosh Bapat
On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Now, seeing this report, it seems the customer(s) are probably okay to > > skip a missing publication and let replication continue. So, we should > > consider backpatching this change but it would be better if few more > > people can share their opinion on this matter. > > Including Tomas for his opinion. Who else do you think can provide an > opinion based on experience? > I don't have any particular names in mind but Dilip and Sawada-San names are listed as reviewers in the commit [1], so it would be good to see what are their thoughts on this. Please note that this behavior is from the time logical replication was introduced, so we need to be a bit careful in changing the behavior in backbranches. [1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7c99dc587a010a0c40d72a0e435111ca7a371c02 -- With Regards, Amit Kapila.
On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Now, seeing this report, it seems the customer(s) are probably okay to > > > skip a missing publication and let replication continue. So, we should > > > consider backpatching this change but it would be better if few more > > > people can share their opinion on this matter. > > > > Including Tomas for his opinion. Who else do you think can provide an > > opinion based on experience? > > > > I don't have any particular names in mind but Dilip and Sawada-San > names are listed as reviewers in the commit [1], so it would be good > to see what are their thoughts on this. > > Please note that this behavior is from the time logical replication > was introduced, so we need to be a bit careful in changing the > behavior in backbranches. > > [1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7c99dc587a010a0c40d72a0e435111ca7a371c02 I believe we should backpatch this fix. The old behavior doesn't seem intentional, and IMHO users might not be relying on that behavior, but that's just my thought and someone can come across a real world use case where a user might be depending on that behavior? Although we initially didn't backpatch it because it changed existing behavior and hadn't received any complaints, a recent complaint suggests that it's now better to improve the back branches as well. -- Regards, Dilip Kumar Google
On Mon, 4 Aug 2025 at 09:47, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Hi Vignesh, > Thanks for the patches. > > On Sat, Aug 2, 2025 at 7:10 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > The backport seems to be straight forward. Please let me know if you > > > need my help in doing so, if we decide to backport the fix. > > > > Now that this has been reported on the back branches, we should > > consider whether it's appropriate to backport the fix. Here are the > > patches prepared for the back branches. > > PG14 and + patches do not test that DROP PUBLICATION does not disrupt > the publication. I think we need to test that as well. The attached v3 version patch has the changes for the same. Regards, Vignesh
Вложения
- v3_PG13-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION.patch
- v3_PG16-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v3_PG14-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v3_PG15-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
- v3_PG17-0001-Fix-ALTER-SUBSCRIPTION-.-SET-PUBLICATION-.-c.patch
On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Now, seeing this report, it seems the customer(s) are probably okay to > > > skip a missing publication and let replication continue. So, we should > > > consider backpatching this change but it would be better if few more > > > people can share their opinion on this matter. > > > > Including Tomas for his opinion. Who else do you think can provide an > > opinion based on experience? > > > > I don't have any particular names in mind but Dilip and Sawada-San > names are listed as reviewers in the commit [1], so it would be good > to see what are their thoughts on this. > > Please note that this behavior is from the time logical replication > was introduced, so we need to be a bit careful in changing the > behavior in backbranches. Agreed. Only Dilip has expressed an opinion so far. Haven't heard from others, so can't guess what their opinions are. If we are ok backpatching it, I will review Vignesh's patches thoroughly. -- Best Wishes, Ashutosh Bapat
On Fri, Aug 8, 2025 at 5:19 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat > > <ashutosh.bapat.oss@gmail.com> wrote: > > > > > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > Now, seeing this report, it seems the customer(s) are probably okay to > > > > skip a missing publication and let replication continue. So, we should > > > > consider backpatching this change but it would be better if few more > > > > people can share their opinion on this matter. > > > > > > Including Tomas for his opinion. Who else do you think can provide an > > > opinion based on experience? > > > > > > > I don't have any particular names in mind but Dilip and Sawada-San > > names are listed as reviewers in the commit [1], so it would be good > > to see what are their thoughts on this. > > > > Please note that this behavior is from the time logical replication > > was introduced, so we need to be a bit careful in changing the > > behavior in backbranches. > > Agreed. > > Only Dilip has expressed an opinion so far. Haven't heard from others, > so can't guess what their opinions are. > Yeah, let's wait for a few more days. Even if we decide to backpatch it, let's target the next minor release. -- With Regards, Amit Kapila.
On Fri, Aug 8, 2025 at 5:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 8, 2025 at 5:19 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Tue, Aug 5, 2025 at 9:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Aug 1, 2025 at 5:06 PM Ashutosh Bapat > > > <ashutosh.bapat.oss@gmail.com> wrote: > > > > > > > > On Fri, Aug 1, 2025 at 4:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > Now, seeing this report, it seems the customer(s) are probably okay to > > > > > skip a missing publication and let replication continue. So, we should > > > > > consider backpatching this change but it would be better if few more > > > > > people can share their opinion on this matter. > > > > > > > > Including Tomas for his opinion. Who else do you think can provide an > > > > opinion based on experience? > > > > > > > > > > I don't have any particular names in mind but Dilip and Sawada-San > > > names are listed as reviewers in the commit [1], so it would be good > > > to see what are their thoughts on this. > > > > > > Please note that this behavior is from the time logical replication > > > was introduced, so we need to be a bit careful in changing the > > > behavior in backbranches. > > > > Agreed. > > > > Only Dilip has expressed an opinion so far. Haven't heard from others, > > so can't guess what their opinions are. > > > > Yeah, let's wait for a few more days. Even if we decide to backpatch > it, let's target the next minor release. I'm personally hesitant to backpatch this change. I'm not sure if there are any users who aware of this behavior and depend on it, but it seems to me that for users who update to a new minor version having this change, the problem will simply change from that replication stops due to missing publications to that replication can continue but they will almost silently lose some changes (users often don't see warnings in server logs). I guess dealing with the latter problem would be more difficult. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com