Обсуждение: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext
When working on the REPACK command, we see an ERROR caused by unexpected change of CurrentResourceOwner [1]. I think the problem is that reorderbuffer.c does not restore the original value after calling RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle the call like other callers throughout the tree do. [1] https://www.postgresql.org/message-id/CADzfLwUgPMLiFkXRnk97ugPqkDfsNJ3TRdw9gjJM%3D8WB4_nXwQ%40mail.gmail.com -- Antonin Houska Web: https://www.cybertec-postgresql.com
Вложения
Hello, Antonin! if (using_subtxn) { RollbackAndReleaseCurrentSubTransaction(); MemoryContextSwitchTo(ccxt); CurrentResourceOwner = cowner; } IIUC memory context is already switched above: MemoryContext ecxt = MemoryContextSwitchTo(ccxt); Best regards, Mikhail.
On 2025-Sep-03, Antonin Houska wrote: > When working on the REPACK command, we see an ERROR caused by unexpected > change of CurrentResourceOwner [1]. I think the problem is that > reorderbuffer.c does not restore the original value after calling > RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle > the call like other callers throughout the tree do. I have registered this as https://commitfest.postgresql.org/patch/6051/ I've been wondering whether this should be backpatched. In principle this is a bugfix, so it should, but I don't offhand recall any cases where failure to set the current context/resowner in the other reorderbuffer.c users causes a live bug, so ... maybe master only? I'm wondering if it's possible where anybody _depends_ on the current behavior, but I suppose that's quite unlikely. It applies cleanly back to 14, so I would probably stop there in any case. (A good thing also, because if we mess up 13 with it now and don't notice until after the next minor is out, there won't be a chance to unbreak it later.) -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ Essentially, you're proposing Kevlar shoes as a solution for the problem that you want to walk around carrying a loaded gun aimed at your foot. (Tom Lane)
On Thu, Sep 11, 2025, at 3:05 PM, Álvaro Herrera wrote: > On 2025-Sep-03, Antonin Houska wrote: > >> When working on the REPACK command, we see an ERROR caused by unexpected >> change of CurrentResourceOwner [1]. I think the problem is that >> reorderbuffer.c does not restore the original value after calling >> RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle >> the call like other callers throughout the tree do. > Interesting. I'm wondering that if this patch is applied we could remove the following code /* * Logical decoding could have clobbered CurrentResourceOwner during * transaction management, so restore the executor's value. (This is * a kluge, but it's not worth cleaning up right now.) */ CurrentResourceOwner = old_resowner; from pg_logical_slot_get_changes_guts and LogicalSlotAdvanceAndCheckSnapState functions too. IIUC the referred code is a band-aid that will be improved someday. > I have registered this as > https://commitfest.postgresql.org/patch/6051/ > > I've been wondering whether this should be backpatched. In principle > this is a bugfix, so it should, but I don't offhand recall any cases > where failure to set the current context/resowner in the other > reorderbuffer.c users causes a live bug, so ... maybe master only? I'm > wondering if it's possible where anybody _depends_ on the current > behavior, but I suppose that's quite unlikely. > I would say apply it to master only. If/when we have a bug report we can backpatch it. Per the crash description, I'm not sure we can create a reproducible test case with the current supported commands. Am I wrong? > It applies cleanly back to 14, so I would probably stop there in any > case. (A good thing also, because if we mess up 13 with it now and > don't notice until after the next minor is out, there won't be a chance > to unbreak it later.) > I would also refrain to apply scary fixes into an EOL branch. -- Euler Taveira EDB https://www.enterprisedb.com/
Euler Taveira <euler@eulerto.com> wrote: > On Thu, Sep 11, 2025, at 3:05 PM, Álvaro Herrera wrote: > > On 2025-Sep-03, Antonin Houska wrote: > > > >> When working on the REPACK command, we see an ERROR caused by unexpected > >> change of CurrentResourceOwner [1]. I think the problem is that > >> reorderbuffer.c does not restore the original value after calling > >> RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle > >> the call like other callers throughout the tree do. > > > > Interesting. I'm wondering that if this patch is applied we could remove the > following code > > /* > * Logical decoding could have clobbered CurrentResourceOwner during > * transaction management, so restore the executor's value. (This is > * a kluge, but it's not worth cleaning up right now.) > */ > CurrentResourceOwner = old_resowner; > > from pg_logical_slot_get_changes_guts and LogicalSlotAdvanceAndCheckSnapState > functions too. IIUC the referred code is a band-aid that will be improved > someday. Even though we're fixing the likely reason of this problem, we cannot be 100% sure that no other problem like this still exists. So I'd not remove this assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front of that, and adjust the comment? > > I have registered this as > > https://commitfest.postgresql.org/patch/6051/ > > > > I've been wondering whether this should be backpatched. In principle > > this is a bugfix, so it should, but I don't offhand recall any cases > > where failure to set the current context/resowner in the other > > reorderbuffer.c users causes a live bug, so ... maybe master only? I'm > > wondering if it's possible where anybody _depends_ on the current > > behavior, but I suppose that's quite unlikely. > > > > I would say apply it to master only. If/when we have a bug report we can > backpatch it. +1 > Per the crash description, I'm not sure we can create a > reproducible test case with the current supported commands. Am I wrong? It seems so, at least with he "CurrentResourceOwner = old_resowner" assignment in place. REPACK CONCURRENTLY exposes the problem a bit more because it has at least one kind of resource open during logical decoding: relation. -- Antonin Houska Web: https://www.cybertec-postgresql.com
On 2025-Sep-12, Antonin Houska wrote: > Euler Taveira <euler@eulerto.com> wrote: > > > Interesting. I'm wondering that if this patch is applied we could remove the > > following code [...] from pg_logical_slot_get_changes_guts and > > LogicalSlotAdvanceAndCheckSnapState functions too. IIUC the referred > > code is a band-aid that will be improved someday. > > Even though we're fixing the likely reason of this problem, we cannot be 100% > sure that no other problem like this still exists. So I'd not remove this > assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front of > that, and adjust the comment? Yeah, I'm going to risk removing it, because if we don't do it now, we're never going to do it. We can mitigate the risk of missing remaining bugs by having that assertion you suggest, so that if anyone actually hits a problem here, we'll know soon enough. I have pushed it with that change. I'll also add an open item for pg19 so that we remember to come back to remove the assertions, if we feel we no longer need them. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Update: super-fast reaction on the Postgres bugs mailing list. The report was acknowledged [...], and a fix is under discussion. The wonders of open-source !" https://twitter.com/gunnarmorling/status/1596080409259003906
Álvaro Herrera <alvherre@kurilemu.de> wrote: > On 2025-Sep-12, Antonin Houska wrote: > > > Euler Taveira <euler@eulerto.com> wrote: > > > > > Interesting. I'm wondering that if this patch is applied we could remove the > > > following code [...] from pg_logical_slot_get_changes_guts and > > > LogicalSlotAdvanceAndCheckSnapState functions too. IIUC the referred > > > code is a band-aid that will be improved someday. > > > > Even though we're fixing the likely reason of this problem, we cannot be 100% > > sure that no other problem like this still exists. So I'd not remove this > > assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front of > > that, and adjust the comment? > > Yeah, I'm going to risk removing it, because if we don't do it now, > we're never going to do it. We can mitigate the risk of missing > remaining bugs by having that assertion you suggest, so that if anyone > actually hits a problem here, we'll know soon enough. > > I have pushed it with that change. I'll also add an open item for pg19 > so that we remember to come back to remove the assertions, if we feel we > no longer need them. I was worried about removing those workarounds because it was not trivial to diagnose the issue. But it should be ok for the master branch. Thanks. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Hi!
> Hello, Antonin!
>
> if (using_subtxn)
> {
> RollbackAndReleaseCurrentSubTransaction();
> MemoryContextSwitchTo(ccxt);
> CurrentResourceOwner = cowner;
> }
>
> IIUC memory context is already switched above:
>
> MemoryContext ecxt = MemoryContextSwitchTo(ccxt);
>
Finally realized RollbackAndReleaseCurrentSubTransaction leaves as
with TransactionAbortContext, so, nevermind :)
> Hello, Antonin!
>
> if (using_subtxn)
> {
> RollbackAndReleaseCurrentSubTransaction();
> MemoryContextSwitchTo(ccxt);
> CurrentResourceOwner = cowner;
> }
>
> IIUC memory context is already switched above:
>
> MemoryContext ecxt = MemoryContextSwitchTo(ccxt);
>
Finally realized RollbackAndReleaseCurrentSubTransaction leaves as
with TransactionAbortContext, so, nevermind :)