Обсуждение: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks

Поиск

Список

Период

Сортировка

BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks

От

PG Bug reporting form

Дата:

09 июля 2019 г., 01:22:45

The following bug has been logged on the website:

Bug reference:      15900
Logged by:          Alex Aktsipetrov
Email address:      alex.akts@gmail.com
PostgreSQL version: 12beta2
Operating system:   Ubuntu 16.04
Description:

SELECT FOR UPDATE query that references a transition table in AFTER
INSERT/UPDATE triggers produces an unexpected error. The same query with FOR
UPDATE omitted finishes without any error, which is my expectation for the
original one as well. AFTER DELETE triggers were not tested.

For example, the following query:

    create table testtr (a int, b text);

    create function testtr_trigger() returns trigger language plpgsql as
    $$begin
     perform(
       select array_agg(a) from
       (select testtr.a from testtr join new_table on testtr.a = new_table.a
for update)
       as tmp
     );
     return new;
    end$$;

    create trigger testtr_trigger
    after insert on testtr
    referencing new table as new_table
    for each statement execute procedure testtr_trigger();

    insert into testtr values (1, 'one'), (2, 'two');

produces the following error:

    ERROR:  executor could not find named tuplestore "new_table"
    CONTEXT:  SQL statement "SELECT (
       select array_agg(a) from
       (select testtr.a from testtr join new_table on testtr.a = new_table.a
for update)
       as tmp
    )"

I think the issue was introduced in ad0bda5d24ea2bcc72b5e50020e3c79bab10836b
as the query finishes successfully in its ancestor.

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Александр Акципетров

Дата:

09 июля 2019 г., 02:32:55

On Tue, 9 Jul 2019 at 01:22, PG Bug reporting form <noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 15900
Logged by: Alex Aktsipetrov
Email address: alex.akts@gmail.com
PostgreSQL version: 12beta2
Operating system: Ubuntu 16.04
Description:

SELECT FOR UPDATE query that references a transition table in AFTER
INSERT/UPDATE triggers produces an unexpected error. The same query with FOR
UPDATE omitted finishes without any error, which is my expectation for the
original one as well. AFTER DELETE triggers were not tested.

For example, the following query:

create table testtr (a int, b text);

create function testtr_trigger() returns trigger language plpgsql as
$$begin
perform(
select array_agg(a) from
(select testtr.a from testtr join new_table on testtr.a = new_table.a
for update)
as tmp
);
return new;
end$$;

create trigger testtr_trigger
after insert on testtr
referencing new table as new_table
for each statement execute procedure testtr_trigger();

insert into testtr values (1, 'one'), (2, 'two');

produces the following error:

ERROR: executor could not find named tuplestore "new_table"
CONTEXT: SQL statement "SELECT (
select array_agg(a) from
(select testtr.a from testtr join new_table on testtr.a = new_table.a
for update)
as tmp
)"

I think the issue was introduced in ad0bda5d24ea2bcc72b5e50020e3c79bab10836b
as the query finishes successfully in its ancestor.

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Alex Aktsipetrov

Дата:

09 июля 2019 г., 02:39:23

Sorry, the previous email was a fat finger error.

Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the right idea.

Вложения

trigger_select_for_update.patch

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Thomas Munro

Дата:

09 июля 2019 г., 02:49:19

On Tue, Jul 9, 2019 at 11:39 AM Alex Aktsipetrov <alex.akts@gmail.com> wrote:
> Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the
rightidea.
 

Thanks for the report and the proposed fix!

Hmm, I wonder if EPQ might be involved in bug report #15720 (version 11.2):

https://www.postgresql.org/message-id/flat/15720-38c2b29e5d720187%40postgresql.org

-- 
Thomas Munro
https://enterprisedb.com

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Thomas Munro

Дата:

09 июля 2019 г., 04:13:31

On Tue, Jul 9, 2019 at 11:49 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Jul 9, 2019 at 11:39 AM Alex Aktsipetrov <alex.akts@gmail.com> wrote:
> > Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the
rightidea.

>
> Thanks for the report and the proposed fix!

The repro works for me, and I think the patch is correct.  Here's a
version with that line moved to a more natural place IMHO.

> Hmm, I wonder if EPQ might be involved in bug report #15720 (version 11.2):
>
> https://www.postgresql.org/message-id/flat/15720-38c2b29e5d720187%40postgresql.org

I think it's highly likely that bug #15720 was a case of this bug and
would be fixed by this patch.  Alex's repro doesn't work on 11 though,
because EPQ is not entered at all.  Which raises the question: why do
we need to enter EPQ after commit ad0bda5d on 12/master, for a row
that hasn't been updated by anyone else?

-- 
Thomas Munro
https://enterprisedb.com

Вложения

0001-Pass-queryEnv-down-to-EvalPlanQual-s-EState.patch

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Thomas Munro

Дата:

09 июля 2019 г., 14:47:28

On Tue, Jul 9, 2019 at 1:13 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > Hmm, I wonder if EPQ might be involved in bug report #15720 (version 11.2):
> >
> > https://www.postgresql.org/message-id/flat/15720-38c2b29e5d720187%40postgresql.org
>
> I think it's highly likely that bug #15720 was a case of this bug and
> would be fixed by this patch.  Alex's repro doesn't work on 11 though,
> because EPQ is not entered at all.  Which raises the question: why do
> we need to enter EPQ after commit ad0bda5d on 12/master, for a row
> that hasn't been updated by anyone else?

Explanation: since ad0bda5d24ea, ExecLockRows() always calls
EvalPlanQualBegin() which initialises the plan state, and in this case
ExecInitNamedTuplestoreScan() errors out due to the bug.  Before, you
needed the right concurrency scenario (epq_needed) before we did that,
as the reporter of bug #15720 discovered.

I'm planning to commit that patch tomorrow.

-- 
Thomas Munro
https://enterprisedb.com

Re: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks

От

Tom Lane

Дата:

09 июля 2019 г., 18:38:13

Thomas Munro <thomas.munro@gmail.com> writes:
> On Tue, Jul 9, 2019 at 1:13 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>> I think it's highly likely that bug #15720 was a case of this bug and
>> would be fixed by this patch.

Agreed.  I think your version of the fix is good, and you should
mention #15720 too in the commit message.

>> Alex's repro doesn't work on 11 though,
>> because EPQ is not entered at all.  Which raises the question: why do
>> we need to enter EPQ after commit ad0bda5d on 12/master, for a row
>> that hasn't been updated by anyone else?

> Explanation: since ad0bda5d24ea, ExecLockRows() always calls
> EvalPlanQualBegin() which initialises the plan state, and in this case
> ExecInitNamedTuplestoreScan() errors out due to the bug.  Before, you
> needed the right concurrency scenario (epq_needed) before we did that,
> as the reporter of bug #15720 discovered.

I'm quite desperately unhappy about this observation, because
EvalPlanQualBegin is a *large* amount of overhead that is usually
unnecessary, and is now going to be paid for *every locked row*
whether there's any conflict on it or not.  I do not find that
acceptable.  Why is it necessary to do this before finding that
there's an update conflict?

            regards, tom lane

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Thomas Munro

Дата:

10 июля 2019 г., 01:43:28

On Wed, Jul 10, 2019 at 3:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Tue, Jul 9, 2019 at 1:13 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> >> I think it's highly likely that bug #15720 was a case of this bug and
> >> would be fixed by this patch.
>
> Agreed.  I think your version of the fix is good, and you should
> mention #15720 too in the commit message.

Thanks.  Pushed.

> >> Alex's repro doesn't work on 11 though,
> >> because EPQ is not entered at all.  Which raises the question: why do
> >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row
> >> that hasn't been updated by anyone else?
>
> > Explanation: since ad0bda5d24ea, ExecLockRows() always calls
> > EvalPlanQualBegin() which initialises the plan state, and in this case
> > ExecInitNamedTuplestoreScan() errors out due to the bug.  Before, you
> > needed the right concurrency scenario (epq_needed) before we did that,
> > as the reporter of bug #15720 discovered.
>
> I'm quite desperately unhappy about this observation, because
> EvalPlanQualBegin is a *large* amount of overhead that is usually
> unnecessary, and is now going to be paid for *every locked row*
> whether there's any conflict on it or not.  I do not find that
> acceptable.  Why is it necessary to do this before finding that
> there's an update conflict?

I haven't seriously looked into it and haven't succeeded in finding
the discussion of why this is absolutely necessary in the commit's
thread, but you'd think it should be possible to defer slot creation a
bit, or if not, do something cheaper than EPQBegin() that just
initialises the slots.  Andres, Haribabu, Ashutosh?

-- 
Thomas Munro
https://enterprisedb.com

Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

От

Andres Freund

Дата:

11 июля 2019 г., 03:52:02

Hi,

On 2019-07-09 11:38:13 -0400, Tom Lane wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> >> Alex's repro doesn't work on 11 though,
> >> because EPQ is not entered at all.  Which raises the question: why do
> >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row
> >> that hasn't been updated by anyone else?
> 
> > Explanation: since ad0bda5d24ea, ExecLockRows() always calls
> > EvalPlanQualBegin() which initialises the plan state, and in this case
> > ExecInitNamedTuplestoreScan() errors out due to the bug.  Before, you
> > needed the right concurrency scenario (epq_needed) before we did that,
> > as the reporter of bug #15720 discovered.
> 
> I'm quite desperately unhappy about this observation, because
> EvalPlanQualBegin is a *large* amount of overhead that is usually
> unnecessary, and is now going to be paid for *every locked row*
> whether there's any conflict on it or not.  I do not find that
> acceptable.  Why is it necessary to do this before finding that
> there's an update conflict?

Two main reasons:

Previously we referenced tuples from LockRowsState->lr_curtuples, and
the management of that was pretty tightly interlinked with EPQ, and only
worked for heap tuples.Keeping that scheme would have been somewhat
complicated to continue to maintain.

Secondly, previously the tuple fetched by heap_lock_tuple() was just
stored in a local variable - but now that happens via a slot (as we
otherwise cannot reasonably handle things like heap wanting to return a
pinned buffer, and others not).  As we potentially need to lock rows
from multiple tables, we'd need multiple slots suitable to lock/fetch those
rows.

EPQ already needed similar infrastructure internally - and those tuples
were fetched and retained for EPQ's benefit. So it seemed sensible to
just use the slots from EPQ.  Note that we don't need
EvalPlanQualFetch() anymore, as it's work is done inside the AM -
obviously there's no AM independent way to perform correct ctid chasing.

Which large overhead you mean is going to be paid for "*every locked
row*"? EvalPlanQualBegin() ought to be fairly fast for repeated calls in
the same node - although I do admit that it'd be a lot nicer if the
ExecSetParamPlanMulti() work wouldn't need be redone.

I think it might be reasonable to split EvalPlanQualBegin() (and perhaps
EvalPlanQualStart()) into two pieces: One to just have a minimal
EPQState, with valid slots, and one to get ready to actually run EPQ?

Unfortunately I'm going to be unreachable pretty soon till Monday
(hiking without any network access), so I won't be immediately able to respond.

Greetings,

Andres Freund

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks

Вложения

Вложения