Обсуждение: SEGFAULT on a concurrent UPDATE of mix of local and foreign partitions

Поиск
Список
Период
Сортировка

SEGFAULT on a concurrent UPDATE of mix of local and foreign partitions

От
Andrey Lepikhov
Дата:
Hi,
Postgres SEGFAULT'ed on the UPDATE of mix of local and foreign partitions.
Initialization - see t.sql
For replaying this segfault just execute in parallel:
UPDATE test SET x = x - 1;

The problem was introduced by commit 1375422.
ExecUpdate has found a concurrently updated tuples and starts subplan 
evaluation. This operation creates new EState for EPQState and sets 
es_result_relations in NULL value. Next, ExecInitNode(subplan) is 
launched and underlying ExecInitForeignScan tries to access to an 
element of es_result_relations. This causes SEGFAULT.

I studied this problem shortly. I think, EPQState can use 
es_result_relations of a parent EState. Patch in attachment fixes this. 
check-world passed clearly.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

Re: SEGFAULT on a concurrent UPDATE of mix of local and foreign partitions

От
Heikki Linnakangas
Дата:
On 06/08/2021 08:34, Andrey Lepikhov wrote:
> Hi,
> Postgres SEGFAULT'ed on the UPDATE of mix of local and foreign partitions.
> Initialization - see t.sql
> For replaying this segfault just execute in parallel:
> UPDATE test SET x = x - 1;
> 
> The problem was introduced by commit 1375422.
> ExecUpdate has found a concurrently updated tuples and starts subplan
> evaluation. This operation creates new EState for EPQState and sets
> es_result_relations in NULL value. Next, ExecInitNode(subplan) is
> launched and underlying ExecInitForeignScan tries to access to an
> element of es_result_relations. This causes SEGFAULT.

Thanks for the report!

> I studied this problem shortly. I think, EPQState can use
> es_result_relations of a parent EState. Patch in attachment fixes this.
> check-world passed clearly.

Hmm, that seems to work, but I have a few questions:

Can we be sure that the es_result_relations entry of the result relation 
has been initialized when the EPQState is constructed? The general model 
now is that you initialize them lazily, when needed.

Does it ever make sense to re-evaluate a direct Foreign Update as part 
of EvalPlanQual? It doesn't seem sane to me, you cannot just run an 
UPDATE statement again and expect it to be idempotent. So I think it 
would make sense skip this during EvalPlanQual processing altogether. 
And no need to call the FDW's Begin/EndDirectModify functions either.

It's not clear to me what guarantee there is that a direct-modify 
Foreign Update/Delete node is never re-evaluated as part of 
EvalPlanQual. Is it because the a ModifyTable node is always at the top 
of the plan tree? Or is it because the FDW never uses direct-modify in 
queries where EvalPlanQual might be needed? Or because we use 
ROW_MARK_COPY in those cases?

In any case, re-evaluating a direct-modify statement doesn't seem sane, 
so I propose the attached patch to add some runtime checks for that and 
to avoid the original segfault.

- Heikki

Вложения

Re: SEGFAULT on a concurrent UPDATE of mix of local and foreign partitions

От
"Andrey V. Lepikhov"
Дата:
On 8/10/21 1:29 AM, Heikki Linnakangas wrote:
> On 06/08/2021 08:34, Andrey Lepikhov wrote:
>> Hi,
>> Postgres SEGFAULT'ed on the UPDATE of mix of local and foreign 
>> partitions.
>> Initialization - see t.sql
>> For replaying this segfault just execute in parallel:
>> UPDATE test SET x = x - 1;
>>
>> The problem was introduced by commit 1375422.
>> ExecUpdate has found a concurrently updated tuples and starts subplan
>> evaluation. This operation creates new EState for EPQState and sets
>> es_result_relations in NULL value. Next, ExecInitNode(subplan) is
>> launched and underlying ExecInitForeignScan tries to access to an
>> element of es_result_relations. This causes SEGFAULT.
> 
> Thanks for the report!
> 
>> I studied this problem shortly. I think, EPQState can use
>> es_result_relations of a parent EState. Patch in attachment fixes this.
>> check-world passed clearly.
> 
> Hmm, that seems to work, but I have a few questions:
> 
> Can we be sure that the es_result_relations entry of the result relation 
> has been initialized when the EPQState is constructed? The general model 
> now is that you initialize them lazily, when needed.
As I understand the code correctly - yes, because a ModifyTable node 
higher and will initialize this field earlier. But in many cases it is 
not needed.
> 
> Does it ever make sense to re-evaluate a direct Foreign Update as part 
> of EvalPlanQual? It doesn't seem sane to me, you cannot just run an 
> UPDATE statement again and expect it to be idempotent. So I think it 
> would make sense skip this during EvalPlanQual processing altogether. 
> And no need to call the FDW's Begin/EndDirectModify functions either.
Agree. It is my first dive into the EPQ code. After studying i think 
this makes no sense.
> 
> It's not clear to me what guarantee there is that a direct-modify 
> Foreign Update/Delete node is never re-evaluated as part of 
> EvalPlanQual. Is it because the a ModifyTable node is always at the top 
> of the plan tree? Or is it because the FDW never uses direct-modify in 
> queries where EvalPlanQual might be needed? Or because we use 
> ROW_MARK_COPY in those cases?
ModifyTable node may be found in an underlying CTE, for example. And, 
right now, it's not clear for me too.
> 
> In any case, re-evaluating a direct-modify statement doesn't seem sane, 
> so I propose the attached patch to add some runtime checks for that and 
> to avoid the original segfaultYour patch looks much better.

-- 
regards,
Andrey Lepikhov
Postgres Professional



Re: SEGFAULT on a concurrent UPDATE of mix of local and foreign partitions

От
Heikki Linnakangas
Дата:
On 10/08/2021 12:18, Andrey V. Lepikhov wrote:
> On 8/10/21 1:29 AM, Heikki Linnakangas wrote:
>> Does it ever make sense to re-evaluate a direct Foreign Update as part
>> of EvalPlanQual? It doesn't seem sane to me, you cannot just run an
>> UPDATE statement again and expect it to be idempotent. So I think it
>> would make sense skip this during EvalPlanQual processing altogether.
>> And no need to call the FDW's Begin/EndDirectModify functions either.
> Agree. It is my first dive into the EPQ code. After studying i think
> this makes no sense.

Pushed the fix to master and v14. Thanks for the report!

- Heikki