Обсуждение: Correction of RowMark Removal During Sel-Join Elimination

Поиск
Список
Период
Сортировка

Correction of RowMark Removal During Sel-Join Elimination

От
Andrei Lepikhov
Дата:
Hi,

I've noticed that the code for sel-join elimination incorrectly removes 
RowMarks. Currently, this doesn't lead to any issues because the 
optimiser retains pointers to the simple_rte_array. A RowMark refers to 
an RTE entry that is the same for both the relations being kept and 
those being removed.

I believe it would be beneficial to address this issue now to prevent 
potential problems in the future. See the patch attached.

-- 
regards, Andrei Lepikhov

Вложения

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Greg Sabino Mullane
Дата:
Basic concept looks good. However:

and fixes the incorrect behaviour. Additionally, it renames variables to make
similar errors more apparent in the future.

- if (!innerrel_is_unique_ext(root, joinrelids, inner->relids,
- outer, JOIN_INNER, selfjoinquals,
+ if (!innerrel_is_unique_ext(root, joinrelids, rrel->relids,
+ krel, JOIN_INNER, selfjoinquals,

I'm not convinced this is an improvement from someone just coming in to this part of the code, especially given (for example) the comment right above it:

 * Determine if the inner table can duplicate outer rows.  We must
 * bypass the unique rel cache here since we're possibly using a


Cheers,
Greg

--
Enterprise Postgres Software Products & Tech Support

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Andrei Lepikhov
Дата:
On 11/8/2025 20:15, Greg Sabino Mullane wrote:
> I'm not convinced this is an improvement from someone just coming in to 
> this part of the code, especially given (for example) the comment right 
> above it:
> 
>   * Determine if the inner table can duplicate outer rows.  We must
>   * bypass the unique rel cache here since we're possibly using aThanks for your feedback.
I made some minor adjustments to the comments to make the code more 
consistent. Sure, rrel and krel don't seem like the best solution - I 
guess natives could find less wordy synonyms than just dumb 
'keeping_rel' and 'removing_rel'. But it is simpler to track which 
relation and RowMark should be removed.

-- 
regards, Andrei Lepikhov
Вложения

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Alexander Korotkov
Дата:
Hi, Andrei!

On Tue, Aug 12, 2025 at 12:40 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 11/8/2025 20:15, Greg Sabino Mullane wrote:
> > I'm not convinced this is an improvement from someone just coming in to
> > this part of the code, especially given (for example) the comment right
> > above it:
> >
> >   * Determine if the inner table can duplicate outer rows.  We must
> >   * bypass the unique rel cache here since we're possibly using aThanks for your feedback.
> I made some minor adjustments to the comments to make the code more
> consistent. Sure, rrel and krel don't seem like the best solution - I
> guess natives could find less wordy synonyms than just dumb
> 'keeping_rel' and 'removing_rel'. But it is simpler to track which
> relation and RowMark should be removed.

Thank you for catching this.  And thank you for the fix.  I think it
worth separating fix and refactoring.  This helps to understand what
exactly the fix is by looking at the patch.  I also edited commit
message.  I'm going to push this if no objections.

------
Regards,
Alexander Korotkov
Supabase

Вложения

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Alexander Lakhin
Дата:
Hello Alexander,

24.08.2025 03:44, Alexander Korotkov wrote:
Thank you for catching this.  And thank you for the fix.  I think it
worth separating fix and refactoring.  This helps to understand what
exactly the fix is by looking at the patch.  I also edited commit
message.  I'm going to push this if no objections.

Please try the following script:
create table t1(i1 int primary key);
create table t2 (i2 int, check (i2 is not null));

set constraint_exclusion = 'on';

select 1 from t1 right join t2 on i1 = i2 where (select true);

which triggers (starting from 5f6f951f8):
Core was generated by `postgres: law regression [local] SELECT                                       '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at clauses.c:4242
4242                    if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
(gdb) bt
#0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at clauses.c:4242
#1  0x000058dd6a7b4f74 in eval_const_expressions_mutator (node=0x58dd78d75590, context=0x7fff875a9f00) at clauses.c:3556
#2  0x000058dd6a7b2935 in eval_const_expressions (root=0x58dd78d505f8, node=0x58dd78d75590) at clauses.c:2272
#3  0x000058dd6a7c857d in get_relation_constraints (root=0x58dd78d505f8, relationObjectId=16385, rel=0x58dd78d738e0, include_noinherit=true, include_notnull=true, include_partition=true) at plancat.c:1419
#4  0x000058dd6a7c8fdb in relation_excluded_by_constraints (root=0x58dd78d505f8, rel=0x58dd78d738e0, rte=0x58dd78c6b968) at plancat.c:1808
#5  0x000058dd6a73675e in set_rel_size (root=0x58dd78d505f8, rel=0x58dd78d738e0, rti=2, rte=0x58dd78c6b968) at allpaths.c:364
#6  0x000058dd6a73664c in set_base_rel_sizes (root=0x58dd78d505f8) at allpaths.c:322
#7  0x000058dd6a73631e in make_one_rel (root=0x58dd78d505f8, joinlist=0x58dd78d74a90) at allpaths.c:183
...

Discovered with SQLsmith.

Best regards,
Alexander

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Alexander Korotkov
Дата:
On Sat, Aug 30, 2025 at 4:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> Hello Alexander,
>
> 24.08.2025 03:44, Alexander Korotkov wrote:
>
> Thank you for catching this.  And thank you for the fix.  I think it
> worth separating fix and refactoring.  This helps to understand what
> exactly the fix is by looking at the patch.  I also edited commit
> message.  I'm going to push this if no objections.
>
>
> Please try the following script:
> create table t1(i1 int primary key);
> create table t2 (i2 int, check (i2 is not null));
>
> set constraint_exclusion = 'on';
>
> select 1 from t1 right join t2 on i1 = i2 where (select true);
>
> which triggers (starting from 5f6f951f8):
> Core was generated by `postgres: law regression [local] SELECT                                       '.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at
clauses.c:4242
> 4242                    if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
> (gdb) bt
> #0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at
clauses.c:4242
> #1  0x000058dd6a7b4f74 in eval_const_expressions_mutator (node=0x58dd78d75590, context=0x7fff875a9f00) at
clauses.c:3556
> #2  0x000058dd6a7b2935 in eval_const_expressions (root=0x58dd78d505f8, node=0x58dd78d75590) at clauses.c:2272
> #3  0x000058dd6a7c857d in get_relation_constraints (root=0x58dd78d505f8, relationObjectId=16385, rel=0x58dd78d738e0,
include_noinherit=true,include_notnull=true, include_partition=true) at plancat.c:1419 
> #4  0x000058dd6a7c8fdb in relation_excluded_by_constraints (root=0x58dd78d505f8, rel=0x58dd78d738e0,
rte=0x58dd78c6b968)at plancat.c:1808 
> #5  0x000058dd6a73675e in set_rel_size (root=0x58dd78d505f8, rel=0x58dd78d738e0, rti=2, rte=0x58dd78c6b968) at
allpaths.c:364
> #6  0x000058dd6a73664c in set_base_rel_sizes (root=0x58dd78d505f8) at allpaths.c:322
> #7  0x000058dd6a73631e in make_one_rel (root=0x58dd78d505f8, joinlist=0x58dd78d74a90) at allpaths.c:183

Thank you for pointing.  I'm investigating this now.

------
Regards,
Alexander Korotkov
Supabase



Re: Correction of RowMark Removal During Sel-Join Elimination

От
Alexander Korotkov
Дата:
On Sat, Aug 30, 2025 at 9:54 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Sat, Aug 30, 2025 at 4:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> >
> > Hello Alexander,
> >
> > 24.08.2025 03:44, Alexander Korotkov wrote:
> >
> > Thank you for catching this.  And thank you for the fix.  I think it
> > worth separating fix and refactoring.  This helps to understand what
> > exactly the fix is by looking at the patch.  I also edited commit
> > message.  I'm going to push this if no objections.
> >
> >
> > Please try the following script:
> > create table t1(i1 int primary key);
> > create table t2 (i2 int, check (i2 is not null));
> >
> > set constraint_exclusion = 'on';
> >
> > select 1 from t1 right join t2 on i1 = i2 where (select true);
> >
> > which triggers (starting from 5f6f951f8):
> > Core was generated by `postgres: law regression [local] SELECT                                       '.
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at
clauses.c:4242
> > 4242                    if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
> > (gdb) bt
> > #0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at
clauses.c:4242
> > #1  0x000058dd6a7b4f74 in eval_const_expressions_mutator (node=0x58dd78d75590, context=0x7fff875a9f00) at
clauses.c:3556
> > #2  0x000058dd6a7b2935 in eval_const_expressions (root=0x58dd78d505f8, node=0x58dd78d75590) at clauses.c:2272
> > #3  0x000058dd6a7c857d in get_relation_constraints (root=0x58dd78d505f8, relationObjectId=16385,
rel=0x58dd78d738e0,include_noinherit=true, include_notnull=true, include_partition=true) at plancat.c:1419 
> > #4  0x000058dd6a7c8fdb in relation_excluded_by_constraints (root=0x58dd78d505f8, rel=0x58dd78d738e0,
rte=0x58dd78c6b968)at plancat.c:1808 
> > #5  0x000058dd6a73675e in set_rel_size (root=0x58dd78d505f8, rel=0x58dd78d738e0, rti=2, rte=0x58dd78c6b968) at
allpaths.c:364
> > #6  0x000058dd6a73664c in set_base_rel_sizes (root=0x58dd78d505f8) at allpaths.c:322
> > #7  0x000058dd6a73631e in make_one_rel (root=0x58dd78d505f8, joinlist=0x58dd78d74a90) at allpaths.c:183
>
> Thank you for pointing.  I'm investigating this now.

Hmm... It seems that commit 5f6f951f88 spotted some existing bug.
get_relation_constraints() deserializes the constraint node, but it
initially refers varno == 1.  get_relation_constraints() calls
ChangeVarNodes() to change varno from 1 to 2, but only after calling
eval_const_expressions() which access root->simple_rte_array[] with
wrong varno...

The draft patch fixing this is attached.  I will continue the investigation.

------
Regards,
Alexander Korotkov
Supabase

Вложения

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Alexander Korotkov
Дата:
On Sat, Aug 30, 2025 at 10:29 PM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sat, Aug 30, 2025 at 9:54 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > On Sat, Aug 30, 2025 at 4:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> > >
> > > Hello Alexander,
> > >
> > > 24.08.2025 03:44, Alexander Korotkov wrote:
> > >
> > > Thank you for catching this.  And thank you for the fix.  I think it
> > > worth separating fix and refactoring.  This helps to understand what
> > > exactly the fix is by looking at the patch.  I also edited commit
> > > message.  I'm going to push this if no objections.
> > >
> > >
> > > Please try the following script:
> > > create table t1(i1 int primary key);
> > > create table t2 (i2 int, check (i2 is not null));
> > >
> > > set constraint_exclusion = 'on';
> > >
> > > select 1 from t1 right join t2 on i1 = i2 where (select true);
> > >
> > > which triggers (starting from 5f6f951f8):
> > > Core was generated by `postgres: law regression [local] SELECT                                       '.
> > > Program terminated with signal SIGSEGV, Segmentation fault.
> > > #0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at
clauses.c:4242
> > > 4242                    if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
> > > (gdb) bt
> > > #0  0x000058dd6a7b6014 in var_is_nonnullable (root=0x58dd78d505f8, var=0x58dd78d75610, use_rel_info=false) at
clauses.c:4242
> > > #1  0x000058dd6a7b4f74 in eval_const_expressions_mutator (node=0x58dd78d75590, context=0x7fff875a9f00) at
clauses.c:3556
> > > #2  0x000058dd6a7b2935 in eval_const_expressions (root=0x58dd78d505f8, node=0x58dd78d75590) at clauses.c:2272
> > > #3  0x000058dd6a7c857d in get_relation_constraints (root=0x58dd78d505f8, relationObjectId=16385,
rel=0x58dd78d738e0,include_noinherit=true, include_notnull=true, include_partition=true) at plancat.c:1419 
> > > #4  0x000058dd6a7c8fdb in relation_excluded_by_constraints (root=0x58dd78d505f8, rel=0x58dd78d738e0,
rte=0x58dd78c6b968)at plancat.c:1808 
> > > #5  0x000058dd6a73675e in set_rel_size (root=0x58dd78d505f8, rel=0x58dd78d738e0, rti=2, rte=0x58dd78c6b968) at
allpaths.c:364
> > > #6  0x000058dd6a73664c in set_base_rel_sizes (root=0x58dd78d505f8) at allpaths.c:322
> > > #7  0x000058dd6a73631e in make_one_rel (root=0x58dd78d505f8, joinlist=0x58dd78d74a90) at allpaths.c:183
> >
> > Thank you for pointing.  I'm investigating this now.
>
> Hmm... It seems that commit 5f6f951f88 spotted some existing bug.
> get_relation_constraints() deserializes the constraint node, but it
> initially refers varno == 1.  get_relation_constraints() calls
> ChangeVarNodes() to change varno from 1 to 2, but only after calling
> eval_const_expressions() which access root->simple_rte_array[] with
> wrong varno...
>
> The draft patch fixing this is attached.  I will continue the investigation.

The same patch with a bit revised comment and commit message.

------
Regards,
Alexander Korotkov
Supabase

Вложения

Re: Correction of RowMark Removal During Sel-Join Elimination

От
Richard Guo
Дата:
On Sun, Aug 31, 2025 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > The draft patch fixing this is attached.  I will continue the investigation.
>
> The same patch with a bit revised comment and commit message.

FWIW, I reported this same issue and proposed the patch last week in
Discussion: https://postgr.es/m/CAMbWs48x=GsGan_bvQ+j0rLqYFNi9W725EmJpWUZv1RrmaicZQ@mail.gmail.com

Thanks
Richard



Re: Correction of RowMark Removal During Sel-Join Elimination

От
Richard Guo
Дата:
On Sun, Aug 31, 2025 at 8:41 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Sun, Aug 31, 2025 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > The same patch with a bit revised comment and commit message.

> FWIW, I reported this same issue and proposed the patch last week in
> Discussion: https://postgr.es/m/CAMbWs48x=GsGan_bvQ+j0rLqYFNi9W725EmJpWUZv1RrmaicZQ@mail.gmail.com

I think it's better to push this patch sooner rather than later, as
multiple people have encountered the issue in different ways.  I'll go
ahead and push 0001 from my patch set shortly.  I don't think we need
to backpatch it, since the issue only occurs after commit e2debb643.

Thanks
Richard



Re: Correction of RowMark Removal During Sel-Join Elimination

От
Alexander Korotkov
Дата:
On Sun, Aug 31, 2025 at 2:51 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Sun, Aug 31, 2025 at 8:41 AM Richard Guo <guofenglinux@gmail.com> wrote:
> > On Sun, Aug 31, 2025 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > The same patch with a bit revised comment and commit message.
>
> > FWIW, I reported this same issue and proposed the patch last week in
> > Discussion: https://postgr.es/m/CAMbWs48x=GsGan_bvQ+j0rLqYFNi9W725EmJpWUZv1RrmaicZQ@mail.gmail.com
>
> I think it's better to push this patch sooner rather than later, as
> multiple people have encountered the issue in different ways.  I'll go
> ahead and push 0001 from my patch set shortly.

I see.  You're much farther than I'm.  Please, go ahead pushing this.

> I don't think we need
> to backpatch it, since the issue only occurs after commit e2debb643.

I think you're right.

------
Regards,
Alexander Korotkov
Supabase



Re: Correction of RowMark Removal During Sel-Join Elimination

От
Richard Guo
Дата:
On Sun, Aug 31, 2025 at 9:50 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Sun, Aug 31, 2025 at 2:51 AM Richard Guo <guofenglinux@gmail.com> wrote:
> > I think it's better to push this patch sooner rather than later, as
> > multiple people have encountered the issue in different ways.  I'll go
> > ahead and push 0001 from my patch set shortly.

> I see.  You're much farther than I'm.  Please, go ahead pushing this.

Done.

Thanks
Richard



Re: Correction of RowMark Removal During Sel-Join Elimination

От
Andrei Lepikhov
Дата:
On 31/8/2025 03:03, Richard Guo wrote:
> On Sun, Aug 31, 2025 at 9:50 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> On Sun, Aug 31, 2025 at 2:51 AM Richard Guo <guofenglinux@gmail.com> wrote:
>>> I think it's better to push this patch sooner rather than later, as
>>> multiple people have encountered the issue in different ways.  I'll go
>>> ahead and push 0001 from my patch set shortly.
>
>> I see.  You're much farther than I'm.  Please, go ahead pushing this.
>
> Done.
Hmm, is this enough? As I see, it is a typical pattern: after
deserialisation by the stringToNode function, Postgres attempts to call
eval_const_expressions. See, for example, RelationGetIndexExpressions.

That seems strange, because who knows what the entry No.1 is and how
constant evaluation over this relation influences the expression tree.
Perhaps it would be beneficial to refactor all this logic? Maybe go
further and, before storing any relation constraint to the system
catalogue, replace varno with something like 'UNDEF_VAR (-5)'?

I thought about why it was designed that way. Commit eabc714 shows that
at that time, the eval_const_expressions didn't touch PlannerInfo data
at all. Currently, it requires query structures in some cases,
particularly in prosupport functions, which necessitates having correct
relation references.

--
regards, Andrei Lepikhov