Обсуждение: BUG #17975: Nested Loop Index Scan returning wrong result

Поиск

Список

Период

Сортировка

BUG #17975: Nested Loop Index Scan returning wrong result

От

PG Bug reporting form

Дата:

14 июня 2023 г., 18:17:31

The following bug has been logged on the website:

Bug reference:      17975
Logged by:          Tor Erik Linnerud
Email address:      tel@jklm.no
PostgreSQL version: 15.3
Operating system:   MacOS 13.4, Linux 5.16
Description:

Hi, first let me say thanks for all the hard work that goes into Postgres.


I ran into a very specific query + index + data combination that appears to
return the wrong result. After much trial and error I’ve been able to
construct a dump to reproduce the problem, when running ANALYZE after the
import.

1. Grab the DB dump (13 MB)

curl -L "https://www.dropbox.com/s/k1ai0765gc2k98f/bug5.sql?dl=1" -o
bug5.sql

2. Create an empty database, import the dump and analyze:

createdb bug5 && psql -d bug5 -f bug5.sql && psql -d bug5 -c "ANALYZE"

3. Run queries:

psql -d bug5 -c "set enable_indexscan = 'off'; SELECT a.id FROM a JOIN b ON
b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'"

1 row expected, get 1

psql -d bug5 -c "set enable_indexscan = 'on'; SELECT a.id FROM a JOIN b ON
b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'"

1 row expected, get 0

Thanks, 
Tor Erik

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Andres Freund

Дата:

14 июня 2023 г., 23:36:51

Hi,

On 2023-06-14 15:17:31 +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      17975
> Logged by:          Tor Erik Linnerud
> Email address:      tel@jklm.no
> PostgreSQL version: 15.3
> Operating system:   MacOS 13.4, Linux 5.16
> Description:        
> 
> Hi, first let me say thanks for all the hard work that goes into Postgres.
> 
> 
> I ran into a very specific query + index + data combination that appears to
> return the wrong result. After much trial and error I’ve been able to
> construct a dump to reproduce the problem, when running ANALYZE after the
> import.
> 
> 1. Grab the DB dump (13 MB)
> 
> curl -L "https://www.dropbox.com/s/k1ai0765gc2k98f/bug5.sql?dl=1" -o
> bug5.sql
> 
> 2. Create an empty database, import the dump and analyze:
> 
> createdb bug5 && psql -d bug5 -f bug5.sql && psql -d bug5 -c "ANALYZE"
> 
> 3. Run queries:
> 
> psql -d bug5 -c "set enable_indexscan = 'off'; SELECT a.id FROM a JOIN b ON
> b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'"
> 
> 1 row expected, get 1
> 
> psql -d bug5 -c "set enable_indexscan = 'on'; SELECT a.id FROM a JOIN b ON
> b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'"
> 
> 1 row expected, get 0

Uh, huh. I can reproduce that. And it's not just 15, reproduces in old
versions too.

The plan doesn't look obviously wrong:

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                             QUERY PLAN
            │
 

├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Nested Loop  (cost=1.26..17.46 rows=1 width=8) (actual time=0.063..0.065 rows=0 loops=1)
            │
 
│   Output: a.id
            │
 
│   Inner Unique: true
            │
 
│   ->  Nested Loop  (cost=0.84..16.88 rows=1 width=8) (actual time=0.056..0.059 rows=1 loops=1)
            │
 
│         Output: b.a_id
            │
 
│         Inner Unique: true
            │
 
│         ->  Index Scan using c_tag_idx on public.c  (cost=0.42..8.44 rows=1 width=8) (actual time=0.034..0.036 rows=1
loops=1)    │
 
│               Output: c.id, c.tag
            │
 
│               Index Cond: ((c.tag)::text = '13880'::text)
            │
 
│         ->  Index Scan using index_b_c_id on public.b  (cost=0.42..8.44 rows=1 width=16) (actual time=0.015..0.016
rows=1loops=1) │
 
│               Output: b.id, b.c_id, b.a_id
            │
 
│               Index Cond: (b.c_id = c.id)
            │
 
│   ->  Index Only Scan using a_pkey on public.a  (cost=0.42..0.59 rows=1 width=8) (actual time=0.002..0.003 rows=0
loops=1)        │
 
│         Output: a.id
            │
 
│         Index Cond: (a.id = b.a_id)
            │
 
│         Heap Fetches: 0
            │
 
│ Planning Time: 0.998 ms
            │
 
│ Execution Time: 0.148 ms
            │
 

└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(18 rows)

It's not an IOS issue, it happens without IOS as well.


Something seems to be off with the relevant param - it's NULL. Haven't dug
deeper.

Greetings,

Andres Freund

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Peter Geoghegan

Дата:

15 июня 2023 г., 00:10:02

On Wed, Jun 14, 2023 at 1:37 PM Andres Freund <andres@anarazel.de> wrote:
> Something seems to be off with the relevant param - it's NULL. Haven't dug
> deeper.

I see that the bad plan has _bt_first() return false early during a
btgettuple() call for one of the a_pkey index scans. That is,
_bt_first() returns false early in the "Quit now if
_bt_preprocess_keys() discovered that the scan keys can never be
satisfied (eg, x == 1 AND x > 2)" path.

It's possible that this is a red herring -- this is not the only
difference between the "bad" index scan heavy plan and the "good" plan
that uses bitmap scans. But right now my suspicion falls on
_bt_preprocess_keys().

--
Peter Geoghegan

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Andres Freund

Дата:

15 июня 2023 г., 00:12:31

Hi,

On 2023-06-14 13:36:51 -0700, Andres Freund wrote:
> On 2023-06-14 15:17:31 +0000, PG Bug reporting form wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference:      17975
> > Logged by:          Tor Erik Linnerud
> > Email address:      tel@jklm.no
> > PostgreSQL version: 15.3
> > Operating system:   MacOS 13.4, Linux 5.16
> > Description:
> >
> > Hi, first let me say thanks for all the hard work that goes into Postgres.
> >
> >
> > I ran into a very specific query + index + data combination that appears to
> > return the wrong result. After much trial and error I’ve been able to
> > construct a dump to reproduce the problem, when running ANALYZE after the
> > import.
> >
> > 1. Grab the DB dump (13 MB)
> >
> > curl -L "https://www.dropbox.com/s/k1ai0765gc2k98f/bug5.sql?dl=1" -o
> > bug5.sql
> >
> > 2. Create an empty database, import the dump and analyze:
> >
> > createdb bug5 && psql -d bug5 -f bug5.sql && psql -d bug5 -c "ANALYZE"
> >
> > 3. Run queries:
> >
> > psql -d bug5 -c "set enable_indexscan = 'off'; SELECT a.id FROM a JOIN b ON
> > b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'"
> >
> > 1 row expected, get 1
> >
> > psql -d bug5 -c "set enable_indexscan = 'on'; SELECT a.id FROM a JOIN b ON
> > b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'"
> >
> > 1 row expected, get 0
>
> Uh, huh. I can reproduce that. And it's not just 15, reproduces in old
> versions too.
>
> The plan doesn't look obviously wrong:
>
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
> │                                                             QUERY PLAN
              │
 
>
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
> │ Nested Loop  (cost=1.26..17.46 rows=1 width=8) (actual time=0.063..0.065 rows=0 loops=1)
              │
 
> │   Output: a.id
              │
 
> │   Inner Unique: true
              │
 
> │   ->  Nested Loop  (cost=0.84..16.88 rows=1 width=8) (actual time=0.056..0.059 rows=1 loops=1)
              │
 
> │         Output: b.a_id
              │
 
> │         Inner Unique: true
              │
 
> │         ->  Index Scan using c_tag_idx on public.c  (cost=0.42..8.44 rows=1 width=8) (actual time=0.034..0.036
rows=1loops=1)     │
 
> │               Output: c.id, c.tag
              │
 
> │               Index Cond: ((c.tag)::text = '13880'::text)
              │
 
> │         ->  Index Scan using index_b_c_id on public.b  (cost=0.42..8.44 rows=1 width=16) (actual time=0.015..0.016
rows=1loops=1) │
 
> │               Output: b.id, b.c_id, b.a_id
              │
 
> │               Index Cond: (b.c_id = c.id)
              │
 
> │   ->  Index Only Scan using a_pkey on public.a  (cost=0.42..0.59 rows=1 width=8) (actual time=0.002..0.003 rows=0
loops=1)        │
 
> │         Output: a.id
              │
 
> │         Index Cond: (a.id = b.a_id)
              │
 
> │         Heap Fetches: 0
              │
 
> │ Planning Time: 0.998 ms
              │
 
> │ Execution Time: 0.148 ms
              │
 
>
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
> (18 rows)
>
> It's not an IOS issue, it happens without IOS as well.
>
>
> Something seems to be off with the relevant param - it's NULL. Haven't dug
> deeper.

I think it's a problem with the uniqueness determination / missing a
qual / index selection.

There are two rows in b with b.c_id = 13880, except that one of them has a
NULL a_id:

=> SELECT * FROM b WHERE c_id = 13880;
┌────────┬───────┬────────┐
│   id   │ c_id  │  a_id  │
├────────┼───────┼────────┤
│ 326048 │ 13880 │ (null) │
│ 572151 │ 13880 │ 955968 │
└────────┴───────┴────────┘
(2 rows)

. The uniqueness information comes from:
  "index_a_cannot_share_c" UNIQUE, btree (c_id) WHERE a_id IS NOT NULL

But note that we aren't using that index, we use
  "index_b_c_id" btree (c_id)

which of course also contains the a_id = NULL row.


We either need to force the index that we got the uniqueness information to be
used when it is partial, or add the quals from the partial unique index to all
other index scans.

Greetings,

Andres Freund

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Andres Freund

Дата:

15 июня 2023 г., 00:21:27

Hi,

On 2023-06-14 14:10:02 -0700, Peter Geoghegan wrote:
> On Wed, Jun 14, 2023 at 1:37 PM Andres Freund <andres@anarazel.de> wrote:
> > Something seems to be off with the relevant param - it's NULL. Haven't dug
> > deeper.
> 
> I see that the bad plan has _bt_first() return false early during a
> btgettuple() call for one of the a_pkey index scans. That is,
> _bt_first() returns false early in the "Quit now if
> _bt_preprocess_keys() discovered that the scan keys can never be
> satisfied (eg, x == 1 AND x > 2)" path.

I think that's a consequence of the planner bug (see subsequent email) - the
bad plan wrongly uses an index that first returns a null a_id, which then
quite legitimately makes _bt_preprocess_keys() exit early.  That bit of
smartness did confuse me for a moment, I was expecting the index scan to
actually scan the index at first :).

I haven't figured out where we're dropping the implied index qual on the floor
yet...

Greetings,

Andres Freund

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Peter Geoghegan

Дата:

15 июня 2023 г., 00:38:30

On Wed, Jun 14, 2023 at 2:21 PM Andres Freund <andres@anarazel.de> wrote:
> I think that's a consequence of the planner bug (see subsequent email) - the
> bad plan wrongly uses an index that first returns a null a_id, which then
> quite legitimately makes _bt_preprocess_keys() exit early.  That bit of
> smartness did confuse me for a moment, I was expecting the index scan to
> actually scan the index at first :).

It certainly looks that way now.

On further examination the only substantive difference is that the
indexes themselves differ in one leaf of the plan. The good plan and
the bad plan match everywhere else (except that the good plan uses
bitmap index scans rather than index scans everywhere).

--
Peter Geoghegan

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Andres Freund

Дата:

15 июня 2023 г., 01:49:28

Hi,

On 2023-06-14 14:12:31 -0700, Andres Freund wrote:
> I think it's a problem with the uniqueness determination / missing a
> qual / index selection.
> 
> There are two rows in b with b.c_id = 13880, except that one of them has a
> NULL a_id:
> 
> => SELECT * FROM b WHERE c_id = 13880;
> ┌────────┬───────┬────────┐
> │   id   │ c_id  │  a_id  │
> ├────────┼───────┼────────┤
> │ 326048 │ 13880 │ (null) │
> │ 572151 │ 13880 │ 955968 │
> └────────┴───────┴────────┘
> (2 rows)
> 
> . The uniqueness information comes from:
>   "index_a_cannot_share_c" UNIQUE, btree (c_id) WHERE a_id IS NOT NULL
> 
> But note that we aren't using that index, we use
>   "index_b_c_id" btree (c_id)
> 
> which of course also contains the a_id = NULL row.
> 
> 
> We either need to force the index that we got the uniqueness information to be
> used when it is partial, or add the quals from the partial unique index to all
> other index scans.

I suspect this is an issue going back to 9c7f5229. Indeed, 9.6 doesn't
reproduce the issue (9c7f5229 was in 10). I haven't bisected it down to that,
but it seems pretty likely - and if it's not that commit, it's a closely
related one.

It's not immediately obvious to me how to nicely fix this in a backpatchable
way. An easy fix would be to not allow predicate indexes at all anymore in
relation_has_unique_index_for(), but that's a pretty big cannon - fixes the
issue though.

Greetings,

Andres Freund

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Tom Lane

Дата:

15 июня 2023 г., 02:02:48

Andres Freund <andres@anarazel.de> writes:
> On 2023-06-14 14:12:31 -0700, Andres Freund wrote:
>> We either need to force the index that we got the uniqueness information to be
>> used when it is partial, or add the quals from the partial unique index to all
>> other index scans.

I did not study this example yet, but generally we ignore predicate
indexes unless their predicates can be proven from base restrictions
of their table (that's what predOK means).  If the base restrictions
are enforced at scan level, which they should be, then uniqueness
should hold at any join level regardless of whether we actually
scanned with that index or some other way.  Maybe we broke that
chain of reasoning somehow?

> It's not immediately obvious to me how to nicely fix this in a backpatchable
> way. An easy fix would be to not allow predicate indexes at all anymore in
> relation_has_unique_index_for(), but that's a pretty big cannon - fixes the
> issue though.

Yeah, that would be the easy way out if we don't find a better answer.
But right at the moment I don't understand why this is failing.

            regards, tom lane

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Andres Freund

Дата:

15 июня 2023 г., 02:57:11

Hi,

On 2023-06-14 19:02:48 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-06-14 14:12:31 -0700, Andres Freund wrote:
> >> We either need to force the index that we got the uniqueness information to be
> >> used when it is partial, or add the quals from the partial unique index to all
> >> other index scans.
>
> I did not study this example yet, but generally we ignore predicate
> indexes unless their predicates can be proven from base restrictions
> of their table (that's what predOK means).  If the base restrictions
> are enforced at scan level, which they should be, then uniqueness
> should hold at any join level regardless of whether we actually
> scanned with that index or some other way.  Maybe we broke that
> chain of reasoning somehow?

It doesn't really hold at lower join levels with partial unique indexes, at
least as far as inner_unique goes. In this case we have one partial unique
index on b(c_id) WHERE a_id IS NOT NULL, and we have a plain index on b(c_id).
inner_unique is set to true based on the partial index - but then we decide
use the non-partial index for the index scan. That ends up returning a row
which with a_is = NULL, which won't find a match in the upper join levels. But
because the join is inner_unique, it'll not try to find another row on the
inner side.

> > It's not immediately obvious to me how to nicely fix this in a backpatchable
> > way. An easy fix would be to not allow predicate indexes at all anymore in
> > relation_has_unique_index_for(), but that's a pretty big cannon - fixes the
> > issue though.
>
> Yeah, that would be the easy way out if we don't find a better answer.
> But right at the moment I don't understand why this is failing.

Hope the above made it a bit clearer?

Greetings,

Andres Freund

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Tom Lane

Дата:

15 июня 2023 г., 02:59:26

Andres Freund <andres@anarazel.de> writes:
> On 2023-06-14 19:02:48 -0400, Tom Lane wrote:
>> I did not study this example yet, but generally we ignore predicate
>> indexes unless their predicates can be proven from base restrictions
>> of their table (that's what predOK means).

> It doesn't really hold at lower join levels with partial unique indexes, at
> least as far as inner_unique goes. In this case we have one partial unique
> index on b(c_id) WHERE a_id IS NOT NULL, and we have a plain index on b(c_id).
> inner_unique is set to true based on the partial index - but then we decide
> use the non-partial index for the index scan. That ends up returning a row
> which with a_is = NULL, which won't find a match in the upper join
> levels.

But how did it decide that the partial index is predOK, if there's not
a qual forcing a_id to not be null?

            regards, tom lane

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Andres Freund

Дата:

15 июня 2023 г., 03:09:20

Hi,

On 2023-06-14 19:59:26 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-06-14 19:02:48 -0400, Tom Lane wrote:
> >> I did not study this example yet, but generally we ignore predicate
> >> indexes unless their predicates can be proven from base restrictions
> >> of their table (that's what predOK means).
> 
> > It doesn't really hold at lower join levels with partial unique indexes, at
> > least as far as inner_unique goes. In this case we have one partial unique
> > index on b(c_id) WHERE a_id IS NOT NULL, and we have a plain index on b(c_id).
> > inner_unique is set to true based on the partial index - but then we decide
> > use the non-partial index for the index scan. That ends up returning a row
> > which with a_is = NULL, which won't find a match in the upper join
> > levels.
> 
> But how did it decide that the partial index is predOK, if there's not
> a qual forcing a_id to not be null?

There is - but it's at a higher join level. That would prevent us from
returning a wrongly matching row, but in the inner_unique case we don't even
get to that point.

We obviously could make it correct by injecting the relevant check into the
index scan on the inner side, but it doesn't look trivial to do so.

Greetings,

Andres Freund

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

David Rowley

Дата:

15 июня 2023 г., 03:16:54

On Thu, 15 Jun 2023 at 11:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
> > It doesn't really hold at lower join levels with partial unique indexes, at
> > least as far as inner_unique goes. In this case we have one partial unique
> > index on b(c_id) WHERE a_id IS NOT NULL, and we have a plain index on b(c_id).
> > inner_unique is set to true based on the partial index - but then we decide
> > use the non-partial index for the index scan. That ends up returning a row
> > which with a_is = NULL, which won't find a match in the upper join
> > levels.
>
> But how did it decide that the partial index is predOK, if there's not
> a qual forcing a_id to not be null?

In check_index_predicates(), we seem to use the results of
generate_join_implied_equalities() as index predicate proofs. That
seems fishy because surely those are only valid to use in cases after
the join for the particular predicate is evaluated.

In this case a.id = b.a_id AND c.id = b.c_id are used as proofs. I
didn't debug all the way, but I assume we deduce that the NOT NULL
index is ok due to the strict join quals.

David

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

David Rowley

Дата:

15 июня 2023 г., 03:23:59

On Thu, 15 Jun 2023 at 12:16, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 15 Jun 2023 at 11:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > But how did it decide that the partial index is predOK, if there's not
> > a qual forcing a_id to not be null?
>
> In check_index_predicates(), we seem to use the results of
> generate_join_implied_equalities() as index predicate proofs. That
> seems fishy because surely those are only valid to use in cases after
> the join for the particular predicate is evaluated.
>
> In this case a.id = b.a_id AND c.id = b.c_id are used as proofs. I
> didn't debug all the way, but I assume we deduce that the NOT NULL
> index is ok due to the strict join quals.

Isn't it just a wrong assumption to use predOK for unique proofs for
all cases here.  predOK seems ok to set to true for when it comes to
using that index for the query as by the time we finish evaluating all
the joins we'd have filtered out all rows that the partial index won't
contain... The problem just lies with the assumption that it's ok to
use the unique partial index as proofs before evaluation of all quals
that were used to determine predOK is true.

Perhaps we need another predOK field that just uses baserestrictinfos...

David

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Tom Lane

Дата:

15 июня 2023 г., 03:28:20

David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 15 Jun 2023 at 11:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> But how did it decide that the partial index is predOK, if there's not
>> a qual forcing a_id to not be null?

> In this case a.id = b.a_id AND c.id = b.c_id are used as proofs. I
> didn't debug all the way, but I assume we deduce that the NOT NULL
> index is ok due to the strict join quals.

Ah.  So the index is okay to use as far as the final result is
concerned, but it's not okay for the purpose to which it's being put.

I concur that we'd better just not use partial indexes in
relation_has_unique_index_for.

            regards, tom lane

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

David Rowley

Дата:

15 июня 2023 г., 03:38:23

On Thu, 15 Jun 2023 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I concur that we'd better just not use partial indexes in
> relation_has_unique_index_for.

I wonder if that's ok for a backpatch.  This affects both left join
removals and unique joins.  Seems like suddenly making a left join
removal not work might cause someone some pain.

Is it worth trying to jam in a new boolean field into IndexOptInfo
into some spare padding to that we run predicate_implied_by() just
using baserestrictinfo and use those in
relation_has_unique_index_for()?

One problem might be that I don't see anywhere we could jam another
boolean into IndexOptInfo on 32-bit builds. Maybe we could put it at
the end given that we should only be making these structs in core?

David

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

Tom Lane

Дата:

15 июня 2023 г., 03:50:53

David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 15 Jun 2023 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I concur that we'd better just not use partial indexes in
>> relation_has_unique_index_for.

> I wonder if that's ok for a backpatch.  This affects both left join
> removals and unique joins.  Seems like suddenly making a left join
> removal not work might cause someone some pain.

I kind of doubt that this will affect any large number of users.
If it did, we'd have had reports sooner.

> Is it worth trying to jam in a new boolean field into IndexOptInfo
> into some spare padding to that we run predicate_implied_by() just
> using baserestrictinfo and use those in
> relation_has_unique_index_for()?

How will that work with the caching in innerrel_is_unique?
I also seriously doubt that we can make such a thing work
without adding parameters to any externally-visible functions.

            regards, tom lane

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

David Rowley

Дата:

15 июня 2023 г., 07:33:58

On Thu, 15 Jun 2023 at 12:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > On Thu, 15 Jun 2023 at 12:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I concur that we'd better just not use partial indexes in
> >> relation_has_unique_index_for.
>
> > I wonder if that's ok for a backpatch.  This affects both left join
> > removals and unique joins.  Seems like suddenly making a left join
> > removal not work might cause someone some pain.
>
> I kind of doubt that this will affect any large number of users.
> If it did, we'd have had reports sooner.

OK, I've attached a patch for just not using partial indexes as
uniqueness proofs.

> > Is it worth trying to jam in a new boolean field into IndexOptInfo
> > into some spare padding to that we run predicate_implied_by() just
> > using baserestrictinfo and use those in
> > relation_has_unique_index_for()?
>
> How will that work with the caching in innerrel_is_unique?
> I also seriously doubt that we can make such a thing work
> without adding parameters to any externally-visible functions.

relation_has_unique_index_for() would just check this new field that
gets set only from baserestrictinfo quals.  We'd just completely
ignore join quals for the new boolean field that'll be used in
relation_has_unique_index_for().

You're probably thinking that we could use join quals up to the level
that we've already joined to for unique joins.  That's probably true,
but seems more complex than it might be worth to do that.  I imagined
the majority of predOK are set to true based on baserestrictinfo
quals. I've no data to back that up, however.  I was just trying to
think of ways to lessen the cases we regress from fixing this bug.

David

Вложения

v1-0001-Don-t-use-partial-unique-indexes-for-unique-proof.patch

Re: BUG #17975: Nested Loop Index Scan returning wrong result

От

David Rowley

Дата:

21 июня 2023 г., 02:01:39

On Thu, 15 Jun 2023 at 16:33, David Rowley <dgrowleyml@gmail.com> wrote:
> OK, I've attached a patch for just not using partial indexes as
> uniqueness proofs.

I still have concerns about not doing anything for partial unique
indexes that are predOK as far as the baserestrictinfos are concerned.
I did push the patch to disable the unique join optimisation for all
partial unique indexes for now as we do need to at least fix the bug.

I've attached another patch which adds predOKBase and uses just
baserestrictinfo quals to set that field.  If we were to backpatch
something like this, we'd need to figure out how to represent this new
field in IndexOptInfo.  Changing the predOK bool to a uint8 and having
3 values might be an ok method.

David

Вложения

predOKBase.patch

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BUG #17975: Nested Loop Index Scan returning wrong result

Вложения

Вложения