Re: Optimization outcome depends on the index order

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Optimization outcome depends on the index order
Дата
Msg-id CAPpHfdsE+v+mDoX2OF-wBTR4eiyiB3urwMEJUPbSwjJiG-P=Gg@mail.gmail.com
обсуждение исходный текст
Ответ на Optimization outcome depends on the index order  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: Optimization outcome depends on the index order  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
On Fri, Dec 22, 2023 at 8:53 AM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:
> On 21/12/2023 12:10, Alexander Korotkov wrote:
>  > I took a closer look at the patch in [9].  I should drop my argument
>  > about breaking the model, because add_path() already considers other
>  > aspects than just costs.  But I have two more note about that patch:
>  >
>  > 1) It seems that you're determining the fact that the index path
>  > should return strictly one row by checking path->rows <= 1.0 and
>  > indexinfo->unique.  Is it really guaranteed that in this case quals
>  > are matching unique constraint?  path->rows <= 1.0 could be just an
>  > estimation error.  Or one row could be correctly estimated, but it's
>  > going to be selected by some quals matching unique constraint and
>  > other quals in recheck.  So, it seems there is a risk to select
>  > suboptimal index due to this condition.
>
> Operating inside the optimizer, we consider all estimations to be the
> sooth. This patch modifies only one place: having two equal assumptions,
> we just choose one that generally looks more stable.
> Filtered tuples should be calculated and included in the cost of the
> path. The decision on the equality of paths has been made in view of the
> estimation of these filtered tuples.

Even if estimates are accurate the conditions in the patch doesn't guarantee there is actually a unique condition.

# create table t as select i/1000 a, i % 1000 b, i % 1000 c from generate_series(1,1000000) i;
# create unique index t_unique_idx on t(a,b);
# create index t_another_idx on t(a,c);
# \d t
                 Table "public.t"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 a      | integer |           |          |
 b      | integer |           |          |
 c      | integer |           |          |
Indexes:
    "t_another_idx" btree (a, c)
    "t_unique_idx" UNIQUE, btree (a, b)
# set enable_bitmapscan = off; explain select * from t where a = 1 and c = 1;
SET
Time: 0.459 ms
                                QUERY PLAN
--------------------------------------------------------------------------
 Index Scan using t_unique_idx on t  (cost=0.42..1635.16 rows=1 width=12)
   Index Cond: (a = 1)
   Filter: (c = 1)
(3 rows)

 
>  > 2) Even for non-unique indexes this patch is putting new logic on top
>  > of the subsequent code.  How we can prove it's going to be a win?
>  > That could lead, for instance, to dropping parallel-safe paths in
>  > cases we didn't do so before.
> Because we must trust all predictions made by the planner, we just
> choose the most trustworthy path. According to the planner logic, it is
> a path with a smaller selectivity. We can make mistakes anyway just
> because of the nature of estimation.

Even if we need to take selectivity into account here, it's still not clear why this should be on top of other logic later in add_path().

>  > Anyway, please start a separate thread if you're willing to put more
>  > work into this.
>
> Done

Thanks.

------
Regards,
Alexander Korotkov

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Junwang Zhao
Дата:
Сообщение: Re: [meson] expose buildtype debug/optimization info to pg_config
Следующее
От: shveta malik
Дата:
Сообщение: Re: Synchronizing slots from primary to standby