Re: BUG #17964: Missed query planner optimization

Поиск

Список

Период

Сортировка

От	Mathias Kunter
Тема	Re: BUG #17964: Missed query planner optimization
Дата	7 июня 2023 г. 11:47:59
Msg-id	94608857-707f-55ce-99a0-ac500aeb06f8@gmail.com обсуждение исходный текст
Ответ на	Re: BUG #17964: Missed query planner optimization (David Rowley <dgrowleyml@gmail.com>)
Ответы	Re: BUG #17964: Missed query planner optimization
Список	pgsql-bugs

Дерево обсуждения

> It's not a bug that the planner does not consider evaluating the join
> before the UNION

Yes, it's not a bug, but it's something which can be improved. If I 
simply change the original query from this:

> SELECT ... WHERE id IN (SELECT ...);

into this:

> SELECT ... WHERE id = ANY(ARRAY(SELECT ...));

then Postgres uses an index scan, and the query is orders of magnitude 
faster. Note that the planner actually correctly computes the estimated 
costs for both variants, since I get:

> cost=769.11..1227.17 when using IN
> cost=86.45..86.65    when using ANY

See https://dbfiddle.uk/iOkiiTJJ

Also note that this issue doesn't only affect UNION queries. For 
example, the following query will also execute orders of magnitude 
faster if I simply replace IN with ANY:

> SELECT * FROM t WHERE x = 'a' OR y IN (SELECT ...);

Again, estimated costs say that using ANY should be faster:

> cost=8.30..2443.31 when using IN
> cost=56.45..350.69 when using ANY

See https://dbfiddle.uk/b9piwQr4

Hence, why doesn't the planner simply test whether it's beneficial to 
replace IN with ANY? It seems that all which has to be done is to 
compare the query plans for both possible execution variants. I guess 
this should be rather simple to implement, isn't it?

Thanks

Mathias

Am 06.06.23 um 23:32 schrieb David Rowley:
> On Wed, 7 Jun 2023 at 04:44, PG Bug reporting form
> <noreply@postgresql.org> wrote:
>> In the example below, the query planner uses a sequential scan (query 1)
>> even though it could use an index scan (query 2).
>>
>> EXPLAIN ANALYZE SELECT id, name FROM (SELECT id, name FROM table1 UNION
>> SELECT id, name FROM table2) AS q
>> WHERE id IN (SELECT id FROM table3);
> 
>> EXPLAIN ANALYZE SELECT id, name FROM (SELECT id, name FROM table1 UNION
>> SELECT id, name FROM table2) AS q
>> WHERE id IN (1538,8836,5486,3464,2673);
> 
> It's not a bug that the plannSer does not consider evaluating the join
> before the UNION, it's just an optimisation opportunity we don't
> currently explore.
> 
> If you want that, then write:
> 
> EXPLAIN ANALYZE SELECT id, name FROM table1 WHERE id IN (SELECT id
> FROM table3) UNION SELECT id, name FROM table2 WHERE id IN (SELECT id
> FROM table3);
> 
> David

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #17964: Missed query planner optimization