Re: BUG #17964: Missed query planner optimization
От | Mathias Kunter |
---|---|
Тема | Re: BUG #17964: Missed query planner optimization |
Дата | |
Msg-id | 94608857-707f-55ce-99a0-ac500aeb06f8@gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17964: Missed query planner optimization (David Rowley <dgrowleyml@gmail.com>) |
Ответы |
Re: BUG #17964: Missed query planner optimization
|
Список | pgsql-bugs |
> It's not a bug that the planner does not consider evaluating the join > before the UNION Yes, it's not a bug, but it's something which can be improved. If I simply change the original query from this: > SELECT ... WHERE id IN (SELECT ...); into this: > SELECT ... WHERE id = ANY(ARRAY(SELECT ...)); then Postgres uses an index scan, and the query is orders of magnitude faster. Note that the planner actually correctly computes the estimated costs for both variants, since I get: > cost=769.11..1227.17 when using IN > cost=86.45..86.65 when using ANY See https://dbfiddle.uk/iOkiiTJJ Also note that this issue doesn't only affect UNION queries. For example, the following query will also execute orders of magnitude faster if I simply replace IN with ANY: > SELECT * FROM t WHERE x = 'a' OR y IN (SELECT ...); Again, estimated costs say that using ANY should be faster: > cost=8.30..2443.31 when using IN > cost=56.45..350.69 when using ANY See https://dbfiddle.uk/b9piwQr4 Hence, why doesn't the planner simply test whether it's beneficial to replace IN with ANY? It seems that all which has to be done is to compare the query plans for both possible execution variants. I guess this should be rather simple to implement, isn't it? Thanks Mathias Am 06.06.23 um 23:32 schrieb David Rowley: > On Wed, 7 Jun 2023 at 04:44, PG Bug reporting form > <noreply@postgresql.org> wrote: >> In the example below, the query planner uses a sequential scan (query 1) >> even though it could use an index scan (query 2). >> >> EXPLAIN ANALYZE SELECT id, name FROM (SELECT id, name FROM table1 UNION >> SELECT id, name FROM table2) AS q >> WHERE id IN (SELECT id FROM table3); > >> EXPLAIN ANALYZE SELECT id, name FROM (SELECT id, name FROM table1 UNION >> SELECT id, name FROM table2) AS q >> WHERE id IN (1538,8836,5486,3464,2673); > > It's not a bug that the plannSer does not consider evaluating the join > before the UNION, it's just an optimisation opportunity we don't > currently explore. > > If you want that, then write: > > EXPLAIN ANALYZE SELECT id, name FROM table1 WHERE id IN (SELECT id > FROM table3) UNION SELECT id, name FROM table2 WHERE id IN (SELECT id > FROM table3); > > David
В списке pgsql-bugs по дате отправления: