Performance improvement hints
От | devik@cdi.cz |
---|---|
Тема | Performance improvement hints |
Дата | |
Msg-id | 39BE21D1.FB8BD89E@cdi.cz обсуждение исходный текст |
Ответы |
Re: Performance improvement hints
Re: Performance improvement hints |
Список | pgsql-hackers |
Hello, I have encountered problems with particular query so that a started to dug into sources. I've two questions/ideas: 1) when optimizer computes size of join it does it as card(R1)*card(R2)*selectivity. Suppose two relations (R1 & R2) each10000 rows. If you (inner) join them using equality operator, the result is at most 10000 rows (min(card(R1),card(R2)).But pg estimates 1 000 000 (uses selectivity 0.01 here). Then when computing cost it will resultin very high cost in case of hash and loop join BUT low (right) cost for merge join. It is because for hash and loop joins the cost is estimated from row count but merge join uses another estimation (as it always know that merge joincan be done only on equality op). It then leads to use of mergejoin for majority of joins. Unfortunately I found thatin majority of such cases the hash join is two times faster. I tested it using SET ENABLE_MERGEJOIN=OFF ... What aboutto change cost estimator to use min(card(R1), card(R2)) instead of card(R1)*card(R2)*selectivity in case where R1and R2 are connected using equality ? It should lead to much faster plans for majority of SQLs. 2) suppose we have relation R1(id,name) and index ix(id,name) on it. In query like: select id,name from R1 order by id planner will prefer to do seqscan+sort (althought the R1 is rather big). And yes it is really faster than using indexscan. But indexscan always lookups actual record in heap even if all needed attributes are contained in the index. Oracle and even MSSQL reads attributes directly from index without looking for actual tuple at heap. Is there anyneed to do it in such ineffecient way ? regards, devik
В списке pgsql-hackers по дате отправления: