Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1
От | Tom Lane |
---|---|
Тема | Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1 |
Дата | |
Msg-id | 13031.1300123473@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1 ("John Surcombe" <John.Surcombe@digimap.gg>) |
Список | pgsql-performance |
"John Surcombe" <John.Surcombe@digimap.gg> writes: >> It'd be >> useful to see the pg_stats.correlation value for both the userid and >> receiveddatetime columns. > Yes, the table is indeed nearly perfectly ordered by receiveddatetime > (correlation 0.998479). correlation on userid is -0.065556. n_distinct > on userid is also low: 1097. Ah-hah. > Is the problem perhaps something like the following: PostgreSQL is > thinking that because there are not many userids and there is low > correlation, that if it just scans the table from the top in date order, > this will be cheap (because receiveddatetime correlation is high so it > won't have to seek randomly), and it won't have to scan very far before > it finds the first row with a matching userid. There's some of that, but I think the main problem is that there's a very high discount on the cost estimate for a perfectly-correlated index, and that makes it end up looking cheaper to use than the uncorrelated one. (It doesn't help any that we don't do correlation properly for multicolumn indexes; but given what you say above, the correlation estimate for the two-column index would be small even if we'd computed it exactly.) You might find that reducing random_page_cost would avoid the problem. That should reduce the advantage conferred on the high-correlation index, and it probably would represent your actual configuration better anyway, given the results you're showing here. regards, tom lane
В списке pgsql-performance по дате отправления: