Re: Plan uses wrong index, preferring to scan pkey index instead

Поиск
Список
Период
Сортировка
От Yuri Kunde Schlesner
Тема Re: Plan uses wrong index, preferring to scan pkey index instead
Дата
Msg-id 1416355354.465985.192652933.31058871@webmail.messagingengine.com
обсуждение исходный текст
Ответ на Re: Plan uses wrong index, preferring to scan pkey index instead  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
On Sun, Nov 16, 2014, at 03:18 PM, Tom Lane wrote:
> I suspect that the reason the planner likes the backlog_pkey is that it's
> almost perfectly correlated with table order, which greatly reduces the
> number of table fetches that need to happen over the course of a
> indexscan
> compared to using the less-well-correlated bufferid+messageid index.
> So that way is estimated to be cheaper than using the less-correlated
> index ... and that may even be true except for outlier bufferid values
> with no recent messages.
Indeed, and I can imagine that this is more advantageous in the general
case, as I described in my last message. The problem is that the
variance is too high, with a 500x slowdown between the best and the
worst cases for that plan.

> What I think might be a workable solution, assuming you can stand a
> little
> downtime to do it, is to CLUSTER the table on the bufferid+messageid
> index.  This would reverse the correlation advantage and thereby solve
> your problem.  Now, ordinarily CLUSTER is only a temporary solution
> because the cluster-induced ordering degrades over time.  But I think it
> would likely be a very long time until you accumulate so many new
> messages
> that the table as a whole looks well-correlated on messageid alone.
I tried this and it seems to have solved my problem! The better plan is
consistently chosen now, and it's as fast as former plan on the fast
cases, and much faster on the slow case. I will continue monitoring the
DB to see if it eventually switches back to the former scheme, and if it
does I can just include a re-cluster on my maintenance schedule. Thanks
so much for the suggestion.


В списке pgsql-performance по дате отправления:

Предыдущее
От: Dave Johansen
Дата:
Сообщение: Re: Partitions and work_mem?
Следующее
От: Stuart Bishop
Дата:
Сообщение: A pessimistic planner