Re: BUG #5120: Performance difference between running a query with named cursor and straight SELECT

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #5120: Performance difference between running a query with named cursor and straight SELECT
Дата
Msg-id 14062.1255642397@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #5120: Performance difference between running a query with named cursor and straight SELECT  (Steve McLellan <smclellan@mintel.com>)
Список pgsql-bugs
Steve McLellan <smclellan@mintel.com> writes:
> Sorry, I should've tried that. The two explains are below; without cursor
> then with cursor. I don't know enough to reliably say whether there's
> anything wrong with either (we use default_statistics_target=100 although
> the estimates don't look right for some of the query fragments), but they're
> certainly very different.

Right, so the cursor plan is one that is estimated to take 3x longer,
but can deliver rows starting right away, where the other plan will have
a long delay and then spit everything out in a bunch.  It's not a bad
tradeoff, if the estimate were right.  The killer mistake is here:

>    ->  Index Scan using "Originator Type" on originator_dimension
>  (cost=0.00..125.29 rows=1 width=4) (actual time=26.067..41.906 rows=169
> loops=1)
>          Index Cond: ("Originator Type" = 'NO DATA'::text)
>          Filter: ("Area" = ANY ('{National,"Phoenix, AZ"}'::text[]))

Since that's on the outside of a nestloop, the 169x rowcount error
translates directly to a 169x error in the total runtime estimate ---
and it looks like that's not very far at all from the real ratio,
so the other estimates are pretty good.

I'm not sure whether you'd get any real improvement from a further
increase in stats target, though that's certainly an easy thing to
try.  What this looks like to me is the longstanding problem of having
correlations between the distributions of two different columns
(here, Area and Originator Type), which the planner doesn't know about.
So its estimate of the combined probability of meeting the conditions
is off.  You could try checking estimated vs real rowcount for the
area and originator conditions separately to see if that's the case
or not.  If either one is off by itself then a stats increase for that
column would help, if not there's not much to be done except lean on
the cursor target knob ...

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: BUG #5121: Segmentation Fault when using pam w/ krb5
Следующее
От: Steve McLellan
Дата:
Сообщение: Re: BUG #5120: Performance difference between running a query with named cursor and straight SELECT