Re: index v. seqscan for certain values

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: index v. seqscan for certain values
Дата	12 апреля 2004 г. 18:02:50
Msg-id	13762.1081803767@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: index v. seqscan for certain values ("Jeremy Dunn" <jdunn@autorevenue.com>)
Ответы	Re: index v. seqscan for certain values
Список	pgsql-performance

Дерево обсуждения

"Jeremy Dunn" <jdunn@autorevenue.com> writes:
> Agreed.  However, given that count(*) is a question that can be answered
> _solely_ using the index (without reference to the actual data blocks),

As Bruno noted, that is not the case in Postgres; we must visit the
table rows anyway.

> When I just tried it again with a value of 300, analyze, then run the
> query, I get a *worse* result for an estimate.  I don't understand this.

That's annoying.  How repeatable are these results --- if you do ANALYZE
over again several times, how much does the row count estimate change
each time?  (It should change somewhat, since ANALYZE is taking a random
sample, but one would like to think not a whole lot.)  Is the variance
more or less at the higher stats target?  Take a look at a few different
CID values to get a sense of the accuracy, don't look at just one ...

(Actually, you might find it more profitable to look at the pg_stats
entry for the CID column rather than reverse-engineering the stats via
ANALYZE.  Look at how well the most-common-values list and associated
frequency numbers track reality.)

Also, can you think of any reason for the distribution of CID values
to be nonuniform within the table?  For instance, do rows get inserted
in order of increasing CID, or is there any clustering of rows with the
same CID?

            regards, tom lane

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: index v. seqscan for certain values