Re: index v. seqscan for certain values
От | Tom Lane |
---|---|
Тема | Re: index v. seqscan for certain values |
Дата | |
Msg-id | 13762.1081803767@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: index v. seqscan for certain values ("Jeremy Dunn" <jdunn@autorevenue.com>) |
Ответы |
Re: index v. seqscan for certain values
|
Список | pgsql-performance |
"Jeremy Dunn" <jdunn@autorevenue.com> writes: > Agreed. However, given that count(*) is a question that can be answered > _solely_ using the index (without reference to the actual data blocks), As Bruno noted, that is not the case in Postgres; we must visit the table rows anyway. > When I just tried it again with a value of 300, analyze, then run the > query, I get a *worse* result for an estimate. I don't understand this. That's annoying. How repeatable are these results --- if you do ANALYZE over again several times, how much does the row count estimate change each time? (It should change somewhat, since ANALYZE is taking a random sample, but one would like to think not a whole lot.) Is the variance more or less at the higher stats target? Take a look at a few different CID values to get a sense of the accuracy, don't look at just one ... (Actually, you might find it more profitable to look at the pg_stats entry for the CID column rather than reverse-engineering the stats via ANALYZE. Look at how well the most-common-values list and associated frequency numbers track reality.) Also, can you think of any reason for the distribution of CID values to be nonuniform within the table? For instance, do rows get inserted in order of increasing CID, or is there any clustering of rows with the same CID? regards, tom lane
В списке pgsql-performance по дате отправления: