Re: Slow count(*) again...
От | Scott Marlowe |
---|---|
Тема | Re: Slow count(*) again... |
Дата | |
Msg-id | AANLkTimayTM__hHu9RoiLaNhzfpNY6Vo5y4OkbX_sV=o@mail.gmail.com обсуждение исходный текст |
Ответ на | Slow count(*) again... (Neil Whelchel <neil.whelchel@gmail.com>) |
Ответы |
Re: Slow count(*) again...
Re: Slow count(*) again... |
Список | pgsql-performance |
On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel <neil.whelchel@gmail.com> wrote: > I know that there haven been many discussions on the slowness of count(*) even > when an index is involved because the visibility of the rows has to be > checked. In the past I have seen many suggestions about using triggers and > tables to keep track of counts and while this works fine in a situation where > you know what the report is going to be ahead of time, this is simply not an > option when an unknown WHERE clause is to be used (dynamically generated). > I ran into a fine example of this when I was searching this mailing list, > "Searching in 856,646 pages took 13.48202 seconds. Site search powered by > PostgreSQL 8.3." Obviously at some point count(*) came into play here because > the site made a list of pages (1 2 3 4 5 6 > next). I very commonly make a > list of pages from search results, and the biggest time killer here is the > count(*) portion, even worse yet, I sometimes have to hit the database with > two SELECT statements, one with OFFSET and LIMIT to get the page of results I > need and another to get the amount of total rows so I can estimate how many > pages of results are available. The point I am driving at here is that since > building a list of pages of results is such a common thing to do, there need > to be some specific high speed ways to do this in one query. Maybe an > estimate(*) that works like count but gives an answer from the index without > checking visibility? I am sure that this would be good enough to make a page > list, it is really no big deal if it errors on the positive side, maybe the > list of pages has an extra page off the end. I can live with that. What I > can't live with is taking 13 seconds to get a page of results from 850,000 > rows in a table. 99% of the time in the situations you don't need an exact measure, and assuming analyze has run recently, select rel_tuples from pg_class for a given table is more than close enough. I'm sure wrapping that in a simple estimated_rows() function would be easy enough to do.
В списке pgsql-performance по дате отправления: