Re: [RFC] speed up count(*)
От | Tomas Vondra |
---|---|
Тема | Re: [RFC] speed up count(*) |
Дата | |
Msg-id | d55a7173-865f-03e8-7d0b-b12942560024@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: [RFC] speed up count(*) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [RFC] speed up count(*)
|
Список | pgsql-hackers |
On 10/20/21 19:57, Tom Lane wrote: > John Naylor <john.naylor@enterprisedb.com> writes: >> Perennially our users have complaints about slow count(*) when coming from >> some other systems. Index-only scans help, but I think we can do better. I >> recently wondered if a BRIN index could be used to answer min/max aggregate >> queries over the whole table, and it turns out it doesn't. However, then it >> occurred to me that if we had an opclass that keeps track of the count in >> each page range, that would be a way to do a fast count(*) by creating the >> right index. That would require planner support and other work, but it >> seems doable. Any opinions on whether this is worth the effort? > > The core reason why this is hard is that we insist on giving the right > answer. In particular, count(*) is supposed to count the rows that > satisfy the asker's snapshot. So I don't see a good way to answer it > from an index only, given that we don't track visibility accurately > in indexes. > Couldn't we simply inspect the visibility map, use the index data only for fully visible/summarized ranges, and inspect the heap for the remaining pages? That'd still be a huge improvement for tables with most only a few pages modified recently, which is a pretty common case. I think the bigger issue is that people rarely do COUNT(*) on the whole table. There are usually other conditions and/or GROUP BY, and I'm not sure how would that work. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: