Re: speeding up COUNT and DISTINCT queries

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: speeding up COUNT and DISTINCT queries
Дата
Msg-id 874r67ff28.fsf@stark.dyndns.tv
обсуждение исходный текст
Ответ на Re: speeding up COUNT and DISTINCT queries  (Max Baker <max@warped.org>)
Ответы Re: speeding up COUNT and DISTINCT queries  (Robert Treat <xzilla@users.sourceforge.net>)
Список pgsql-performance
Max Baker <max@warped.org> writes:

> On Wed, Mar 12, 2003 at 05:57:50PM -0800, Joe Conway wrote:
> > Max Baker wrote:
> > >Thanks for the help.  I guess i'm not clear on why there is so much
> > >extra cruft.  Does postgres leave a little bit behind every time it does
> > >an update?  Because this table is updated constantly.
> > >
> >
> > Yes. See:
> > http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/routine-vacuuming.html
>
> That would explain why once a night isn't enough.  Thanks.
> The contents of this table get refreshed every 4 hours.  I'll add a
> vacuum after every refresh and comapre the results in a couple days.

If it gets completely refreshed, ie, every tuple is updated or deleted and
re-inserted in a big batch job then VACUUM might never be enough without
boosting some config values a lot. You might need to do a VACUUM FULL after
the refresh. VACUUM FULL locks the table though which might be unfortunate.

VACUUM FULL should be sufficient but you might want to consider instead
TRUNCATE-ing the table and then reinserting records rather than deleting if
that's what you're doing. Or alternatively building the new data in a new
table and then doing a switcheroo with ALTER TABLE RENAME. However ALTER TABLE
(and possible TRUNCATE as well?) will invalidate functions and other objects
that refer to the table.

Regarding the original question:

. 7.4 will probably be faster than 7.3 at least if you stick with GROUP BY.

. You could try building an index on mac, but I suspect even then it'll choose
  the sequential scan. But try it with an index and enable_seqscan = off to
  see if it's even worth trying to get it to use the index. If so you'll have
  to lower random_page_cost and/or play with cpu_tuple_cost and other
  variables to get it to do so.

. You might also want to cluster the table on that index. You would have to
  recluster it every time you do your refresh and it's not clear how much it
  would help if any. But it might be worth trying.

--
greg

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Christopher Kings-Lynne"
Дата:
Сообщение: Re: speeding up COUNT and DISTINCT queries
Следующее
От: Robert Treat
Дата:
Сообщение: Re: speeding up COUNT and DISTINCT queries