Re: Querying distinct values from a large table
От | Igor Lobanov |
---|---|
Тема | Re: Querying distinct values from a large table |
Дата | |
Msg-id | 45BF10D7.3040203@swsoft.com обсуждение исходный текст |
Ответ на | Re: Querying distinct values from a large table (Richard Huxton <dev@archonet.com>) |
Ответы |
Re: Querying distinct values from a large table
Re: Querying distinct values from a large table |
Список | pgsql-performance |
Richard Huxton wrote: >> I have rather large table with about 5 millions of rows and a dozen of >> columns. Let's suppose that columns are named 'a', 'b', 'c' etc. I >> need to query distinct pairs of ('a';'b') from this table. > > What version of PostgreSQL is it? 8.1.4 > How many distinct values are you getting back from your 5 million rows? > If there are too many, an index isn't going to help. No more than 10,000. > Can you share the EXPLAIN ANALYSE output? You might want to try > increasing work_mem for this one query to speed any sorting. Real table and colum names are obfuscated because of NDA, sorry. explain analyze select distinct a, b from tbl EXPLAIN ANALYZE output is: Unique (cost=500327.32..525646.88 rows=1848 width=6) (actual time=52719.868..56126.356 rows=5390 loops=1) -> Sort (cost=500327.32..508767.17 rows=3375941 width=6) (actual time=52719.865..54919.989 rows=3378864 loops=1) Sort Key: a, b -> Seq Scan on tbl (cost=0.00..101216.41 rows=3375941 width=6) (actual time=16.643..20652.610 rows=3378864 loops=1) Total runtime: 57307.394 ms > How often is the table updated? Clustering might buy you some > improvements (but not a huge amount I suspect). It is updated once per 3-5 seconds. And one more thing. I don't know if it helps, but column 'a' can have value from a limited set: 0, 1 or 2. Column 'b' is also an integer (foreign key, actually). -- Igor Lobanov Internal Development Engineer SWsoft, Inc.
В списке pgsql-performance по дате отправления: