Re: Performance problems with DISTINCT ON
От | Sgarbossa Domenico |
---|---|
Тема | Re: Performance problems with DISTINCT ON |
Дата | |
Msg-id | 0D9FA0704ED847C089451F8B79D1EE4C@laneniac.it обсуждение исходный текст |
Ответ на | Performance problems with DISTINCT ON ("Sgarbossa Domenico" <domenico.sgarbossa@eniac.it>) |
Ответы |
Re: Performance problems with DISTINCT ON
Re: Performance problems with DISTINCT ON |
Список | pgsql-performance |
Subject: Re: [PERFORM] Performance problems with DISTINCT ON > Sgarbossa Domenico wrote: >> I need to retrieve the most recent prices per products from a price list >> table: > >> select distinct on (articolo) articolo,data_ent,prezzo from >> listini_anagrafici order by articolo, data_ent desc >> >> but it seems that this query runs slowly... about 5/6 seconds. >> the table contains more or less 500K records, PostgreSQL version is >> 8.1.11 and the server has 4gb of RAM entirely dedicate to the db. > >> 'Unique (cost=73893.89..76551.25 rows=88312 width=24) (actual >> time=4022.578..5076.206 rows=193820 loops=1)' >> ' -> Sort (cost=73893.89..75222.57 rows=531472 width=24) (actual >> time=4022.574..4505.538 rows=531472 loops=1)' >> ' Sort Key: articolo, data_ent' >> ' -> Seq Scan on listini_anagrafici (cost=0.00..16603.72 >> rows=531472 width=24) (actual time=0.009..671.797 rows=531472 loops=1)' >> 'Total runtime: 5217.452 ms' > > You've got 531472 rows in the table and the query is going to output > 193820 of them. Scanning the whole table is almost certainly the way to > go. > > If the table doesn't change much, you could try running a CLUSTER on the > index you've created. That will lock the table while it re-orders the > physical layout of the rows based on your index though, so it's no good > if the table is updated much. > > Failing that, you could try issuing "set work_mem = ..." before the > query with increasing sizes for work_mem. That might make the sort > faster too. > Thank you for the answer, I've tried as you suggest but the only things that seems make some differences is the work_mem parameter This helps to reduce the amount of time about for the half (3 seconds) but unfortunately this ain't enough. If there are a lot of concurrent request I think it could made the data swap to the disk. Should I try a different approach to solve this issue?
В списке pgsql-performance по дате отправления: