Re: Order by (for 15 rows) adds 30 seconds to query time
От | Richard Neill |
---|---|
Тема | Re: Order by (for 15 rows) adds 30 seconds to query time |
Дата | |
Msg-id | 4B15FE63.10800@cam.ac.uk обсуждение исходный текст |
Ответ на | Re: Order by (for 15 rows) adds 30 seconds to query time ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Список | pgsql-performance |
Dear Kevin, Thanks for a very helpful reply. Kevin Grittner wrote: > Richard Neill <rn214@cam.ac.uk> wrote: > >> Am I wrong in thinking that ORDER BY is always applied after the >> main query is run? > > Yes, you are wrong to think that. It compares the costs of various > plans, and when it has an index with the high order portion matching > your ORDER BY clause, it may think that it can scan that index to > generate the correct sequence. If the sort is estimated to be > expensive enough compared to the index scan, it will use the index > scan and skip the sort. Sorting hundreds of billions of rows can be > expensive. > That makes sense now. >> Even if I run it this way: >> >> select * from (select * from h.inventory where demand_id >> =289276563) as sqry order by id; >> >> which should(?) surely force it to run the first select, then >> sort, > > I wouldn't necessarily assume that. You can EXPLAIN that form of > the query and find out easily enough. Does it say: > > -> Index Scan using inventory_demand_id on > inventory (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053 > rows=15 loops=1) > Index Cond: (demand_id = 289276563) > > or: > > -> Index Scan using inventory_pkey on > inventory (cost=0.00..879728.20 rows=28 width=56) (actual > time=31738.956..32860.738 rows=15 loops=1) > Filter: (demand_id = 289276563) > >> it's quick if I do order by id+1 > > You don't have an index on id+1. > Your explanation is validated by the explain - it only does the sort last iff I use "order by id+1", where there is no index for that. [Aside: using "id+0" also forces a sort.] > > The real problem to solve here is that it's estimating the rows > count for the result so badly. If you need a short-term > work-around, you've already discovered that you can keep it from > using the index on id for ordering by creating an expression using > id which causes it not to consider the index a match. That's kind > of ugly to keep long term, though. > We seem to have a general case of very bad query plans, where in other cases, explain analyze shows that the query-planner's guesses are miles adrift. Others have said that this is symptomatic of a lack of doing analyze, however we are doing quite a lot of analyzing (both through autovacuum, and a manual "vacuum verbose analyze" every night). Our underlying statistical distribution isn't that changeable. Thanks, Richard
В списке pgsql-performance по дате отправления: