Re: Selecting max(pk) is slow on empty set
От | Richard Huxton |
---|---|
Тема | Re: Selecting max(pk) is slow on empty set |
Дата | |
Msg-id | 4795FA5B.8030003@archonet.com обсуждение исходный текст |
Ответ на | Re: Selecting max(pk) is slow on empty set ("Alexander Staubo" <alex@purefiction.net>) |
Ответы |
Re: Selecting max(pk) is slow on empty set
|
Список | pgsql-general |
Alexander Staubo wrote: > On 1/22/08, Richard Huxton <dev@archonet.com> wrote: >> Then see if an ALTER TABLE SET >> STATISTICS 100 makes a difference. > > So it does: > > # explain analyze select max(id) from user_messages where user_id = 13604; > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=1320.52..1320.53 rows=1 width=4) (actual > time=13.640..13.640 rows=1 loops=1) > -> Index Scan using user_messages_user on user_messages > (cost=0.00..1319.62 rows=358 width=4) (actual time=13.631..13.631 > rows=0 loops=1) > Index Cond: (user_id = 13604) > Total runtime: 13.712 ms > > Thank you! That solves my performance problem, at least. Although the row-estimate still seems quite high. You might want to increase it even further (maximum is 1000). If this is a common query, I'd look at an index on (user,id) rather than just (user) perhaps. > But it's worrying that PostgreSQL should be so off in planning the > query. Does this behaviour qualify as a bug, or is this -- that is, > the need to tweak statistics parameters -- just your garden-variety > application-specific optimization? Well, it's data-specific rather than application specific I suppose. The issue is that there is a cost to tracking 100 values and you don't want to pay that on every column in every table. If user 13604 isn't in the list of most-common users then all it can really do is fix an upper bound on how many matches it can have. Of course you and I can reason outside of the data and guess that manu users won't have more than a handful of messages, but that's not something PG can do. In theory, PG could auto-tune itself for various parameters. The problem then is, do you: 1. Learn constantly, meaning you constantly pay the cost of checking your decisions and never get consistent plans. 2. Learn once, in which case a change in data frequencies or usage patterns renders your learning out of date. You might find http://pgfoundry.org/ useful with the fouine / pqa projects to analyse query logs. -- Richard Huxton Archonet Ltd
В списке pgsql-general по дате отправления: