Re: Mystery: functions are slow with group by, but...
От | Tom Lane |
---|---|
Тема | Re: Mystery: functions are slow with group by, but... |
Дата | |
Msg-id | 20624.956080922@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Mystery: functions are slow with group by, but... (Allan Kelly <allan.kelly@buildstore.co.uk>) |
Список | pgsql-sql |
Allan Kelly <allan.kelly@buildstore.co.uk> writes: > I have a big performance problem using a function: > This takes 5-10 _minutes_ : > select a4, total_live_a4_properties( a4 ) > from properties group by a4; > There are 5132 records in the properties table, and 64 distinct a4 values. > However this next one is < 3 _seconds_! > select a4, count(*), (100*count(*)/total_live_a4_properties( a4 )) as percentage > from properties group by a4; > Can anyone tell me what's going on? You were right to look to EXPLAIN for a hint about the problem, but unfortunately EXPLAIN doesn't show you anything about where a function call is evaluated in the plan tree. If you dig into the EXPLAIN VERBOSE output, you can see where it's evaluated. It turns out that in the first query, 6.5.* evaluates the function call in the seqscan's target list, which means that the function is evaluated separately at each row of the table :-(. Then, all the duplicate rows are thrown away by the group step, so you just did 5000+ useless function evaluations, each of which needed a scan of the table. In the second case, the behavior is different because of the presence of an aggregate function --- the planner knows it cannot evaluate that until the top level, so the whole expression including the user function call is done at the output of the aggregate step. Net result, only one function evaluation per group. This is, of course, pretty braindead. It's fixed in 7.0: functions on group columns are not evaluated until after the group step, with or without aggregates. regards, tom lane
В списке pgsql-sql по дате отправления: