Re: horrendous query challenge :-)
От | Tom Lane |
---|---|
Тема | Re: horrendous query challenge :-) |
Дата | |
Msg-id | 2935.1022790172@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: horrendous query challenge :-) (Fran Fabrizio <ffabrizio@mmrd.com>) |
Список | pgsql-general |
Fran Fabrizio <ffabrizio@mmrd.com> writes: > ... at which point the query runs pretty much instantly. > That's an awful lot of overhead for that poor function. findsite() is a > key function that we use all over the place. I thought it was fairly > efficient but this demonstrates how it can quickly get out of hand. I > suppose if I could always ensure that findsite_cache was completely > populated, we could always just hit that directly. Since "what is the > site id of the site that holds this entity?" is such a common question, > we really should have it in a table as opposed to a function lookup, > huh? Does even the simplest plpgsql function have this kind of > overhead? Or is my function poorly written? The problem is not that there's anything wrong with the function on its own terms. The problem is that the query planner has no idea what the semantics of findsite(a.foo) = b.bar are, and so it has no alternative but to execute the query as a nested loop: compare every row of A to every row of B, computing findsite() again for each such comparison. And each findsite call performs an independent probe into findparent_cache, making the thing effectively equivalent to a three-level nested loop. You did not show the query plan being used after you converted this to a join against findparent_cache, but obviously it's a lot better than a 3-level nested loop... It would have helped if the planner had computed findsite(a.foo) only once per row of A. In the current scheme of things I believe that would happen if B were being scanned with an inner indexscan, but unfortunately B was too small to justify an indexscan. (The extra cost of re-evaluating findsite isn't weighted sufficiently high in making that choice, because the planner doesn't know that findsite() is any more expensive than any other function.) However, the real lesson is that pushing table lookups down into functions prevents the planner from optimizing those lookups as joins. regards, tom lane
В списке pgsql-general по дате отправления: