match_unsorted_outer() vs. cost_nestloop()
От | Robert Haas |
---|---|
Тема | match_unsorted_outer() vs. cost_nestloop() |
Дата | |
Msg-id | 603c8f070909041802p18ed2fb1v91245ccfb5c2a24a@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: match_unsorted_outer() vs. cost_nestloop()
|
Список | pgsql-hackers |
In joinpath.c, match_unsorted_outer() considers materializing the inner side of each nested loop if the inner path is not an index scan, bitmap heap scan, tid scan, material path, function scan, CTE scan, or worktable scan. In costsize.c, cost_nestloop() charges the startup cost only once if the inner path is a hash path or material path; otherwise, it charges it for every anticipated rescan. It seems to me, perhaps naively, like the criteria used in these two places are more different than they maybe should be. For example, function scan nodes insert their results into a tuplestore so that rescans get the same set of tuples, which is why we don't consider inserting a materialize node over them in match_unsorted_outer() - but I think that also means that we oughtn't to be counting the startup cost for every rescan. I'm not exactly sure which ones should match or not match. Hash paths, maybe, shouldn't. I believe the reason why we don't count the startup cost of the hash path over again is because we're assuming that it's attributable to the cost of building the hash table, which only needs to be done once. I don't think that's 100% accurate because the hash path could have inherited some of that cost from its underlying paths. At any rate, it's conceivable that materializing could be enough cheaper than repeating the join that a materialize nodes makes sense. Thoughts? ...Robert
В списке pgsql-hackers по дате отправления: