add_path optimization
От | Robert Haas |
---|---|
Тема | add_path optimization |
Дата | |
Msg-id | 603c8f070901312037ld88d3f4qd4821c4b44c5c3f6@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: add_path optimization
Re: add_path optimization Re: add_path optimization Re: add_path optimization Re: add_path optimization |
Список | pgsql-hackers |
I've been doing some benchmarking and profiling on the PostgreSQL query analyzer, and it seems that (at least for the sorts of queries that I typically run) the dominant cost is add_path(). I've been able to find two optimizations that seem to help significantly: 1. add_path() often calls compare_fuzzy_path_costs() twice on the same pair of paths, and when the paths compare equal on one criterion, some comparisons are duplicated. I've refactored this function to return the results of both calculations without repeating any floating-point arithmetic. 2. match_unsorted_outer() adds as many as 5 nested loop joins at a time with the same set of pathkeys. In my tests, it tended to be ~3 - cheapest inner, cheapest inner materialized, and cheapest inner index. Since these all have the same pathkeys, clearly only the one with the cheapest total cost is in the running for cheapest total cost for that set of pathkeys, and likewise for startup cost (and the two may be the same). Yet we compare all of them against the whole pathlist, one after the other, including (for the most part) the rather expensive pathkey comparison. I've added a function add_similar_paths() and refactored match_unsorted_outer() to use it. On a couple of complex (and proprietary) queries with 12+ joins each, I measure a planning time improvement of 8-12% with the attached patch applied. It would be interesting to try to replicate this on a publicly available data set, but I don't know of a good one to use. Suggestions welcome - results of performance testing on your own favorite big queries even more welcome. Simple test harness also attached. I took the approach of dropping caches, starting the server, and then running this 5 times each on several queries, dropping top and bottom results. ...Robert
Вложения
В списке pgsql-hackers по дате отправления: