Re: WIP patch for parameterized inner paths
От | Tom Lane |
---|---|
Тема | Re: WIP patch for parameterized inner paths |
Дата | |
Msg-id | 29697.1327508653@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | WIP patch for parameterized inner paths (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: WIP patch for parameterized inner paths
Re: WIP patch for parameterized inner paths |
Список | pgsql-hackers |
I wrote: > Attached is a WIP patch for parameterized paths, along the > lines we have discussed before: ... I've made considerable progress on the TODO items I listed: indxpath.c has been ripped apart and restructured, and I have it considering parameterized paths for hash sub-joins. I made a deliberate policy decision not to work very hard on exploring parameterized mergejoin plans, because that seems to inflate the number of paths considered way more than the relative usefulness of merge over hash joins justifies. Also, the attached patch undoes the lobotomization of IN/EXISTS pullup that we had to install in 8.4 as a stopgap for lack of ability to consider the type of plan that can now be handled. After that I started doing some performance testing, and the initial news was bad: the planner was about 3x slower than 9.1 on a moderately large join problem. I've spent the past several days hacking away at that, and have got it down to about 35% slower by dint of the following changes: * micro-optimization of add_path(), in particular avoiding duplicate calculations during cost comparisons. * introducing a two-step mechanism for testing whether proposed join paths are interesting. The patch first calculates a quick and dirty lower bound on the cost of the join path (mostly, from the costs of its input paths) and looks through the joinrel's pathlist to see if there is already a path that is clearly better. If not, it proceeds with the full cost calculation and add_path insertion. In my testing the precheck is able to eliminate 80% or so of the proposed paths, more than repaying the extra pathlist search. Now this is of course cheating with both hands, in that either of these optimization techniques could have been applied before ... but they weren't. I think that 35% slower on large join problems is probably acceptable, given that this is investigating a larger number of possible solutions and hopefully often finding a better plan. I think I can knock some more off of that by refactoring the costsize.c APIs, anyway. Right now the first-pass and second-pass cost calculations are independent and so there's some repeated work, which I'd like to eliminate both for speed reasons and to get rid of duplicate code that'd have to be kept in sync if it's left as-is. If there are not objections, I'd like to go ahead and commit this after one more round of polishing. There's still a lot left to do, but what it mainly needs now is some testing on real-world planning problems, and I don't think it's going to get that until it's been merged in. regards, tom lane
Вложения
В списке pgsql-hackers по дате отправления: