Re: New design for FK-based join selectivity estimation
От | Tomas Vondra |
---|---|
Тема | Re: New design for FK-based join selectivity estimation |
Дата | |
Msg-id | 0cc0ee1c-7c47-7971-767a-0cf22b7f0a93@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: New design for FK-based join selectivity estimation (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: New design for FK-based join selectivity estimation
|
Список | pgsql-hackers |
On 06/18/2016 09:27 PM, Tom Lane wrote: > I wrote: >> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>> Is this a good idea? I'd probably vote to do what's proposed (and >>> rejected) in the second half of the comment, i.e. put back the clauses >>> and skip the FK as that pretty much says "improve estimate or keep the >>> current one, but don't risk making it worse." > >> I would buy that approach when it comes to "loose" quals, but I think >> it's not right for EC-derived matches, because of the behavior we >> discussed previously that an EC should be considered to have implicitly >> generated all the matching quals even though it actually made only one >> qual that might not be any of them exactly. > > After further thought I decided you're right, because one of the main > original goals of ECs was to prevent double-counting the selectivity > of redundant equality quals. Acting as though the EC had generated > multiple redundant quals is likely to make things worse not better. > > I still feel that we're leaving something on the table here, but it > would need to take the form of intelligently combining estimates for > overlapping FKs, not just blindly multiplying them together. Seems > like a task for later, especially considering that cases of this sort > are likely to be rare. > > Pushed with adjustments for that. OK, thanks. The one thing we haven't done is testing the performance, to see how this fares. So I've repeated the tests I've done on the original version of the patch here https://www.postgresql.org/message-id/8344835e-18af-9d40-aed7-bd2261be9162@2ndquadrant.com Sadly I don't have access to the machine used for the previous tests (on a vacation and the machine sits under my desk at home), so I had to use my laptop. That means a fair amount of variability due to power saving built into the CPU, and also VM variability (using Xen VM). So the numbers are not directly comparable to the old results, and I believe the differences between the patched and unpatched version seem to be quite clear despite the variability. It's true the results are not as bad as with the originally committed patch, but there are multiple cases where the planning time gets up to ~2x compared to master. See the old-results.ods, and also old-scripts.tgz (the old scripts used the GUC we removed, so I had to tweak it a bit, and you'll have to whack it a bit to get it working). Sure, those cases use many foreign keys (generally >=100), but the conclusion with the old patch was that it matters and we spent a lot of time to get it within 10% of master. There are also two or three cases where the planning got quite a bit faster, for some reason. I've also constructed another script, joining just 2 tables and doing some funky things (e.g. adding a lot of overlapping foreign keys), and in these cases the slowdown is even more significant - up to ~13x, and the stddev also increased. See new-results.ods and new-scripts.tgz. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: