Re: Performance issue in foreign-key-aware join estimation
От | Tomas Vondra |
---|---|
Тема | Re: Performance issue in foreign-key-aware join estimation |
Дата | |
Msg-id | 3faaa4df-2fa1-9162-2e1b-db48d4334aca@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: Performance issue in foreign-key-aware join estimation (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: Performance issue in foreign-key-aware join estimation
|
Список | pgsql-hackers |
On 12/24/18 1:07 PM, David Rowley wrote: > On Mon, 24 Dec 2018 at 09:38, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Using the above idea, but rather than going to the trouble of storing >> PlannerInfo->eq_classes as an array type list, if we build the array >> on the fly inside match_foreign_keys_to_quals(), then build a >> Bitmapset type index to mark which of the eclasses contains members >> for each relation, then I can get the run-time for the function down >> to just 0.89%. Looking at other functions appearing high on the >> profile I also see; have_relevant_eclass_joinclause() (14%), >> generate_join_implied_equalities_for_ecs() (23%). > > I've now expanded the proof of concept patch to use this indexing > technique for have_relevant_eclass_joinclause() and > generate_join_implied_equalities_for_ecs(). With Tom's test from > up-thread, I get: > > Master: > latency average = 14125.374 ms > > Patched: > latency average = 2417.164 ms > Yes, I can confirm these measurements. On my machine, timing on master is about 10530ms, with v1 of the patch it drops to 2600ms and v2 pushes it down to 1610ms. I however observe failures on 4 regression test suites - inherit, equivclass, partition_join and partition_prune (diff attached). That's a bit surprising, because AFAICS the patch merely optimizes the execution and should not change the planning otherwise (all the failures seem to be a query plan changing in some way). I'm not sure if the plans are correct or better than the old ones. The other thing is that we probably should not use a single test case to measure the optimization - I doubt it can improve less extreme queries, but maybe we should verify it does not regress them? With the patch attached, bms_overlap gets quite high in the profiles. I think one reason for that is that all bitmap operations (not just bms_overlap) always start the loop at 0. For the upper boundary is usually determined as Min() of the lengths, but we don't do that for lower boundary because we don't track that. The bitmaps for eclasses are likely sparse, so this is quite painful. Attaches is a simple patch that adds tracking of "minword" and uses it in various bms_ methods to skip initial part of the loop. On my machine this reduces the timings by roughly 5% (from 1610 to 1530 ms). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: