Re: [PoC] Reducing planning time when tables have many partitions

Поиск
Список
Период
Сортировка
От Yuya Watari
Тема Re: [PoC] Reducing planning time when tables have many partitions
Дата
Msg-id CAJ2pMkaNzmvMUm9igQwRH0AAo39gsjnE1VXupPGyLR2T7ENnUQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Reducing planning time when tables have many partitions  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: [PoC] Reducing planning time when tables have many partitions  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Re: [PoC] Reducing planning time when tables have many partitions  ("Lepikhov Andrei" <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
Hello Ashutosh and Andrey,

Thank you for your email, and I really apologize for my late response.

On Thu, Sep 7, 2023 at 3:43 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> It seems that  you are still investigating and fixing issues. But the
> CF entry is marked as "needs review". I think a better status is
> "WoA". Do you agree with that?

Yes, I am now investigating and fixing issues. I agree with you and
changed the entry's status to "Waiting on Author". Thank you for your
advice.

On Tue, Sep 19, 2023 at 5:21 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Working on self-join removal in the thread [1] nearby, I stuck into the
> problem, which made an additional argument to work in this new direction
> than a couple of previous ones.
> With indexing positions in the list of equivalence members, we make some
> optimizations like join elimination more complicated - it may need to
> remove some clauses and equivalence class members.
> For changing lists of derives or ec_members, we should go through all
> the index lists and fix them, which is a non-trivial operation.

Thank you for looking into this and pointing that out. I understand
that this problem will occur somewhere like your patch [1] quoted
below because we need to modify RelOptInfo->eclass_child_members in
addition to ec_members. Is my understanding correct? (Of course, I
know ec_[no]rel_members, but I doubt we need them.)

=====
+static void
+update_eclass(EquivalenceClass *ec, int from, int to)
+{
+   List       *new_members = NIL;
+   ListCell   *lc;
+
+   foreach(lc, ec->ec_members)
+   {
+       EquivalenceMember  *em = lfirst_node(EquivalenceMember, lc);
+       bool                is_redundant = false;
+
        ...
+
+       if (!is_redundant)
+           new_members = lappend(new_members, em);
+   }
+
+   list_free(ec->ec_members);
+   ec->ec_members = new_members;
=====

I think we may be able to remove the eclass_child_members field by
making child members on demand. v20 makes child members at
add_[child_]join_rel_equivalences() and adds them into
RelOptInfo->eclass_child_members. Instead of doing that, if we
translate on demand when child members are requested,
RelOptInfo->eclass_child_members is no longer necessary. After that,
there is only ec_members, which consists of parent members, so
removing clauses will still be simple. Do you think this idea will
solve your problem? If so, I will experiment with this and share a new
patch version. The main concern with this idea is that the same child
member will be created many times, wasting time and memory. Some
techniques like caching might solve this.

[1] https://www.postgresql.org/message-id/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru

--
Best regards,
Yuya Watari



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Comment about set_join_pathlist_hook()