Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
От | Ashutosh Bapat |
---|---|
Тема | Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids) |
Дата | |
Msg-id | CAExHW5tREjmH9qMTVLR0fjiCojjwfgqsax4RUrxxmaQKWbLWZQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids) (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Ответы |
Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
|
Список | pgsql-hackers |
On Fri, Sep 8, 2023 at 3:22 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Fri, Jul 28, 2023 at 3:16 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > Hi Tom, Richard, > > > > On Mon, Jul 24, 2023 at 8:17 AM Richard Guo <guofenglinux@gmail.com> wrote: > > > > > > Thanks for pushing it! > > > > With this fix, I saw a noticeable increase in the memory consumption > > of planner. I was running experiments mentioned in [1] The reason is > > the Bitmapset created by bms_union() are not freed during planning and > > when there are thousands of child joins involved, bitmapsets takes up > > a large memory and there any a large number of bitmaps. > > > > Attached 0002 patch fixes the memory consumption by calculating > > appinfos only once and using them twice. The number look like below > > > > Number of tables joined | without patch | with patch | > > ------------------------------------------------------ > > 2 | 40.8 MiB | 40.3 MiB | > > 3 | 151.6 MiB | 146.9 MiB | > > 4 | 463.9 MiB | 445.5 MiB | > > 5 | 1663.9 MiB | 1563.3 MiB | > > > > The memory consumption is prominent at higher number of joins as that > > exponentially increases the number of child joins. > > > > Attached setup.sql and queries.sql and patch 0001 were used to measure > > memory similar to [1]. > > > > I don't think it's a problem with the patch itself. We should be able > > to use Bitmapset APIs similar to what patch is doing. But there's a > > problem with our Bitmapset implementation. It's not space efficient > > for thousands of partitions. A quick calculation reveals this. > > > > If the number of partitions is 1000, the matching partitions will > > usually be 1000, 2000, 3000 and so on. Thus the bitmapset represnting > > the relids will be {b 1000, 2000, 3000, ...}. To represent a single > > 1000th digit current Bitmapset implementation will allocate 1000/8 = > > 125 bytes of memory. A 5 way join will require 125 * 5 = 625 bytes of > > memory. This is even true for lower relid numbers since they will be > > 1000 bits away e.g. (1, 1001, 2001, 3001, ...). So 1000 such child > > joins require 625KiB memory. Doing this as many times as the number of > > joins we get 120 * 625 KiB = 75 MiB which is closer to the memory > > difference we see above. > > > > Even if we allocate a list to hold 5 integers it will not take 625 > > bytes. I think we need to improve our Bitmapset representation to be > > efficient in such cases. Of course whatever representation we choose > > has to be space efficient for a small number of tables as well and > > should gel well with our planner logic. So I guess some kind of > > dynamic representation which changes the underlying layout based on > > the contents is required. I have looked up past hacker threads to see > > if this has been discussed previously. > > > > I don't think this is the thread to discuss it and also I am not > > planning to work on it in the immediate future. But I thought I would > > mention it where I found it. > > > > [1] https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com > > > > Adding this small patch to the commitfest in case somebody finds it > worth fixing this specific memory consumption. With a new subject. Rebased patches. 0001 - is same as the squashed version of patches at https://www.postgresql.org/message-id/CAExHW5sCJX7696sF-OnugAiaXS=Ag95=-m1cSrjcmyYj8Pduuw@mail.gmail.com. 0002 is the actual fix described earlier -- Best Wishes, Ashutosh Bapat
Вложения
В списке pgsql-hackers по дате отправления: