Re: Memory consumed by paths during partitionwise join planning

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Memory consumed by paths during partitionwise join planning
Дата
Msg-id CAExHW5tn25KxL92FF5R66phhfeUCV0AeFEZ95KCst+Fm5nm1Rg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Memory consumed by paths during partitionwise join planning  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
On Tue, Feb 20, 2024 at 8:19 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
>
> On 19/2/2024 19:25, Ashutosh Bapat wrote:
> > On Fri, Feb 16, 2024 at 8:42 AM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> Live example: right now, I am working on the code like MSSQL has - a
> >> combination of NestLoop and HashJoin paths and switching between them in
> >> real-time. It requires both paths in the path list at the moment when
> >> extensions are coming. Even if one of them isn't referenced from the
> >> upper pathlist, it may still be helpful for the extension.
> >
> > There is no guarantee that every path presented to add_path will be
> > preserved. Suboptimal paths are freed as and when add_path discovers
> > that they are suboptimal. So I don't think an extension can rely on
> > existence of a path. But having a refcount makes it easy to preserve
> > the required paths by referencing them.
> I don't insist, just provide my use case. It would be ideal if you would
> provide some external routines for extensions that allow for sticking
> the path in pathlist even when it has terrible cost estimation.

With refcounts you can reference it and store it somewhere other than
pathlist. The path won't be lost until it is dereferrenced.
RelOptInfo::Pathlist is for optimal paths.

> >
> >>
> >>
> > IIUC, you are suggesting that instead of planning each
> > partition/partitionwise join, we only create paths with the strategies
> > which were found to be optimal with previous partitions. That's a good
> > heuristic but it won't work if partition properties - statistics,
> > indexes etc. differ between groups of partitions.
> Sure, but the "Symmetry" strategy assumes that on the scope of a
> thousand partitions, especially with parallel append involved, it
> doesn't cause sensible performance degradation if we find a bit
> suboptimal path in a small subset of partitions. Does it make sense?
> As I see, when people use 10-100 partitions for the table, they usually
> strive to keep indexes symmetrical for all partitions.
>

I agree that we need something like that. In order to do that, we need
machinery to prove that all partitions have similar properties. Once
that is proved, we can skip creating paths for similar partitions. But
that's out of scope of this work and complements it.

--
Best Wishes,
Ashutosh Bapat



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: David Rowley
Дата:
Сообщение: Re: JIT compilation per plan node