Re: Hybrid Hash/Nested Loop joins and caching results from subplans

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Дата	28 августа 2020 г. 14:54:58
Msg-id	CAApHDvo2acQSogMCa3hB7moRntXWHO8G+WSwhyty2+c8vYRq3A@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Hybrid Hash/Nested Loop joins and caching results from subplans (Andres Freund <andres@anarazel.de>)
Ответы	Re: Hybrid Hash/Nested Loop joins and caching results from subplans Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Список	pgsql-hackers

Дерево обсуждения

On Wed, 26 Aug 2020 at 03:52, Andres Freund <andres@anarazel.de> wrote:
>
> On 2020-08-25 20:48:37 +1200, David Rowley wrote:
> > However, given the correct planner choice, there will never be
> > a gross slowdown due to having the extra node.
>
> There'll be a significant reduction in increase in performance.

So I did a very rough-cut change to the patch to have the caching be
part of Nested Loop.  It can be applied on top of the other 3 v7
patches.

For the performance, the test I did results in the performance
actually being reduced from having the Result Cache as a separate
node.  The reason for this is mostly because Nested Loop projects.
Each time I fetch a MinimalTuple from the cache, the patch will deform
it in order to store it in the virtual inner tuple slot for the nested
loop. Having the Result Cache as a separate node can skip this step as
it's result tuple slot is a TTSOpsMinimalTuple, so we can just store
the cached MinimalTuple right into the slot without any
deforming/copying.

Here's an example of a query that's now slower:

select count(*) from hundredk hk inner join lookup100 l on hk.one = l.a;

In this case, the original patch does not have to deform the
MinimalTuple from the cache as the count(*) does not require any Vars
from it.   With the rough patch that's attached, the MinimalTuple is
deformed in during the transformation during ExecCopySlot(). The
slowdown exists no matter which column of the hundredk table I join to
(schema in [1]).

Performance comparison is as follows:

v7 (Result Cache as a separate node)
postgres=# explain (analyze, timing off) select count(*) from hundredk
hk inner join lookup l on hk.one = l.a;
 Execution Time: 652.582 ms

v7 + attached rough patch
postgres=# explain (analyze, timing off) select count(*) from hundredk
hk inner join lookup l on hk.one = l.a;
 Execution Time: 843.566 ms

I've not yet thought of any way to get rid of the needless
MinimalTuple deform.  I suppose the cache could just have already
deformed tuples, but that requires more memory and would result in a
worse cache hit ratios for workloads where the cache gets filled.

I'm open to ideas to make the comparison fairer.

(Renamed the patch file to .txt to stop the CFbot getting upset with me)

David

[1] https://www.postgresql.org/message-id/CAApHDvrPcQyQdWERGYWx8J+2DLUNgXu+fOSbQ1UscxrunyXyrQ@mail.gmail.com

Вложения

resultcache_in_nestloop_hacks.patch.txt

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hybrid Hash/Nested Loop joins and caching results from subplans

Вложения