Re: A performance issue with Memoize

Поиск
Список
Период
Сортировка
От Richard Guo
Тема Re: A performance issue with Memoize
Дата
Msg-id CAMbWs48ePZupHo01wAek9Qv+naf0ZY+CUF0fHW1OHXrXL_fMzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A performance issue with Memoize  (Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>)
Список pgsql-hackers

On Thu, Feb 1, 2024 at 3:43 PM Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote:
Hi,

I've seen a similar issue with the following query (tested on the current head):

EXPLAIN ANALYZE SELECT * FROM tenk1 t1
LEFT JOIN LATERAL (SELECT t1.two, tenk2.hundred, tenk2.two FROM tenk2) t2
ON t1.hundred = t2.hundred WHERE t1.hundred < 5;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=8.46..1495.10 rows=50000 width=256)
(actual time=0.860..111.013 rows=50000 loops=1)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=8.16..376.77 rows=500
width=244) (actual time=0.798..1.418 rows=500 loops=1)
         Recheck Cond: (hundred < 5)
         Heap Blocks: exact=263
         ->  Bitmap Index Scan on tenk1_hundred  (cost=0.00..8.04
rows=500 width=0) (actual time=0.230..0.230 rows=500 loops=1)
               Index Cond: (hundred < 5)
   ->  Memoize  (cost=0.30..4.89 rows=100 width=12) (actual
time=0.009..0.180 rows=100 loops=500)
         Cache Key: t1.hundred
         Cache Mode: logical
         Hits: 0  Misses: 500  Evictions: 499  Overflows: 0  Memory Usage: 5kB
         ->  Index Scan using tenk2_hundred on tenk2  (cost=0.29..4.88
rows=100 width=12) (actual time=0.007..0.124 rows=100 loops=500)
               Index Cond: (hundred = t1.hundred)
 Planning Time: 0.661 ms
 Execution Time: 113.076 ms
(14 rows)

The memoize's cache key only uses t1.hundred while the nested loop has
two changed parameters: the lateral var t1.two and t1.hundred. This
leads to a chgParam that is always different and the cache is purged
on each rescan.

Thanks for the report!  This issue is caused by that we fail to check
PHVs for lateral references, and hence cannot know that t1.two should
also be included in the cache keys.

I reported exactly the same issue in [1], and verified that it can be
fixed by the patch in [2] (which seems to require a rebase).

[1] https://www.postgresql.org/message-id/CAMbWs4_imG5C8rXt7xdU7zf6whUDc2rdDun%2BVtrowcmxb41CzA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAMbWs49%2BCjoy0S0xkCRDcHXGHvsYLOdvr9jq9OTONOBnsgzXOg%40mail.gmail.com

Thanks
Richard

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Fix bugs not to discard statistics when changing stats_fetch_consistency
Следующее
От: David Rowley
Дата:
Сообщение: Re: set_cheapest without checking pathlist