Re: Shared detoast Datum proposal

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Shared detoast Datum proposal
Дата
Msg-id 87zfvedbsr.fsf@163.com
обсуждение исходный текст
Ответ на Re: Shared detoast Datum proposal  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Shared detoast Datum proposal  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

> On 2/26/24 16:29, Andy Fan wrote:
>>
>> ...>
>> I can understand the benefits of the TOAST cache, but the following
>> issues looks not good to me IIUC: 
>> 
>> 1). we can't put the result to tts_values[*] since without the planner
>> decision, we don't know if this will break small_tlist logic. But if we
>> put it into the cache only, which means a cache-lookup as a overhead is
>> unavoidable.
>
> True - if you're comparing having the detoasted value in tts_values[*]
> directly with having to do a lookup in a cache, then yes, there's a bit
> of an overhead.
>
> But I think from the discussion it's clear having to detoast the value
> into tts_values[*] has some weaknesses too, in particular:
>
> - It requires decisions which attributes to detoast eagerly, which is
> quite invasive (having to walk the plan, ...).
>
> - I'm sure there will be cases where we choose to not detoast, but it
> would be beneficial to detoast.
>
> - Detoasting just the initial slices does not seem compatible with this.
>
> IMHO the overhead of the cache lookup would be negligible compared to
> the repeated detoasting of the value (which is the current baseline). I
> somewhat doubt the difference compared to tts_values[*] will be even
> measurable.
>
>> 
>> 2). It is hard to decide which entry should be evicted without attaching
>> it to the TupleTableSlot's life-cycle. then we can't grantee the entry
>> we keep is the entry we will reuse soon?
>> 
>
> True. But is that really a problem? I imagined we'd set some sort of
> memory limit on the cache (work_mem?), and evict oldest entries. So the
> entries would eventually get evicted, and the memory limit would ensure
> we don't consume arbitrary amounts of memory.
>

Here is a copy from the shared_detoast_datum.org in the patch. I am
feeling about when / which entry to free is a key problem and run-time
(detoast_attr) overhead vs createplan.c overhead is a small difference
as well. the overhead I paid for createplan.c/setref.c looks not huge as
well. 

"""
A alternative design: toast cache
---------------------------------

This method is provided by Tomas during the review process. IIUC, this
method would maintain a local HTAB which map a toast datum to a detoast
datum and the entry is maintained / used in detoast_attr
function. Within this method, the overall design is pretty clear and the
code modification can be controlled in toasting system only.

I assumed that releasing all of the memory at the end of executor once
is not an option since it may consumed too many memory. Then, when and
which entry to release becomes a trouble for me. For example:

          QUERY PLAN
------------------------------
 Nested Loop
   Join Filter: (t1.a = t2.a)
   ->  Seq Scan on t1
   ->  Seq Scan on t2
(4 rows)

In this case t1.a needs a longer lifespan than t2.a since it is
in outer relation. Without the help from slot's life-cycle system, I
can't think out a answer for the above question.

Another difference between the 2 methods is my method have many
modification on createplan.c/setref.c/execExpr.c/execExprInterp.c, but
it can save some run-time effort like hash_search find / enter run-time
in method 2 since I put them directly into tts_values[*].

I'm not sure the factor 2 makes some real measurable difference in real
case, so my current concern mainly comes from factor 1.
"""


-- 
Best Regards
Andy Fan




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: Shared detoast Datum proposal
Следующее
От: Peter Smith
Дата:
Сообщение: Re: Synchronizing slots from primary to standby