Re: Avoid detoast overhead when possible

Поиск
Список
Период
Сортировка
От Nikita Malakhov
Тема Re: Avoid detoast overhead when possible
Дата
Msg-id CAN-LCVNdc3yO3BaSpza5G76=o2eePgY4BX9azNyFW+Smf-GEDA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Avoid detoast overhead when possible  (zhihuifan1213@163.com)
Ответы Re: Avoid detoast overhead when possible  (zhihuifan1213@163.com)
Список pgsql-hackers
Hi,

Hmmm, I've checked this patch and can't see performance difference on a large
(20000 key-value pairs) json, using toasted json column several times makes no
difference between current implementation on master (like queries mentioned above).

Maybe I'm doing something wrong?

On Tue, Dec 5, 2023 at 4:16 AM <zhihuifan1213@163.com> wrote:

Hi,

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

> SELECT toastable_col FROM t1
> WHERE f(t1.toastable_col)
> ORDER BY nonindexed;

Thanks for this example! it's true that the current design requires more
memory to sort since toastable_col is detoasted at the scan stage and it
is output to the sort node. It should be avoided.

> SELECT ev_class
> FROM pg_rewrite
> WHERE octet_length(ev_action) > 1
> ORDER BY ev_class;

This one is different I think, since the ev_action (the toastable_col) is
*NOT* output to sort node, so no extra memory is required IIUC.

 * CP_SMALL_TLIST specifies that a narrower tlist is preferred.  This is
 * passed down by parent nodes such as Sort and Hash, which will have to
 * store the returned tuples.

We can also verify this by

explain (costs off, verbose) SELECT ev_class
FROM pg_rewrite
WHERE octet_length(ev_action) > 1
ORDER BY ev_class;
                            QUERY PLAN                           
------------------------------------------------------------------
 Sort
   Output: ev_class
   Sort Key: pg_rewrite.ev_class
   ->  Seq Scan on pg_catalog.pg_rewrite
         Output: ev_class
         Filter: (octet_length((pg_rewrite.ev_action)::text) > 1)
(6 rows)

Only ev_class is output to Sort node.

So if we want to make sure there is performance regression for all the
existing queries in any case, we can add 1 more restriction into the
saved-detoast-value logic. It must be (NOT under CP_SMALL_TLIST) OR (the
toastable_col is not in the output list). It can be a planner decision.

If we code like this, the result will be we need to dotoast N times
for toastable_col in qual for the below query.

SELECT toastable_col FROM t
WHERE f1(toastable_col)
AND f2(toastable_col)
..
AND fn(toastable_col)
ORDER BY any-target-entry;

However

SELECT
  f1(toastable_col),
  f2(toastable_col),
  ..
  fn(toastable_col)
FROM t
ORDER BY any-target-entry;

the current path still works for it.

This one is my favorite one so far. Another option is saving the
detoast-value in some other memory or existing-slot-in-place for
different sistuation, that would requires more expr expression changes
and planner changes. I just checked all the queries in my hand, the
current design can cover all of them.

--
Best Regards
Andy Fan





--
Regards,

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Test 002_pg_upgrade fails with olddump on Windows
Следующее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Re: Synchronizing slots from primary to standby