Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Дата
Msg-id CAH2-Wzmcaf=uNhTGBj4jqXFSwbVwMw895JEMXr0bzcBdc8=FRA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
On Mon, Jan 22, 2024 at 4:13 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> Attached 2 patches: v11.patch-a and v11.patch-b. Both are incremental
> on top of your earlier set, and both don't allocate additional memory
> in the merge operation in non-assertion builds.
>
> patch-a is a trivial and clean implementation of mergesort, which
> tends to reduce the total number of compare operations if both arrays
> are of similar size and value range. It writes data directly back into
> the main array on non-assertion builds, and with assertions it reuses
> your binary search join on scratch space for algorithm validation.

This patch fails some of my tests on non-assert builds only (assert
builds pass all my tests, though). I'm using the first patch on its
own here.

While I tend to be relatively in favor of complicated assertions (I
tend to think the risk of introducing side-effects is worth it), it
looks like you're only performing certain steps in release builds.
This is evident from just looking at the code (there is an #else block
just for the release build in the loop). Note also that
"Assert(_bt_compare_array_elements(&merged[merged_nelems++], orig,
&cxt) == 0)" has side-effects in assert-enabled builds only (it
increments merged_nelems). While it's true that you *also* increment
merged_nelems *outside* of the assertion (or in an #else block used
during non-assert builds), that is conditioned on some other thing (so
it's in no way equivalent to the debug #ifdef USE_ASSERT_CHECKING
case). It's also just really hard to understand what's going on here.

If I was going to do this kind of thing, I'd use two completely
separate loops, that were obviously completely separate (maybe even
two functions). I'd then memcmp() each array at the end.

--
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Maiquel Grassi
Дата:
Сообщение: RE: Psql meta-command conninfo+
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Transaction timeout