Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

Поиск

Список

Период

Сортировка

От	Matthias van de Meent
Тема	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Дата	20 марта 2024 г. 19:26:38
Msg-id	CAEze2WiYOpAytLraPs72FV1MvOP1Q6vZx04zHj++-ZWyNBkqUg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Список	pgsql-hackers

Дерево обсуждения

On Sat, 16 Mar 2024 at 01:12, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Fri, Mar 8, 2024 at 9:00 AM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > I've attached v14, where 0001 is v13, 0002 is a patch with small
> > changes + some large comment suggestions, and 0003 which contains
> > sorted merge join code for _bt_merge_arrays.
>
> This is part of my next revision, v15, which I've attached (along with
> a test case that you might find useful, explained below).
>
> v15 makes the difference between the non-required scan key trigger and
> required scan key trigger cases clearer within _bt_advance_array_keys.

OK, here's a small additional review, with a suggestion for additional
changes to _bt_preprocess:

> @@ -1117,6 +3160,46 @@ _bt_compare_scankey_args(IndexScanDesc scan, ScanKey op,
>      /*
>      * The opfamily we need to worry about is identified by the index column.
>      */
>     Assert(leftarg->sk_attno == rightarg->sk_attno);
>
> +    /*
> +     * If either leftarg or rightarg are equality-type array scankeys, we need
> +     * specialized handling (since by now we know that IS NULL wasn't used)
> +     */
> [...]
> +    }
> +
>     opcintype = rel->rd_opcintype[leftarg->sk_attno - 1];

Here, you insert your code between the comment about which opfamily to
choose and the code assigning the opfamily. I think this needs some
cleanup.

> +             * Don't call _bt_preprocess_array_keys_final in this fast path
> +             * (we'll miss out on the single value array transformation, but
> +             * that's not nearly as important when there's only one scan key)

Why is it OK to ignore it? Or, why don't we apply it here?

---

Attached 2 patches for further optimization of the _bt_preprocess_keys
path (on top of your v15), according to the following idea:

Right now, we do "expensive" processing with xform merging for all
keys when we have more than 1 keys in the scan. However, we only do
per-attribute merging of these keys, so if there is only one key for
any attribute, the many cycles spent in that loop are mostly wasted.
By checking for single-scankey attributes, we can short-track many
multi-column index scans because they usually have only a single scan
key per attribute.
The first implements that idea, the second reduces the scope of
various variables so as to improve compiler optimizability.

I'll try to work a bit more on merging the _bt_preprocess steps into a
single main iterator, but this is about as far as I got with clean
patches. Merging the steps for array preprocessing with per-key
processing and post-processing is proving a bit more complex than I'd
anticipated, so I don't think I'll be able to finish that before the
feature freeze, especially with other things that keep distracting me.

Matthias van de Meent

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

Вложения