Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

Поиск

Список

Период

Сортировка

От	Matthias van de Meent
Тема	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Дата	4 марта 2024 г. 20:51:37
Msg-id	CAEze2Whe5L-z65DgvUgPkkLt6d2PxG1gV8Z_fNWjS9+-1jkDWw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Список	pgsql-hackers

Дерево обсуждения

On Sat, 2 Mar 2024 at 02:30, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Feb 15, 2024 at 6:36 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > Attached is v11, which now says something like that in the commit
> > message.
>
> Attached is v12.

Some initial comments on the documentation:

> +    that searches for multiple values together.  Queries that use certain
> +    <acronym>SQL</acronym> constructs to search for rows matching any value
> +    out of a list (or an array) of multiple scalar values might perform
> +    multiple <quote>primitive</quote> index scans (up to one primitive scan
> +    per scalar value) at runtime.  See <xref linkend="functions-comparisons"/>
> +    for details.

I don't think the "see <functions-comparisons> for details" is
correctly worded: The surrounding text implies that it would contain
details about in which precise situations multiple primitive index
scans would be consumed, while it only contains documentation about
IN/NOT IN/ANY/ALL/SOME.

Something like the following would fit better IMO:

+    that searches for multiple values together.  Queries that use certain
+    <acronym>SQL</acronym> constructs to search for rows matching any value
+    out of a list or array of multiple scalar values (such as those
described in
+    <functions-comparisons> might perform multiple <quote>primitive</quote>
+    index scans (up to one primitive scan per scalar value) at runtime.

Then there is a second issue in the paragraph: Inverted indexes such
as GIN might well decide to start counting more than one "primitive
scan" per scalar value, because they may need to go through their
internal structure more than once to produce results for a single
scalar value; e.g. with queries WHERE textfield LIKE '%some%word%', a
trigram index would likely use at least 4 descents here: one for each
of "som", "ome", "wor", "ord".

> > All that really remains now is to research how we might integrate this
> > work with the recently added continuescanPrechecked/haveFirstMatch
> > stuff from Alexander Korotkov, if at all.
>
> The main change in v12 is that I've integrated both the
> continuescanPrechecked and the haveFirstMatch optimizations. Both of
> these fields are now page-level state, shared between the _bt_readpage
> caller and the _bt_checkkeys/_bt_advance_array_keys callees (so they
> appear next to the new home for _bt_checkkeys' continuescan field, in
> the new page state struct).

Cool. I'm planning to review the rest of this patch this
week/tomorrow, could you take some time to review some of my btree
patches, too?

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan