Re: DRAFT: Pass sk_attno to consistent function

Поиск
Список
Период
Сортировка
От Michał Kłeczek
Тема Re: DRAFT: Pass sk_attno to consistent function
Дата
Msg-id 78D9054A-B4EC-475D-8118-C434C6D3261D@kleczek.org
обсуждение исходный текст
Ответ на DRAFT: Pass sk_attno to consistent function  (Michał Kłeczek <michal@kleczek.org>)
Ответы Re: DRAFT: Pass sk_attno to consistent function  (Michał Kłeczek <michal@kleczek.org>)
Список pgsql-hackers
Wrong file in the previous message - sorry for the noise.

Attached is a fixed patch.

Thanks,
Michal



On 18 Mar 2024, at 15:14, Michał Kłeczek <michal@kleczek.org> wrote:

I realised it might be enough to pass sk_attno to consistent function as that should be enough to lookup metadata if needed.

Attached is a draft patch that does this.

Michal

<0001-Pass-key-sk_attno-to-consistent-function.patch>

On 18 Mar 2024, at 14:31, Michał Kłeczek <michal@kleczek.org> wrote:

Hi All,

On 11 Mar 2024, at 18:58, Michał Kłeczek <michal@kleczek.org> wrote:

Hi All,

During my journey of designing Pg based solution at my work I was severely hit by several shortcomings in GiST.
The most severe one is the lack of support for SAOP filters as it makes it difficult to have partition pruning and index (only) scans working together.

To overcome the difficulties I implemented a simple extension:

Since it provides a separate operator class from btree_gist it requires re-indexing the data.
So I thought maybe it would be a good idea to incorporate it into btree_gist.


While working on supporting (sort of) SAOP support for Gist I was stuck with the interplay between Gist consistent function and partition pruning.
Not sure how it applies to SAOP handling in general though.

I’ve implemented an operator class/family that supports Gist index scan for the following query:

SELECT * FROM tbl WHERE col ||= ARRAY[‘a’, ‘b’, ‘c’];

It all works well except cases where tbl is partitioned by “col” column.
In this case index scan unnecessarily scans pages for values that are not in the partition.

I am not sure if it works as expected (ie. no unnecessary scans) in case of ANY = (ARRAY[]) and nbtree.
In case of Gist the only place where pre-processing of queries can be performed is consistent function.
But right now there is no possibility to access any scan related information as it is not passed to consistent function.
The only thing available is GISTENTRY and it does not contain any metadata.

As a workaround I’ve added options to the op family that allows (redundantly) specifying MODULUS/REMAINDER for the index:

CREATE INDEX idx ON tbl_partition_01 USING gist ( col gist_text_extra_ops (modulus=4, remainder=2) )

and use the values to filter out query array passed to consistent function.

This is of course not ideal:
- the information is redundant
- changing partitioning scheme requires re-indexing

I am wondering if it is possible to extend consistent function API so that either ScanKey or even the whole IndexScanDesc is passed as additional parameter.

Michal


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michał Kłeczek
Дата:
Сообщение: DRAFT: Pass sk_attno to consistent function
Следующее
От: Dean Rasheed
Дата:
Сообщение: Re: Improving EXPLAIN's display of SubPlan nodes