Обсуждение: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

Поиск
Список
Период
Сортировка

Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

От
"Jinjing Zhou"
Дата:
Hi hackers, 

I hope this message finds you well. I am reaching out to seek guidance on a specific aspect of PostgreSQL's index scanning functionality.

I am currently working on a vector search extension for postgres, where I need to generate bitmaps based on filter conditions during an index scan. The goal is to optimize the query performance by efficiently identifying the rows that meet the given criteria.

The query plan looks like this
Index Scan using products_feature_idx on products  (cost=0.00..27.24 rows=495 width=12)
         Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
         Filter: ((price > '0.2'::double precision) AND (price <= '0.7'::double precision))

We have a custom index for the order by clause on the feature column. Now we want to utilize the index on other columns like price column. We want to access the bitmap of price column's filter condition in the feature column index. Is there any way I can achieve this goal?

Any help or guidance is appreciated!

Thanks.
Jinjing Zhou

Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

От
Tomas Vondra
Дата:

On 11/19/23 18:19, Jinjing Zhou wrote:
> Hi hackers, 
> 
> I hope this message finds you well. I am reaching out to seek guidance
> on a specific aspect of PostgreSQL's index scanning functionality.
> 
> I am currently working on a vector search extension for postgres, where
> I need to generate bitmaps based on filter conditions during an index
> scan. The goal is to optimize the query performance by efficiently
> identifying the rows that meet the given criteria.
> 
> The query plan looks like this
> 
>     Index Scan using products_feature_idx on products  (cost=0.00..27.24
>     rows=495 width=12)
>              Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
>              Filter: ((price > '0.2'::double precision) AND (price <=
>     '0.7'::double precision))
> 
> 
> We have a custom index for the order by clause on the feature column.
> Now we want to utilize the index on other columns like price column. We
> want to access the bitmap of price column's filter condition in the
> feature column index. Is there any way I can achieve this goal?
> 
> Any help or guidance is appreciated!
> 

I suppose you'll need to give more details about what exactly are you
trying to achieve, what you tried, maybe some code examples, etc. Your
question is quite vague, and it's unclear what "bitmaps generated on
filter conditions" or "custom index" means.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

От
Matthias van de Meent
Дата:
On Mon, 20 Nov 2023 at 09:30, Jinjing Zhou <allenzhou@tensorchord.ai> wrote:
>
> Hi hackers,
>
> I hope this message finds you well. I am reaching out to seek guidance on a specific aspect of PostgreSQL's index
scanningfunctionality. 
>
> I am currently working on a vector search extension for postgres, where I need to generate bitmaps based on filter
conditionsduring an index scan. The goal is to optimize the query performance by efficiently identifying the rows that
meetthe given criteria. 
>
> The query plan looks like this
>
> Index Scan using products_feature_idx on products  (cost=0.00..27.24 rows=495 width=12)
>          Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
>          Filter: ((price > '0.2'::double precision) AND (price <= '0.7'::double precision))
>
>
> We have a custom index for the order by clause on the feature column. Now we want to utilize the index on other
columnslike price column. We want to access the bitmap of price column's filter condition in the feature column index.
Isthere any way I can achieve this goal? 

If you mean "I'd like to use bitmaps generated by combining filter
results from index A, B, and C for (pre-)filtering the ordered index
lookups in index D",
then there is no current infrastructure to do this. Bitmap scans
currently generate a data structure that is not indexable, and can
thus not be used efficiently to push an index's generated bitmap into
another bitmap's scans.

There are efforts to improve the data structures we use for storing
TIDs during vacuum [0] which could extend to the TID bitmap structure,
but even then we'd need some significant effort to rewire Postgres'
internals to push down the bitmap filters; and that is even under the
assumption that pushing the bitmap down into the index AM is more
efficient than doing the merges above the index AM and then re-sorting
the data.

So, in short, it's not currently available in community PostgreSQL.
You could probably create a planner hook + custom executor node that
does this, but it won't be able to use much of the features available
inside PostgreSQL.

Kind regards,

Matthias van de Meent

[0] https://www.postgresql.org/message-id/flat/CANWCAZbrZ58-w1W_3pg-0tOfbx8K41_n_03_0ndGV78hJWswBA%2540mail.gmail.com



Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

От
"Jinjing Zhou"
Дата:
Thanks a lot! This is exactly what I'm asking. We've tried the CustomScanAPI at https://github.com/tensorchord/pgvecto.rs/pull/126, but met error with "variable not found in subplan target list". We're still investigating the root cause and thanks for your guidance!

Best
Jinjing Zhou
From: "Matthias van de Meent"<boekewurm+postgres@gmail.com>
Date: Mon, Nov 20, 2023, 19:33
Subject: Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans
To: "Jinjing Zhou"<allenzhou@tensorchord.ai>
On Mon, 20 Nov 2023 at 09:30, Jinjing Zhou <allenzhou@tensorchord.ai> wrote: > > Hi hackers, > > I hope this message finds you well. I am reaching out to seek guidance on a specific aspect of PostgreSQL's index scanning functionality. > > I am currently working on a vector search extension for postgres, where I need to generate bitmaps based on filter conditions during an index scan. The goal is to optimize the query performance by efficiently identifying the rows that meet the given criteria. > > The query plan looks like this > > Index Scan using products_feature_idx on products (cost=0.00..27.24 rows=495 width=12) > Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector) > Filter: ((price > '0.2'::double precision) AND (price <= '0.7'::double precision)) > > > We have a custom index for the order by clause on the feature column. Now we want to utilize the index on other columns like price column. We want to access the bitmap of price column's filter condition in the feature column index. Is there any way I can achieve this goal? If you mean "I'd like to use bitmaps generated by combining filter results from index A, B, and C for (pre-)filtering the ordered index lookups in index D", then there is no current infrastructure to do this. Bitmap scans currently generate a data structure that is not indexable, and can thus not be used efficiently to push an index's generated bitmap into another bitmap's scans. There are efforts to improve the data structures we use for storing TIDs during vacuum [0] which could extend to the TID bitmap structure, but even then we'd need some significant effort to rewire Postgres' internals to push down the bitmap filters; and that is even under the assumption that pushing the bitmap down into the index AM is more efficient than doing the merges above the index AM and then re-sorting the data. So, in short, it's not currently available in community PostgreSQL. You could probably create a planner hook + custom executor node that does this, but it won't be able to use much of the features available inside PostgreSQL. Kind regards, Matthias van de Meent [0] https://www.postgresql.org/message-id/flat/CANWCAZbrZ58-w1W_3pg-0tOfbx8K41_n_03_0ndGV78hJWswBA%2540mail.gmail.com

Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

От
"Jinjing Zhou"
Дата:
Thanks. Our project is at https://github.com/tensorchord/pgvecto.rs. A custom index is implemented for the vector similarity search, which implements `amgettuples` with direction support to provide candidates for the order by clause. 

And we want to inject the filter condition using bitmap into the amgettuples process, instead of checking the tuples one by one to accelerate the whole process. 

Best
Jinjing Zhou
From: "Tomas Vondra"<tomas.vondra@enterprisedb.com>
Date: Mon, Nov 20, 2023, 18:52
Subject: Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans
On 11/19/23 18:19, Jinjing Zhou wrote: > Hi hackers,  > > I hope this message finds you well. I am reaching out to seek guidance > on a specific aspect of PostgreSQL's index scanning functionality. > > I am currently working on a vector search extension for postgres, where > I need to generate bitmaps based on filter conditions during an index > scan. The goal is to optimize the query performance by efficiently > identifying the rows that meet the given criteria. > > The query plan looks like this > > Index Scan using products_feature_idx on products  (cost=0.00..27.24 > rows=495 width=12) >          Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector) >          Filter: ((price > '0.2'::double precision) AND (price <= > '0.7'::double precision)) > > > We have a custom index for the order by clause on the feature column. > Now we want to utilize the index on other columns like price column. We > want to access the bitmap of price column's filter condition in the > feature column index. Is there any way I can achieve this goal? > > Any help or guidance is appreciated! > I suppose you'll need to give more details about what exactly are you trying to achieve, what you tried, maybe some code examples, etc. Your question is quite vague, and it's unclear what "bitmaps generated on filter conditions" or "custom index" means. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company