Обсуждение: [WIP]Vertical Clustered Index (columnar store extension) - take2

Поиск

Список

Период

Сортировка

[WIP]Vertical Clustered Index (columnar store extension) - take2

От

"Aya Iwata (Fujitsu)"

Дата:

07 октября 2024 г., 17:53:19

Hi All,

Suggestions

==========

When analyzing real-time data collected by PostgreSQL,

it can be difficult to tune the current PostgreSQL server for satisfactory performance.

Therefore, we propose Vertical Clustered Indexing (VCI), an in-memory column store function that holds data in a state suitable for business analysis and is also expected to improve analysis performance.

With VCI, you can also expect to run analysis 7.8 times faster. This is achieved by the analytics engine, which optimizes parallel processing of column-oriented data, in addition to the fact that VCI stores data in a columnar format, enabling efficient retrieval of the columns needed for analysis.

Similar Features

============

One column store feature available with postgres is Citus Columnar Table.

If you introduces the citus extension, which allows columnar tables to be used using the columnar access method.

This function is intended to analyze the accumulated data. Therefore, you cannot update or delete data.

VCI supports data updates and deletions. This enables you to analyze not only the accumulated data but also the data that occurs in real time.

Implementing VCI

============

To introduce an updatable column store, we explain how updates to row-oriented data are propagated to column-oriented data.

VCI has two storage areas.

- Write Optimized Storage (WOS)

- Read Optimized Storage (ROS)

Describes WOS.

The WOS stores data for all columns in the VCI in a row-oriented format.

All newly added data is stored in the WOS relation along with the transaction information.

Using WOS to delete and update newly added data has no significant performance impact compared to deleting from columnar storage.

ROS is the storage area where all column data is stored.

When inserting/updating/deleting, data is written synchronously to WOS. It does not compress or index the data.

This avoids the overhead of converting to a columnar while updating the data.

After a certain amount of data accumulates in the WOS, the ROS control daemon converts it to column data asynchronously with updates.

Column data transformation compresses and indexes the data and writes it to ROS.

Describes searching for data.

Since there are two storage formats, the SELECT process needs to convert the WOS data to local ROS to determine whether it is visible or invisible. This conversion cost depends on the number of tuples present in the WOS file. This may introduce some performance overhead.

Obtain search results by referencing the local ROS and referencing the ROS in parallel.

These implementation ideas are also posted on Fujitsu's blog for your reference. [1]

Past discussions

===========

We've proposed features before. [2]

This thread also explains the details, so please check it.

In a previous thread, we suggested implementing a modification to the PostgreSQL backend code.

Based on the FB we received at that time, we think we need to re-implement this feature in pluggable storage using the table access method API.

I also got a FB of the features I needed from a PostgreSQLPro member. We believe it is necessary to deal with these issues in stages.

- Need to provide vector processing for nodes (filter, grand aggregate, aggregation with group by...) to speed up computation

- Requires parallel processing support such as scanning

It is assumed that the re-implementation will also require additional functionality to the current Table Access Method API.

It is useful not only for VCI but also for other access methods.

Therefore, we decided to propose the VCI feature to the community and proceed with development.

Request matter

===========

Are members of the PostgreSQL hackers interested in VCI features?

We welcome your comments and suggestions on this feature.

In particular, if you have any questions, required features, or implementations, please let me know.

[1] https://www.postgresql.fastware.com/blog/improve-data-analysis-performance-without-impacting-business-transactions-with-vertical-clustered-index

[2]https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN+hCy8xOLAh2doYhVg5d6HsAA@mail.gmail.com

Regards,

Aya Iwata

FUJITSU LIMITED

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Alvaro Herrera

Дата:

14 января 2025 г., 15:00:22

Hello,

I came across this email by chance while looking for something else.

On 2024-Oct-07, Aya Iwata (Fujitsu) wrote:

> Therefore, we propose Vertical Clustered Indexing (VCI), an in-memory
> column store function that holds data in a state suitable for business
> analysis and is also expected to improve analysis performance.

> With VCI, you can also expect to run analysis 7.8 times faster. This
> is achieved by the analytics engine, which optimizes parallel
> processing of column-oriented data, in addition to the fact that VCI
> stores data in a columnar format, enabling efficient retrieval of the
> columns needed for analysis.

Wow.

> Request matter
> ===========
> 
> Are members of the PostgreSQL hackers interested in VCI features?
> We welcome your comments and suggestions on this feature.
> In particular, if you have any questions, required features, or implementations, please let me know.

I think this is definitely an important and welcome development.
I look forward to your patches in this area.

Thank you,

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Yura Sokolov

Дата:

15 января 2025 г., 17:43:46

07.10.2024 17:53, Aya Iwata (Fujitsu) wrote:
> Hi All,
> 
> Suggestions
> 
> ==========
> 
> When analyzing real-time data collected by PostgreSQL,
> 
> it can be difficult to tune the current PostgreSQL server for 
> satisfactory performance.
> 
> Therefore, we propose Vertical Clustered Indexing (VCI), an in-memory 
> column store function that holds data in a state suitable for business 
> analysis and is also expected to improve analysis performance.

I just don't get, why it should be "in-memory"? All the same things you 
describe further, but storing in paged index on-disk with caching 
through shared_buffers - why this way it wouldn't work?

RE: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

"Aya Iwata (Fujitsu)"

Дата:

08 апреля 2025 г., 10:41:38

Hi Alvaro san,

I am sorry for my late reply. I continue to work on proposing VCI feature to the community.

> I think this is definitely an important and welcome development.
> I'm looking forward to patches in this area.

Thank you!
I am currently preparing to share VCI designs with PGConf.dev.
I look forward to sharing more about VCI with you.


Best regards,
Aya Iwata
FUJITSU LIMITED

RE: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

"Aya Iwata (Fujitsu)"

Дата:

08 апреля 2025 г., 13:29:48

Hi Yura san,


> I just don't get, why it should be "in-memory"? All the same things you
> describe further, but storing in paged index on-disk with caching
> through shared_buffers - why this way it wouldn't work?

We make the columnar store resident in memory for maximum search performance.
But I'm not very particular about this. Comments are welcome.

Best regards,
Aya Iwata
FUJITSU LIMITED


> -----Original Message-----
> From: Yura Sokolov <y.sokolov@postgrespro.ru>
> Sent: Wednesday, January 15, 2025 11:44 PM
> To: pgsql-hackers@lists.postgresql.org
> Subject: Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
> 
> 07.10.2024 17:53, Aya Iwata (Fujitsu) wrote:
> > Hi All,
> >
> > Suggestions
> >
> > ==========
> >
> > When analyzing real-time data collected by PostgreSQL,
> >
> > it can be difficult to tune the current PostgreSQL server for
> > satisfactory performance.
> >
> > Therefore, we propose Vertical Clustered Indexing (VCI), an in-memory
> > column store function that holds data in a state suitable for business
> > analysis and is also expected to improve analysis performance.
> 
> I just don't get, why it should be "in-memory"? All the same things you
> describe further, but storing in paged index on-disk with caching
> through shared_buffers - why this way it wouldn't work?
> 
>

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Yura Sokolov

Дата:

08 апреля 2025 г., 18:24:20

08.04.2025 13:29, Aya Iwata (Fujitsu) wrote:
> Hi Yura san,
> 
> 
>> I just don't get, why it should be "in-memory"? All the same things you
>> describe further, but storing in paged index on-disk with caching
>> through shared_buffers - why this way it wouldn't work?
> 
> We make the columnar store resident in memory for maximum search performance.
> But I'm not very particular about this. Comments are welcome.

I just wanted to say: there is no need to be super fast.
There is the need to be remarkably faster than it is now.

ClickHouse, DuckDB, Vertica - they are not in-memory, they are disk based.
But they are very fast.
If PostgreSQL will be just as twice slower as ClickHouse, it will be very
great! Most of users will not setup ClickHouse at all then, because twice
slower is still very fast.

Databases could be very huge. Even when they are in "columnar" format,
which usually consumes less space. And memory is still costs more than disk
space.

Certainly there are users who think they need "in-memory". But the truth is
very few of them really need "in-memory".

All of this is just my opinion. I could be wrong.

>> -----Original Message-----
>> From: Yura Sokolov <y.sokolov@postgrespro.ru>
>> Sent: Wednesday, January 15, 2025 11:44 PM
>> To: pgsql-hackers@lists.postgresql.org
>> Subject: Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
>>
>> 07.10.2024 17:53, Aya Iwata (Fujitsu) wrote:
>>> Hi All,
>>>
>>> Suggestions
>>>
>>> ==========
>>>
>>> When analyzing real-time data collected by PostgreSQL,
>>>
>>> it can be difficult to tune the current PostgreSQL server for
>>> satisfactory performance.
>>>
>>> Therefore, we propose Vertical Clustered Indexing (VCI), an in-memory
>>> column store function that holds data in a state suitable for business
>>> analysis and is also expected to improve analysis performance.
>>
>> I just don't get, why it should be "in-memory"? All the same things you
>> describe further, but storing in paged index on-disk with caching
>> through shared_buffers - why this way it wouldn't work?

-- 
regards
Yura Sokolov aka funny-falcon

RE: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

"Aya Iwata (Fujitsu)"

Дата:

13 мая 2025 г., 04:08:34

Hello,

I found some dead codes in 0001 patches. I removed.
Here are new patches.

Regards,
Aya Iwata
Fujitsu Limited

On Fri, May 23, 2025 at 4:29 PM Tomas Vondra <tomas@vondra.me> wrote:

Also, Alvaro seemed to think TAM is the way to go, and in order to keep
the OLTP performance he suggested to use both heap and VCI at the same
time, in different "forks". I'm not sure how would that work, or if we
can already do that - AFAIK we can't, because ForkNumber does not allow
adding custom forks. We'd have to relax that, or invent some sort of
federated TAM (that just multiplexes it to two TAMs). Maybe.

But it's not like the IAM approach doesn't need to do this. The first
patch had to add stuff to a lot of random places to make this work. And
some of the places touch stuff that we don't expect indexes to worry
about, like ALTER TABLE, etc.

I suspect another option would be to handle this with table inheritance: have one child that is heap-based, a second that's VCI, and a background job to move data from heap to VCI (and vice-versa for updates and maybe deletes).

Note that you could actually implement all that in user-space. Personally I'd much rather have a way to do pure VCI / column-store sooner and manage it myself than have to wait another release (or more) to get a complete solution...

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Tomas Vondra

Дата:

04 июня 2025 г., 21:15:58

On 6/4/25 19:59, Jim Nasby wrote:
> 
> 
> On Fri, May 23, 2025 at 4:29 PM Tomas Vondra <tomas@vondra.me
> <mailto:tomas@vondra.me>> wrote:
> 
>     Also, Alvaro seemed to think TAM is the way to go, and in order to keep
>     the OLTP performance he suggested to use both heap and VCI at the same
>     time, in different "forks". I'm not sure how would that work, or if we
>     can already do that - AFAIK we can't, because ForkNumber does not allow
>     adding custom forks. We'd have to relax that, or invent some sort of
>     federated TAM (that just multiplexes it to two TAMs). Maybe.
> 
>     But it's not like the IAM approach doesn't need to do this. The first
>     patch had to add stuff to a lot of random places to make this work. And
>     some of the places touch stuff that we don't expect indexes to worry
>     about, like ALTER TABLE, etc.
> 
> 
> I suspect another option would be to handle this with table inheritance:
> have one child that is heap-based, a second that's VCI, and a background
> job to move data from heap to VCI (and vice-versa for updates and maybe
> deletes).
> 
> Note that you could actually implement all that in user-space.
> Personally I'd much rather have a way to do pure VCI / column-store
> sooner and manage it myself than have to wait another release (or more)
> to get a complete solution...  

I don't see how could this ever work with the optimizer, which assumes
scanning an inheritance hierarchy means scanning all parts. But this
would require making planner "smarter" to know it should scan only one
of the child relations. And I believe it's not possible to do that while
constructing scans for the heap/VCI parts, those places are not aware of
what other parts are being scanned etc.

Sure, you could do this in "user-space" by constructing queries that
reference either the heap or VCI part. But then why put that into
inheritance tree at all? It certainly does not help with moving data
between the parts.

I may be missing something, of course. But it's not clear to me how is
this supposed to work ...

What I can imagine is "VCI" as a "proxy" TAM on top of heap, keeping the
columnar format in a separate fork. And using either that from custom
scans, or the heap as a fallback for cases not supported by VCI.

regars

-- 
Tomas Vondra

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Jim Nasby

Дата:

05 июня 2025 г., 01:19:35

On Wed, Jun 4, 2025 at 1:16 PM Tomas Vondra <tomas@vondra.me> wrote:

On 6/4/25 19:59, Jim Nasby wrote:
>
>
> On Fri, May 23, 2025 at 4:29 PM Tomas Vondra <tomas@vondra.me
> <mailto:tomas@vondra.me>> wrote:
>
> Also, Alvaro seemed to think TAM is the way to go, and in order to keep
> the OLTP performance he suggested to use both heap and VCI at the same
> time, in different "forks". I'm not sure how would that work, or if we
> can already do that - AFAIK we can't, because ForkNumber does not allow
> adding custom forks. We'd have to relax that, or invent some sort of
> federated TAM (that just multiplexes it to two TAMs). Maybe.
>
> But it's not like the IAM approach doesn't need to do this. The first
> patch had to add stuff to a lot of random places to make this work. And
> some of the places touch stuff that we don't expect indexes to worry
> about, like ALTER TABLE, etc.
>
>
> I suspect another option would be to handle this with table inheritance:
> have one child that is heap-based, a second that's VCI, and a background
> job to move data from heap to VCI (and vice-versa for updates and maybe
> deletes).
>
> Note that you could actually implement all that in user-space.
> Personally I'd much rather have a way to do pure VCI / column-store
> sooner and manage it myself than have to wait another release (or more)
> to get a complete solution...

I don't see how could this ever work with the optimizer, which assumes
scanning an inheritance hierarchy means scanning all parts. But this
would require making planner "smarter" to know it should scan only one
of the child relations. And I believe it's not possible to do that while
constructing scans for the heap/VCI parts, those places are not aware of
what other parts are being scanned etc.

Right; I was envisioning that one child would be a conventional heap that stored very recent data and another child would be columnar in nature. So you'd definitely want to always look at both children.

I am making an assumption (based on the comment about multiple forks) that we'd have some way to handle VCI without having an actual heap.

Sure, you could do this in "user-space" by constructing queries that
reference either the heap or VCI part. But then why put that into
inheritance tree at all? It certainly does not help with moving data
between the parts.

Right; I only brought it up because just having a working column-store would be a big win, even if you had to code something to deal with any DML that wasn't already batch up. Of course it would be better if it just did the RightThing(TM) out of the box... but the perfect can be the enemy of the good.

What I can imagine is "VCI" as a "proxy" TAM on top of heap, keeping the
columnar format in a separate fork. And using either that from custom
scans, or the heap as a fallback for cases not supported by VCI.

Yeah, there'd definitely need to be some kind of proxy... I'm just suggesting that we don't *have* to do that as a separate fork...

Of course I could also just be missing something :)

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Jim Nasby

Дата:

05 июня 2025 г., 01:51:16

On Wed, Jun 4, 2025 at 5:19 PM Jim Nasby <jnasby@upgrade.com> wrote:

What I can imagine is "VCI" as a "proxy" TAM on top of heap, keeping the
columnar format in a separate fork. And using either that from custom
scans, or the heap as a fallback for cases not supported by VCI.

Yeah, there'd definitely need to be some kind of proxy... I'm just suggesting that we don't *have* to do that as a separate fork...

(tl;dr: there are some key things that can only be implemented in the engine that would enable much more complex features to be added at the SQL level, without requiring tons of C code to implement the larger feature)

Oh, one other thing worth mentioning... it's actually not terribly hard to build a column-store in userspace today: just turn every column of a table into an array and set the TOAST target low enough so that it all gets toasted. I tested that many years ago, and even though I couldn't set the toast target back then saw some really encouraging results... provided that I constructed my queries carefully (I also had range fields for each column that stored the min/max of each array, so the planner could completely skip de-toasting anything that would not contain values of interest.)

The reason I never went anywhere with this concept is it'd be very hard for most folks to write queries that performed well. The transform from column back to row-based was actually hidden behind a view (a bunch of unnest()'s) - but if you didn't make use of the range fields in your query you lost a lot (but not all[1]) of the performance gain. I know that I could have used the hooks to teach the planner how to do this, but it would have been a huge amount of work (at least for me) to do so.

It did occur to me recently that a generic system for teaching the optimizer additional transforms it could make would be generally useful. By far the biggest example would be a way to teach it that

WHERE timestamp_field :: date = '2025-6-4'

is the same thing as

WHERE timestamp_field >= '2025-6-4' AND timestamp_field < '2025-6-5'

That would be extremely helpful in a lot of environments. There are definitely other cases where you can apply the same kinds of logic. In particular, such a feature (if generic enough) would make it possible to write simple queries against a view that transformed columnar data (stored as arrays) back into a row format *and* apply additional predicates that would make those queries highly efficient - all done via pure SQL.

[1] In my testing (which used the taxicab database) there was still a performance gain from storing the data as arrays, even if the queries to access it took no special efforts to eliminate unnecessary data. The reason is that TOAST meant that the base data was being compressed. In fact, testing showed that there was a win even if you didn't treat each individual column as an array; you could simply store an array of a composite type and still see a win.

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

19 июня 2025 г., 07:14:16

Here are the v8 patches. The main changes are as follows:

v8-0001 VCI - changes to postgres core
- same

v8-0002 VCI module - main
- extracted "compression" related code from this main patch
- applied Timur's top-up patch [1] re "session_preload_libraries"
- removed some dead code

v8-0003 VCI module - documentation
- removed mentions about compression

~~~

To avoid too much noise, other extracted patches (below) will be
maintained off-list.
0004 VCI module - compression
0005 VCI module - hothash
0006 VCI module - defer WOS2ROS

======
[1] https://www.postgresql.org/message-id/c4e314ef0a58050c8b7847ac1852555876674a69.camel%40postgrespro.ru

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

25 июня 2025 г., 05:22:57

Hi.

Here are the latest v9* patches. These have following changes:

0001
-  fixes some of the minor quirks reported by Tomas [1].

0002
- fixes to added address some of Timur's feedback/patches [2]
- test code updated to remove unsupported type
- test code updated to remove dynamic costings

======
[1] Tomas - https://www.postgresql.org/message-id/a748aa6b-c7e6-4d02-a590-ab404d590448%40vondra.me
[2] Timur - https://www.postgresql.org/message-id/6a6058fc089f89561b2545f024953e4daa0b8561.camel%40postgrespro.ru

Kind Regards,
Peter Smith
Fujitsu Australia

Hi. Here is the latest patch set v12*

Main differences are:

Patch 0001 (core)
- removed SizeOfIptrData macro, as reported by Tomas [1] and Japin [2]

Patch 0002 (vci module)
- Made fixes so the "ROS Control Worker" (for background WOS->ROS
transfer) can now launch ok.

======
[1] Tomas - https://www.postgresql.org/message-id/a748aa6b-c7e6-4d02-a590-ab404d590448%40vondra.me
[2] Japin -
https://www.postgresql.org/message-id/ME0P300MB04457E24CA8965F008FB2CDBB648A%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia.

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

17 июля 2025 г., 06:22:17

Hi. Here are the latest v13 patches.

Changes include:

PATCH 0002.
- README improvements -- as previously sent separately [1]
- Refactor InitPageCoreWithoutLock -- per proposal from Japin [2-#2]
- Changes to eliminate warnings from "headerscheck" and "cpluspluscheck"
- Add missing assignments in vci_handler()

======
[1] https://www.postgresql.org/message-id/CAHut%2BPvYQZAHcD-tK5XaobUpWoTf0Gkjx7nAA9eJq_HbPCSxCQ%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/ME0P300MB0445FD473D75F65E8B0A6F5DB64BA%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

22 июля 2025 г., 10:46:44

Hi.

Here are the latest v14 patches.

Changes include:

PATCH 0002.
- Fixes the enable_seqscan PANIC bug reported by Japin [1]
- Adds a new regression test case for this

======
[1]
https://www.postgresql.org/message-id/ME0P300MB04457E24CA8965F008FB2CDBB648A%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

28 июля 2025 г., 23:57:54

Here are the latest v15 patches.

Changes include:

PATCH 0002.
- README now says user should not tamper with VCI internal relations
- fixes/test the VACUUM bug -- fix provided by Japin [1]
- fixes/tests the reported segv for attempted REFRESH of VCI internal
relation -- see [2 comment#1]
- fixes/tests VCI internal relation dependency on the indexed table
- simplifies code for PG_TEMP_FILES_DIR -- see [2 comment#2]

======
[1]
https://www.postgresql.org/message-id/ME0P300MB0445891C69BD82561055F218B65FA%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
[2]
https://www.postgresql.org/message-id/ME0P300MB0445EBA04D6947DD717074DFB65CA%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia.

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

01 августа 2025 г., 11:24:25

Here are the latest v16 patches.

Changes include:

PATCH 0001.
- REINDEX bugfix -- fix provided by Japin [1]

PATCH 0002.
- REINDEX bugfix test cases - per [1]
- README now list all the internal relations -- per [2]
- Query of VCI internal relation no longer causes confusing HINT -- per [3]
- Renamed the VCI internal relations to have a "pg_" prefix

======
[1]
https://www.postgresql.org/message-id/ME0P300MB04453BEE52F84048683460E4B627A%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
[2] https://www.postgresql.org/message-id/CAHut%2BPt8naGc7pH0YG_0G8Wu5aqJiHoT6xP%2BY81_eJWapg9%3DDA%40mail.gmail.com
[3]
https://www.postgresql.org/message-id/ME0P300MB0445307958A2DC0831CEF56DB624A%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

08 августа 2025 г., 11:08:27

Here are the latest v17 patches.

Changes include:

PATCH 0002.
- Rebase to fix compile error, result of recent master change
- Bugfix for an unreported EXPLAIN ANALYZE error
- Bugfix for an unreported double pfree
- Code cleanup (ran pgindent; corrected ~100 typos; removed empty
coverage comments)

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Japin Li

Дата:

11 августа 2025 г., 12:39:01

On Fri, Aug 08, 2025 at 06:08:27PM +1000, Peter Smith wrote:
> Here are the latest v17 patches.
> 
> Changes include:
> 
> PATCH 0002.
> - Rebase to fix compile error, result of recent master change
> - Bugfix for an unreported EXPLAIN ANALYZE error
> - Bugfix for an unreported double pfree
> - Code cleanup (ran pgindent; corrected ~100 typos; removed empty
> coverage comments)
> 

1.
+static struct
+{
+    int         transfn_oid;    /* Transition function's funcoid. Arrays are
+                                 * sorted in ascending order */
+    Oid         transtype;      /* Transition data type */
+    PGFunction  merge_trans;    /* Function pointer set required for parallel
+                                 * aggregation for each transfn_oid */
+    vci_aggtranstype_kind kind; /* If transtype is INTERNALOID, its details */
+}           trans_funcs_table[] = {
+    {F_FLOAT4_ACCUM, 1022, merge_floatX_accum, VCI_AGG_NOT_INTERNAL},   /* 208 */
+    {F_FLOAT8_ACCUM, 1022, merge_floatX_accum, VCI_AGG_NOT_INTERNAL},   /* 222 */
+    {F_INT8INC, 20, int8pl, VCI_AGG_NOT_INTERNAL},  /* 1833 */
+    {F_NUMERIC_ACCUM, 2281, numeric_combine, VCI_AGG_NUMERIC_AGG_STATE},    /* 1834 */
+    {F_INT2_ACCUM, 2281, numeric_poly_combine, VCI_AGG_POLY_NUM_AGG_STATE}, /* 1836 */
+    {F_INT4_ACCUM, 2281, numeric_poly_combine, VCI_AGG_POLY_NUM_AGG_STATE}, /* 1835 */
+    {F_INT8_ACCUM, 2281, numeric_combine, VCI_AGG_NUMERIC_AGG_STATE},   /* 1836 */
+    {F_INT2_SUM, 20, int8pl, VCI_AGG_NOT_INTERNAL}, /* 1840 */
+    {F_INT4_SUM, 20, int8pl, VCI_AGG_NOT_INTERNAL}, /* 1841 */
+    {F_INTERVAL_AVG_COMBINE, 2281, merge_interval_avg_accum, VCI_AGG_NOT_INTERNAL}, /* 3325 */
+    {F_INT2_AVG_ACCUM, 1016, merge_intX_accum, VCI_AGG_NOT_INTERNAL},   /* 1962 */
+    {F_INT4_AVG_ACCUM, 1016, merge_intX_accum, VCI_AGG_NOT_INTERNAL},   /* 1963 */
+    {F_INT8INC_ANY, 20, int8pl, VCI_AGG_NOT_INTERNAL},  /* 2804 */
+    {F_INT8_AVG_ACCUM, 2281, int8_avg_combine, VCI_AGG_POLY_AVG_NUM_AGG_STATE}, /* 2746 */
+    {F_NUMERIC_AVG_ACCUM, 2281, numeric_avg_combine, VCI_AGG_AVG_NUMERIC_AGG_STATE},    /* 2858 */
+};

The comments state that this is sorted in ascending order, but the code doesn't
follow that rule. While the current linear search works, a future change to
binary search could cause problems.

2.
+static void
+CheckIndexedRelationKind(Relation rel)
+{
+    char        relKind = get_rel_relkind(RelationGetRelid(rel));
+
+    if (relKind == RELKIND_MATVIEW)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("access method \"%s\" does not support index on materialized view", VCI_STRING)));
+
+    if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
+        ereport(ERROR,
+                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                 errmsg("access method \"%s\" does not support index on temporary table", VCI_STRING)));
+}

Would it be possible to use rel->rd_rel->relkind directly?  This might avoid
the overhead of a function call.

3.
The discussion on add_index_delete_hook [1] makes me wonder if an Index Access
Method callback could serve the same purpose. This also raises the question:
would we then need an index update callback as well?

3.
Here are some typos.

a)
@@ -475,7 +477,7 @@ vci_scan_EndCustomPlan(CustomScanState *node)

         default:
             /* LCOV_EXCL_START */
-            elog(PANIC, "Should not reach hare");
+            elog(PANIC, "Should not reach here");
             /* LCOV_EXCL_STOP */
             break;
     }
b)
@@ -543,7 +545,7 @@ vci_create_relation(const char *rel_identifier, Relation indexRel, IndexInfo *in
             TupleDescInitEntry(new_tupdesc, (AttrNumber) 1, "bindata", BYTEAOID, -1, 0);    /* */
             break;

-            /* TIC-CRID  */
+            /* TID-CRID  */
         case VCI_RELTYPE_TIDCRID:
             natts = 1;
             new_tupdesc = CreateTemplateTupleDesc(natts);   /* no Oid */

c)
@@ -1065,7 +1065,7 @@ vci_inner_build(Relation heapRel, Relation indexRel, IndexInfo *indexInfo)
 /*
  * Put or Copy page into INIT_FORK.
  * If valid page is given, that page will be putted into INIT_FORK.
- * If Invalid page (NULL pointer) is given, MAIN_FORK page well be copied.
+ * If invalid page (NULL pointer) is given, MAIN_FORK page well be copied.
  */
 static void
 vci_putInitPage(Oid oid, Page page, BlockNumber blkno)


[1]
https://www.postgresql.org/message-id/OS7PR01MB119642862AA1CE536549D08CFEA40A%40OS7PR01MB11964.jpnprd01.prod.outlook.com

-- 
Best regards,
Japin Li
ChengDu WenWu Information Technology Co., LTD.

Вложения

v17-vci-partial-review.diff

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Japin Li

Дата:

12 августа 2025 г., 06:48:20

On Mon, Aug 11, 2025 at 05:39:01PM +0800, Japin Li wrote:
> On Fri, Aug 08, 2025 at 06:08:27PM +1000, Peter Smith wrote:
> > Here are the latest v17 patches.
> > 
> > Changes include:
> > 
> > PATCH 0002.
> > - Rebase to fix compile error, result of recent master change
> > - Bugfix for an unreported EXPLAIN ANALYZE error
> > - Bugfix for an unreported double pfree
> > - Code cleanup (ran pgindent; corrected ~100 typos; removed empty
> > coverage comments)
> > 
> 3.
> Here are some typos.
> 
> a)
> @@ -475,7 +477,7 @@ vci_scan_EndCustomPlan(CustomScanState *node)
> 
>          default:
>              /* LCOV_EXCL_START */
> -            elog(PANIC, "Should not reach hare");
> +            elog(PANIC, "Should not reach here");
>              /* LCOV_EXCL_STOP */
>              break;
>      }
> b)
> @@ -543,7 +545,7 @@ vci_create_relation(const char *rel_identifier, Relation indexRel, IndexInfo *in
>              TupleDescInitEntry(new_tupdesc, (AttrNumber) 1, "bindata", BYTEAOID, -1, 0);    /* */
>              break;
> 
> -            /* TIC-CRID  */
> +            /* TID-CRID  */
>          case VCI_RELTYPE_TIDCRID:
>              natts = 1;
>              new_tupdesc = CreateTemplateTupleDesc(natts);   /* no Oid */
> 
> c)
> @@ -1065,7 +1065,7 @@ vci_inner_build(Relation heapRel, Relation indexRel, IndexInfo *indexInfo)
>  /*
>   * Put or Copy page into INIT_FORK.
>   * If valid page is given, that page will be putted into INIT_FORK.
> - * If Invalid page (NULL pointer) is given, MAIN_FORK page well be copied.
> + * If invalid page (NULL pointer) is given, MAIN_FORK page well be copied.
>   */
>  static void
>  vci_putInitPage(Oid oid, Page page, BlockNumber blkno)
> 
> 
1.
PFA the other typos.

2.
I found it skip vci query context initialization in vci_intialize_query_context()
if full page writes is disabled, Could you explain why we need full page write
enabled for VCI?

3.
Both vci_ros.h and vci_ros.c have a comment about accessing the VCI main relation
header, but they are slightly different. Could we sync them and keep only one?

It seems the comment is outdated, as the functions vci_KeepReadingMainRelHeader()
and vci_KeepWritingMainRelHeader() do not exist in the current VCI implementation.

4.
+/**
+ * This function is assumed when the VCI index is newly built, and
+ * it converts all the data in the relation of PostgreSQL into ROS.
+ */
+uint64
+vci_ConvertWos2RosForBuild(Relation mainRel,
+                           Size workareaSize,
+                           IndexInfo *indexInfo)
...
+    result = (uint64) table_index_build_scan(comContext.heapRel,
+                                             mainRel,
+                                             indexInfo,
+                                             true,  /* allow syncscan */
+                                             true,
+                                             vci_build_callback,
+                                             (void *) &comContext, NULL)

Perhaps we can use a double return type to avoid type casting since
other places also use double.

-- 
Best regards,
Japin Li
ChengDu WenWu Information Technology Co., LTD.

Вложения

v17-typos.diff

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

13 августа 2025 г., 19:28:34

On Mon, 2025-08-11 at 17:39 +0800, Japin Li wrote:
>
> 1.
> +static struct
> +{
> +    int         transfn_oid;    /* Transition function's funcoid.
> Arrays are
> +                                 * sorted in ascending order */
> +    Oid         transtype;      /* Transition data type */
> +    PGFunction  merge_trans;    /* Function pointer set required for
> parallel
> +                                 * aggregation for each transfn_oid
> */
> +    vci_aggtranstype_kind kind; /* If transtype is INTERNALOID, its
> details */
> +}           trans_funcs_table[] = {
> +    {F_FLOAT4_ACCUM, 1022, merge_floatX_accum,
> VCI_AGG_NOT_INTERNAL},   /* 208 */
> +    {F_FLOAT8_ACCUM, 1022, merge_floatX_accum,
> VCI_AGG_NOT_INTERNAL},   /* 222 */
> +    {F_INT8INC, 20, int8pl, VCI_AGG_NOT_INTERNAL},  /* 1833 */
> +    {F_NUMERIC_ACCUM, 2281, numeric_combine,
> VCI_AGG_NUMERIC_AGG_STATE},    /* 1834 */
> +    {F_INT2_ACCUM, 2281, numeric_poly_combine,
> VCI_AGG_POLY_NUM_AGG_STATE}, /* 1836 */
> +    {F_INT4_ACCUM, 2281, numeric_poly_combine,
> VCI_AGG_POLY_NUM_AGG_STATE}, /* 1835 */
> +    {F_INT8_ACCUM, 2281, numeric_combine,
> VCI_AGG_NUMERIC_AGG_STATE},   /* 1836 */
> +    {F_INT2_SUM, 20, int8pl, VCI_AGG_NOT_INTERNAL}, /* 1840 */
> +    {F_INT4_SUM, 20, int8pl, VCI_AGG_NOT_INTERNAL}, /* 1841 */
> +    {F_INTERVAL_AVG_COMBINE, 2281, merge_interval_avg_accum,
> VCI_AGG_NOT_INTERNAL}, /* 3325 */
> +    {F_INT2_AVG_ACCUM, 1016, merge_intX_accum,
> VCI_AGG_NOT_INTERNAL},   /* 1962 */
> +    {F_INT4_AVG_ACCUM, 1016, merge_intX_accum,
> VCI_AGG_NOT_INTERNAL},   /* 1963 */
> +    {F_INT8INC_ANY, 20, int8pl, VCI_AGG_NOT_INTERNAL},  /* 2804 */
> +    {F_INT8_AVG_ACCUM, 2281, int8_avg_combine,
> VCI_AGG_POLY_AVG_NUM_AGG_STATE}, /* 2746 */
> +    {F_NUMERIC_AVG_ACCUM, 2281, numeric_avg_combine,
> VCI_AGG_AVG_NUMERIC_AGG_STATE},    /* 2858 */
> +};
>
> The comments state that this is sorted in ascending order, but the
> code doesn't
> follow that rule. While the current linear search works, a future
> change to
> binary search could cause problems.
>

Hi Japin!
I've looked at the code, vci_set_merge_and_copy_trans_funcs() function
is unused and almost all code of vci_aggmergetranstype.c file including
trans_funcs_table[] can be either removed either replaced with simple
switch-case. Only transfn_oid fields of trans_funcs_table[] were
actually used. Here is my patch made on top of v17.

Peter, what do you think? Is it OK to remove those code?

--
Regards,
Timur Magomedov

Вложения

0001-Removed-vci_set_merge_and_copy_trans_funcs.patch

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

14 августа 2025 г., 04:23:34

Here are the latest v18* patches.

Changes include:

PATCH 0002.
Cleaning:
- Fix typos (per Japin patches [1] and [2])
- Remove all #if 0 code
- Make all header comments more consistent
- Cleanup Doxygen annotation formatting
- Remove all double blank lines
Fixes:
- trans_funcs_table[] ordering (per Japin patch [1])
- Access relkind via member instead of function calls (per Japin patch [1])
- Change vci_ConvertWos2RosForBuild return type to reduce casts (per Japin [2])

======
[1] Japin 11/8.
https://www.postgresql.org/message-id/ME0P300MB04457AAC931AD1E3D0CE32FBB628A%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
[2] Japin 12/8.
https://www.postgresql.org/message-id/ME0P300MB04450DF54484C145B77BD0D8B62BA%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia

On Thu, 2025-08-14 at 11:23 +1000, Peter Smith wrote:
> Here are the latest v18* patches.
>

Hi Peter!
I've reworked my recent patch [1] so it is now based on v18 and is
divided into several simpler patches. Here they are plus one additional
patch.

0001-Fixed-comment-and-guard-name-in-vci_pg_copy.h.patch
Looks like vci_pg_copy.h was renamed from vci_numeric.h but file name
comment and define guard name were not updated. Fixed it.

0002-Removed-vci_set_merge_and_copy_trans_funcs.patch
Found that vci_set_merge_and_copy_trans_funcs() is not used anywhere,
removed it alogn with the code that was only called inside it.
trans_funcs_table[] now only contains single transfn_oid field, others
(unused) are removed.

0003-Replaced-linear-search-by-switch-case.patch
Replaced linear search inside trans_funcs_table array to more optimal
switch-case.

0004-Removed-worker-name-check-in-lock.c.patch
This is one I'm not sure about.
Found that changes in Postgres core lock.c file check for "backend="
substring in background worker name. There is also a comment in
vci_ros_daemon.c mentioning bgw_name checks of LockAquire(). Names
don't match however. So as far as I understand the check for "backend="
in name is always false since no code in VCI sets bgw_name to something
similar.
This is either forgotten feature that can be easily fixed by removing
bgw_name checks, either some bug, either my misunderstanding.
For the first case, here is a patch that removes bgw_name checks in
lock.c. It makes core patch a bit smaller and not touching lock.c at
all (Yay!).

[1]
https://www.postgresql.org/message-id/8beac6e8a01971b22ccf0f2e2a8eb12a78e5a7ac.camel%40postgrespro.ru

--
Regards,
Timur Magomedov

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

19 августа 2025 г., 09:36:07

Hi Timur.

Thanks for the patches you provided. My replies are inline below.

On Sat, Aug 16, 2025 at 3:45 AM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
>
> On Thu, 2025-08-14 at 11:23 +1000, Peter Smith wrote:
> > Here are the latest v18* patches.
> >
>
> Hi Peter!
> I've reworked my recent patch [1] so it is now based on v18 and is
> divided into several simpler patches. Here they are plus one additional
> patch.
>
> 0001-Fixed-comment-and-guard-name-in-vci_pg_copy.h.patch
> Looks like vci_pg_copy.h was renamed from vci_numeric.h but file name
> comment and define guard name were not updated. Fixed it.

In v19 I intend to merge vci_pg_copy.h into postgresql_copy.h, so this is moot.

>
> 0002-Removed-vci_set_merge_and_copy_trans_funcs.patch
> Found that vci_set_merge_and_copy_trans_funcs() is not used anywhere,
> removed it alogn with the code that was only called inside it.
> trans_funcs_table[] now only contains single transfn_oid field, others
> (unused) are removed.
>
> 0003-Replaced-linear-search-by-switch-case.patch
> Replaced linear search inside trans_funcs_table array to more optimal
> switch-case.

Thanks, these 0002 and 0003 changes will be in v19 patches which I'm
hoping to post tomorrow or the day after.

>
> 0004-Removed-worker-name-check-in-lock.c.patch
> This is one I'm not sure about.
> Found that changes in Postgres core lock.c file check for "backend="
> substring in background worker name. There is also a comment in
> vci_ros_daemon.c mentioning bgw_name checks of LockAquire(). Names
> don't match however. So as far as I understand the check for "backend="
> in name is always false since no code in VCI sets bgw_name to something
> similar.
> This is either forgotten feature that can be easily fixed by removing
> bgw_name checks, either some bug, either my misunderstanding.
> For the first case, here is a patch that removes bgw_name checks in
> lock.c. It makes core patch a bit smaller and not touching lock.c at
> all (Yay!).
>

There is code in the function vci_LaunchROSControlWorker() which does
a sprintf to assign the worker.bgw_name, but I also do not see how
LockAcquire can have a bgw_name containing string “backend=”.
Frankly, I expected the patch 0001 code should say more like strstr(…,
“vci:”) because otherwise it does not seem VCI specific. Indeed, if it
was checking “vci:” then the comment in storage/vci_ros_daemon.c would
make sense to me.

So, I agree  that the Acquire/ReleaseLock code seems like it might be
unreachable, OTOH, I am reluctant to remove it without understanding
more about what was the intended purpose in the first place. Checking…

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

20 августа 2025 г., 05:27:00

Here are the latest v19* patches.

Changes include:

PATCH 0002.
Cleaning
- Removed unused code (per Timur's patch 0002 [1])
- Removed vci_pg_copy.h; Combined this into postrgesql_copy.h
Code changes
- Changed linear search to switch (per Timur's patch 0003 [1])
- Removed UpperUniquePath (fix build issue due to master commit 24225ad)

======
[1] Timur's patches.
https://www.postgresql.org/message-id/07c53a696afb8089d724214dbaeded6fcaa8fc0d.camel%40postgrespro.ru

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

28 августа 2025 г., 03:20:36

Here are the latest v20* patches.

Changes include:

PATCH 0002.
- Addressed all compiler warnings. Hopefully, cfbot CI will now report
green for these.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

29 августа 2025 г., 06:37:10

Here are the latest v21* patches.

Changes include:

PATCH 0002.
- Removed all the unused "port" subfolder files plus associated include files.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

09 сентября 2025 г., 04:41:44

Here are the latest v22* patches.

Changes include:

PATCH 0002.
- Code cleanups -- remove some redundant code related to "devload_t",
because the OSS patch only implements the "unmonitored" device.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

10 сентября 2025 г., 05:31:57

Here are the latest v23* patches.

Changes include:

PATCH 0002.
- Some code/config changes so that cfbot can work now for MacOS and FreeBSD.
- Modify test code to make the ANALYZE expected results more stable
(don't display buffers).

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

12 сентября 2025 г., 18:58:58

Hello Peter!
Thanks for updates. Here is a small fix for clang "variable set but not
used" warnings.

--
Regards,
Timur Magomedov

Вложения

0001-Fixed-set-but-not-used-clang-warnings.patch

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

15 сентября 2025 г., 02:30:40

On Sat, Sep 13, 2025 at 1:58 AM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
>
> Hello Peter!
> Thanks for updates. Here is a small fix for clang "variable set but not
> used" warnings.
>

Thanks for your fixes. I will include these whenever I post the next version.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

17 сентября 2025 г., 18:15:00

Hello Peter!


I've found (using valgrind) some cases of reading random garbage after
allocated memory. Investigation showed this was caused by converting
some nodes to VciScanState* even if they have smaller size allocated
originally. So accessing VciScanState fields was accessing memory after
palloc'ed memory which could be used by any other allocation.

I think converting to VciScanState* is only valid for nodes with tag
T_CustomScanState so here is a patch that adds assertions for this:
0001-Assert-corrrect-node-tags-when-casting-to-VciScanSta.patch

VCI v23 passes the tests with this patch applied.

There are queries that fail unfortunately. I've added one of them to
bugs.sql:
0002-Reproducer-of-invalid-cast-to-VciScanState.patch
Node with tag 420 (T_NestLoopState) is cast to VciScanState* that fails
newly added assertion.

I'm not sure where to look next to fix this. Looking forward for you
comments and ideas.

--
Regards,
Timur Magomedov

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

18 сентября 2025 г., 07:46:17

Here are the latest v24* patches.

Changes include:

PATCH 0002.
- Some compiler warning fixes, from Timur [1]
- Pre-emptive removal of PointerIsValid() macro to prevent a rebase
- Added sanity Asserts for T_CustomScanState, from Timur [2]

======
[1] https://www.postgresql.org/message-id/2af90dfaf6004e17782bd6c8cb8444670ab4d82c.camel%40postgrespro.ru
[2] https://www.postgresql.org/message-id/149d6694c0c5a789b0ee865e80109022002bade5.camel%40postgrespro.ru

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

18 сентября 2025 г., 08:07:36

Hi Timur,

Thanks for your ongoing work for this patch.

On Thu, Sep 18, 2025 at 1:15 AM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
...
> I've found (using valgrind) some cases of reading random garbage after
> allocated memory. Investigation showed this was caused by converting
> some nodes to VciScanState* even if they have smaller size allocated
> originally. So accessing VciScanState fields was accessing memory after
> palloc'ed memory which could be used by any other allocation.
>
> I think converting to VciScanState* is only valid for nodes with tag
> T_CustomScanState so here is a patch that adds assertions for this:
> 0001-Assert-corrrect-node-tags-when-casting-to-VciScanSta.patch

What exactly did Valgrind report? For example, you said the
VciScanState points beyond allocated memory. Do you have any more
clues, like where that happened? Did you discover where that (smaller
than it should be) memory was allocated in the first place?

>
> VCI v23 passes the tests with this patch applied.

OK. I am not 100% certain about the asserts, but since the existing
VCI tests are passing, I have merged your patch as-is into v24-0002. I
guess we will find out later if the bug below is due to an old code
cast problem or a new code assert problem.

>
> There are queries that fail unfortunately. I've added one of them to
> bugs.sql:
> 0002-Reproducer-of-invalid-cast-to-VciScanState.patch
> Node with tag 420 (T_NestLoopState) is cast to VciScanState* that fails
> newly added assertion.
>
> I'm not sure where to look next to fix this. Looking forward for you
> comments and ideas.

OK. I ran with your 2nd patch applied and reproduced the core-dump
below, where it tripped over one of your new Asserts at
executor/vci_sort.c:89. I can see it is an unexpected value
T_NestLoopState.

(gdb) bt 15
#0  0x00007ff948aa62c7 in raise () from /lib64/libc.so.6
#1  0x00007ff948aa79b8 in abort () from /lib64/libc.so.6
#2  0x0000000000c07977 in ExceptionalCondition
(conditionName=0x7ff940839fa8 "scanstate->vci.css.ss.ps.type ==
T_CustomScanState", fileName=0x7ff940839f90 "executor/vci_sort.c",
    lineNumber=89) at assert.c:66
#3  0x00007ff9408084e6 in vci_sort_ExecCustomPlan (node=0x2a862f0) at
executor/vci_sort.c:89
#4  0x000000000079d5bd in ExecCustomScan (pstate=0x2a862f0) at nodeCustom.c:137
#5  0x000000000077f693 in ExecProcNodeInstr (node=0x2a862f0) at
execProcnode.c:486
#6  0x000000000077f664 in ExecProcNodeFirst (node=0x2a862f0) at
execProcnode.c:470
#7  0x0000000000772b72 in ExecProcNode (node=0x2a862f0) at
../../../src/include/executor/executor.h:316
#8  0x0000000000775774 in ExecutePlan (queryDesc=0x2a89100,
operation=CMD_SELECT, sendTuples=true, numberTuples=0,
direction=ForwardScanDirection, dest=0xe5b1a0 <donothingDR>)
    at execMain.c:1697
#9  0x000000000077317b in standard_ExecutorRun (queryDesc=0x2a89100,
direction=ForwardScanDirection, count=0) at execMain.c:366
#10 0x00007ff9407f9efd in vci_executor_run_routine
(queryDesc=0x2a89100, direction=ForwardScanDirection, count=0) at
executor/vci_executor.c:178
#11 0x0000000000772ff5 in ExecutorRun (queryDesc=0x2a89100,
direction=ForwardScanDirection, count=0) at execMain.c:301
#12 0x00000000006b7f66 in ExplainOnePlan (plannedstmt=0x2a8a628,
into=0x0, es=0x2a81388,
    queryString=0x28b0fd0 "EXPLAIN (ANALYZE, COSTS FALSE, BUFFERS
FALSE, TIMING FALSE, SUMMARY FALSE)\nSELECT *\n  FROM main m\n  JOIN
secondary s\n\tON m.id = s.main_id\n WHERE s.val in (\n\t\tSELECT
MAX(val)\n\t\t  FROM secondary s2\n\t\t W"..., params=0x0,
queryEnv=0x0, planduration=0x7ffe51311320, bufusage=0x0,
mem_counters=0x0) at explain.c:579
#13 0x00000000006b799a in standard_ExplainOneQuery (query=0x2ac21f8,
cursorOptions=2048, into=0x0, es=0x2a81388,
    queryString=0x28b0fd0 "EXPLAIN (ANALYZE, COSTS FALSE, BUFFERS
FALSE, TIMING FALSE, SUMMARY FALSE)\nSELECT *\n  FROM main m\n  JOIN
secondary s\n\tON m.id = s.main_id\n WHERE s.val in (\n\t\tSELECT
MAX(val)\n\t\t  FROM secondary s2\n\t\t W"..., params=0x0,
queryEnv=0x0) at explain.c:372
#14 0x00007ff9407f9ff3 in vci_explain_one_query_routine
(queryDesc=0x2ac21f8, cursorOptions=2048, into=0x0, es=0x2a81388,
    queryString=0x28b0fd0 "EXPLAIN (ANALYZE, COSTS FALSE, BUFFERS
FALSE, TIMING FALSE, SUMMARY FALSE)\nSELECT *\n  FROM main m\n  JOIN
secondary s\n\tON m.id = s.main_id\n WHERE s.val in (\n\t\tSELECT
MAX(val)\n\t\t  FROM secondary s2\n\t\t W"..., params=0x0,
queryEnv=0x0) at executor/vci_executor.c:224
(More stack frames follow...)
(gdb)

I will keep investigating it...

I have not included your test case in the v24* patches because I
didn't want this known test failure to mask out any other unknown test
problems that might occur.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

19 сентября 2025 г., 17:13:51

Hi Peter!

> What exactly did Valgrind report? For example, you said the
> VciScanState points beyond allocated memory. Do you have any more
> clues, like where that happened? Did you discover where that (smaller
> than it should be) memory was allocated in the first place?

Doing some experiments I've faced a segfault on a query joining tables
filled with some amount of data. It was flaky so I used Valgrind.
There is a line in vci_scan.c, exec_custom_plan_enabling_vp():
if (!scanstate->first_fetch || (scanstate->pos.num_fetched_rows <=
scanstate->pos.current_row))
Valgrind reported that line as Invalid read of size 1, 4 and 4. So all
three of the values checked in this line are read from some random
memory, possibly allocated and used by other objects already.

When the expression in exec_custom_plan_enabling_vp() randomly
evaluated to true, the following ExecClearTuple() dereferences NULL in
slot->tts_ops.
The memory was originally allocated in nodeHashJoin.c, in hjstate =
makeNode(HashJoinState) line so it is really smaller than VciScanState.

I did not use any table data for reproducer since asserts helps to
catch the original problem. I also simplified the original query for a
reproducer.

> OK. I am not 100% certain about the asserts, but since the existing
> VCI tests are passing, I have merged your patch as-is into v24-0002.
> I
> guess we will find out later if the bug below is due to an old code
> cast problem or a new code assert problem.
>

Thanks for merging asserts. And looks like the problem is related to
VCI join nodes.
There is no VCI hash join or VCI nested loop. There is a code in VCI
planner that still puts VCI Sort or VCI Aggregate nodes on top of
regular join nodes which makes no sense to me. This is the cause of the
problem. VCI Sort and VCI Aggregate then convert outer nodes to VCI
Scan since they know there can't be anything another. This can be fixed
either by implementing VCI joins either by disabling them in a deeper
way. Since we already have developer GUCs for them I would rather set
them to disabled by default instead of removing all useful VCI joins
related code.

Made a patch with a test and a simplest fix (disabling joins in GUCs).

--
Regards,
Timur Magomedov

Вложения

0001-Avoid-VCI-sort-after-non-VCI-join-in-planner.patch

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

23 сентября 2025 г., 05:54:13

Here are the latest v25* patches.

Changes include:

PATCH 0001.
- A rebase was needed due to a recent commit d4d1fc5.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

24 сентября 2025 г., 05:46:39

On Sat, Sep 20, 2025 at 12:13 AM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
>
> Hi Peter!
>
> > What exactly did Valgrind report? For example, you said the
> > VciScanState points beyond allocated memory. Do you have any more
> > clues, like where that happened? Did you discover where that (smaller
> > than it should be) memory was allocated in the first place?
>
> Doing some experiments I've faced a segfault on a query joining tables
> filled with some amount of data. It was flaky so I used Valgrind.
> There is a line in vci_scan.c, exec_custom_plan_enabling_vp():
> if (!scanstate->first_fetch || (scanstate->pos.num_fetched_rows <=
> scanstate->pos.current_row))
> Valgrind reported that line as Invalid read of size 1, 4 and 4. So all
> three of the values checked in this line are read from some random
> memory, possibly allocated and used by other objects already.
>
> When the expression in exec_custom_plan_enabling_vp() randomly
> evaluated to true, the following ExecClearTuple() dereferences NULL in
> slot->tts_ops.
> The memory was originally allocated in nodeHashJoin.c, in hjstate =
> makeNode(HashJoinState) line so it is really smaller than VciScanState.
>
> I did not use any table data for reproducer since asserts helps to
> catch the original problem. I also simplified the original query for a
> reproducer.
>
> > OK. I am not 100% certain about the asserts, but since the existing
> > VCI tests are passing, I have merged your patch as-is into v24-0002.
> > I
> > guess we will find out later if the bug below is due to an old code
> > cast problem or a new code assert problem.
> >
>
> Thanks for merging asserts. And looks like the problem is related to
> VCI join nodes.
> There is no VCI hash join or VCI nested loop. There is a code in VCI
> planner that still puts VCI Sort or VCI Aggregate nodes on top of
> regular join nodes which makes no sense to me. This is the cause of the
> problem. VCI Sort and VCI Aggregate then convert outer nodes to VCI
> Scan since they know there can't be anything another. This can be fixed
> either by implementing VCI joins either by disabling them in a deeper
> way. Since we already have developer GUCs for them I would rather set
> them to disabled by default instead of removing all useful VCI joins
> related code.
>
> Made a patch with a test and a simplest fix (disabling joins in GUCs).
>

Hi Timur,

Thanks for the patch! Unfortunately, this is straying into areas with
which I'm not familiar, so I'm taking it on faith that these are good
changes. For now, I'm happy to merge your patch into the next VCI
version, posted unless someone else objects.

~

But, I still have a couple of questions for clarification:

1. What about the original Valgrind issue?

Is that still a problem that needs to be addressed? E.g., is the bad
allocation still lurking, and your sort avoidance patch is simply
preventing the bad allocation from being exposed until some next thing
randomly fails? Or is there no allocation problem anymore to worry
about?

2. What about your added Assert that was previously failing at
executor/vci_sort.c:89?

That Assert is still present in vci_sort.c, but AFAICT the current
tests are not executing that code. Do those patched GUC changes simply
make that code unreachable now? In other words, should that previously
failing Assert be left where it is or not? Should there be another
test case added to execute this Assert?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

24 сентября 2025 г., 16:47:37

Hi Peter!

On Wed, 2025-09-24 at 12:46 +1000, Peter Smith wrote:
> On Sat, Sep 20, 2025 at 12:13 AM Timur Magomedov
> <t.magomedov@postgrespro.ru> wrote:
> >
> > Hi Peter!
> >
> > > What exactly did Valgrind report? For example, you said the
> > > VciScanState points beyond allocated memory. Do you have any more
> > > clues, like where that happened? Did you discover where that
> > > (smaller
> > > than it should be) memory was allocated in the first place?
> >
> > Doing some experiments I've faced a segfault on a query joining
> > tables
> > filled with some amount of data. It was flaky so I used Valgrind.
> > There is a line in vci_scan.c, exec_custom_plan_enabling_vp():
> > if (!scanstate->first_fetch || (scanstate->pos.num_fetched_rows <=
> > scanstate->pos.current_row))
> > Valgrind reported that line as Invalid read of size 1, 4 and 4. So
> > all
> > three of the values checked in this line are read from some random
> > memory, possibly allocated and used by other objects already.
> >
> > When the expression in exec_custom_plan_enabling_vp() randomly
> > evaluated to true, the following ExecClearTuple() dereferences NULL
> > in
> > slot->tts_ops.
> > The memory was originally allocated in nodeHashJoin.c, in hjstate =
> > makeNode(HashJoinState) line so it is really smaller than
> > VciScanState.
> >
> > I did not use any table data for reproducer since asserts helps to
> > catch the original problem. I also simplified the original query
> > for a
> > reproducer.
> >
> > > OK. I am not 100% certain about the asserts, but since the
> > > existing
> > > VCI tests are passing, I have merged your patch as-is into v24-
> > > 0002.
> > > I
> > > guess we will find out later if the bug below is due to an old
> > > code
> > > cast problem or a new code assert problem.
> > >
> >
> > Thanks for merging asserts. And looks like the problem is related
> > to
> > VCI join nodes.
> > There is no VCI hash join or VCI nested loop. There is a code in
> > VCI
> > planner that still puts VCI Sort or VCI Aggregate nodes on top of
> > regular join nodes which makes no sense to me. This is the cause of
> > the
> > problem. VCI Sort and VCI Aggregate then convert outer nodes to VCI
> > Scan since they know there can't be anything another. This can be
> > fixed
> > either by implementing VCI joins either by disabling them in a
> > deeper
> > way. Since we already have developer GUCs for them I would rather
> > set
> > them to disabled by default instead of removing all useful VCI
> > joins
> > related code.
> >
> > Made a patch with a test and a simplest fix (disabling joins in
> > GUCs).
> >
>
> Hi Timur,
>
> Thanks for the patch! Unfortunately, this is straying into areas with
> which I'm not familiar, so I'm taking it on faith that these are good
> changes. For now, I'm happy to merge your patch into the next VCI
> version, posted unless someone else objects.
>
> ~
>
> But, I still have a couple of questions for clarification:
>
> 1. What about the original Valgrind issue?
>
> Is that still a problem that needs to be addressed? E.g., is the bad
> allocation still lurking, and your sort avoidance patch is simply
> preventing the bad allocation from being exposed until some next
> thing
> randomly fails? Or is there no allocation problem anymore to worry
> about?

Allocations are fine, the problem was using some nodes as nodes of
another type (and bigger size) which leads to crossing boundary of
allocated memory. We are safe now and asserts guard us from repeating
the original bug.


> 2. What about your added Assert that was previously failing at
> executor/vci_sort.c:89?
>
> That Assert is still present in vci_sort.c, but AFAICT the current
> tests are not executing that code. Do those patched GUC changes
> simply
> make that code unreachable now? In other words, should that
> previously
> failing Assert be left where it is or not? Should there be another
> test case added to execute this Assert?

Added simple test for running VCI sort node, it executes the assertion
code in vci_sort.c. No assertion fails, so VCI Sort itself is OK. Here
are both two commits on top of v25 version.


--
Regards,
Timur Magomedov

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

25 сентября 2025 г., 04:47:54

Hi Timur.

On Wed, Sep 24, 2025 at 11:47 PM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
...
> >
> > Thanks for the patch! Unfortunately, this is straying into areas with
> > which I'm not familiar, so I'm taking it on faith that these are good
> > changes. For now, I'm happy to merge your patch into the next VCI
> > version, posted unless someone else objects.
> >
> > ~
> >
> > But, I still have a couple of questions for clarification:
> >
> > 1. What about the original Valgrind issue?
> >
> > Is that still a problem that needs to be addressed? E.g., is the bad
> > allocation still lurking, and your sort avoidance patch is simply
> > preventing the bad allocation from being exposed until some next
> > thing
> > randomly fails? Or is there no allocation problem anymore to worry
> > about?
>
> Allocations are fine, the problem was using some nodes as nodes of
> another type (and bigger size) which leads to crossing boundary of
> allocated memory. We are safe now and asserts guard us from repeating
> the original bug.
>

Thanks for the clarification.

>
> > 2. What about your added Assert that was previously failing at
> > executor/vci_sort.c:89?
> >
> > That Assert is still present in vci_sort.c, but AFAICT the current
> > tests are not executing that code. Do those patched GUC changes
> > simply
> > make that code unreachable now? In other words, should that
> > previously
> > failing Assert be left where it is or not? Should there be another
> > test case added to execute this Assert?
>
> Added simple test for running VCI sort node, it executes the assertion
> code in vci_sort.c. No assertion fails, so VCI Sort itself is OK. Here
> are both two commits on top of v25 version.
>

These have been included in v26. Thanks!

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

25 сентября 2025 г., 04:49:16

Here are the latest v26* patches.

Changes include:

PATCH 0002.
- Rebase was needed due to a5b35fc
- Merged the sort/join patches provided by Timur [1]

======
[1] https://www.postgresql.org/message-id/fd67d6c54cccaf0d98ec8a19182635067392b928.camel%40postgrespro.ru

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

26 сентября 2025 г., 17:09:32

Hello Peter!
There is a code in vci_ros.c that initializes xl_heap_inplace xlrec.
Comment says this code was taken from src/backend/access/heap/heapam.c.
It was fine for Postgres 17 and earlier however struct xl_heap_inplace
has 6 fields, not one since commit 243e9b40f1b2. So nmsgs field of
xlrec has some random uninitialized value from stack. It goes to WAL
and in case of big nmsgs it can cause segfault during server startup.

Here are backtrace of a segfault while applying WAL on server startup
and a patch that initializes all necessary fields of xlrec to avoid bad
WAL records.

--
Regards,
Timur Magomedov

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

29 сентября 2025 г., 02:39:12

Here are the latest v27* patches.

Changes include:

PATCH 0002.
- Rebase vci_ros.c was needed due to 243e9b4 (patch provided by Timur [1])

======
[1] https://www.postgresql.org/message-id/b0183172fbe8fbf4260d10df50a57127753eba68.camel%40postgrespro.ru

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

06 октября 2025 г., 05:11:59

Here are the latest v28* patches.

Changes include:

PATCH 0001.
- Remove some code that Iwata-San has already separated out to a
different thread [1].

PATCH 0002.
- Temp removal of call to the now separated out function.

======
[1]
https://www.postgresql.org/message-id/flat/OS7PR01MB11964335F36BE41021B62EAE8EAE4A%40OS7PR01MB11964.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

10 октября 2025 г., 02:37:49

Here are the latest v29* patches.

Changes include:

PATCH 0001.
- Rebase was needed

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

14 октября 2025 г., 08:44:25

Here are the latest v30* patches.

Changes include:

PATCH 0002.
- Rebase was needed due to commit add323d

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

06 ноября 2025 г., 10:22:55

Here are the latest patches.

Some rebasing was needed.

Note -- I've changed the versioning -- now date-based instead of v31.

~~~

Changes include:

PATCH 0001.
- Rebase due to commit e1ac846f3d2836dcfa0ad15310e28d0a0b495500
(ItemPointerData)

PATCH 0002.
- Rebase due to commit e1ac846f3d2836dcfa0ad15310e28d0a0b495500
(ItemPointerData)
- Rebase due to commit a13833c (GUC structs)

~~~

KNOWN ISSUES
The test cases are currently failing with a TRAP Assert.
Unfortunately, the CI has failed to apply this patch for the last few
weeks, so any master changes during that time could have impacted VCI
without being noticed. Meanwhile, I am reposting these rebased patches
despite the known test issue, because then at least it may
catch/report any other emerging build problems early, while the cause
of the TRAP is still being investigated.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

12 ноября 2025 г., 19:11:51

Hello Peter!
I've succeeded in making a reproducer for a infrequent bug I've seen
several times with ROS control daemon enabled.
Looks like WAL records produced by ROS control daemon while processing
"vci_rc_update_del_vec" command are not compatible with what
heap_xlog_prune_freeze() function expects to read from WAL. Those
records are produced in cleanUpWos(), specific line looks like "recptr
= XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_PRUNE_ON_ACCESS);".
The reproducing steps can look tricky, any ideas of improving them are
welcome. This would ideally be a TAP test. For now I just patch code so
that ROS daemon terminates rignt after "update delete vector" command,
kill all postgress processes and next time PostgreSQL is started it
catches assertion inside heap_xlog_prune_freeze() function.

This is the reproduction routine in four steps:

1. Patch VCI using vci_always_fail_update_delete_vector.patch and build
it.

2. Setup VCI in config file (ros_control_daemon enabled):
shared_preload_libraries = 'vci'
max_worker_processes = 20
vci.table_rows_threshold = 0
vci.cost_threshold = 0
vci.enable_ros_control_daemon = true

3. Run reproducer.sh script that runs pgbench on VCI-enabled table and
terminates all postgres processes immediately using "killall -s 9
postgres" after pgbench failed. "pg_ctl stop" can't terminate
PostgreSQL here. "update delete vector" command is usually executed in
less than ten minutes on my system but it needs to wait some time.

4. Here we are with some WAL records on storage that (at least on my
machine) PostgreSQL is unable to apply and fails the assertion:

$ pg_ctl start
...
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 0/017689D8
..TRAP: failed Assert("do_prune || nplans > 0 || vmflags &
VISIBILITYMAP_VALID_BITS"), File: "heapam_xlog.c", Line: 117, PID:
841207

Debugger shows data actually contains some offsets, in order, but the
format and flags combination are unexpected:
#6  0x000055fad895d1f7 in heap_xlog_prune_freeze
(record=0x55faf23f0ce0) at heapam_xlog.c:117
117            Assert(do_prune || nplans > 0 || vmflags &
VISIBILITYMAP_VALID_BITS);
(gdb) print dataptr
$1 = 0x7072b5e18248 "\001"
(gdb) print datalen
$2 = 370
(gdb) print frz_offsets
$3 = (OffsetNumber *) 0x7072b5e18248
(gdb) print *frz_offsets@185
$4 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
  37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
  72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104,
105,
  106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133,
  134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147,
148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161,
  162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175,
176, 177, 178, 179, 180, 181, 182, 183, 184, 185}
(gdb) print frz_offsets==dataptr
$5 = 1

I also attached backtrace from GDB.

I don't understand yet how to fix this and the reproducing is clunky so
any ideas are welcome.
Does this reproduce on your system too? Is it some known problem?

--
Regards,
Timur Magomedov

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

13 ноября 2025 г., 10:02:20

Hi Timur,

Thank you so much for your continued interest in this patch and for
taking the time to report these bugs! It is really appreciated and
your info has been recorded so it won't be forgotten.

Recent work and focus:

* As you might have seen, there have been many changes to master
lately, so I've been rebasing VCI to keep everything building cleanly.
Good news - the existing tests are now working again (locally) after
failing on -hackers for the past couple of weeks. I'm planning to post
the updated patches before the end of this week.

* Once that's done, my next priority is to split the patch 0002 into
smaller, more manageable pieces for easier review.

In other words, it is going to take a little while before I can circle
back to help investigate intermittent bugs.

Regarding your questions:

On Thu, Nov 13, 2025 at 3:11 AM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
>
...
> I don't understand yet how to fix this and the reproducing is clunky so
> any ideas are welcome.
> Does this reproduce on your system too? Is it some known problem?
>

I'll follow your reproduction steps when I have a chance and get back to you.

I am not aware that this is a known problem.

~~

Thanks again. Sorry I can't be of more immediate help.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

14 ноября 2025 г., 07:24:11

Here are the latest patches v20251114*

Changes include:

PATCH 0002.
- A fix was needed to make VCI code compatible with the recent master
change c106ef0.

VCI tests should now be passing again.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

17 ноября 2025 г., 04:34:44

Here are the latest patches.

Changes include:

PATCH 0002.
- Address a new compiler warning reported by CI

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Timur Magomedov

Дата:

17 ноября 2025 г., 17:32:34

On Mon, 2025-11-17 at 12:34 +1100, Peter Smith wrote:
> Here are the latest patches.
>
> Changes include:
>
> PATCH 0002.
> - Address a new compiler warning reported by CI
>
Hello Peter!
Thanks for updating the patch!
According to the issue I've found recently with incorrect WAL record
[1], I've prepared a patch that replaces manual XLog...() calls by a
call to log_heap_prune_and_freeze() in a way similar to what
freezeWos() and lazy_vacuum_heap_page() do. This fixes reproducer
scenario on my machine.

[1]
https://www.postgresql.org/message-id/36cedffdfcac437afd692442cf9c1d16d7f28b01.camel%40postgrespro.ru

--
Regards,
Timur Magomedov

Вложения

0001-Fixed-WAL-record-in-cleanUpWos.patch

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

18 ноября 2025 г., 01:45:39

On Tue, Nov 18, 2025 at 1:32 AM Timur Magomedov
<t.magomedov@postgrespro.ru> wrote:
> Hello Peter!
> Thanks for updating the patch!
> According to the issue I've found recently with incorrect WAL record
> [1], I've prepared a patch that replaces manual XLog...() calls by a
> call to log_heap_prune_and_freeze() in a way similar to what
> freezeWos() and lazy_vacuum_heap_page() do. This fixes reproducer
> scenario on my machine.
>
> [1]
> https://www.postgresql.org/message-id/36cedffdfcac437afd692442cf9c1d16d7f28b01.camel%40postgrespro.ru
>

Here are the latest patches.

Changes include:

PATCH 0002.
- Merged the fix provided by Timur [1].

======
[1] https://www.postgresql.org/message-id/1de545d00a5c0bb8aaf42f05dd118a2c19030041.camel%40postgrespro.ru

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

09 декабря 2025 г., 08:41:30

Hi. Here is the latest set of VCI patches.

The previous huge patch 0002 has been split into smaller parts to make
it more manageable. The split parts are buildable, but they are not
separately functional. In other words, nothing will work unless/until
all of the parts have been applied.

Summary of changes:

Patch 0001 (unchanged)

~

Patch 0002 (content unchanged; split into multiple parts)

BEFORE
0002-VCI-module-main
AFTER
0002-VCI-main-part1
0003-VCI-main-part2
0004-VCI-main-part3
0005-VCI-main-part4
0006-VCI-main-part5
0007-VCI-main-part6
0008-VCI-tests

~

Patch 0003 (content unchanged)

BEFORE
0003-VCI-module-documentation
AFTER
0009-VCI-docs

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

17 декабря 2025 г., 04:40:13

Here is the latest set of VCI patches.

Summary of changes:

Where possible, now declare 'for-loop' variables locally. There were
many of these, resulting in a code reduction of 200+ lines.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Hi.

Here is the latest set of VCI patches.

Summary of changes:
- Most palloc's have been changed to use palloc_array and
palloc_object per similar recent HEAD changes
- In passing, noticed and fixed a MemSet bug.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

От

Peter Smith

Дата:

23 декабря 2025 г., 08:00:13

Hi.

Here is the latest set of VCI patches.

Summary of changes:
- Patch 0001. Modify the locking in function isRelHasVCIIndex() to
address a random TRAP seen in CI results.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: [WIP]Vertical Clustered Index (columnar store extension) - take2

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения