Обсуждение: partitioning and identity column

Поиск

Список

Период

Сортировка

partitioning and identity column

От

Ashutosh Bapat

Дата:

27 октября 2023 г., 14:32:11

Hi All,
Reading [1] I have been experimenting with behaviour of identity
columns and serial column in case of partitioned tables. My
observations related to serial column can be found at [2]. This email
is about identity column behaviour with partitioned tables. Serial and
identity columns have sufficiently different behaviour and
implementation to have separate discussions on their behaviour. I
don't want to mix this with [1] since that thread is about replacing
serial with identity. The discussion in this and [2] will be useful to
drive [1] forward.

Behaviour 1
=========
If a partitioned table has an identity column, the partitions do not
inherit identity property.
#create table tpart (a int generated always as identity primary key,
                    src varchar) partition by range(a);
#create table t_p1 partition of tpart for values from (1) to (3);
#\d tpart
                         Partitioned table "public.tpart"
 Column |       Type        | Collation | Nullable |           Default
--------+-------------------+-----------+----------+------------------------------
 a      | integer           |           | not null | generated always
as identity
 src    | character varying |           |          |
Partition key: RANGE (a)
Indexes:
    "tpart_pkey" PRIMARY KEY, btree (a)
Number of partitions: 2 (Use \d+ to list them.)

#\d t_p1
                     Table "public.t_p1"
 Column |       Type        | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
 a      | integer           |           | not null |
 src    | character varying |           |          |
Partition of: tpart FOR VALUES FROM (1) TO (3)
Indexes:
    "t_p1_pkey" PRIMARY KEY, btree (a)

Notice that the default column of t_p1. This means that a direct
INSERT into partition will fail if it does not specify value for the
identity column. As a consequence such a value may conflict with an
existing value or a future value of the identity column. In the
example, the identity column is a primary key and also a partition
key, thus the conflict would result in an error. But when it's not a
partition key (and hence a primary key), it will just allow those
conflicting values.

Behaviour 2
=========
If a table being attached as a partition to a partitioned table and
both of them have column with same name as identity column, the ATTACH
succeeds and allow both tables to use different sequences.
#create table t_p5 (a int primary key, b int generated always as
identity, src varchar);
#alter table tpart attach partition t_p5 for values from (7) to (9);
#\d t_p5
                               Table "public.t_p5"
 Column |       Type        | Collation | Nullable |           Default
--------+-------------------+-----------+----------+------------------------------
 a      | integer           |           | not null |
 b      | integer           |           | not null | generated always
as identity
 src    | character varying |           |          |
Partition of: tpart FOR VALUES FROM (7) TO (9)
Indexes:
    "t_p5_pkey" PRIMARY KEY, btree (a)

As a consequence a direct INSERT into the partition will result in a
value for identity column which conflicts with an existing value or a
future value in the partitioned table. Again, if the identity column
in a primary key (and partition key), the conflicting INSERT will
fail. But otherwise, the conflicting values will go unnoticed.

I consulted Vik Fearing, offline, about SQL standard's take on
identity columns in partitioned table. SQL standard does not specify
partitioned tables as a separate entity. Thus a partitioned table is
at par with a regular table. Hence an identity column in partitioned
table should share the same identity space across all the partitions.

Behaviour 3
=========
We allow identity column to be added to a partitioned table which
doesn't have partitions but we disallow adding identity column to a
partitioned table which has partitions.
#create table tpart (a int primary key,
                    src varchar) partition by range(a);
#create table t_p1 partition of tpart for values from (1) to (3);
#alter table tpart add column b int generated always as identity;
ERROR:  cannot recursively add identity column to table that has child tables

I don't understand why is that prohibited. If we allow partitions to
be added to a partitioned table with identity column, we should allow
an identity column to be added to a partitioned table with partitions.

Behaviour 4
=========
Even though we don't allow an identity column to be added to a
partitioned table with partitions, we allow an existing column to be
converted to an identity column in such a table.

#create table tpart (a int primary key,
                    src varchar) partition by range(a);
#create table t_p1 partition of tpart for values from (1) to (3);
#create table t_p2 partition of tpart for values from (3) to (5);
#alter table tpart alter column a add generated always as identity;

#\d tpart
                         Partitioned table "public.tpart"
 Column |       Type        | Collation | Nullable |           Default
--------+-------------------+-----------+----------+------------------------------
 a      | integer           |           | not null | generated always
as identity
 src    | character varying |           |          |
Partition key: RANGE (a)
Indexes:
    "tpart_pkey" PRIMARY KEY, btree (a)
Number of partitions: 2 (Use \d+ to list them.)

#\d t_p1
                     Table "public.t_p1"
 Column |       Type        | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
 a      | integer           |           | not null |
 src    | character varying |           |          |
Partition of: tpart FOR VALUES FROM (1) TO (3)
Indexes:
    "t_p1_pkey" PRIMARY KEY, btree (a)

Behaviour 3 and 4 are conflicting with each other themselves.

I think we should fix these anomalies as follows
1. Allow identity columns to be added to the partitioned table
irrespective of whether they have partitions of not.
2. Propagate identity property to partitions.
3. Use the same underlying sequence for getting default value of an
identity column when INSERTing directly in a partition.
4. Disallow attaching a partition with identity column.

1 will fix inconsistencies in Behaviour 3 and 4. 2 and 3 will fix
anomalies in Behaviour 1. 4 will fix Behaviour 2.

Note on point 3: The current implementation uses pg_depend to find the
sequence associated with the identity column. We don't necessarily
need to add dependencies for individual partitions though. Instead we
could use the dependency on the partitioned table itself. I haven't
checked feasibility of this option, but it makes things simpler esp.
for DETACH and DROP of partition.

Note on point 4: The proposal again simplifies DETACH and DROP. If we
decide to somehow coalesce the identity columns of partition and
partitioned table, it would make DETACH and DROP complex. Also if the
identity column of the partition being attached is not identity column
in partition table, INSERT on partition table would fail in case of
ALWAYS. Of course the risk is we will break backward compatibility.
But given that the current behaviour is quite erroneous, I doubt if
there are users relying on this behaviour.

Thoughts?


[1] https://www.postgresql.org/message-id/flat/70be435b-05db-06f2-7c01-9bb8ee2fccce%40enterprisedb.com
[2] https://www.postgresql.org/message-id/CAExHW5toAsjc7uwSeSzX6sgvktFxsv7pd606zP6DnTX7Y6O4jg@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Peter Eisentraut

Дата:

13 ноября 2023 г., 13:21:43

On 27.10.23 13:32, Ashutosh Bapat wrote:
> I think we should fix these anomalies as follows
> 1. Allow identity columns to be added to the partitioned table
> irrespective of whether they have partitions of not.
> 2. Propagate identity property to partitions.
> 3. Use the same underlying sequence for getting default value of an
> identity column when INSERTing directly in a partition.
> 4. Disallow attaching a partition with identity column.
> 
> 1 will fix inconsistencies in Behaviour 3 and 4. 2 and 3 will fix
> anomalies in Behaviour 1. 4 will fix Behaviour 2.

This makes sense to me.

Note, here is a writeup about the behavior of generated columns with 
partitioning: 
https://www.postgresql.org/docs/devel/ddl-generated-columns.html.  It 
would be useful if we documented the behavior of identity columns 
similarly.  (I'm not saying the behavior has to match.)

One thing that's not clear to me is what should happen if you have a 
partitioned table with an identity column and you try to attach a 
partition that has its own identity definition for that column.  I 
suppose we shouldn't allow that.  (The equivalent case for generated 
columns is allowed.)

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

16 ноября 2023 г., 13:53:25

On Mon, Nov 13, 2023 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 27.10.23 13:32, Ashutosh Bapat wrote:
> > I think we should fix these anomalies as follows
> > 1. Allow identity columns to be added to the partitioned table
> > irrespective of whether they have partitions of not.
> > 2. Propagate identity property to partitions.
> > 3. Use the same underlying sequence for getting default value of an
> > identity column when INSERTing directly in a partition.
> > 4. Disallow attaching a partition with identity column.
> >
> > 1 will fix inconsistencies in Behaviour 3 and 4. 2 and 3 will fix
> > anomalies in Behaviour 1. 4 will fix Behaviour 2.
>
> This makes sense to me.
>
> Note, here is a writeup about the behavior of generated columns with
> partitioning:
> https://www.postgresql.org/docs/devel/ddl-generated-columns.html.  It
> would be useful if we documented the behavior of identity columns
> similarly.  (I'm not saying the behavior has to match.)

Yes. Will add the documentation while working on the code.

>
> One thing that's not clear to me is what should happen if you have a
> partitioned table with an identity column and you try to attach a
> partition that has its own identity definition for that column.  I
> suppose we shouldn't allow that.

That's point 4 above. We shouldn't allow that case.

> (The equivalent case for generated
> columns is allowed.)
>

There might be some weird behaviour because of that like virtual
columns from the same partition reporting different values based on
the table used in the SELECT clause OR stored generated columns having
different values for two rows with the same underlying columns just
because they were INSERTed into different tables (partitioned vs
partition). That may have some impact on the logical replication. I
haven't tested this myself. Maybe a topic for a separate thread.

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

19 декабря 2023 г., 13:47:38

On Mon, Nov 13, 2023 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 27.10.23 13:32, Ashutosh Bapat wrote:
> > I think we should fix these anomalies as follows
> > 1. Allow identity columns to be added to the partitioned table
> > irrespective of whether they have partitions of not.
> > 2. Propagate identity property to partitions.
> > 3. Use the same underlying sequence for getting default value of an
> > identity column when INSERTing directly in a partition.
> > 4. Disallow attaching a partition with identity column.
> >
> > 1 will fix inconsistencies in Behaviour 3 and 4. 2 and 3 will fix
> > anomalies in Behaviour 1. 4 will fix Behaviour 2.
>
> This makes sense to me.

PFA WIP patches implementing/fixing identity support for partitioned
tables as outlined above.

A partitioned table is a single relation and thus an identity column
of a partitioned table should use the same identity space across all
the partitions. This means that the sequence underlying the identity
column will be shared by all the partitions of a partitioned table and
the column will have the same identity properties across all the
partitions. Thus
1. When a new partition is added or a table is attached as a
partition, it inherits the identity column along with the underlying
sequence from the partitioned table. It can not have an identity
column of its own.
2. Since a partition never had its own identity column, when detaching
a partition, it will loose identity property of any column that had
it. If it were to retain the identity property it can not use
underlying sequence. That's not possible anyway.

This is different from the way we treat identity in inheritance.
Children in inheritance hierarchy are independent enough to have
separate identity columns and sequences of their own. So the above
discussion applies only to partitioned table. The patches too deal
with only partitioned tables.

At this point I am looking for opinions on the above rules and whether
the implementation is on the right track.

The work consists of many small code changes. In order to know which
code change is associated with which SQL each patch has test changes
and associated code change. Each patch has a commit message explaining
the changes in detail (and some times repeating the above rules again,
sorry for the repetition). These patches will be merged into a single
patch or a couple patches at most. Here's what each patch does

0001 - change to get_partition_ancestors() prologue. Can be reviewed
and committed independent of other patches.

0002 - A new partition inherits identity column and uses the
underlying sequence for direct INSERTs

0004 - An attached partition inherits identity property and uses the
underlying sequence for direct INSERTs. When inheriting the identity
property it should also inherit the NOT NULL constraint, but that's a
TODO in this patch. We expect matching NOT NULL constraints to be
present in the partition being attached. I am not sure whether we want
to add NOT NULL constraints automatically for an identity column. We
require a NOT NULL constraint to be present when adding identity
property to a column. The behavior in the patch seems to be consistent
with this.

0006 - supports ADD COLUMN ... GENERATED AS IDENTITY on a partitioned
table. identity property is propagated down the partition hierarchy.

0008 - A TODO: that I need verify/address before finalizing these
patches. Any hints are welcome.

0009 - supports ALTER COLUMN ... ADD GENERATED AS IDENTITY on a
partitioned table. Propagates the identity property down the partition
hierarchy. It requires adding NOT NULL constraint before adding
identity property just like regular table.

0011 - dropping identity property of a column of a partitioned table
drops it from the corresponding columns of all of its partitions. But
the NOT NULL constraint is retained just like in case of regular
table.

0013 - detaching a partition, drops identity property from all the
columns of partition.

patches 0003, 0005, 0007, 0010, 0012, 0014 have detailed white box
tests testing the catalog changes for each SQL. But they are not meant
to be part of the final patch-set.

--
Best Wishes,
Ashutosh Bapat

Вложения

Re: partitioning and identity column

От

Peter Eisentraut

Дата:

21 декабря 2023 г., 14:02:02

On 19.12.23 11:47, Ashutosh Bapat wrote:
> At this point I am looking for opinions on the above rules and whether
> the implementation is on the right track.

This looks on the right track to me.

> 0001 - change to get_partition_ancestors() prologue. Can be reviewed
> and committed independent of other patches.

I committed that.

> 0004 - An attached partition inherits identity property and uses the
> underlying sequence for direct INSERTs. When inheriting the identity
> property it should also inherit the NOT NULL constraint, but that's a
> TODO in this patch. We expect matching NOT NULL constraints to be
> present in the partition being attached. I am not sure whether we want
> to add NOT NULL constraints automatically for an identity column. We
> require a NOT NULL constraint to be present when adding identity
> property to a column. The behavior in the patch seems to be consistent
> with this.

I think it makes sense that the NOT NULL constraint must be added 
manually before attaching  is allowed.

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

09 января, 17:10:33

On Thu, Dec 21, 2023 at 4:32 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 19.12.23 11:47, Ashutosh Bapat wrote:
> > At this point I am looking for opinions on the above rules and whether
> > the implementation is on the right track.
>
> This looks on the right track to me.

Thanks.

>
> > 0001 - change to get_partition_ancestors() prologue. Can be reviewed
> > and committed independent of other patches.
>
> I committed that.

Thanks.

>
> > 0004 - An attached partition inherits identity property and uses the
> > underlying sequence for direct INSERTs. When inheriting the identity
> > property it should also inherit the NOT NULL constraint, but that's a
> > TODO in this patch. We expect matching NOT NULL constraints to be
> > present in the partition being attached. I am not sure whether we want
> > to add NOT NULL constraints automatically for an identity column. We
> > require a NOT NULL constraint to be present when adding identity
> > property to a column. The behavior in the patch seems to be consistent
> > with this.
>
> I think it makes sense that the NOT NULL constraint must be added
> manually before attaching  is allowed.
>
Ok. I have modified the test case to add NOT NULL constraint.

Here's complete patch-set.
0001 - fixes unrelated documentation style - can be committed
independently OR ignored
0002 - adds an Assert in related code - can be independently committed

On Mon, Nov 13, 2023 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> Note, here is a writeup about the behavior of generated columns with
> partitioning:
> https://www.postgresql.org/docs/devel/ddl-generated-columns.html.  It
> would be useful if we documented the behavior of identity columns
> similarly.  (I'm not saying the behavior has to match.)
0003 - addresses this request

0004 - 0011 - each patch contains code changes and SQL testing those
changes for ease of review. Each patch has commit message that
describes the changes and rationale, if any, behind those changes.
0012 - test changes
0013 - expected output change because of code changes
All these patches should be committed as a single commit finally.
Please let me know when I can squash those all together. We may commit
0003 separately or along with 0004-0013.

0014 and 0015 - pg_dump/restore and pg_upgrade tests. But these
patches are not expected to be committed for the reasons explained in
the commit message. Since identity columns of a partitioned table are
not marked as such in partitions in the older version, I tested their
upgrade from PG 14 through the changes in 0015. pg_dumpall_14.out
contains the dump file from PG 14 I used for this testing.

--
Best Wishes,
Ashutosh Bapat

Вложения

Re: partitioning and identity column

От

Peter Eisentraut

Дата:

16 января, 21:59:59

On 09.01.24 15:10, Ashutosh Bapat wrote:
> Here's complete patch-set.

Looks good!  Committed.

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

17 января, 08:36:33

On Wed, Jan 17, 2024 at 12:30 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 09.01.24 15:10, Ashutosh Bapat wrote:
> > Here's complete patch-set.
>
> Looks good!  Committed.
>

Thanks a lot Peter.

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Peter Eisentraut

Дата:

22 января, 15:02:30

On 17.01.24 06:36, Ashutosh Bapat wrote:
> On Wed, Jan 17, 2024 at 12:30 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> On 09.01.24 15:10, Ashutosh Bapat wrote:
>>> Here's complete patch-set.
>>
>> Looks good!  Committed.
>>
> 
> Thanks a lot Peter.

I found another piece of code that might need updating, or at least the 
comment.

In MergeAttributes(), in the part that merges the specified column 
definitions into the inherited ones, it says

     /*
      * Identity is never inherited.  The new column can have an
      * identity definition, so we always just take that one.
      */
     def->identity = newdef->identity;

This is still correct for regular inheritance, but not for partitioning. 
  I think for partitioning, this is not reachable because you can't 
specify identity information when you create a partition(?).  So maybe 
something like

     if (newdef->identity)
     {
         Assert(!is_partioning);
         /*
          * Identity is never inherited.  The new column can have an
          * identity definition, so we always just take that one.
          */
         def->identity = newdef->identity;
     }

Thoughts?

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

22 января, 15:23:19

On Mon, Jan 22, 2024 at 5:32 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I found another piece of code that might need updating, or at least the
> comment.
>
> In MergeAttributes(), in the part that merges the specified column
> definitions into the inherited ones, it says
>
>      /*
>       * Identity is never inherited.  The new column can have an
>       * identity definition, so we always just take that one.
>       */
>      def->identity = newdef->identity;
>
> This is still correct for regular inheritance, but not for partitioning.
>   I think for partitioning, this is not reachable because you can't
> specify identity information when you create a partition(?).  So maybe
> something like

You may specify the information when creating a partition, but it will
cause an error. We have tests in identity.sql for the same (look for
pitest1_pfail).

>
>      if (newdef->identity)
>      {
>          Assert(!is_partioning);
>          /*
>           * Identity is never inherited.  The new column can have an
>           * identity definition, so we always just take that one.
>           */
>          def->identity = newdef->identity;
>      }
>
> Thoughts?

That code block already has Assert(!is_partition) at line 3085. I
thought that Assert is enough.

There's another thing I found. The file isn't using
check_stack_depth() in the function which traverse inheritance
hierarchies. This isn't just a problem of the identity related
function but most of the functions in that file. Do you think it's
worth fixing it?

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Peter Eisentraut

Дата:

22 января, 21:59:32

On 22.01.24 13:23, Ashutosh Bapat wrote:
>>       if (newdef->identity)
>>       {
>>           Assert(!is_partioning);
>>           /*
>>            * Identity is never inherited.  The new column can have an
>>            * identity definition, so we always just take that one.
>>            */
>>           def->identity = newdef->identity;
>>       }
>>
>> Thoughts?
> 
> That code block already has Assert(!is_partition) at line 3085. I
> thought that Assert is enough.

Ok.  Maybe just rephrase that comment somehow then?

> There's another thing I found. The file isn't using
> check_stack_depth() in the function which traverse inheritance
> hierarchies. This isn't just a problem of the identity related
> function but most of the functions in that file. Do you think it's
> worth fixing it?

I suppose the number of inheritance levels is usually not a problem for 
stack depth?

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

24 января, 09:34:43

On Tue, Jan 23, 2024 at 12:29 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 22.01.24 13:23, Ashutosh Bapat wrote:
> >>       if (newdef->identity)
> >>       {
> >>           Assert(!is_partioning);
> >>           /*
> >>            * Identity is never inherited.  The new column can have an
> >>            * identity definition, so we always just take that one.
> >>            */
> >>           def->identity = newdef->identity;
> >>       }
> >>
> >> Thoughts?
> >
> > That code block already has Assert(!is_partition) at line 3085. I
> > thought that Assert is enough.
>
> Ok.  Maybe just rephrase that comment somehow then?

Please see refactoring patches attached to [1]. Refactoring that way
makes it unnecessary to mention "regular inheritance" in each comment.
Yet I have included a modified version of the comment in that patch
set.

>
> > There's another thing I found. The file isn't using
> > check_stack_depth() in the function which traverse inheritance
> > hierarchies. This isn't just a problem of the identity related
> > function but most of the functions in that file. Do you think it's
> > worth fixing it?
>
> I suppose the number of inheritance levels is usually not a problem for
> stack depth?
>

Practically it should not. I would rethink the application design if
it requires so many inheritance or partition levels. But functions in
optimizer like try_partitionwise_join() and set_append_rel_size() call

/* Guard against stack overflow due to overly deep inheritance tree. */
check_stack_depth();

I am fine if we want to skip this.

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

24 января, 10:02:30

On Wed, Jan 24, 2024 at 12:04 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Ok.  Maybe just rephrase that comment somehow then?
>
> Please see refactoring patches attached to [1]. Refactoring that way
> makes it unnecessary to mention "regular inheritance" in each comment.
> Yet I have included a modified version of the comment in that patch
> set.

Sorry forgot to add the reference. Here it is.

[1] https://www.postgresql.org/message-id/CAExHW5vz7A-skzt05=4frFx9-VPjfjK4jKQZT7ufRNh4J7=xmQ@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

15 февраля, 21:00:00

Hello Ashutosh,

24.01.2024 09:34, Ashutosh Bapat wrote:
>
>>> There's another thing I found. The file isn't using
>>> check_stack_depth() in the function which traverse inheritance
>>> hierarchies. This isn't just a problem of the identity related
>>> function but most of the functions in that file. Do you think it's
>>> worth fixing it?
>> I suppose the number of inheritance levels is usually not a problem for
>> stack depth?
>>
> Practically it should not. I would rethink the application design if
> it requires so many inheritance or partition levels. But functions in
> optimizer like try_partitionwise_join() and set_append_rel_size() call
>
> /* Guard against stack overflow due to overly deep inheritance tree. */
> check_stack_depth();
>
> I am fine if we want to skip this.

I've managed to reach stack overflow inside ATExecSetIdentity() with
the following script:
(echo "CREATE TABLE tp0 (a int PRIMARY KEY,
         b int GENERATED ALWAYS AS IDENTITY) PARTITION BY RANGE (a);";
for ((i=1;i<=80000;i++)); do
   echo "CREATE TABLE tp$i PARTITION OF tp$(( $i - 1 ))
          FOR VALUES FROM ($i) TO (1000000) PARTITION BY RANGE (a);";
done;
echo "ALTER TABLE tp0 ALTER COLUMN b SET GENERATED BY DEFAULT;") | psql >psql.log

(with max_locks_per_transaction = 400 in the config)

It runs about 15 minutes for me and ends with:
Program terminated with signal SIGSEGV, Segmentation fault.

#0  0x000055a8ced20de9 in LWLockAcquire (lock=0x7faec200b900, mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1169
1169    {
(gdb) bt
#0  0x000055a8ced20de9 in LWLockAcquire (lock=0x7faec200b900, mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1169
#1  0x000055a8cea0342d in WALInsertLockAcquire () at xlog.c:1389
#2  XLogInsertRecord (rdata=0x55a8cf1ccee8 <hdr_rdt>, fpw_lsn=fpw_lsn@entry=1261347512, flags=0 '\000', 
num_fpi=num_fpi@entry=0, topxid_included=false) at xlog.c:817
#3  0x000055a8cea1396e in XLogInsert (rmid=rmid@entry=11 '\v', info=<optimized out>) at xloginsert.c:524
#4  0x000055a8ce9c1541 in _bt_insertonpg (rel=0x7faeb8478c98, heaprel=0x7faecf63d378, 
itup_key=itup_key@entry=0x55a8d5064678, buf=3210, cbuf=cbuf@entry=0, stack=stack@entry=0x55a8d1063d08, 
itup=0x55a8d5064658, itemsz=16,
     newitemoff=<optimized out>, postingoff=0, split_only_page=<optimized out>) at nbtinsert.c:1389
#5  0x000055a8ce9bf9a7 in _bt_doinsert (rel=<optimized out>, rel@entry=0x7faeb8478c98, itup=<optimized out>, 
itup@entry=0x55a8d5064658, checkUnique=<optimized out>, checkUnique@entry=UNIQUE_CHECK_YES, indexUnchanged=<optimized
out>,
     heapRel=<optimized out>, heapRel@entry=0x7faecf63d378) at nbtinsert.c:260
#6  0x000055a8ce9c92ad in btinsert (rel=0x7faeb8478c98, values=<optimized out>, isnull=<optimized out>, 
ht_ctid=0x55a8d50643cc, heapRel=0x7faecf63d378, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=<optimized out>,
     indexInfo=<optimized out>) at nbtree.c:205
#7  0x000055a8cea41391 in CatalogIndexInsert (indstate=indstate@entry=0x55a8d0fc03e8, heapTuple=<optimized out>, 
heapTuple@entry=0x55a8d50643c8, updateIndexes=<optimized out>) at indexing.c:170
#8  0x000055a8cea4172c in CatalogTupleUpdate (heapRel=heapRel@entry=0x7faecf63d378, otid=0x55a8d50643cc, 
tup=tup@entry=0x55a8d50643c8) at indexing.c:324
#9  0x000055a8ceb18173 in ATExecSetIdentity (rel=0x7faeab1288a8, colName=colName@entry=0x55a8d0fbc2b8 "b", 
def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8307
#10 0x000055a8ceb18251 in ATExecSetIdentity (rel=0x7faeab127f28, colName=colName@entry=0x55a8d0fbc2b8 "b", 
def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8337
#11 0x000055a8ceb18251 in ATExecSetIdentity (rel=0x7faeab1275a8, colName=colName@entry=0x55a8d0fbc2b8 "b", 
def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8337
#12 0x000055a8ceb18251 in ATExecSetIdentity (rel=0x7faeab126c28, colName=colName@entry=0x55a8d0fbc2b8 "b", 
def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8337
...

Functions ATExecAddIdentity() and ATExecDropIdentity() are recursive too,
so I think they can be exploited as well.

Best regards,
Alexander

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

19 февраля, 15:17:34

On Thu, Feb 15, 2024 at 11:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> Hello Ashutosh,
>
> 24.01.2024 09:34, Ashutosh Bapat wrote:
> >
> >>> There's another thing I found. The file isn't using
> >>> check_stack_depth() in the function which traverse inheritance
> >>> hierarchies. This isn't just a problem of the identity related
> >>> function but most of the functions in that file. Do you think it's
> >>> worth fixing it?
> >> I suppose the number of inheritance levels is usually not a problem for
> >> stack depth?
> >>
> > Practically it should not. I would rethink the application design if
> > it requires so many inheritance or partition levels. But functions in
> > optimizer like try_partitionwise_join() and set_append_rel_size() call
> >
> > /* Guard against stack overflow due to overly deep inheritance tree. */
> > check_stack_depth();
> >
> > I am fine if we want to skip this.
>
> I've managed to reach stack overflow inside ATExecSetIdentity() with
> the following script:
> (echo "CREATE TABLE tp0 (a int PRIMARY KEY,
>          b int GENERATED ALWAYS AS IDENTITY) PARTITION BY RANGE (a);";
> for ((i=1;i<=80000;i++)); do
>    echo "CREATE TABLE tp$i PARTITION OF tp$(( $i - 1 ))
>           FOR VALUES FROM ($i) TO (1000000) PARTITION BY RANGE (a);";
> done;
> echo "ALTER TABLE tp0 ALTER COLUMN b SET GENERATED BY DEFAULT;") | psql >psql.log
>
> (with max_locks_per_transaction = 400 in the config)
>
> It runs about 15 minutes for me and ends with:
> Program terminated with signal SIGSEGV, Segmentation fault.
>
> #0  0x000055a8ced20de9 in LWLockAcquire (lock=0x7faec200b900, mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1169
> 1169    {
> (gdb) bt
> #0  0x000055a8ced20de9 in LWLockAcquire (lock=0x7faec200b900, mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1169
> #1  0x000055a8cea0342d in WALInsertLockAcquire () at xlog.c:1389
> #2  XLogInsertRecord (rdata=0x55a8cf1ccee8 <hdr_rdt>, fpw_lsn=fpw_lsn@entry=1261347512, flags=0 '\000',
> num_fpi=num_fpi@entry=0, topxid_included=false) at xlog.c:817
> #3  0x000055a8cea1396e in XLogInsert (rmid=rmid@entry=11 '\v', info=<optimized out>) at xloginsert.c:524
> #4  0x000055a8ce9c1541 in _bt_insertonpg (rel=0x7faeb8478c98, heaprel=0x7faecf63d378,
> itup_key=itup_key@entry=0x55a8d5064678, buf=3210, cbuf=cbuf@entry=0, stack=stack@entry=0x55a8d1063d08,
> itup=0x55a8d5064658, itemsz=16,
>      newitemoff=<optimized out>, postingoff=0, split_only_page=<optimized out>) at nbtinsert.c:1389
> #5  0x000055a8ce9bf9a7 in _bt_doinsert (rel=<optimized out>, rel@entry=0x7faeb8478c98, itup=<optimized out>,
> itup@entry=0x55a8d5064658, checkUnique=<optimized out>, checkUnique@entry=UNIQUE_CHECK_YES, indexUnchanged=<optimized
out>,
>      heapRel=<optimized out>, heapRel@entry=0x7faecf63d378) at nbtinsert.c:260
> #6  0x000055a8ce9c92ad in btinsert (rel=0x7faeb8478c98, values=<optimized out>, isnull=<optimized out>,
> ht_ctid=0x55a8d50643cc, heapRel=0x7faecf63d378, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=<optimized out>,
>      indexInfo=<optimized out>) at nbtree.c:205
> #7  0x000055a8cea41391 in CatalogIndexInsert (indstate=indstate@entry=0x55a8d0fc03e8, heapTuple=<optimized out>,
> heapTuple@entry=0x55a8d50643c8, updateIndexes=<optimized out>) at indexing.c:170
> #8  0x000055a8cea4172c in CatalogTupleUpdate (heapRel=heapRel@entry=0x7faecf63d378, otid=0x55a8d50643cc,
> tup=tup@entry=0x55a8d50643c8) at indexing.c:324
> #9  0x000055a8ceb18173 in ATExecSetIdentity (rel=0x7faeab1288a8, colName=colName@entry=0x55a8d0fbc2b8 "b",
> def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8307
> #10 0x000055a8ceb18251 in ATExecSetIdentity (rel=0x7faeab127f28, colName=colName@entry=0x55a8d0fbc2b8 "b",
> def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8337
> #11 0x000055a8ceb18251 in ATExecSetIdentity (rel=0x7faeab1275a8, colName=colName@entry=0x55a8d0fbc2b8 "b",
> def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8337
> #12 0x000055a8ceb18251 in ATExecSetIdentity (rel=0x7faeab126c28, colName=colName@entry=0x55a8d0fbc2b8 "b",
> def=def@entry=0x55a8d1063918, lockmode=lockmode@entry=8, recurse=true, recursing=<optimized out>) at tablecmds.c:8337
> ...
>
> Functions ATExecAddIdentity() and ATExecDropIdentity() are recursive too,
> so I think they can be exploited as well.

not just Identity related functions, but many other functions in
tablecmds.c have that problem as I mentioned earlier.

--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

19 февраля, 18:00:00

Hello Ashutosh,

19.02.2024 15:17, Ashutosh Bapat wrote:
>
>> Functions ATExecAddIdentity() and ATExecDropIdentity() are recursive too,
>> so I think they can be exploited as well.
> not just Identity related functions, but many other functions in
> tablecmds.c have that problem as I mentioned earlier.
>

Could you please name functions, which you suspect, for me to recheck them?
Perhaps we should consider fixing all of such functions, in light of
b0f7dd915 and d57b7cc33...

Best regards,
Alexander

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

20 февраля, 07:57:49

On Mon, Feb 19, 2024 at 8:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> Hello Ashutosh,
>
> 19.02.2024 15:17, Ashutosh Bapat wrote:
> >
> >> Functions ATExecAddIdentity() and ATExecDropIdentity() are recursive too,
> >> so I think they can be exploited as well.
> > not just Identity related functions, but many other functions in
> > tablecmds.c have that problem as I mentioned earlier.
> >
>
> Could you please name functions, which you suspect, for me to recheck them?
> Perhaps we should consider fixing all of such functions, in light of
> b0f7dd915 and d57b7cc33...

Looks like the second commit has fixed all other places I knew except
Identity related functions. So worth fixing identity related functions
too. I see
dropconstraint_internal() has two calls to check_stack_depth() back to
back. The second one is not needed?


--
Best Wishes,
Ashutosh Bapat

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

20 февраля, 09:00:00

20.02.2024 07:57, Ashutosh Bapat wrote:
>> Could you please name functions, which you suspect, for me to recheck them?
>> Perhaps we should consider fixing all of such functions, in light of
>> b0f7dd915 and d57b7cc33...
> Looks like the second commit has fixed all other places I knew except
> Identity related functions. So worth fixing identity related functions
> too. I see
> dropconstraint_internal() has two calls to check_stack_depth() back to
> back. The second one is not needed?

Yeah, that's funny. It looks like such a double protection emerged
because Alvaro protected the function (in b0f7dd915), which was waiting for
adding check_stack_depth() in the other thread (resulted in d57b7cc33).

Thank you for spending time on this!

Best regards,
Alexander

Re: partitioning and identity column

От

Alexander Korotkov

Дата:

21 февраля, 03:51:20

On Tue, Feb 20, 2024 at 8:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
> 20.02.2024 07:57, Ashutosh Bapat wrote:
> >> Could you please name functions, which you suspect, for me to recheck them?
> >> Perhaps we should consider fixing all of such functions, in light of
> >> b0f7dd915 and d57b7cc33...
> > Looks like the second commit has fixed all other places I knew except
> > Identity related functions. So worth fixing identity related functions
> > too. I see
> > dropconstraint_internal() has two calls to check_stack_depth() back to
> > back. The second one is not needed?
>
> Yeah, that's funny. It looks like such a double protection emerged
> because Alvaro protected the function (in b0f7dd915), which was waiting for
> adding check_stack_depth() in the other thread (resulted in d57b7cc33).
>
> Thank you for spending time on this!

Thank you, I removed the second check.

------
Regards,
Alexander Korotkov

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

26 апреля, 15:00:00

Hello Ashutosh and Peter,

16.01.2024 21:59, Peter Eisentraut wrote:
> On 09.01.24 15:10, Ashutosh Bapat wrote:
>> Here's complete patch-set.
>
> Looks good!  Committed.
>

Please take a look at a new error case introduced by 699586315:
CREATE TABLE tbl1 (a int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY)
   PARTITION BY LIST (a);
CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT;

CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR:  no owned sequence found

Best regards,
Alexander

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

26 апреля, 15:57:58

Thanks Alexander for the report.

On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:

Hello Ashutosh and Peter,

16.01.2024 21:59, Peter Eisentraut wrote:
> On 09.01.24 15:10, Ashutosh Bapat wrote:
>> Here's complete patch-set.
>
> Looks good! Committed.
>

Please take a look at a new error case introduced by 699586315:
CREATE TABLE tbl1 (a int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY)
PARTITION BY LIST (a);
CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT;

CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR: no owned sequence found

I don't think creating a table like a partition is common or even useful. Usually it would create it from partitithe oned table. But if we consider that to be a use case, I think the error is expected since a partition doesn't have its own identity; it shares it with the partitioned table. Maybe we could give a better message. But I will look into this and fix it if the solution makes sense.

Do you want to track this in open items?

Best Wishes,

Ashutosh Bapat

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

26 апреля, 21:00:01

26.04.2024 15:57, Ashutosh Bapat wrote:

Thanks Alexander for the report.

On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:

CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR: no owned sequence found

I don't think creating a table like a partition is common or even useful. Usually it would create it from partitithe oned table. But if we consider that to be a use case, I think the error is expected since a partition doesn't have its own identity; it shares it with the partitioned table. Maybe we could give a better message. But I will look into this and fix it if the solution makes sense.

Maybe it's uncommon, but it's allowed, so users may want to
CREATE TABLE sometable (LIKE partX INCLUDING ALL), for example, if the
partition has a somewhat different structure. And thinking about how such
a restriction could be described in the docs, I would prefer to avoid this
error at the implementation level.

Do you want to track this in open items?

If you are inclined to fix this behavior, I would add this item.

Best regards,
Alexander

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

27 апреля, 18:00:01

Hello Ashutosh,

26.04.2024 21:00, Alexander Lakhin wrote:

26.04.2024 15:57, Ashutosh Bapat wrote:
Thanks Alexander for the report.

On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:

CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR: no owned sequence found

Do you want to track this in open items?

If you are inclined to fix this behavior, I would add this item.

Please look also at another script, which produces the same error:
CREATE TABLE tbl1 (a int GENERATED BY DEFAULT AS IDENTITY, b text)
PARTITION BY LIST (b);
CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT;

ALTER TABLE tbl1 ALTER COLUMN a SET DATA TYPE bigint;
ERROR: no owned sequence found

(On 699586315~1, it executes successfully and changes the data type of the
identity column and it's sequence.)

Best regards,
Alexander

Re: partitioning and identity column

От

Alexander Lakhin

Дата:

28 апреля, 09:59:44

27.04.2024 18:00, Alexander Lakhin wrote:
>
> Please look also at another script, which produces the same error:

I've discovered yet another problematic case:
CREATE TABLE tbl1 (a int GENERATED ALWAYS AS IDENTITY, b text)
     PARTITION BY LIST (a);
CREATE TABLE tbl2 (b text, a int NOT NULL);
ALTER TABLE tbl1 ATTACH PARTITION tbl2 DEFAULT;

INSERT INTO tbl2 DEFAULT VALUES;
ERROR:  no owned sequence found

Though it works with tbl2(a int NOT NULL, b text)...
Take a look at this too, please.

Best regards,
Alexander

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

30 апреля, 13:59:11

On Sun, Apr 28, 2024 at 12:29 PM Alexander Lakhin <exclusion@gmail.com> wrote:

27.04.2024 18:00, Alexander Lakhin wrote:
>
> Please look also at another script, which produces the same error:

I've discovered yet another problematic case:
CREATE TABLE tbl1 (a int GENERATED ALWAYS AS IDENTITY, b text)
PARTITION BY LIST (a);
CREATE TABLE tbl2 (b text, a int NOT NULL);
ALTER TABLE tbl1 ATTACH PARTITION tbl2 DEFAULT;

INSERT INTO tbl2 DEFAULT VALUES;
ERROR: no owned sequence found

Though it works with tbl2(a int NOT NULL, b text)...
Take a look at this too, please.

Thanks Alexander for the report.

PFA patch which fixes all the three problems.

I had not fixed getIdentitySequence() to fetch identity sequence associated with the partition because I thought it would be better to fail with an error when it's not used correctly. But these bugs show 1. the error is misleading and unwanted 2. there are more places where adding that logic to getIdentitySequence() makes sense. Fixed the function in these patches. Now callers like transformAlterTableStmt have to be careful not to call the function on a partition.

I have examined all the callers of getIdentitySequence() and they seem to be fine. The code related to SetIdentity, DropIdentity is not called for partitions, errors for which are thrown elsewhere earlier.

Best Wishes,

Ashutosh Bapat

Вложения

0001-Fix-assorted-bugs-related-to-identity-colum-20240430.patch

Re: partitioning and identity column

От

Michael Paquier

Дата:

01 мая, 06:53:31

On Tue, Apr 30, 2024 at 04:29:11PM +0530, Ashutosh Bapat wrote:
> PFA patch which fixes all the three problems.

Please note that this was not tracked as an open item, so I have added
one referring to the failures reported by Alexander.
--
Michael

Вложения

signature.asc

Re: partitioning and identity column

От

Dmitry Dolgov

Дата:

04 мая, 23:13:19

> On Tue, Apr 30, 2024 at 04:29:11PM +0530, Ashutosh Bapat wrote:
> On Sun, Apr 28, 2024 at 12:29 PM Alexander Lakhin <exclusion@gmail.com>
> wrote:
>
> > 27.04.2024 18:00, Alexander Lakhin wrote:
> > >
> > > Please look also at another script, which produces the same error:
> >
> > I've discovered yet another problematic case:
> > CREATE TABLE tbl1 (a int GENERATED ALWAYS AS IDENTITY, b text)
> >      PARTITION BY LIST (a);
> > CREATE TABLE tbl2 (b text, a int NOT NULL);
> > ALTER TABLE tbl1 ATTACH PARTITION tbl2 DEFAULT;
> >
> > INSERT INTO tbl2 DEFAULT VALUES;
> > ERROR:  no owned sequence found
> >
> > Though it works with tbl2(a int NOT NULL, b text)...
> > Take a look at this too, please.
> >
>
> Thanks Alexander for the report.
>
> PFA patch which fixes all the three problems.
>
> I had not fixed getIdentitySequence() to fetch identity sequence associated
> with the partition because I thought it would be better to fail with an
> error when it's not used correctly. But these bugs show 1. the error is
> misleading and unwanted 2. there are more places where adding that logic
> to  getIdentitySequence() makes sense. Fixed the function in these patches.
> Now callers like transformAlterTableStmt have to be careful not to call the
> function on a partition.
>
> I have examined all the callers of getIdentitySequence() and they seem to
> be fine. The code related to SetIdentity, DropIdentity is not called for
> partitions, errors for which are thrown elsewhere earlier.

Thanks for the fix.

I had a quick look, it covers the issues mentioned above in the thread.
Few nitpicks/questions:

* I think it makes sense to verify if the ptup is valid. This approach
  would fail if the target column of the root partition is marked as
  attisdropped.

     Oid
    -getIdentitySequence(Oid relid, AttrNumber attnum, bool missing_ok)
    +getIdentitySequence(Relation rel, AttrNumber attnum, bool missing_ok)
     {

    [...]

    +        relid = llast_oid(ancestors);
    +        ptup = SearchSysCacheAttName(relid, attname);
    +        attnum = ((Form_pg_attribute) GETSTRUCT(ptup))->attnum;

* getIdentitySequence is used in build_column_default, which in turn
  often appears in loops over table attributes. AFAICT it means that the
  same root partition search will be repeated multiple times in such
  situations if there is more than one identity. I assume the
  performance impact of this repetition is negligible?

* Maybe a silly question, but since I'm not aware about all the details
  here, I'm curious -- the approach of mapping attributes of a partition
  to the root partition attributes, how robust is it? I guess there is
  no way that the root partition column will be not what is expected,
  e.g. due to some sort of concurrency?

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

06 мая, 16:22:41

On Sun, May 5, 2024 at 1:43 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

> On Tue, Apr 30, 2024 at 04:29:11PM +0530, Ashutosh Bapat wrote:
> On Sun, Apr 28, 2024 at 12:29 PM Alexander Lakhin <exclusion@gmail.com>
> wrote:
>
> > 27.04.2024 18:00, Alexander Lakhin wrote:
> > >
> > > Please look also at another script, which produces the same error:
> >
> > I've discovered yet another problematic case:
> > CREATE TABLE tbl1 (a int GENERATED ALWAYS AS IDENTITY, b text)
> > PARTITION BY LIST (a);
> > CREATE TABLE tbl2 (b text, a int NOT NULL);
> > ALTER TABLE tbl1 ATTACH PARTITION tbl2 DEFAULT;
> >
> > INSERT INTO tbl2 DEFAULT VALUES;
> > ERROR: no owned sequence found
> >
> > Though it works with tbl2(a int NOT NULL, b text)...
> > Take a look at this too, please.
> >
>
> Thanks Alexander for the report.
>
> PFA patch which fixes all the three problems.
>
> I had not fixed getIdentitySequence() to fetch identity sequence associated
> with the partition because I thought it would be better to fail with an
> error when it's not used correctly. But these bugs show 1. the error is
> misleading and unwanted 2. there are more places where adding that logic
> to getIdentitySequence() makes sense. Fixed the function in these patches.
> Now callers like transformAlterTableStmt have to be careful not to call the
> function on a partition.
>
> I have examined all the callers of getIdentitySequence() and they seem to
> be fine. The code related to SetIdentity, DropIdentity is not called for
> partitions, errors for which are thrown elsewhere earlier.

Thanks for the fix.

I had a quick look, it covers the issues mentioned above in the thread.
Few nitpicks/questions:

* I think it makes sense to verify if the ptup is valid. This approach
would fail if the target column of the root partition is marked as
attisdropped.

The column is searched by name which is derived from attno of child partition. So it has to exist in the root partition. If it doesn't something is seriously wrong. Do you have a reproducer? We may want to add Assert(HeapTupleIsValid(ptup)) just in case. But it seems unnecessary to me.

Oid
-getIdentitySequence(Oid relid, AttrNumber attnum, bool missing_ok)
+getIdentitySequence(Relation rel, AttrNumber attnum, bool missing_ok)
{

[...]

+ relid = llast_oid(ancestors);
+ ptup = SearchSysCacheAttName(relid, attname);
+ attnum = ((Form_pg_attribute) GETSTRUCT(ptup))->attnum;

* getIdentitySequence is used in build_column_default, which in turn
often appears in loops over table attributes. AFAICT it means that the
same root partition search will be repeated multiple times in such
situations if there is more than one identity. I assume the
performance impact of this repetition is negligible?

I thought having multiple identity columns would be rare and hence avoided making code complex. Otherwise we have to get root partition somewhere in the caller hierarchy separately the logic much farther apart. Usually the ancestor entries will be somewhere in the cache

* Maybe a silly question, but since I'm not aware about all the details
here, I'm curious -- the approach of mapping attributes of a partition
to the root partition attributes, how robust is it? I guess there is
no way that the root partition column will be not what is expected,
e.g. due to some sort of concurrency?

Any such thing would require a lock on the partition relation in the question which is locked before passing rel around? So it shouldn't happen.

Best Wishes,

Ashutosh Bapat

Re: partitioning and identity column

От

Dmitry Dolgov

Дата:

06 мая, 17:01:33

> On Mon, May 06, 2024 at 06:52:41PM +0530, Ashutosh Bapat wrote:
> On Sun, May 5, 2024 at 1:43 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> > I had a quick look, it covers the issues mentioned above in the thread.
> > Few nitpicks/questions:
> >
> > * I think it makes sense to verify if the ptup is valid. This approach
> >   would fail if the target column of the root partition is marked as
> >   attisdropped.
> >
>
> The column is searched by name which is derived from attno of child
> partition. So it has to exist in the root partition. If it doesn't
> something is seriously wrong. Do you have a reproducer? We may want to add
> Assert(HeapTupleIsValid(ptup)) just in case. But it seems unnecessary to me.

Sure, normally it should work. I don't have any particular situation in
mind, when attisdropped might be set on a root partition, but obviously
setting it manually crashes this path. Consider it mostly as suggestion
for a more defensive implementation "just in case".

> >      Oid
> >     -getIdentitySequence(Oid relid, AttrNumber attnum, bool missing_ok)
> >     +getIdentitySequence(Relation rel, AttrNumber attnum, bool missing_ok)
> >      {
> >
> >     [...]
> >
> >     +           relid = llast_oid(ancestors);
> >     +           ptup = SearchSysCacheAttName(relid, attname);
> >     +           attnum = ((Form_pg_attribute) GETSTRUCT(ptup))->attnum;
> >
> > * getIdentitySequence is used in build_column_default, which in turn
> >   often appears in loops over table attributes. AFAICT it means that the
> >   same root partition search will be repeated multiple times in such
> >   situations if there is more than one identity. I assume the
> >   performance impact of this repetition is negligible?
> >
>
> I thought having multiple identity columns would be rare and hence avoided
> making code complex. Otherwise we have to get root partition somewhere in
> the caller hierarchy separately the logic much farther apart. Usually the
> ancestor entries will be somewhere in the cache

Yeah, agree, it's reasonable to expect that the case with multiple
identity columns will be rare.

Re: partitioning and identity column

От

Peter Eisentraut

Дата:

08 мая, 00:04:51

On 30.04.24 12:59, Ashutosh Bapat wrote:
> PFA patch which fixes all the three problems.

I have committed this patch.  Thanks all.

Re: partitioning and identity column

От

Ashutosh Bapat

Дата:

09 мая, 23:32:08

Thanks a lot Peter.

On Wed, May 8, 2024 at 2:34 AM Peter Eisentraut <peter@eisentraut.org> wrote:

On 30.04.24 12:59, Ashutosh Bapat wrote:
> PFA patch which fixes all the three problems.

I have committed this patch. Thanks all.

Best Wishes,

Ashutosh Bapat

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: partitioning and identity column

Вложения

Вложения

Вложения

Вложения