Обсуждение: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

Поиск

Список

Период

Сортировка

Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

11 декабря 2025 г., 12:15:51

Hi,
While working with logical replication and partitioned tables, I noticed an inconsistency between how publications treat partitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves.

When a publication is created on a partitioned table, e.g.:

```
CREATE PUBLICATION pub FOR TABLE parent;
```

PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s expectation that a partitioned table behaves as a single logical entity.

However, if the user then runs:
```

ALTER TABLE parent REPLICA IDENTITY FULL;
```

only the parent table’s relreplident is updated. None of the leaf partitions inherit this change, even though the parent itself has no storage and its replication identity plays no role in logical replication. Logical decoding always operates on the leaf partitions, and their replication identities determine whether UPDATE/DELETE can be replicated safely.

This gap leads to several problems:

* The parent table’s replica identity is effectively irrelevant during logical replication, since it never stores tuples or produces WAL.

* Users expect that altering the replica identity on the partitioned table would apply to all partitions that are implicitly included in the publication.

* As a result, users currently need to run ALTER TABLE ... REPLICA IDENTITY separately for every leaf partition, which is tedious and error-prone on large partition hierarchies.

* Misconfiguration usually surfaces only when logical replication starts failing on UPDATE/DELETE for specific leaf partitions due to mismatched replica identities.

To address this, the attached patch makes:
```

ALTER TABLE parent REPLICA IDENTITY <type>
```

cascade the new setting to all leaf partitions of the table. Partitioned tables (RELKIND_PARTITIONED_TABLE) are skipped since they have no storage and no effective replica identity.

This aligns ALTER TABLE behavior with how publications already expand partitioned tables, and makes replication identity configuration consistent with logical replication semantics.

The attached patch is not yet fully ready for detailed review, this is more of a proof-of-concept. At this stage, I mainly want to see whether people agree with the idea, or if there are objections to cascading replica identity changes for partitioned tables before I refine the patch further.

Comments and feedback are welcome.

Best regards,

--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Вложения

v1-0001-tablecmds-Cascade-REPLICA-IDENTITY-changes-to-lea.patch

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Amit Kapila

Дата:

11 декабря 2025 г., 15:43:07

On Thu, Dec 11, 2025 at 2:46 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi,
> While working with logical replication and partitioned tables, I noticed an inconsistency between how publications
treatpartitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves. 
>
> When a publication is created on a partitioned table, e.g.:
> ```
> CREATE PUBLICATION pub FOR TABLE parent;
> ```
>
> PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s
expectationthat a partitioned table behaves as a single logical entity. 
>
> However, if the user then runs:
> ```
> ALTER TABLE parent REPLICA IDENTITY FULL;
> ```
> only the parent table’s relreplident is updated. None of the leaf partitions inherit this change, even though the
parentitself has no storage and its replication identity plays no role in logical replication. Logical decoding always
operateson the leaf partitions, and their replication identities determine whether UPDATE/DELETE can be replicated
safely.
>
> This gap leads to several problems:
>
> * The parent table’s replica identity is effectively irrelevant during logical replication, since it never stores
tuplesor produces WAL. 
>

When we use row filters, if publish_via_partition_root option of
publication is true, the root partitioned table's row filter is used.
I think this would then refer RI of partitioned table for validity of
row filter. Please see docs [1] (There can be a case where a
subscription combines multiple publications. If a partitioned table is
published by any subscribed publications which set
publish_via_partition_root = true, changes on this partitioned table
(or on its partitions) will be published using the identity and schema
of this partitioned table rather than that of the individual
partitions. This parameter also affects how row filters and column
lists are chosen for partitions; see below for details.) for more
details.

I have not tested it but you can once try to see how it behaves.

The other point is what if one of the partition already has RI defined
to a different value than what is defined for parent table?

[1] - https://www.postgresql.org/docs/devel/sql-createpublication.html

--
With Regards,
Amit Kapila.

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

12 декабря 2025 г., 06:58:42


> On Dec 11, 2025, at 20:43, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 2:46 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>
>> Hi,
>> While working with logical replication and partitioned tables, I noticed an inconsistency between how publications
treatpartitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves. 
>>
>> When a publication is created on a partitioned table, e.g.:
>> ```
>> CREATE PUBLICATION pub FOR TABLE parent;
>> ```
>>
>> PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s
expectationthat a partitioned table behaves as a single logical entity. 
>>
>> However, if the user then runs:
>> ```
>> ALTER TABLE parent REPLICA IDENTITY FULL;
>> ```
>> only the parent table’s relreplident is updated. None of the leaf partitions inherit this change, even though the
parentitself has no storage and its replication identity plays no role in logical replication. Logical decoding always
operateson the leaf partitions, and their replication identities determine whether UPDATE/DELETE can be replicated
safely.
>>
>> This gap leads to several problems:
>>
>> * The parent table’s replica identity is effectively irrelevant during logical replication, since it never stores
tuplesor produces WAL. 
>>
>
> When we use row filters, if publish_via_partition_root option of
> publication is true, the root partitioned table's row filter is used.
> I think this would then refer RI of partitioned table for validity of
> row filter. Please see docs [1] (There can be a case where a
> subscription combines multiple publications. If a partitioned table is
> published by any subscribed publications which set
> publish_via_partition_root = true, changes on this partitioned table
> (or on its partitions) will be published using the identity and schema
> of this partitioned table rather than that of the individual
> partitions. This parameter also affects how row filters and column
> lists are chosen for partitions; see below for details.) for more
> details.
>
> I have not tested it but you can once try to see how it behaves.
>
> The other point is what if one of the partition already has RI defined
> to a different value than what is defined for parent table?
>
> [1] - https://www.postgresql.org/docs/devel/sql-createpublication.html
>
> --
> With Regards,
> Amit Kapila.

Hi Amit,

Thanks for pointing out that my assumption of “RI of parent is not used” is not always true.

I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how about to
extendthe ALTER TABLE syntax like: 

```
ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE]
```

So, that the current syntax will behave the same as usual, and

With CASCADE
============
1. Root's RI updated
2. All children (including middle partitioned tables and leaf tables) RI updated unless 3
3. If any child’s RI is different from the root's RI, fail out, no change happens

With CASCADE FORCE
===================
1. Root's RI updated
2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a
child’sRI is different from root’s RI 

"ALTER TABLE parent REPLICA IDENTITY” is a PG specific syntax, so the change won’t break the SQL standard. And
“CASCADE”is known keyword that has been used in many SQL commands. 

I can see the usefulness of “CASCADE” when a partitioned table has many partitions. A single command will be able to
updateall partitions’ RI. 

What do you think?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Amit Kapila

Дата:

13 декабря 2025 г., 14:10:46

On Fri, Dec 12, 2025 at 9:28 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi Amit,
>
> Thanks for pointing out that my assumption of “RI of parent is not used” is not always true.
>
> I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how about
toextend the ALTER TABLE syntax like: 
>
> ```
> ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE]
> ```
>

CASCADE is used for dependent objects, so I don't think using it will
be appropriate in this context. However, the INHERIT (NO INHERIT)
could be used. We already use them for constraints, see ALTER TABLE
... ALTER CONSTRAINT syntax in docs.

> So, that the current syntax will behave the same as usual, and
>
> With CASCADE
> ============
> 1. Root's RI updated
> 2. All children (including middle partitioned tables and leaf tables) RI updated unless 3
> 3. If any child’s RI is different from the root's RI, fail out, no change happens
>
> With CASCADE FORCE
> ===================
> 1. Root's RI updated
> 2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a
child’sRI is different from root’s RI 
>

I think you can try to experiment with CHECK or NOT NULL constraint
behavior for similar cases in case of partition tables.

BTW, did you get this use case in the field or just browsing docs, you
thought it would be useful to have such a feature?

--
With Regards,
Amit Kapila.

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

15 декабря 2025 г., 10:13:52

Hi Amit,

Thank you very much for the guidance.

> On Dec 13, 2025, at 19:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 12, 2025 at 9:28 AM Chao Li <li.evan.chao@gmail.com> wrote:
>>
>> Hi Amit,
>>
>> Thanks for pointing out that my assumption of “RI of parent is not used” is not always true.
>>
>> I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how about
toextend the ALTER TABLE syntax like: 
>>
>> ```
>> ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE]
>> ```
>>
>
> CASCADE is used for dependent objects, so I don't think using it will
> be appropriate in this context. However, the INHERIT (NO INHERIT)
> could be used. We already use them for constraints, see ALTER TABLE
> ... ALTER CONSTRAINT syntax in docs.
>
>> So, that the current syntax will behave the same as usual, and
>>
>> With CASCADE
>> ============
>> 1. Root's RI updated
>> 2. All children (including middle partitioned tables and leaf tables) RI updated unless 3
>> 3. If any child’s RI is different from the root's RI, fail out, no change happens
>>
>> With CASCADE FORCE
>> ===================
>> 1. Root's RI updated
>> 2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a
child’sRI is different from root’s RI 
>>
>
> I think you can try to experiment with CHECK or NOT NULL constraint
> behavior for similar cases in case of partition tables.
>
> BTW, did you get this use case in the field or just browsing docs, you
> thought it would be useful to have such a feature?
>

The main problem I am trying to solve is [1], where you are already in the thread. This is a real pain point reported
byour users and field teams. While working on [1], I noticed this additional issue during my own tests. I then
discussedit with our field teams, and they confirmed that such a feature would be very helpful in practice. We have
manydeployments where a single partitioned table has several thousands of partitions, and having a fast, single command
toupdate the replica identity of all partitions would significantly simplify operations. 

I also confirmed one thing with the field teams: across our deployments (my company has 100K+ deployments in China),
theyhave never seen a case where partitions under the same parent/root use different replica identities. In theory,
thisis allowed, since RI can be set per partition, but in practice I am not sure whether such a use case really exists. 

Currently, replica identity is not inheritable for partitions. I verified this behavior: if I create a partitioned
table,alter its RI to FULL, and then create a new partition, the new partition still uses DEFAULT. If we keep this
behavior,we can easily run into a scenario like this: 

* create a partitioned table with 10 partitions
* ALTER TABLE <parent> REPLICA IDENTITY FULL INHERIT   -- assume this feature exists
* create 5 new partitions
* ALTER TABLE <parent> REPLICA IDENTITY FULL INHERIT -- conflict occurs

In this case, a conflict occurs because the newly created partitions still have DEFAULT RI, but this is not the user’s
intention.

From this perspective, when a new partition is created, it should automatically inherit the parent’s RI. With that
behavior,a “FORCE” option would rarely be needed, because having one partition use a different RI from its siblings
shouldbe an uncommon case. 

Based on this, I imagine the feature roughly like this:

* When a new partition is created, it inherits its parent’s RI
* ALTER TABLE <table_name> REPLICA IDENTITY [ INHERIT | NO INHERIT ]  -- error out on conflicts

This leads to a couple of follow-up questions:

* Should RI be switchable between “inheritable” and “non-inheritable”, similar to constraints? IMO, no. RI is much
simplerthan constraints. For constraints, parent–child relationships exist between tables with potentially different
structures,so it is natural that child tables might have different constraints. RI, however, only applies to
partitionedtables, where partitions must share the same structure as the parent. In practice, it seems rare for
partitionsto intentionally use a different RI than the parent. 

* Should publish_via_partition_root in publications affect this feature? IMO, no. A table can belong to multiple
publications,and different publications may have different publish_via_partition_root settings. 

[1] https://postgr.es/m/CAEoWx2mMorbMwjKbT4YCsjDyL3r9Mp+z0bbK57VZ+OkJTgJQVQ@mail.gmail.com

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Amit Kapila

Дата:

16 декабря 2025 г., 13:12:35

On Mon, Dec 15, 2025 at 12:44 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Thank you very much for the guidance.
>
> > On Dec 13, 2025, at 19:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Dec 12, 2025 at 9:28 AM Chao Li <li.evan.chao@gmail.com> wrote:
> >>
> >> Hi Amit,
> >>
> >> Thanks for pointing out that my assumption of “RI of parent is not used” is not always true.
> >>
> >> I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how
aboutto extend the ALTER TABLE syntax like: 
> >>
> >> ```
> >> ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE]
> >> ```
> >>
> >
> > CASCADE is used for dependent objects, so I don't think using it will
> > be appropriate in this context. However, the INHERIT (NO INHERIT)
> > could be used. We already use them for constraints, see ALTER TABLE
> > ... ALTER CONSTRAINT syntax in docs.
> >
> >> So, that the current syntax will behave the same as usual, and
> >>
> >> With CASCADE
> >> ============
> >> 1. Root's RI updated
> >> 2. All children (including middle partitioned tables and leaf tables) RI updated unless 3
> >> 3. If any child’s RI is different from the root's RI, fail out, no change happens
> >>
> >> With CASCADE FORCE
> >> ===================
> >> 1. Root's RI updated
> >> 2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a
child’sRI is different from root’s RI 
> >>
> >
> > I think you can try to experiment with CHECK or NOT NULL constraint
> > behavior for similar cases in case of partition tables.
> >
> > BTW, did you get this use case in the field or just browsing docs, you
> > thought it would be useful to have such a feature?
> >
>
> The main problem I am trying to solve is [1], where you are already in the thread. This is a real pain point reported
byour users and field teams. While working on [1], I noticed this additional issue during my own tests. I then
discussedit with our field teams, and they confirmed that such a feature would be very helpful in practice. We have
manydeployments where a single partitioned table has several thousands of partitions, and having a fast, single command
toupdate the replica identity of all partitions would significantly simplify operations. 
>
> I also confirmed one thing with the field teams: across our deployments (my company has 100K+ deployments in China),
theyhave never seen a case where partitions under the same parent/root use different replica identities. In theory,
thisis allowed, since RI can be set per partition, but in practice I am not sure whether such a use case really exists. 
>
> Currently, replica identity is not inheritable for partitions. I verified this behavior: if I create a partitioned
table,alter its RI to FULL, and then create a new partition, the new partition still uses DEFAULT. If we keep this
behavior,we can easily run into a scenario like this: 
>
> * create a partitioned table with 10 partitions
> * ALTER TABLE <parent> REPLICA IDENTITY FULL INHERIT   -- assume this feature exists
> * create 5 new partitions
> * ALTER TABLE <parent> REPLICA IDENTITY FULL INHERIT -- conflict occurs
>
> In this case, a conflict occurs because the newly created partitions still have DEFAULT RI, but this is not the
user’sintention. 
>
> From this perspective, when a new partition is created, it should automatically inherit the parent’s RI. With that
behavior,a “FORCE” option would rarely be needed, because having one partition use a different RI from its siblings
shouldbe an uncommon case. 
>
> Based on this, I imagine the feature roughly like this:
>
> * When a new partition is created, it inherits its parent’s RI
> * ALTER TABLE <table_name> REPLICA IDENTITY [ INHERIT | NO INHERIT ]  -- error out on conflicts
>

Sounds reasonable to me.

> This leads to a couple of follow-up questions:
>
> * Should RI be switchable between “inheritable” and “non-inheritable”, similar to constraints? IMO, no. RI is much
simplerthan constraints. For constraints, parent–child relationships exist between tables with potentially different
structures,so it is natural that child tables might have different constraints. RI, however, only applies to
partitionedtables, where partitions must share the same structure as the parent. In practice, it seems rare for
partitionsto intentionally use a different RI than the parent. 
>

I see your point but I think we should provide resetting the option
unless it is too complex or not feasible.

> * Should publish_via_partition_root in publications affect this feature? IMO, no. A table can belong to multiple
publications,and different publications may have different publish_via_partition_root settings. 
>

I also don't think so.

--
With Regards,
Amit Kapila.

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

"Euler Taveira"

Дата:

17 декабря 2025 г., 03:10:33

On Tue, Dec 16, 2025, at 7:12 AM, Amit Kapila wrote:
> On Mon, Dec 15, 2025 at 12:44 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>
>> From this perspective, when a new partition is created, it should automatically inherit the parent’s RI. With that
behavior,a “FORCE” option would rarely be needed, because having one partition use a different RI from its siblings
shouldbe an uncommon case. 
>>
>> Based on this, I imagine the feature roughly like this:
>>
>> * When a new partition is created, it inherits its parent’s RI
>> * ALTER TABLE <table_name> REPLICA IDENTITY [ INHERIT | NO INHERIT ]  -- error out on conflicts
>>
>
> Sounds reasonable to me.
>

+1.

>> This leads to a couple of follow-up questions:
>>
>> * Should RI be switchable between “inheritable” and “non-inheritable”, similar to constraints? IMO, no. RI is much
simplerthan constraints. For constraints, parent–child relationships exist between tables with potentially different
structures,so it is natural that child tables might have different constraints. RI, however, only applies to
partitionedtables, where partitions must share the same structure as the parent. In practice, it seems rare for
partitionsto intentionally use a different RI than the parent. 
>>
>
> I see your point but I think we should provide resetting the option
> unless it is too complex or not feasible.
>

+1.

ALTER TABLE ...
  REPLICA IDENTITY { DEFAULT [ INHERIT | NO INHERIT ] |
                     USING INDEX index_name |
                     FULL [ INHERIT | NO INHERIT ] |
                     NOTHING [ INHERIT | NO INHERIT ] }

It doesn't make sense to have the inheritance property for INDEX. Although the
natural default value is NO INHERIT to preserve backward compatibility, I
wondering if we use INHERIT as default. The main advantage is usability as Chao
Li already mentioned. Is there any cases that having a different replica
identity from parent/partitioned table makes sense?


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Вложения

ri.sql

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

17 декабря 2025 г., 10:56:18

Hi Amit and Euler,

Thank you both for all your advice. Here comes my first implementation of INHERIT in the attached v2 patch.

On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira <euler@eulerto.com> wrote:

I wondering if we use INHERIT as default. The main advantage is usability as Chao
Li already mentioned. Is there any cases that having a different replica
identity from parent/partitioned table makes sense?

We can leave this topic open for discussion. In my current implementation, NO INHERIT is still the default. But if we decide to switch the default, I can add a new commit that should include only 1 line code change in gram.y and a tiny doc update.

0001 - when a new partition is created, use the parent's replication identity

0002 - add INHERIT | NO INHERIT

Best regards,

--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Вложения

RE: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

"Zhijie Hou (Fujitsu)"

Дата:

17 декабря 2025 г., 11:48:33

On Wednesday, December 17, 2025 3:56 PM Chao Li <li.evan.chao@gmail.com>  wrote:
> Thank you both for all your advice. Here comes my first implementation of
> INHERIT in the attached v2 patch.
> 
> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira <mailto:euler@eulerto.com> wrote:
>  
> > I wondering if we use INHERIT as default. The main advantage is usability as
> > Chao Li already mentioned. Is there any cases that having a different
> > replica identity from parent/partitioned table makes sense?
> 
> We can leave this topic open for discussion. In my current implementation, NO
> INHERIT is still the default. But if we decide to switch the default, I can add
> a new commit that should include only 1 line code change in gram.y and a tiny
> doc update.
> 
> 0001 - when a new partition is created, use the parent's replication identity
> 0002 - add INHERIT | NO INHERIT

Thanks for updating the patches.

I think there are several design considerations for this proposal:

1) Since the index names can vary across different partitions, what should be the
expected behavior if a new partition cannot identify the same replica identity
key as the root partitioned table?

2) Should we simply use the ONLY keyword to determine whether to propagate the
replica identity to partitions instead of adding [NOT] INHERIT? For example, if
a user specifies ONLY, it changes the identity of the parent table, and any
newly created partitions will adopt this new identity. However, the identities
of existing partitions remain unchanged.

3) There have been previous discussions on similar proposals[1][2]. It might be
beneficial to review those debates to see whether any old issues or arguments
are pertinent to this proposal.

[1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
[2]
https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06

Best Regards,
Hou zj

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

18 декабря 2025 г., 07:21:07

> On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, December 17, 2025 3:56 PM Chao Li <li.evan.chao@gmail.com>  wrote:
>> Thank you both for all your advice. Here comes my first implementation of
>> INHERIT in the attached v2 patch.
>>
>> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira <mailto:euler@eulerto.com> wrote:
>>
>>> I wondering if we use INHERIT as default. The main advantage is usability as
>>> Chao Li already mentioned. Is there any cases that having a different
>>> replica identity from parent/partitioned table makes sense?
>>
>> We can leave this topic open for discussion. In my current implementation, NO
>> INHERIT is still the default. But if we decide to switch the default, I can add
>> a new commit that should include only 1 line code change in gram.y and a tiny
>> doc update.
>>
>> 0001 - when a new partition is created, use the parent's replication identity
>> 0002 - add INHERIT | NO INHERIT
>

Hi Zhijie,

Thanks for your feedback and linked information. I think this patch is avoiding the hard problem of “index” RI.

>
> I think there are several design considerations for this proposal:
>
> 1) Since the index names can vary across different partitions, what should be the
> expected behavior if a new partition cannot identify the same replica identity
> key as the root partitioned table?

Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and NONE.

>
> 2) Should we simply use the ONLY keyword to determine whether to propagate the
> replica identity to partitions instead of adding [NOT] INHERIT? For example, if
> a user specifies ONLY, it changes the identity of the parent table, and any
> newly created partitions will adopt this new identity. However, the identities
> of existing partitions remain unchanged.

The current syntax “ALTER TABLE [ONLY] table_name action”, I think here “ONLY” works in terms of inherited tables but
partitions.(I surprisedly find the doc doesn’t explain “ONLY” at all.) The current behavior (without my patch) proves
myunderstanding, “ALTER TABLE table_name REPLICA IDENTITY” only updates the parent table itself. 

Given that it is not allowed to inherit a table from either a partitioned table (root/parent) or partition (leaf),
reusingthe “ONLY" at “ALTER TABLE” level won’t generate a conflict. But the problem is, we will have to revisit all
ALTERTABLE actions to see if they can propagate to partitions when the target table is a partitioned table. In that
case,scope of this patch is extremely extended. 

The current approach adding INHERIT to the action “REPLICA IDENTIFY” has no global impacts. If you look at the patch,
thesyntax this patch uses is: 
```
ALTER TABLE ...
 REPLICA IDENTITY { DEFAULT [ INHERIT | NO INHERIT ] |
                    USING INDEX index_name |
                    FULL [ INHERIT | NO INHERIT ] |
                    NOTHING [ INHERIT | NO INHERIT ] }
```
It explicitly avoids the complexity of “USING INDEX”, thus we don’t have to mention any exceptions in docs.

Amit, what do you think?

>
> 3) There have been previous discussions on similar proposals[1][2]. It might be
> beneficial to review those debates to see whether any old issues or arguments
> are pertinent to this proposal.
>
> [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
> [2]
https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06
>

I read through the both threads. I think the key difference between the patch and the previous one is that this patch
onlyadds “INHERIT” to DEFAULT/FULL/NONE, which is narrow scoped. Let’s see how the folks who involved in the previous
discussionwill feedback to this patch. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

18 декабря 2025 г., 08:31:24

> On Dec 18, 2025, at 12:21, Chao Li <li.evan.chao@gmail.com> wrote:
>
>>
>> 2) Should we simply use the ONLY keyword to determine whether to propagate the
>> replica identity to partitions instead of adding [NOT] INHERIT? For example, if
>> a user specifies ONLY, it changes the identity of the parent table, and any
>> newly created partitions will adopt this new identity. However, the identities
>> of existing partitions remain unchanged.
>
> The current syntax “ALTER TABLE [ONLY] table_name action”, I think here “ONLY” works in terms of inherited tables but
partitions.(I surprisedly find the doc doesn’t explain “ONLY” at all.) The current behavior (without my patch) proves
myunderstanding, “ALTER TABLE table_name REPLICA IDENTITY” only updates the parent table itself. 
>
> Given that it is not allowed to inherit a table from either a partitioned table (root/parent) or partition (leaf),
reusingthe “ONLY" at “ALTER TABLE” level won’t generate a conflict. But the problem is, we will have to revisit all
ALTERTABLE actions to see if they can propagate to partitions when the target table is a partitioned table. In that
case,scope of this patch is extremely extended. 
>
> The current approach adding INHERIT to the action “REPLICA IDENTIFY” has no global impacts. If you look at the patch,
thesyntax this patch uses is: 
> ```
> ALTER TABLE ...
> REPLICA IDENTITY { DEFAULT [ INHERIT | NO INHERIT ] |
>                    USING INDEX index_name |
>                    FULL [ INHERIT | NO INHERIT ] |
>                    NOTHING [ INHERIT | NO INHERIT ] }
> ```
> It explicitly avoids the complexity of “USING INDEX”, thus we don’t have to mention any exceptions in docs.
>
> Amit, what do you think?

I read through the doc of “ALTER TABLE” again, and got some findings.

```
name - The name (optionally schema-qualified) of an existing table to alter. If ONLY is specified before the table
name,only that table is altered. If ONLY is not specified, the table and all its descendant tables (if any) are
altered.Optionally, * can be specified after the table name to explicitly indicate that descendant tables are included. 
```

For table name, it says "If ONLY is specified before the table name, only that table is altered. If ONLY is not
specified,the table and all its descendant tables (if any) are altered.” Here, I think we can “descendant tables” as
bothinherited table or partition table, as they are mutually exclusive. 

And for “DISABLE/ENABLE [ REPLICA | ALWAYS ] TRIGGER”, the doc says:
```
DISABLE/ENABLE [ REPLICA | ALWAYS ] TRIGGER

When this command is applied to a partitioned table, the states of corresponding clone triggers in the partitions are
updatedtoo, unless ONLY is specified. 
```

Here, ONLY is used for partition table. From this perspective, “REPLICA IDENTITY” not propagating to children feels
likea “bug”. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

RE: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

"Zhijie Hou (Fujitsu)"

Дата:

19 декабря 2025 г., 08:44:42

On Thursday, December 18, 2025 12:21 PM Chao Li <li.evan.chao@gmail.com> wrote:
> 
> > On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> > On Wednesday, December 17, 2025 3:56 PM Chao Li
> <li.evan.chao@gmail.com>  wrote:
> >> Thank you both for all your advice. Here comes my first
> >> implementation of INHERIT in the attached v2 patch.
> >>
> >> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira
> <mailto:euler@eulerto.com> wrote:
> >>
> >>> I wondering if we use INHERIT as default. The main advantage is
> >>> usability as Chao Li already mentioned. Is there any cases that
> >>> having a different replica identity from parent/partitioned table makes
> sense?
> >>
> >> We can leave this topic open for discussion. In my current
> >> implementation, NO INHERIT is still the default. But if we decide to
> >> switch the default, I can add a new commit that should include only 1
> >> line code change in gram.y and a tiny doc update.
> >>
> >> 0001 - when a new partition is created, use the parent's replication
> >> identity
> >> 0002 - add INHERIT | NO INHERIT
> >
> 
> Hi Zhijie,
> 
> Thanks for your feedback and linked information. I think this patch is avoiding
> the hard problem of “index” RI.
> 
> >
> > I think there are several design considerations for this proposal:
> >
> > 1) Since the index names can vary across different partitions, what
> > should be the expected behavior if a new partition cannot identify the
> > same replica identity key as the root partitioned table?
> 
> Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and
> NONE.
> 

I personally find skipping this part to be inconsistent, particularly given the
existing disparities among ALTER TABLE commands related to partitioned table handling.
Omitting this part introduces further inconsistency within the ALTER TABLE
REPLICA IDENTITY.

That said, I understand that skipping this implementation might be necessary due
to technical challenges. It's important, however, to discuss, analyze and document the
specific difficulties associated with implementing this part, and we should
include this information in the code comments and commit message.

Best Regards,
Hou zj

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Amit Kapila

Дата:

22 декабря 2025 г., 09:26:57

On Fri, Dec 19, 2025 at 11:14 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, December 18, 2025 12:21 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >
> > > On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> > > On Wednesday, December 17, 2025 3:56 PM Chao Li
> > <li.evan.chao@gmail.com>  wrote:
> > >> Thank you both for all your advice. Here comes my first
> > >> implementation of INHERIT in the attached v2 patch.
> > >>
> > >> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira
> > <mailto:euler@eulerto.com> wrote:
> > >>
> > >>> I wondering if we use INHERIT as default. The main advantage is
> > >>> usability as Chao Li already mentioned. Is there any cases that
> > >>> having a different replica identity from parent/partitioned table makes
> > sense?
> > >>
> > >> We can leave this topic open for discussion. In my current
> > >> implementation, NO INHERIT is still the default. But if we decide to
> > >> switch the default, I can add a new commit that should include only 1
> > >> line code change in gram.y and a tiny doc update.
> > >>
> > >> 0001 - when a new partition is created, use the parent's replication
> > >> identity
> > >> 0002 - add INHERIT | NO INHERIT
> > >
> >
> > Hi Zhijie,
> >
> > Thanks for your feedback and linked information. I think this patch is avoiding
> > the hard problem of “index” RI.
> >
> > >
> > > I think there are several design considerations for this proposal:
> > >
> > > 1) Since the index names can vary across different partitions, what
> > > should be the expected behavior if a new partition cannot identify the
> > > same replica identity key as the root partitioned table?
> >
> > Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and
> > NONE.
> >
>
> I personally find skipping this part to be inconsistent, particularly given the
> existing disparities among ALTER TABLE commands related to partitioned table handling.
> Omitting this part introduces further inconsistency within the ALTER TABLE
> REPLICA IDENTITY.
>

Fair point. I think one should summarize the previous discussions with
key differences and where the previous patch got stuck. Then, it would
be good to get some feedback from the people involved previously. If
there is an agreement that we can do INHERIT stuff for specific parts
then fine, otherwise, I think we need to address the index part as
well.

--
With Regards,
Amit Kapila.

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

29 декабря 2025 г., 11:24:55

> On Dec 22, 2025, at 14:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 11:14 AM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
>>
>> On Thursday, December 18, 2025 12:21 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>
>>>> On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
>>> wrote:
>>>>
>>>> On Wednesday, December 17, 2025 3:56 PM Chao Li
>>> <li.evan.chao@gmail.com>  wrote:
>>>>> Thank you both for all your advice. Here comes my first
>>>>> implementation of INHERIT in the attached v2 patch.
>>>>>
>>>>> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira
>>> <mailto:euler@eulerto.com> wrote:
>>>>>
>>>>>> I wondering if we use INHERIT as default. The main advantage is
>>>>>> usability as Chao Li already mentioned. Is there any cases that
>>>>>> having a different replica identity from parent/partitioned table makes
>>> sense?
>>>>>
>>>>> We can leave this topic open for discussion. In my current
>>>>> implementation, NO INHERIT is still the default. But if we decide to
>>>>> switch the default, I can add a new commit that should include only 1
>>>>> line code change in gram.y and a tiny doc update.
>>>>>
>>>>> 0001 - when a new partition is created, use the parent's replication
>>>>> identity
>>>>> 0002 - add INHERIT | NO INHERIT
>>>>
>>>
>>> Hi Zhijie,
>>>
>>> Thanks for your feedback and linked information. I think this patch is avoiding
>>> the hard problem of “index” RI.
>>>
>>>>
>>>> I think there are several design considerations for this proposal:
>>>>
>>>> 1) Since the index names can vary across different partitions, what
>>>> should be the expected behavior if a new partition cannot identify the
>>>> same replica identity key as the root partitioned table?
>>>
>>> Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and
>>> NONE.
>>>
>>
>> I personally find skipping this part to be inconsistent, particularly given the
>> existing disparities among ALTER TABLE commands related to partitioned table handling.
>> Omitting this part introduces further inconsistency within the ALTER TABLE
>> REPLICA IDENTITY.
>>
>
> Fair point. I think one should summarize the previous discussions with
> key differences and where the previous patch got stuck. Then, it would
> be good to get some feedback from the people involved previously. If
> there is an agreement that we can do INHERIT stuff for specific parts
> then fine, otherwise, I think we need to address the index part as
> well.

> [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
> [2]
https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06

Sure, let me try to summarize the two discussions.

For [1], key participants included Álvaro Herrera (patch author), Dmitry Dolgov (reviewer/tester), Robert Haas, Simon
Riggs,Michael Paquier, and Peter Eisentraut. 
----

Brief summary with rough timeline:

* In 2019, Álvaro proposed a patch that essentially did the same thing as this patch. That patch also attempted to
handle“index-based” replica identity, which my patch intentionally avoids. 
* Dmitry tested the patch and reported issues related to attaching partitions, including index-related errors.
* Robert then pointed out that having REPLICA IDENTITY recurse was inconsistent with the behavior of other ALTER TABLE
actionssuch as TABLESPACE, and emphasized that we should really try to think through *all* actions that might recurse
tochild partitions. 
* Álvaro noted the use of `ALTER TABLE ONLY`, suggesting that WITHOUT ONLY could recurse to children, while ONLY would
affectjust the parent. 
* Simon commented that recursing TABLESPACE changes could be problematic because they imply physical data rewrites.
* Robert listed several ALTER TABLE actions that lacked consistent recurse behavior (identity columns, triggers,
CLUSTER,OWNER, TABLESPACE, CHECK constraints). 
* This led to broader discussion about whether TABLESPACE/OWNER/etc. should recurse.
* Michael echoed support for having REPLICA IDENTITY recurse, controlled via ONLY.
* Peter pointed out that recursing ADD GENERATED AS IDENTITY / DROP IDENTITY may not be feasible.
* Álvaro wanted to proceed with the patch.
* Robert maintained that defining consistent semantics for all relevant ALTER TABLE actions should come first.

Overall, the blocker was an unresolved semantic disagreement, rather than a concrete technical objection to the patch
itself.There appeared to be broad agreement that: 
* New partitions should use the parent’s replica identity.
* ONLY could be used to control whether replica identity changes recurse.

However, the discussion about “semantic consistency” significantly broadened the scope, and there was no clear
agreementon whether TABLESPACE/OWNER/etc. should recurse, which ultimately stalled the effort. 

For [2], key participants included: Takayuki Tsunakawa (patch author), Bharath Rupireddy (reviewer), Álvaro Herrera,
MichaelPaquier, and Kyotaro Horiguchi. 
----

Brief summary with rough timeline:

* In 2020, Takayuki proposed a patch intended to propagate ALTER TABLE SET LOGGED / UNLOGGED to partitions.
* Bharath reviewed the patch and raised a number of questions and edge cases.
* There were initial discussions about the patch mechanics and expected behavior.
* Álvaro then pointed out the strong relation to the earlier discussion in [1].
* The focus of the discussion shifted to the more fundamental question of “what is the parent?”:
  * Takayuki viewed ALTER TABLE on a partitioned table as a convenient batch operation on existing partitions, with
futurepartitions remaining independent. 
  * Álvaro, Michael, and Kyotaro argued that changing a property on the parent should define the default for future
partitionsas well. 
* No clear agreement was reached on this semantic question, and the discussion expanded into broader concerns about
consistencyacross ALTER TABLE actions. 
* Takayuki withdrew the patch

Overall, [2] also fell under the umbrella of “semantic consistency”, the main discussion was not about replica identity
itself.

Current situation:
----

* Resolving “semantic consistency” across ALTER TABLE actions in a single release appears to be the biggest challenge.
However,addressing everything in one patch set does not seem realistic. 
* The core question of “what is the parent?” from [1] remains central. That said, the discussion appeared to lean
towardthe view that future partitions should inherit properties from the parent. 
* Different properties behave very differently. For example, propagating REPLICA IDENTITY is a metadata-only change and
relativelysafe, while propagating TABLESPACE implies physical data movement and is much riskier. As a result, each
ALTERTABLE action may deserve its own discussion and patch set. 
* The inconsistency is not limited to ALTER TABLE but also exists in CREATE TABLE behavior. For example, a new
partitioninherits TABLESPACE from the parent, but not REPLICA IDENTITY. 
* “ALTER TABLE ONLY table_name” is commonly suggested as the mechanism to control whether changes should recurse to
partitions.

How to proceed?
----

If we stop here, these inconsistencies will remain indefinitely, which I believe nobody really wants. With that in
mind,I’d like to suggest a two-phase approach. 

Phase 1: Document current behavior and set expectations

* Identify all ALTER TABLE actions involved in these inconsistencies.
* Update the ALTER TABLE and CREATE TABLE documentation to clearly describe the current behavior for partitioned
tables,and (where appropriate) the intended or ideal behavior. 
* Explicitly document the meaning of ONLY for partitioned tables, and note that some actions may behave differently,
withdetails described in each action’s section. 

Phase 2: Address actions incrementally

* Work on each ALTER TABLE action individually, recognizing that some may be straightforward while others require more
designdiscussion, coding, and testing. 
* With Phase 1 in place to set expectations, it may not be necessary to complete all actions in a single release.
* That said, it would still be desirable to keep the work bounded, for example within one or two major releases, to
avoidlong-term fragmentation. 

I’ve included the participants from the previous discussions on CC, in case they want to comment further.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

26 января, 05:51:10


> On Dec 29, 2025, at 16:24, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
>> On Dec 22, 2025, at 14:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Dec 19, 2025 at 11:14 AM Zhijie Hou (Fujitsu)
>> <houzj.fnst@fujitsu.com> wrote:
>>>
>>> On Thursday, December 18, 2025 12:21 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>>
>>>>> On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
>>>> wrote:
>>>>>
>>>>> On Wednesday, December 17, 2025 3:56 PM Chao Li
>>>> <li.evan.chao@gmail.com>  wrote:
>>>>>> Thank you both for all your advice. Here comes my first
>>>>>> implementation of INHERIT in the attached v2 patch.
>>>>>>
>>>>>> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira
>>>> <mailto:euler@eulerto.com> wrote:
>>>>>>
>>>>>>> I wondering if we use INHERIT as default. The main advantage is
>>>>>>> usability as Chao Li already mentioned. Is there any cases that
>>>>>>> having a different replica identity from parent/partitioned table makes
>>>> sense?
>>>>>>
>>>>>> We can leave this topic open for discussion. In my current
>>>>>> implementation, NO INHERIT is still the default. But if we decide to
>>>>>> switch the default, I can add a new commit that should include only 1
>>>>>> line code change in gram.y and a tiny doc update.
>>>>>>
>>>>>> 0001 - when a new partition is created, use the parent's replication
>>>>>> identity
>>>>>> 0002 - add INHERIT | NO INHERIT
>>>>>
>>>>
>>>> Hi Zhijie,
>>>>
>>>> Thanks for your feedback and linked information. I think this patch is avoiding
>>>> the hard problem of “index” RI.
>>>>
>>>>>
>>>>> I think there are several design considerations for this proposal:
>>>>>
>>>>> 1) Since the index names can vary across different partitions, what
>>>>> should be the expected behavior if a new partition cannot identify the
>>>>> same replica identity key as the root partitioned table?
>>>>
>>>> Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and
>>>> NONE.
>>>>
>>>
>>> I personally find skipping this part to be inconsistent, particularly given the
>>> existing disparities among ALTER TABLE commands related to partitioned table handling.
>>> Omitting this part introduces further inconsistency within the ALTER TABLE
>>> REPLICA IDENTITY.
>>>
>>
>> Fair point. I think one should summarize the previous discussions with
>> key differences and where the previous patch got stuck. Then, it would
>> be good to get some feedback from the people involved previously. If
>> there is an agreement that we can do INHERIT stuff for specific parts
>> then fine, otherwise, I think we need to address the index part as
>> well.
>
>> [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
>> [2]
https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06
>
> Sure, let me try to summarize the two discussions.
>
> For [1], key participants included Álvaro Herrera (patch author), Dmitry Dolgov (reviewer/tester), Robert Haas, Simon
Riggs,Michael Paquier, and Peter Eisentraut. 
> ----
>
> Brief summary with rough timeline:
>
> * In 2019, Álvaro proposed a patch that essentially did the same thing as this patch. That patch also attempted to
handle“index-based” replica identity, which my patch intentionally avoids. 
> * Dmitry tested the patch and reported issues related to attaching partitions, including index-related errors.
> * Robert then pointed out that having REPLICA IDENTITY recurse was inconsistent with the behavior of other ALTER
TABLEactions such as TABLESPACE, and emphasized that we should really try to think through *all* actions that might
recurseto child partitions. 
> * Álvaro noted the use of `ALTER TABLE ONLY`, suggesting that WITHOUT ONLY could recurse to children, while ONLY
wouldaffect just the parent. 
> * Simon commented that recursing TABLESPACE changes could be problematic because they imply physical data rewrites.
> * Robert listed several ALTER TABLE actions that lacked consistent recurse behavior (identity columns, triggers,
CLUSTER,OWNER, TABLESPACE, CHECK constraints). 
> * This led to broader discussion about whether TABLESPACE/OWNER/etc. should recurse.
> * Michael echoed support for having REPLICA IDENTITY recurse, controlled via ONLY.
> * Peter pointed out that recursing ADD GENERATED AS IDENTITY / DROP IDENTITY may not be feasible.
> * Álvaro wanted to proceed with the patch.
> * Robert maintained that defining consistent semantics for all relevant ALTER TABLE actions should come first.
>
> Overall, the blocker was an unresolved semantic disagreement, rather than a concrete technical objection to the patch
itself.There appeared to be broad agreement that: 
> * New partitions should use the parent’s replica identity.
> * ONLY could be used to control whether replica identity changes recurse.
>
> However, the discussion about “semantic consistency” significantly broadened the scope, and there was no clear
agreementon whether TABLESPACE/OWNER/etc. should recurse, which ultimately stalled the effort. 
>
> For [2], key participants included: Takayuki Tsunakawa (patch author), Bharath Rupireddy (reviewer), Álvaro Herrera,
MichaelPaquier, and Kyotaro Horiguchi. 
> ----
>
> Brief summary with rough timeline:
>
> * In 2020, Takayuki proposed a patch intended to propagate ALTER TABLE SET LOGGED / UNLOGGED to partitions.
> * Bharath reviewed the patch and raised a number of questions and edge cases.
> * There were initial discussions about the patch mechanics and expected behavior.
> * Álvaro then pointed out the strong relation to the earlier discussion in [1].
> * The focus of the discussion shifted to the more fundamental question of “what is the parent?”:
>  * Takayuki viewed ALTER TABLE on a partitioned table as a convenient batch operation on existing partitions, with
futurepartitions remaining independent. 
>  * Álvaro, Michael, and Kyotaro argued that changing a property on the parent should define the default for future
partitionsas well. 
> * No clear agreement was reached on this semantic question, and the discussion expanded into broader concerns about
consistencyacross ALTER TABLE actions. 
> * Takayuki withdrew the patch
>
> Overall, [2] also fell under the umbrella of “semantic consistency”, the main discussion was not about replica
identityitself. 
>
> Current situation:
> ----
>
> * Resolving “semantic consistency” across ALTER TABLE actions in a single release appears to be the biggest
challenge.However, addressing everything in one patch set does not seem realistic. 
> * The core question of “what is the parent?” from [1] remains central. That said, the discussion appeared to lean
towardthe view that future partitions should inherit properties from the parent. 
> * Different properties behave very differently. For example, propagating REPLICA IDENTITY is a metadata-only change
andrelatively safe, while propagating TABLESPACE implies physical data movement and is much riskier. As a result, each
ALTERTABLE action may deserve its own discussion and patch set. 
> * The inconsistency is not limited to ALTER TABLE but also exists in CREATE TABLE behavior. For example, a new
partitioninherits TABLESPACE from the parent, but not REPLICA IDENTITY. 
> * “ALTER TABLE ONLY table_name” is commonly suggested as the mechanism to control whether changes should recurse to
partitions.
>
> How to proceed?
> ----
>
> If we stop here, these inconsistencies will remain indefinitely, which I believe nobody really wants. With that in
mind,I’d like to suggest a two-phase approach. 
>
> Phase 1: Document current behavior and set expectations
>
> * Identify all ALTER TABLE actions involved in these inconsistencies.
> * Update the ALTER TABLE and CREATE TABLE documentation to clearly describe the current behavior for partitioned
tables,and (where appropriate) the intended or ideal behavior. 
> * Explicitly document the meaning of ONLY for partitioned tables, and note that some actions may behave differently,
withdetails described in each action’s section. 
>
> Phase 2: Address actions incrementally
>
> * Work on each ALTER TABLE action individually, recognizing that some may be straightforward while others require
moredesign discussion, coding, and testing. 
> * With Phase 1 in place to set expectations, it may not be necessary to complete all actions in a single release.
> * That said, it would still be desirable to keep the work bounded, for example within one or two major releases, to
avoidlong-term fragmentation. 
>
> I’ve included the participants from the previous discussions on CC, in case they want to comment further.
>

To move this patch forward, I’ve been working on a few related patches in parallel:

    • [1] Enhances the “ALTER TABLE” documentation to clarify how subcommands behave on partitioned tables. In that
discussion,I summarized the current behaviors of all subcommands into 15 categories. 
    • [2] Adds a NOTICE when ONLY is not specified but the action does not recurse to child partitions.
    • [3] Improves the header comments in tablecmds.c to better explain the meaning of “recurse” and “recursing”.

While working on [1], [2] and [3], I gained a deeper understanding of ALTER TABLE behavior. Based on that, v3 is a
completerework compared to v2. The behavior implemented by v3 is: 

    • Recursion is performed during the preparation phase.
    • When ONLY is specified, no recursion occurs; without ONLY, the command recurses to all child partitions.
    • All replica identity types are supported, including USING INDEX (which was not supported in v2).
    • For non-index replica identities, the existing ATSimpleRecursion() infrastructure is reused; for index-based
replicaidentity, the corresponding index name must be determined for each child partition. 

In a previous discussion [4], Dmitry Dolgov pointed out a test case that resulted in a DEADLOCK. I ran that test
againstv3. The test still fails, but I no longer observe a deadlock; instead, the server now crashes during partition
attachment.I will investigate this further. 

While working on v3, I also noticed another issue that may be a bug. When creating an index on a partitioned table, the
indexis automatically created on all partitions. However, if a column type change causes an index rebuild, it appears
thatthe rebuilt index may be missing from child partitions. I will verify this behavior and, if confirmed, discuss it
ina separate thread. 

At this point, I think v3 is in a reviewable state.

[1] https://postgr.es/m/CAEoWx2%3DmYhCfsnHaN96Qqwq5b0GVS2YgO3zpVqPPRd_iO52wRw%40mail.gmail.com
[2] https://postgr.es/m/CAEoWx2=SLga-xH09Cq_PAvsHhQHrBK+V0vF821JKgzS=Bm0haA@mail.gmail.com
[3] https://postgr.es/m/CAEoWx2n9E6_zxPbqwMpaPuC1C_p4b3y635SjiDuCPSVm8GBjsA@mail.gmail.com
[4] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Вложения

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

26 января, 11:56:36

> On Jan 26, 2026, at 10:51, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
> While working on v3, I also noticed another issue that may be a bug. When creating an index on a partitioned table,
theindex is automatically created on all partitions. However, if a column type change causes an index rebuild, it
appearsthat the rebuilt index may be missing from child partitions. I will verify this behavior and, if confirmed,
discussit in a separate thread. 
>

I have verified the problem. The real problem is, after index rebuild, indisreplident flag of index on child partition
willbe lost, which should be a bug. I will start a new thread to give more details of the problem and propose a fix. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Michael Paquier

Дата:

27 января, 09:56:27

On Mon, Jan 26, 2026 at 04:56:36PM +0800, Chao Li wrote:
> I have verified the problem. The real problem is, after index
> rebuild, indisreplident flag of index on child partition will be
> lost, which should be a bug. I will start a new thread to give more
> details of the problem and propose a fix.

That sounds like a real problem, yes.  Thanks for starting a new
thread about that.
--
Michael

Вложения

signature.asc

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

27 января, 10:28:40


> On Jan 27, 2026, at 14:56, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Jan 26, 2026 at 04:56:36PM +0800, Chao Li wrote:
>> I have verified the problem. The real problem is, after index
>> rebuild, indisreplident flag of index on child partition will be
>> lost, which should be a bug. I will start a new thread to give more
>> details of the problem and propose a fix.
>
> That sounds like a real problem, yes.  Thanks for starting a new
> thread about that.
> --
> Michael

This is the new discussion for the bug:
https://www.postgresql.org/message-id/DB533C25-C6BA-4C0F-8046-96168E9CDD72%40gmail.com

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

30 января, 04:22:16


> On Jan 26, 2026, at 10:51, Chao Li <li.evan.chao@gmail.com> wrote:
>
> In a previous discussion [4], Dmitry Dolgov pointed out a test case that resulted in a DEADLOCK. I ran that test
againstv3. The test still fails, but I no longer observe a deadlock; instead, the server now crashes during partition
attachment.I will investigate this further. 

I tried to investigate the server crash yesterday, but I’m no longer able to reproduce it. From the record of the first
crashI encountered, the call stack looked 
like this:
```
TRAP: failed Assert("entry->data.lockmode == BUFFER_LOCK_UNLOCK"), File: "bufmgr.c", Line: 5908, PID: 47991
0   postgres                            0x00000001013d9bb0 ExceptionalCondition + 216
1   postgres                            0x0000000101129a80 BufferLockConditional + 88
2   postgres                            0x0000000101129a04 ConditionalLockBuffer + 224
3   postgres                            0x0000000100b8966c _bt_conditionallockbuf + 28
4   postgres                            0x0000000100b88714 _bt_allocbuf + 128
5   postgres                            0x0000000100b858d4 _bt_split + 1496
6   postgres                            0x0000000100b82cec _bt_insertonpg + 1520
7   postgres                            0x0000000100b81220 _bt_doinsert + 608
8   postgres                            0x0000000100b9a008 btinsert + 120
9   postgres                            0x0000000100b7a224 index_insert + 552
10  postgres                            0x0000000100c5dd50 CatalogIndexInsert + 764
11  postgres                            0x0000000100c5df60 CatalogTupleUpdate + 100
12  postgres                            0x0000000100c7e608 ConstraintSetParentConstraint + 580
13  postgres                            0x0000000100de6598 AttachPartitionEnsureIndexes + 1596
14  postgres                            0x0000000100de5cac attachPartitionTable + 80
15  postgres                            0x0000000100dd864c ATExecAttachPartition + 2520
16  postgres                            0x0000000100dcb8e8 ATExecCmd + 4464
17  postgres                            0x0000000100dc6054 ATRewriteCatalogs + 408
18  postgres                            0x0000000100dbfa18 ATController + 256
19  postgres                            0x0000000100dbf84c AlterTable + 96
20  postgres                            0x00000001011a3508 ProcessUtilitySlow + 1704
21  postgres                            0x00000001011a111c standard_ProcessUtility + 3504
22  postgres                            0x00000001011a035c ProcessUtility + 360
23  postgres                            0x000000010119fa10 PortalRunUtility + 216
24  postgres                            0x000000010119eae0 PortalRunMulti + 688
25  postgres                            0x000000010119e018 PortalRun + 788
26  postgres                            0x0000000101198dcc exec_simple_query + 1380
27  postgres                            0x0000000101197ee8 PostgresMain + 3244
28  postgres                            0x000000010118f8d0 BackendInitialize + 0
29  postgres                            0x0000000101061f3c postmaster_child_launch + 456
30  postgres                            0x00000001010696c8 BackendStartup + 304
31  postgres                            0x0000000101067564 ServerLoop + 372
32  postgres                            0x0000000101066044 PostmasterMain + 6440
33  postgres                            0x0000000100ee40a4 main + 924
34  dyld                                0x000000019a36dd54 start + 7184
2026-01-26 09:52:41.240 CST [46845] LOG:  client backend (PID 47991) was terminated by signal 6: Abort trap: 6
```

I noticed that the Assert in bufmgr.c was removed earlier today by commit 333f58637.

However, with the server crash no longer occurring, the DEADLOCK issue reappeared. After some investigation, I
confirmedthat the deadlock is not specific to this patch, I can consistently reproduce it with ATTACH PARTITION on the
masterbranch. That suggests this is a more general problem. 

I’ll start a new thread to follow up on the deadlock separately.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

От

Chao Li

Дата:

02 февраля, 09:47:53


> On Jan 26, 2026, at 10:51, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
>> On Dec 29, 2025, at 16:24, Chao Li <li.evan.chao@gmail.com> wrote:
>>
>>
>>
>>> On Dec 22, 2025, at 14:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>
>>> On Fri, Dec 19, 2025 at 11:14 AM Zhijie Hou (Fujitsu)
>>> <houzj.fnst@fujitsu.com> wrote:
>>>>
>>>> On Thursday, December 18, 2025 12:21 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>>>
>>>>>> On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
>>>>> wrote:
>>>>>>
>>>>>> On Wednesday, December 17, 2025 3:56 PM Chao Li
>>>>> <li.evan.chao@gmail.com>  wrote:
>>>>>>> Thank you both for all your advice. Here comes my first
>>>>>>> implementation of INHERIT in the attached v2 patch.
>>>>>>>
>>>>>>> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira
>>>>> <mailto:euler@eulerto.com> wrote:
>>>>>>>
>>>>>>>> I wondering if we use INHERIT as default. The main advantage is
>>>>>>>> usability as Chao Li already mentioned. Is there any cases that
>>>>>>>> having a different replica identity from parent/partitioned table makes
>>>>> sense?
>>>>>>>
>>>>>>> We can leave this topic open for discussion. In my current
>>>>>>> implementation, NO INHERIT is still the default. But if we decide to
>>>>>>> switch the default, I can add a new commit that should include only 1
>>>>>>> line code change in gram.y and a tiny doc update.
>>>>>>>
>>>>>>> 0001 - when a new partition is created, use the parent's replication
>>>>>>> identity
>>>>>>> 0002 - add INHERIT | NO INHERIT
>>>>>>
>>>>>
>>>>> Hi Zhijie,
>>>>>
>>>>> Thanks for your feedback and linked information. I think this patch is avoiding
>>>>> the hard problem of “index” RI.
>>>>>
>>>>>>
>>>>>> I think there are several design considerations for this proposal:
>>>>>>
>>>>>> 1) Since the index names can vary across different partitions, what
>>>>>> should be the expected behavior if a new partition cannot identify the
>>>>>> same replica identity key as the root partitioned table?
>>>>>
>>>>> Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and
>>>>> NONE.
>>>>>
>>>>
>>>> I personally find skipping this part to be inconsistent, particularly given the
>>>> existing disparities among ALTER TABLE commands related to partitioned table handling.
>>>> Omitting this part introduces further inconsistency within the ALTER TABLE
>>>> REPLICA IDENTITY.
>>>>
>>>
>>> Fair point. I think one should summarize the previous discussions with
>>> key differences and where the previous patch got stuck. Then, it would
>>> be good to get some feedback from the people involved previously. If
>>> there is an agreement that we can do INHERIT stuff for specific parts
>>> then fine, otherwise, I think we need to address the index part as
>>> well.
>>
>>> [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
>>> [2]
https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06
>>
>> Sure, let me try to summarize the two discussions.
>>
>> For [1], key participants included Álvaro Herrera (patch author), Dmitry Dolgov (reviewer/tester), Robert Haas,
SimonRiggs, Michael Paquier, and Peter Eisentraut. 
>> ----
>>
>> Brief summary with rough timeline:
>>
>> * In 2019, Álvaro proposed a patch that essentially did the same thing as this patch. That patch also attempted to
handle“index-based” replica identity, which my patch intentionally avoids. 
>> * Dmitry tested the patch and reported issues related to attaching partitions, including index-related errors.
>> * Robert then pointed out that having REPLICA IDENTITY recurse was inconsistent with the behavior of other ALTER
TABLEactions such as TABLESPACE, and emphasized that we should really try to think through *all* actions that might
recurseto child partitions. 
>> * Álvaro noted the use of `ALTER TABLE ONLY`, suggesting that WITHOUT ONLY could recurse to children, while ONLY
wouldaffect just the parent. 
>> * Simon commented that recursing TABLESPACE changes could be problematic because they imply physical data rewrites.
>> * Robert listed several ALTER TABLE actions that lacked consistent recurse behavior (identity columns, triggers,
CLUSTER,OWNER, TABLESPACE, CHECK constraints). 
>> * This led to broader discussion about whether TABLESPACE/OWNER/etc. should recurse.
>> * Michael echoed support for having REPLICA IDENTITY recurse, controlled via ONLY.
>> * Peter pointed out that recursing ADD GENERATED AS IDENTITY / DROP IDENTITY may not be feasible.
>> * Álvaro wanted to proceed with the patch.
>> * Robert maintained that defining consistent semantics for all relevant ALTER TABLE actions should come first.
>>
>> Overall, the blocker was an unresolved semantic disagreement, rather than a concrete technical objection to the
patchitself. There appeared to be broad agreement that: 
>> * New partitions should use the parent’s replica identity.
>> * ONLY could be used to control whether replica identity changes recurse.
>>
>> However, the discussion about “semantic consistency” significantly broadened the scope, and there was no clear
agreementon whether TABLESPACE/OWNER/etc. should recurse, which ultimately stalled the effort. 
>>
>> For [2], key participants included: Takayuki Tsunakawa (patch author), Bharath Rupireddy (reviewer), Álvaro Herrera,
MichaelPaquier, and Kyotaro Horiguchi. 
>> ----
>>
>> Brief summary with rough timeline:
>>
>> * In 2020, Takayuki proposed a patch intended to propagate ALTER TABLE SET LOGGED / UNLOGGED to partitions.
>> * Bharath reviewed the patch and raised a number of questions and edge cases.
>> * There were initial discussions about the patch mechanics and expected behavior.
>> * Álvaro then pointed out the strong relation to the earlier discussion in [1].
>> * The focus of the discussion shifted to the more fundamental question of “what is the parent?”:
>> * Takayuki viewed ALTER TABLE on a partitioned table as a convenient batch operation on existing partitions, with
futurepartitions remaining independent. 
>> * Álvaro, Michael, and Kyotaro argued that changing a property on the parent should define the default for future
partitionsas well. 
>> * No clear agreement was reached on this semantic question, and the discussion expanded into broader concerns about
consistencyacross ALTER TABLE actions. 
>> * Takayuki withdrew the patch
>>
>> Overall, [2] also fell under the umbrella of “semantic consistency”, the main discussion was not about replica
identityitself. 
>>
>> Current situation:
>> ----
>>
>> * Resolving “semantic consistency” across ALTER TABLE actions in a single release appears to be the biggest
challenge.However, addressing everything in one patch set does not seem realistic. 
>> * The core question of “what is the parent?” from [1] remains central. That said, the discussion appeared to lean
towardthe view that future partitions should inherit properties from the parent. 
>> * Different properties behave very differently. For example, propagating REPLICA IDENTITY is a metadata-only change
andrelatively safe, while propagating TABLESPACE implies physical data movement and is much riskier. As a result, each
ALTERTABLE action may deserve its own discussion and patch set. 
>> * The inconsistency is not limited to ALTER TABLE but also exists in CREATE TABLE behavior. For example, a new
partitioninherits TABLESPACE from the parent, but not REPLICA IDENTITY. 
>> * “ALTER TABLE ONLY table_name” is commonly suggested as the mechanism to control whether changes should recurse to
partitions.
>>
>> How to proceed?
>> ----
>>
>> If we stop here, these inconsistencies will remain indefinitely, which I believe nobody really wants. With that in
mind,I’d like to suggest a two-phase approach. 
>>
>> Phase 1: Document current behavior and set expectations
>>
>> * Identify all ALTER TABLE actions involved in these inconsistencies.
>> * Update the ALTER TABLE and CREATE TABLE documentation to clearly describe the current behavior for partitioned
tables,and (where appropriate) the intended or ideal behavior. 
>> * Explicitly document the meaning of ONLY for partitioned tables, and note that some actions may behave differently,
withdetails described in each action’s section. 
>>
>> Phase 2: Address actions incrementally
>>
>> * Work on each ALTER TABLE action individually, recognizing that some may be straightforward while others require
moredesign discussion, coding, and testing. 
>> * With Phase 1 in place to set expectations, it may not be necessary to complete all actions in a single release.
>> * That said, it would still be desirable to keep the work bounded, for example within one or two major releases, to
avoidlong-term fragmentation. 
>>
>> I’ve included the participants from the previous discussions on CC, in case they want to comment further.
>>
>
> To move this patch forward, I’ve been working on a few related patches in parallel:
>
>    • [1] Enhances the “ALTER TABLE” documentation to clarify how subcommands behave on partitioned tables. In that
discussion,I summarized the current behaviors of all subcommands into 15 categories. 
>    • [2] Adds a NOTICE when ONLY is not specified but the action does not recurse to child partitions.
>    • [3] Improves the header comments in tablecmds.c to better explain the meaning of “recurse” and “recursing”.
>
> While working on [1], [2] and [3], I gained a deeper understanding of ALTER TABLE behavior. Based on that, v3 is a
completerework compared to v2. The behavior implemented by v3 is: 
>
>    • Recursion is performed during the preparation phase.
>    • When ONLY is specified, no recursion occurs; without ONLY, the command recurses to all child partitions.
>    • All replica identity types are supported, including USING INDEX (which was not supported in v2).
>    • For non-index replica identities, the existing ATSimpleRecursion() infrastructure is reused; for index-based
replicaidentity, the corresponding index name must be determined for each child partition. 
>
> In a previous discussion [4], Dmitry Dolgov pointed out a test case that resulted in a DEADLOCK. I ran that test
againstv3. The test still fails, but I no longer observe a deadlock; instead, the server now crashes during partition
attachment.I will investigate this further. 
>
> While working on v3, I also noticed another issue that may be a bug. When creating an index on a partitioned table,
theindex is automatically created on all partitions. However, if a column type change causes an index rebuild, it
appearsthat the rebuilt index may be missing from child partitions. I will verify this behavior and, if confirmed,
discussit in a separate thread. 
>
> At this point, I think v3 is in a reviewable state.
>
> [1] https://postgr.es/m/CAEoWx2%3DmYhCfsnHaN96Qqwq5b0GVS2YgO3zpVqPPRd_iO52wRw%40mail.gmail.com
> [2] https://postgr.es/m/CAEoWx2=SLga-xH09Cq_PAvsHhQHrBK+V0vF821JKgzS=Bm0haA@mail.gmail.com
> [3] https://postgr.es/m/CAEoWx2n9E6_zxPbqwMpaPuC1C_p4b3y635SjiDuCPSVm8GBjsA@mail.gmail.com
> [4] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
>
<v3-0001-Inherit-replica-identity-for-new-and-attached-par.patch><v3-0002-ALTER-TABLE-recurse-REPLICA-IDENTITY-to-partition.patch>

Trying to move forward this patch, as the previous blockers have been addressed:

 * ALTER TABLE “inconsistency” is clarified by the other patch, see [1].
 * DEADLOCK has been addressed by the other patch, see [2].
 * “index” replica identity is handled.

PFA v4: rebased and deleted an unneeded test case.

[1] https://postgr.es/m/CAEoWx2mUiCYJEBuo5D74gi7pHfNz82b54oDHjZNtrXRFDnBPOg@mail.gmail.com
[2] https://postgr.es/m/CFACA0EB-7E6F-4FAA-9ACE-1FC2226D7482@gmail.com

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения