Обсуждение: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions
Hi,
While working with logical replication and partitioned tables, I noticed an inconsistency between how publications treat partitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves.
When a publication is created on a partitioned table, e.g.:
PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s expectation that a partitioned table behaves as a single logical entity.
This gap leads to several problems:
To address this, the attached patch makes:
```
This aligns ALTER TABLE behavior with how publications already expand partitioned tables, and makes replication identity configuration consistent with logical replication semantics.
The attached patch is not yet fully ready for detailed review, this is more of a proof-of-concept. At this stage, I mainly want to see whether people agree with the idea, or if there are objections to cascading replica identity changes for partitioned tables before I refine the patch further.
Comments and feedback are welcome.
While working with logical replication and partitioned tables, I noticed an inconsistency between how publications treat partitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves.
When a publication is created on a partitioned table, e.g.:
```
CREATE PUBLICATION pub FOR TABLE parent;
```
CREATE PUBLICATION pub FOR TABLE parent;
```
PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s expectation that a partitioned table behaves as a single logical entity.
However, if the user then runs:
```
```
ALTER TABLE parent REPLICA IDENTITY FULL;
```
```
only the parent table’s relreplident is updated. None of the leaf partitions inherit this change, even though the parent itself has no storage and its replication identity plays no role in logical replication. Logical decoding always operates on the leaf partitions, and their replication identities determine whether UPDATE/DELETE can be replicated safely.
This gap leads to several problems:
* The parent table’s replica identity is effectively irrelevant during logical replication, since it never stores tuples or produces WAL.
* Users expect that altering the replica identity on the partitioned table would apply to all partitions that are implicitly included in the publication.
* As a result, users currently need to run ALTER TABLE ... REPLICA IDENTITY separately for every leaf partition, which is tedious and error-prone on large partition hierarchies.
* Misconfiguration usually surfaces only when logical replication starts failing on UPDATE/DELETE for specific leaf partitions due to mismatched replica identities.
```
ALTER TABLE parent REPLICA IDENTITY <type>
```
```
cascade the new setting to all leaf partitions of the table. Partitioned tables (RELKIND_PARTITIONED_TABLE) are skipped since they have no storage and no effective replica identity.
This aligns ALTER TABLE behavior with how publications already expand partitioned tables, and makes replication identity configuration consistent with logical replication semantics.
The attached patch is not yet fully ready for detailed review, this is more of a proof-of-concept. At this stage, I mainly want to see whether people agree with the idea, or if there are objections to cascading replica identity changes for partitioned tables before I refine the patch further.
Comments and feedback are welcome.
Вложения
On Thu, Dec 11, 2025 at 2:46 PM Chao Li <li.evan.chao@gmail.com> wrote: > > Hi, > While working with logical replication and partitioned tables, I noticed an inconsistency between how publications treatpartitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves. > > When a publication is created on a partitioned table, e.g.: > ``` > CREATE PUBLICATION pub FOR TABLE parent; > ``` > > PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s expectationthat a partitioned table behaves as a single logical entity. > > However, if the user then runs: > ``` > ALTER TABLE parent REPLICA IDENTITY FULL; > ``` > only the parent table’s relreplident is updated. None of the leaf partitions inherit this change, even though the parentitself has no storage and its replication identity plays no role in logical replication. Logical decoding always operateson the leaf partitions, and their replication identities determine whether UPDATE/DELETE can be replicated safely. > > This gap leads to several problems: > > * The parent table’s replica identity is effectively irrelevant during logical replication, since it never stores tuplesor produces WAL. > When we use row filters, if publish_via_partition_root option of publication is true, the root partitioned table's row filter is used. I think this would then refer RI of partitioned table for validity of row filter. Please see docs [1] (There can be a case where a subscription combines multiple publications. If a partitioned table is published by any subscribed publications which set publish_via_partition_root = true, changes on this partitioned table (or on its partitions) will be published using the identity and schema of this partitioned table rather than that of the individual partitions. This parameter also affects how row filters and column lists are chosen for partitions; see below for details.) for more details. I have not tested it but you can once try to see how it behaves. The other point is what if one of the partition already has RI defined to a different value than what is defined for parent table? [1] - https://www.postgresql.org/docs/devel/sql-createpublication.html -- With Regards, Amit Kapila.
> On Dec 11, 2025, at 20:43, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 2:46 PM Chao Li <li.evan.chao@gmail.com> wrote: >> >> Hi, >> While working with logical replication and partitioned tables, I noticed an inconsistency between how publications treatpartitions and how "ALTER TABLE ... REPLICA IDENTITY" behaves. >> >> When a publication is created on a partitioned table, e.g.: >> ``` >> CREATE PUBLICATION pub FOR TABLE parent; >> ``` >> >> PostgreSQL automatically includes all leaf partitions of the table in the publication. This matches the user’s expectationthat a partitioned table behaves as a single logical entity. >> >> However, if the user then runs: >> ``` >> ALTER TABLE parent REPLICA IDENTITY FULL; >> ``` >> only the parent table’s relreplident is updated. None of the leaf partitions inherit this change, even though the parentitself has no storage and its replication identity plays no role in logical replication. Logical decoding always operateson the leaf partitions, and their replication identities determine whether UPDATE/DELETE can be replicated safely. >> >> This gap leads to several problems: >> >> * The parent table’s replica identity is effectively irrelevant during logical replication, since it never stores tuplesor produces WAL. >> > > When we use row filters, if publish_via_partition_root option of > publication is true, the root partitioned table's row filter is used. > I think this would then refer RI of partitioned table for validity of > row filter. Please see docs [1] (There can be a case where a > subscription combines multiple publications. If a partitioned table is > published by any subscribed publications which set > publish_via_partition_root = true, changes on this partitioned table > (or on its partitions) will be published using the identity and schema > of this partitioned table rather than that of the individual > partitions. This parameter also affects how row filters and column > lists are chosen for partitions; see below for details.) for more > details. > > I have not tested it but you can once try to see how it behaves. > > The other point is what if one of the partition already has RI defined > to a different value than what is defined for parent table? > > [1] - https://www.postgresql.org/docs/devel/sql-createpublication.html > > -- > With Regards, > Amit Kapila. Hi Amit, Thanks for pointing out that my assumption of “RI of parent is not used” is not always true. I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how about to extendthe ALTER TABLE syntax like: ``` ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE] ``` So, that the current syntax will behave the same as usual, and With CASCADE ============ 1. Root's RI updated 2. All children (including middle partitioned tables and leaf tables) RI updated unless 3 3. If any child’s RI is different from the root's RI, fail out, no change happens With CASCADE FORCE =================== 1. Root's RI updated 2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a child’sRI is different from root’s RI "ALTER TABLE parent REPLICA IDENTITY” is a PG specific syntax, so the change won’t break the SQL standard. And “CASCADE”is known keyword that has been used in many SQL commands. I can see the usefulness of “CASCADE” when a partitioned table has many partitions. A single command will be able to updateall partitions’ RI. What do you think? Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On Fri, Dec 12, 2025 at 9:28 AM Chao Li <li.evan.chao@gmail.com> wrote: > > Hi Amit, > > Thanks for pointing out that my assumption of “RI of parent is not used” is not always true. > > I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how about toextend the ALTER TABLE syntax like: > > ``` > ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE] > ``` > CASCADE is used for dependent objects, so I don't think using it will be appropriate in this context. However, the INHERIT (NO INHERIT) could be used. We already use them for constraints, see ALTER TABLE ... ALTER CONSTRAINT syntax in docs. > So, that the current syntax will behave the same as usual, and > > With CASCADE > ============ > 1. Root's RI updated > 2. All children (including middle partitioned tables and leaf tables) RI updated unless 3 > 3. If any child’s RI is different from the root's RI, fail out, no change happens > > With CASCADE FORCE > =================== > 1. Root's RI updated > 2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a child’sRI is different from root’s RI > I think you can try to experiment with CHECK or NOT NULL constraint behavior for similar cases in case of partition tables. BTW, did you get this use case in the field or just browsing docs, you thought it would be useful to have such a feature? -- With Regards, Amit Kapila.
Hi Amit, Thank you very much for the guidance. > On Dec 13, 2025, at 19:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 9:28 AM Chao Li <li.evan.chao@gmail.com> wrote: >> >> Hi Amit, >> >> Thanks for pointing out that my assumption of “RI of parent is not used” is not always true. >> >> I agree that automatic-cascade will introduce a lot of complexities. To ensure the backward-compatibility, how about toextend the ALTER TABLE syntax like: >> >> ``` >> ALTER TABLE <root> REPLICA IDENTITY <type> [CASCADE | FORCE CASCADE] >> ``` >> > > CASCADE is used for dependent objects, so I don't think using it will > be appropriate in this context. However, the INHERIT (NO INHERIT) > could be used. We already use them for constraints, see ALTER TABLE > ... ALTER CONSTRAINT syntax in docs. > >> So, that the current syntax will behave the same as usual, and >> >> With CASCADE >> ============ >> 1. Root's RI updated >> 2. All children (including middle partitioned tables and leaf tables) RI updated unless 3 >> 3. If any child’s RI is different from the root's RI, fail out, no change happens >> >> With CASCADE FORCE >> =================== >> 1. Root's RI updated >> 2. All children (including middle partitioned tables and leaf tables) RI updated, prints a warning message when a child’sRI is different from root’s RI >> > > I think you can try to experiment with CHECK or NOT NULL constraint > behavior for similar cases in case of partition tables. > > BTW, did you get this use case in the field or just browsing docs, you > thought it would be useful to have such a feature? > The main problem I am trying to solve is [1], where you are already in the thread. This is a real pain point reported byour users and field teams. While working on [1], I noticed this additional issue during my own tests. I then discussedit with our field teams, and they confirmed that such a feature would be very helpful in practice. We have manydeployments where a single partitioned table has several thousands of partitions, and having a fast, single command toupdate the replica identity of all partitions would significantly simplify operations. I also confirmed one thing with the field teams: across our deployments (my company has 100K+ deployments in China), theyhave never seen a case where partitions under the same parent/root use different replica identities. In theory, thisis allowed, since RI can be set per partition, but in practice I am not sure whether such a use case really exists. Currently, replica identity is not inheritable for partitions. I verified this behavior: if I create a partitioned table,alter its RI to FULL, and then create a new partition, the new partition still uses DEFAULT. If we keep this behavior,we can easily run into a scenario like this: * create a partitioned table with 10 partitions * ALTER TABLE <parent> REPLICA IDENTITY FULL INHERIT -- assume this feature exists * create 5 new partitions * ALTER TABLE <parent> REPLICA IDENTITY FULL INHERIT -- conflict occurs In this case, a conflict occurs because the newly created partitions still have DEFAULT RI, but this is not the user’s intention. From this perspective, when a new partition is created, it should automatically inherit the parent’s RI. With that behavior,a “FORCE” option would rarely be needed, because having one partition use a different RI from its siblings shouldbe an uncommon case. Based on this, I imagine the feature roughly like this: * When a new partition is created, it inherits its parent’s RI * ALTER TABLE <table_name> REPLICA IDENTITY [ INHERIT | NO INHERIT ] -- error out on conflicts This leads to a couple of follow-up questions: * Should RI be switchable between “inheritable” and “non-inheritable”, similar to constraints? IMO, no. RI is much simplerthan constraints. For constraints, parent–child relationships exist between tables with potentially different structures,so it is natural that child tables might have different constraints. RI, however, only applies to partitionedtables, where partitions must share the same structure as the parent. In practice, it seems rare for partitionsto intentionally use a different RI than the parent. * Should publish_via_partition_root in publications affect this feature? IMO, no. A table can belong to multiple publications,and different publications may have different publish_via_partition_root settings. [1] https://postgr.es/m/CAEoWx2mMorbMwjKbT4YCsjDyL3r9Mp+z0bbK57VZ+OkJTgJQVQ@mail.gmail.com Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/