Обсуждение: Skipping schema changes in publication

Поиск

Список

Период

Сортировка

Skipping schema changes in publication

От

vignesh C

Дата:

22 марта 2022 г., 07:08:43

Hi,

This feature adds an option to skip changes of all tables in specified
schema while creating publication.
This feature is helpful for use cases where the user wants to
subscribe to all the changes except for the changes present in a few
schemas.
Ex:
CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;

A new column pnskip is added to table "pg_publication_namespace", to
maintain the schemas that the user wants to skip publishing through
the publication. Modified the output plugin (pgoutput) to skip
publishing the changes if the relation is part of skip schema
publication.
As a continuation to this, I will work on implementing skipping tables
from all tables in schema and skipping tables from all tables
publication.

Attached patch has the implementation for this.
This feature is for the pg16 version.
Thoughts?

Regards,
Vignesh

Вложения

v1-0001-Skip-publishing-the-tables-of-schema.patch

Re: Skipping schema changes in publication

От

vignesh C

Дата:

26 марта 2022 г., 14:07:26

On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Hi,
>
> This feature adds an option to skip changes of all tables in specified
> schema while creating publication.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> schemas.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
>
> A new column pnskip is added to table "pg_publication_namespace", to
> maintain the schemas that the user wants to skip publishing through
> the publication. Modified the output plugin (pgoutput) to skip
> publishing the changes if the relation is part of skip schema
> publication.
> As a continuation to this, I will work on implementing skipping tables
> from all tables in schema and skipping tables from all tables
> publication.
>
> Attached patch has the implementation for this.

The patch was not applying on top of HEAD because of the recent
commits, attached patch is rebased on top of HEAD.

Regards,
Vignesh

Вложения

v1-0001-Skip-publishing-the-tables-of-schema.patch

Re: Skipping schema changes in publication

От

vignesh C

Дата:

12 апреля 2022 г., 06:23:29

On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Hi,
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
> > A new column pnskip is added to table "pg_publication_namespace", to
> > maintain the schemas that the user wants to skip publishing through
> > the publication. Modified the output plugin (pgoutput) to skip
> > publishing the changes if the relation is part of skip schema
> > publication.
> > As a continuation to this, I will work on implementing skipping tables
> > from all tables in schema and skipping tables from all tables
> > publication.
> >
> > Attached patch has the implementation for this.
>
> The patch was not applying on top of HEAD because of the recent
> commits, attached patch is rebased on top of HEAD.

The patch does not apply on top of HEAD because of the recent commit,
attached patch is rebased on top of HEAD.

I have also included the implementation for skipping a few tables from
all tables publication, the 0002 patch has the implementation for the
same.
This feature is helpful for use cases where the user wants to
subscribe to all the changes except for the changes present in a few
tables.
Ex:
CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
OR
ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;

Regards,
Vignesh

On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:

On 12.04.22 08:23, vignesh C wrote:
> I have also included the implementation for skipping a few tables from
> all tables publication, the 0002 patch has the implementation for the
> same.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> tables.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2;

We have already allocated the "skip" terminology for skipping
transactions, which is a dynamic run-time action. We are also using the
term "skip" elsewhere to skip locked rows, which is similarly a run-time
action. I think it would be confusing to use the term SKIP for DDL
construction.

I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN

SCHEMA and if I were to suggest a keyword, it would be EXCEPT.

I would also think about this in broader terms. For example, sometimes
people want features like "all columns except these" in certain places.
The syntax for those things should be similar.

The questions are:

What kind of issues does it solve?

Do we have a workaround for it?

That said, I'm not sure this feature is worth the trouble. If this is
useful, what about "whole database except these schemas"? What about
"create this database from this template except these schemas". This
could get out of hand. I think we should encourage users to group their
object the way they want and not offer these complicated negative
selection mechanisms.

I have the same impression too. We already provide a way to:

* include individual tables;

* include all tables;

* include all tables in a certain schema.

Doesn't it cover the majority of the use cases? We don't need to cover all

possible cases in one DDL command. IMO the current grammar for CREATE

PUBLICATION is already complicated after the ALL TABLES IN SCHEMA. You are

proposing to add "ALL TABLES SKIP ALL TABLES" that sounds repetitive but it is

not; doesn't seem well-thought-out. I'm also concerned about possible gotchas

for this proposal. The first command above suggests that it skips all tables in a

certain schema. What happen if I decide to include a particular table of the

skipped schema (second command)?

ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;

ALTER PUBLICATION pub1 ADD TABLE s1.foo;

Having said that I'm not wedded to this proposal. Unless someone provides

compelling use cases for this additional syntax, I think we should leave the

publication syntax as is.

Euler Taveira

EDB https://www.enterprisedb.com/

Re: Skipping schema changes in publication

От

Amit Kapila

Дата:

18 апреля 2022 г., 07:01:59

On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
>
> On 12.04.22 08:23, vignesh C wrote:
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
>
> We have already allocated the "skip" terminology for skipping
> transactions, which is a dynamic run-time action.  We are also using the
> term "skip" elsewhere to skip locked rows, which is similarly a run-time
> action.  I think it would be confusing to use the term SKIP for DDL
> construction.
>
> I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
> SCHEMA and if I were to suggest a keyword, it would be EXCEPT.
>

+1 for EXCEPT.

> I would also think about this in broader terms.  For example, sometimes
> people want features like "all columns except these" in certain places.
> The syntax for those things should be similar.
>
> The questions are:
> What kind of issues does it solve?

As far as I understand, it is for usability, otherwise, users need to
list all required columns' names even if they don't want to hide most
of the columns in the table. Consider user doesn't want to publish the
'salary' or other sensitive information of executives/employees but
would like to publish all other columns. I feel in such cases it will
be a lot of work for the user especially when the table has many
columns. I see that Oracle has a similar feature [1]. I think without
this it will be difficult for users to use this feature in some cases.

> Do we have a workaround for it?
>

I can't think of any except the user needs to manually input all
required columns. Can you think of any other workaround?

> That said, I'm not sure this feature is worth the trouble.  If this is
> useful, what about "whole database except these schemas"?  What about
> "create this database from this template except these schemas".  This
> could get out of hand.  I think we should encourage users to group their
> object the way they want and not offer these complicated negative
> selection mechanisms.
>
> I have the same impression too. We already provide a way to:
>
> * include individual tables;
> * include all tables;
> * include all tables in a certain schema.
>
> Doesn't it cover the majority of the use cases?
>

Similar to columns, the same applies to tables. Users need to manually
add all tables for a database even when she wants to avoid only a
handful of tables from the database say because they contain sensitive
information or are not required. I think we don't need to cover all
possible exceptions but a few where users can avoid some tables would
be useful. If not, what kind of alternative do users have except for
listing all columns or all tables that are required.

[1] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.

Re: Skipping schema changes in publication

От

vignesh C

Дата:

18 апреля 2022 г., 09:40:46

On Thu, Apr 14, 2022 at 7:18 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 12.04.22 08:23, vignesh C wrote:
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
>
> We have already allocated the "skip" terminology for skipping
> transactions, which is a dynamic run-time action.  We are also using the
> term "skip" elsewhere to skip locked rows, which is similarly a run-time
> action.  I think it would be confusing to use the term SKIP for DDL
> construction.
>
> Let's find another term like "omit", "except", etc.

+1 for Except

> I would also think about this in broader terms.  For example, sometimes
> people want features like "all columns except these" in certain places.
> The syntax for those things should be similar.
>
> That said, I'm not sure this feature is worth the trouble.  If this is
> useful, what about "whole database except these schemas"?  What about
> "create this database from this template except these schemas".  This
> could get out of hand.  I think we should encourage users to group their
> object the way they want and not offer these complicated negative
> selection mechanisms.

I thought this feature would help when there are many many tables in
the database and the user wants only certain confidential tables like
credit card information. In this case instead of specifying the whole
table list it will be better to specify "ALL TABLES EXCEPT
cred_info_tbl".
I had seen that mysql also has a similar option replicate-ignore-table
to ignore the changes on specific tables as mentioned in [1].
Similar use case exists in pg_dump too. pg_dump has an option
exclude-table that will be used for not dumping any tables that are
matching the table specified as in [2].

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] - https://www.postgresql.org/docs/devel/app-pgdump.html

Regards,
Vignesh

Re: Skipping schema changes in publication

От

vignesh C

Дата:

21 апреля 2022 г., 03:15:07

On Mon, Apr 18, 2022 at 12:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote:
> >
> > On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
> >
> > On 12.04.22 08:23, vignesh C wrote:
> > > I have also included the implementation for skipping a few tables from
> > > all tables publication, the 0002 patch has the implementation for the
> > > same.
> > > This feature is helpful for use cases where the user wants to
> > > subscribe to all the changes except for the changes present in a few
> > > tables.
> > > Ex:
> > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > > OR
> > > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
> >
> > We have already allocated the "skip" terminology for skipping
> > transactions, which is a dynamic run-time action.  We are also using the
> > term "skip" elsewhere to skip locked rows, which is similarly a run-time
> > action.  I think it would be confusing to use the term SKIP for DDL
> > construction.
> >
> > I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
> > SCHEMA and if I were to suggest a keyword, it would be EXCEPT.
> >
>
> +1 for EXCEPT.

Updated patch by changing the syntax to use EXCEPT instead of SKIP.

Regards,
Vignesh

Вложения

v2-0001-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch

Re: Skipping schema changes in publication

От

Bharath Rupireddy

Дата:

22 апреля 2022 г., 16:09:24

On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Hi,
>
> This feature adds an option to skip changes of all tables in specified
> schema while creating publication.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> schemas.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
>
> A new column pnskip is added to table "pg_publication_namespace", to
> maintain the schemas that the user wants to skip publishing through
> the publication. Modified the output plugin (pgoutput) to skip
> publishing the changes if the relation is part of skip schema
> publication.
> As a continuation to this, I will work on implementing skipping tables
> from all tables in schema and skipping tables from all tables
> publication.
>
> Attached patch has the implementation for this.
> This feature is for the pg16 version.
> Thoughts?

The feature seems to be useful especially when there are lots of
schemas in a database. However, I don't quite like the syntax. Do we
have 'SKIP' identifier in any of the SQL statements in SQL standard?
Can we think of adding skip_schema_list as an option, something like
below?

CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset

Regards,
Bharath Rupireddy.

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

26 апреля 2022 г., 01:55:21

On Sat, Apr 23, 2022 at 2:09 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Hi,
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
> > A new column pnskip is added to table "pg_publication_namespace", to
> > maintain the schemas that the user wants to skip publishing through
> > the publication. Modified the output plugin (pgoutput) to skip
> > publishing the changes if the relation is part of skip schema
> > publication.
> > As a continuation to this, I will work on implementing skipping tables
> > from all tables in schema and skipping tables from all tables
> > publication.
> >
> > Attached patch has the implementation for this.
> > This feature is for the pg16 version.
> > Thoughts?
>
> The feature seems to be useful especially when there are lots of
> schemas in a database. However, I don't quite like the syntax. Do we
> have 'SKIP' identifier in any of the SQL statements in SQL standard?
> Can we think of adding skip_schema_list as an option, something like
> below?
>
> CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
> ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
> ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset
>

I had been wondering for some time if there was any way to introduce a
more flexible pattern matching into PUBLICATION but without bloating
the syntax. Maybe your idea to use an option for the "skip" gives a
way to do it...

For example, if we could use regex (for <schemaname>.<tablename>
patterns) for the option value then....

~~

e.g.1. Exclude certain tables:

// do NOT publish any tables of schemas s1,s2
CREATE PUBLICATION foo FOR ALL TABLES (exclude_match = '(s1\..*)|(s2\..*)');

// do NOT publish my secret tables (those called "mysecretXXX")
CREATE PUBLICATION foo FOR ALL TABLES (exclude_match = '(.*\.mysecret.*)');

~~

e.g.2. Only allow certain tables.

// ONLY publish my tables (those called "mytableXXX")
CREATE PUBLICATION foo FOR ALL TABLES (subset_match = '(.*\.mytable.*)');

// So following is equivalent to FOR ALL TABLES IN SCHEMA s1
CREATE PUBLICATION foo FOR ALL TABLES (subset_match = '(s1\..*)');

------
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Skipping schema changes in publication

От

"osumi.takamichi@fujitsu.com"

Дата:

26 апреля 2022 г., 06:02:46

On Thursday, April 21, 2022 12:15 PM vignesh C <vignesh21@gmail.com> wrote:
> Updated patch by changing the syntax to use EXCEPT instead of SKIP.
Hi


This is my review comments on the v2 patch.

(1) gram.y

I think we can make a unified function that merges
preprocess_alltables_pubobj_list with check_except_in_pubobj_list.

With regard to preprocess_alltables_pubobj_list,
we don't use the 2nd argument "location" in this function.

(2) create_publication.sgml

+  <para>
+   Create a publication that publishes all changes in all the tables except for
+   the changes of <structname>users</structname> and
+   <structname>departments</structname> table;

This sentence should end ":" not ";".

(3) publication.out & publication.sql

+-- fail - can't set except table to schema  publication
+ALTER PUBLICATION testpub_forschema SET EXCEPT TABLE testpub_tbl1;

There is one unnecessary space in the comment.
Kindly change from "schema  publication" to "schema publication".

(4) pg_dump.c & describe.c

In your first email of this thread, you explained this feature
is for PG16. Don't we need additional branch for PG16 ?

@@ -6322,6 +6328,21 @@ describePublications(const char *pattern)
                        }
                }

+               if (pset.sversion >= 150000)
+               {


@@ -4162,7 +4164,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
        /* Collect all publication membership info. */
        if (fout->remoteVersion >= 150000)
                appendPQExpBufferStr(query,
-                                                        "SELECT tableoid, oid, prpubid, prrelid, "
+                                                        "SELECT tableoid, oid, prpubid, prrelid, prexcept,"


(5) psql-ref.sgml

+        If <literal>+</literal> is appended to the command name, the tables,
+        except tables and schemas associated with each publication are shown as
+        well.

I'm not sure if "except tables" is a good description.
I suggest "excluded tables". This applies to the entire patch,
in case if this is reasonable suggestion.


Best Regards,
    Takamichi Osumi

Re: Skipping schema changes in publication

От

vignesh C

Дата:

27 апреля 2022 г., 12:50:11

On Tue, Apr 26, 2022 at 11:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, April 21, 2022 12:15 PM vignesh C <vignesh21@gmail.com> wrote:
> > Updated patch by changing the syntax to use EXCEPT instead of SKIP.
> Hi
>
>
> This is my review comments on the v2 patch.
>
> (1) gram.y
>
> I think we can make a unified function that merges
> preprocess_alltables_pubobj_list with check_except_in_pubobj_list.
>
> With regard to preprocess_alltables_pubobj_list,
> we don't use the 2nd argument "location" in this function.

Removed location and made a unified function.

> (2) create_publication.sgml
>
> +  <para>
> +   Create a publication that publishes all changes in all the tables except for
> +   the changes of <structname>users</structname> and
> +   <structname>departments</structname> table;
>
> This sentence should end ":" not ";".

Modified

> (3) publication.out & publication.sql
>
> +-- fail - can't set except table to schema  publication
> +ALTER PUBLICATION testpub_forschema SET EXCEPT TABLE testpub_tbl1;
>
> There is one unnecessary space in the comment.
> Kindly change from "schema  publication" to "schema publication".

Modified

> (4) pg_dump.c & describe.c
>
> In your first email of this thread, you explained this feature
> is for PG16. Don't we need additional branch for PG16 ?
>
> @@ -6322,6 +6328,21 @@ describePublications(const char *pattern)
>                         }
>                 }
>
> +               if (pset.sversion >= 150000)
> +               {
>
>
> @@ -4162,7 +4164,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
>         /* Collect all publication membership info. */
>         if (fout->remoteVersion >= 150000)
>                 appendPQExpBufferStr(query,
> -                                                        "SELECT tableoid, oid, prpubid, prrelid, "
> +                                                        "SELECT tableoid, oid, prpubid, prrelid, prexcept,"
>

Modified by adding a comment saying "FIXME: 150000 should be changed
to 160000 later for PG16."

> (5) psql-ref.sgml
>
> +        If <literal>+</literal> is appended to the command name, the tables,
> +        except tables and schemas associated with each publication are shown as
> +        well.
>
> I'm not sure if "except tables" is a good description.
> I suggest "excluded tables". This applies to the entire patch,
> in case if this is reasonable suggestion.

Modified it in most of the places where it was applicable. I felt the
usage was ok in a few places.

Thanks for the comments, the attached v3 patch has the changes for the same.

Regards.
Vignesh

Вложения

v3-0001-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch

RE: Skipping schema changes in publication

От

"osumi.takamichi@fujitsu.com"

Дата:

28 апреля 2022 г., 11:20:52

On Wednesday, April 27, 2022 9:50 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for the comments, the attached v3 patch has the changes for the same.
Hi

Thank you for updating the patch. Several minor comments on v3.

(1) commit message

The new syntax allows specifying schemas. For example:
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
OR
ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2;

We have above sentence, but it looks better
to make the description a bit more accurate.

Kindly change
From :
"The new syntax allows specifying schemas"
To :
"The new syntax allows specifying excluded relations"

Also, kindly change "OR" to "or",
because this description is not syntax.

(2) publication_add_relation

@@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
                ObjectIdGetDatum(pubid);
        values[Anum_pg_publication_rel_prrelid - 1] =
                ObjectIdGetDatum(relid);
+       values[Anum_pg_publication_rel_prexcept - 1] =
+               BoolGetDatum(pri->except);
+

        /* Add qualifications, if available */

It would be better to remove the blank line,
because with this change, we'll have two blank
lines in a row.

(3) pg_dump.h & pg_dump_sort.c

@@ -80,6 +80,7 @@ typedef enum
        DO_REFRESH_MATVIEW,
        DO_POLICY,
        DO_PUBLICATION,
+       DO_PUBLICATION_EXCEPT_REL,
        DO_PUBLICATION_REL,
        DO_PUBLICATION_TABLE_IN_SCHEMA,
        DO_SUBSCRIPTION

@@ -90,6 +90,7 @@ enum dbObjectTypePriorities
        PRIO_FK_CONSTRAINT,
        PRIO_POLICY,
        PRIO_PUBLICATION,
+       PRIO_PUBLICATION_EXCEPT_REL,
        PRIO_PUBLICATION_REL,
        PRIO_PUBLICATION_TABLE_IN_SCHEMA,
        PRIO_SUBSCRIPTION,
@@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] =
        PRIO_REFRESH_MATVIEW,           /* DO_REFRESH_MATVIEW */
        PRIO_POLICY,                            /* DO_POLICY */
        PRIO_PUBLICATION,                       /* DO_PUBLICATION */
+       PRIO_PUBLICATION_EXCEPT_REL,    /* DO_PUBLICATION_EXCEPT_REL */
        PRIO_PUBLICATION_REL,           /* DO_PUBLICATION_REL */
        PRIO_PUBLICATION_TABLE_IN_SCHEMA,       /* DO_PUBLICATION_TABLE_IN_SCHEMA */
        PRIO_SUBSCRIPTION                       /* DO_SUBSCRIPTION */

How about having similar order between
pg_dump.h and pg_dump_sort.c, like
we'll add DO_PUBLICATION_EXCEPT_REL
after DO_PUBLICATION_REL in pg_dump.h ?


(4) GetAllTablesPublicationRelations

+       /*
+        * pg_publication_rel and pg_publication_namespace  will only have except
+        * tables in case of all tables publication, no need to pass except flag
+        * to get the relations.
+        */
+       List       *exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
+

There is one unnecessary space in a comment
"...pg_publication_namespace  will only have...". Kindly remove it.

Then, how about diving the variable declaration and
the insertion of the return value of GetPublicationRelations ?
That might be aligned with other places in this file.

(5) GetTopMostAncestorInPublication


@@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
        foreach(lc, ancestors)
        {
                Oid                     ancestor = lfirst_oid(lc);
-               List       *apubids = GetRelationPublications(ancestor);
+               List       *apubids = GetRelationPublications(ancestor, false);
                List       *aschemaPubids = NIL;
+               List       *aexceptpubids;

                level++;

@@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
                else
                {
                        aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
-                       if (list_member_oid(aschemaPubids, puboid))
+                       aexceptpubids = GetRelationPublications(ancestor, true);
+                       if (list_member_oid(aschemaPubids, puboid) ||
+                               (puballtables && !list_member_oid(aexceptpubids, puboid)))
                        {
                                topmost_relid = ancestor;

It seems we forgot to call list_free for "aexceptpubids".


Best Regards,
    Takamichi Osumi

Re: Skipping schema changes in publication

От

Amit Kapila

Дата:

28 апреля 2022 г., 11:31:53

On Fri, Apr 22, 2022 at 9:39 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
>
> The feature seems to be useful especially when there are lots of
> schemas in a database. However, I don't quite like the syntax. Do we
> have 'SKIP' identifier in any of the SQL statements in SQL standard?
>

After discussion, it seems EXCEPT is a preferred choice and the same
is used in the other existing syntax as well.

> Can we think of adding skip_schema_list as an option, something like
> below?
>
> CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
> ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
> ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset
>

Yeah, that is also an option but it seems it will be difficult to
extend if want to support "all columns except (c1, ..)" for the column
list feature.

The other thing to decide is for which all objects we want to support
EXCEPT clause as it may not be useful for everything as indicated by
Peter E. and Euler. We have seen that Oracle supports "all columns
except (c1, ..)" [1] and MySQL seems to support for tables [2]. I
guess we should restrict ourselves to those two cases for now and then
we can extend it later for schemas if required or people agree. Also,
we should see the syntax we choose here should be extendable.

Another idea that occurred to me today for tables this is as follows:
1. Allow to mention except during create publication ... For All Tables.
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
2. Allow to Reset it. This new syntax will reset all objects in the
publications.
Alter Publication ... RESET;
3. Allow to add it to an existing publication
Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];

I think it can be extended in a similar way for schema syntax as well.

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.

Re: Skipping schema changes in publication

От

vignesh C

Дата:

29 апреля 2022 г., 11:42:59

On Thu, Apr 28, 2022 at 4:50 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Wednesday, April 27, 2022 9:50 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the comments, the attached v3 patch has the changes for the same.
> Hi
>
> Thank you for updating the patch. Several minor comments on v3.
>
> (1) commit message
>
> The new syntax allows specifying schemas. For example:
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> OR
> ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2;
>
> We have above sentence, but it looks better
> to make the description a bit more accurate.
>
> Kindly change
> From :
> "The new syntax allows specifying schemas"
> To :
> "The new syntax allows specifying excluded relations"
>
> Also, kindly change "OR" to "or",
> because this description is not syntax.

Slightly reworded and modified

> (2) publication_add_relation
>
> @@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
>                 ObjectIdGetDatum(pubid);
>         values[Anum_pg_publication_rel_prrelid - 1] =
>                 ObjectIdGetDatum(relid);
> +       values[Anum_pg_publication_rel_prexcept - 1] =
> +               BoolGetDatum(pri->except);
> +
>
>         /* Add qualifications, if available */
>
> It would be better to remove the blank line,
> because with this change, we'll have two blank
> lines in a row.

Modified

> (3) pg_dump.h & pg_dump_sort.c
>
> @@ -80,6 +80,7 @@ typedef enum
>         DO_REFRESH_MATVIEW,
>         DO_POLICY,
>         DO_PUBLICATION,
> +       DO_PUBLICATION_EXCEPT_REL,
>         DO_PUBLICATION_REL,
>         DO_PUBLICATION_TABLE_IN_SCHEMA,
>         DO_SUBSCRIPTION
>
> @@ -90,6 +90,7 @@ enum dbObjectTypePriorities
>         PRIO_FK_CONSTRAINT,
>         PRIO_POLICY,
>         PRIO_PUBLICATION,
> +       PRIO_PUBLICATION_EXCEPT_REL,
>         PRIO_PUBLICATION_REL,
>         PRIO_PUBLICATION_TABLE_IN_SCHEMA,
>         PRIO_SUBSCRIPTION,
> @@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] =
>         PRIO_REFRESH_MATVIEW,           /* DO_REFRESH_MATVIEW */
>         PRIO_POLICY,                            /* DO_POLICY */
>         PRIO_PUBLICATION,                       /* DO_PUBLICATION */
> +       PRIO_PUBLICATION_EXCEPT_REL,    /* DO_PUBLICATION_EXCEPT_REL */
>         PRIO_PUBLICATION_REL,           /* DO_PUBLICATION_REL */
>         PRIO_PUBLICATION_TABLE_IN_SCHEMA,       /* DO_PUBLICATION_TABLE_IN_SCHEMA */
>         PRIO_SUBSCRIPTION                       /* DO_SUBSCRIPTION */
>
> How about having similar order between
> pg_dump.h and pg_dump_sort.c, like
> we'll add DO_PUBLICATION_EXCEPT_REL
> after DO_PUBLICATION_REL in pg_dump.h ?
>

Modified

> (4) GetAllTablesPublicationRelations
>
> +       /*
> +        * pg_publication_rel and pg_publication_namespace  will only have except
> +        * tables in case of all tables publication, no need to pass except flag
> +        * to get the relations.
> +        */
> +       List       *exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> +
>
> There is one unnecessary space in a comment
> "...pg_publication_namespace  will only have...". Kindly remove it.
>
> Then, how about diving the variable declaration and
> the insertion of the return value of GetPublicationRelations ?
> That might be aligned with other places in this file.

Modified

> (5) GetTopMostAncestorInPublication
>
>
> @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
>         foreach(lc, ancestors)
>         {
>                 Oid                     ancestor = lfirst_oid(lc);
> -               List       *apubids = GetRelationPublications(ancestor);
> +               List       *apubids = GetRelationPublications(ancestor, false);
>                 List       *aschemaPubids = NIL;
> +               List       *aexceptpubids;
>
>                 level++;
>
> @@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
>                 else
>                 {
>                         aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> -                       if (list_member_oid(aschemaPubids, puboid))
> +                       aexceptpubids = GetRelationPublications(ancestor, true);
> +                       if (list_member_oid(aschemaPubids, puboid) ||
> +                               (puballtables && !list_member_oid(aexceptpubids, puboid)))
>                         {
>                                 topmost_relid = ancestor;
>
> It seems we forgot to call list_free for "aexceptpubids".

Modified

The attached v4 patch has the changes for the same.

Regards,
Vignesh

Вложения

v4-0001-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

03 мая 2022 г., 08:54:33

On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
...
> Another idea that occurred to me today for tables this is as follows:
> 1. Allow to mention except during create publication ... For All Tables.
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> 2. Allow to Reset it. This new syntax will reset all objects in the
> publications.
> Alter Publication ... RESET;
> 3. Allow to add it to an existing publication
> Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
>
> I think it can be extended in a similar way for schema syntax as well.
>

Consider if the user does
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT t3,t4;

What does it mean?
e.g. Is there only one exception list that is modified? Or did the ADD
ALL TABLES override all meaning of the original list?
e.g. Are we now skipping t1,t2,t3,t4, or are we now only skipping t3,t4?

~~~

Here is a similar example, where the ADD TABLE seems confusing to me
when it intersects with a prior EXCEPT
e.g.
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT t1,t2; // ok
ALTER PUBLICATION pub1 ADD TABLE t1; ???

What does it mean?
e.g. Does the explicit ADD TABLE override the original exception list?
e.g. Is t1 published now or should that ALTER have caused an error?

~~

It feels like there are too many tricky rules when using EXCEPT with
ALTER PUBLICATION. I guess complexities can be described in the
documentation but IMO it would be better if the ALTER syntax could be
unambiguous in the first place. So perhaps the rules should be more
restrictive (e.g. just disallow ALTER ... ADD any table that overlaps
the existing EXCEPT list ??)

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Skipping schema changes in publication

От

Amit Kapila

Дата:

04 мая 2022 г., 04:14:53

On Tue, May 3, 2022 at 2:24 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> ...
> > Another idea that occurred to me today for tables this is as follows:
> > 1. Allow to mention except during create publication ... For All Tables.
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > 2. Allow to Reset it. This new syntax will reset all objects in the
> > publications.
> > Alter Publication ... RESET;
> > 3. Allow to add it to an existing publication
> > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
> >
> > I think it can be extended in a similar way for schema syntax as well.
> >
>
> Consider if the user does
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT t3,t4;
>
> What does it mean?
> e.g. Is there only one exception list that is modified? Or did the ADD
> ALL TABLES override all meaning of the original list?
> e.g. Are we now skipping t1,t2,t3,t4, or are we now only skipping t3,t4?
>

This won't be allowed. We won't allow changing ALL TABLES publication
unless the user first performs RESET. This is the purpose of providing
the RESET variant.

> ~~~
>
> Here is a similar example, where the ADD TABLE seems confusing to me
> when it intersects with a prior EXCEPT
> e.g.
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT t1,t2; // ok
> ALTER PUBLICATION pub1 ADD TABLE t1; ???
>
> What does it mean?
> e.g. Does the explicit ADD TABLE override the original exception list?
> e.g. Is t1 published now or should that ALTER have caused an error?
>

This won't be allowed either. We don't allow to Add/Drop from All
Tables publication unless the user performs a RESET. This is true even
today except that we don't have a RESET syntax.

> ~~
>
> It feels like there are too many tricky rules when using EXCEPT with
> ALTER PUBLICATION. I guess complexities can be described in the
> documentation but IMO it would be better if the ALTER syntax could be
> unambiguous in the first place.
>

Agreed.

> So perhaps the rules should be more
> restrictive (e.g. just disallow ALTER ... ADD any table that overlaps
> the existing EXCEPT list ??)
>

I think the current proposal seems to be restrictive enough to avoid
any tricky issues. Do you see any other problem?


-- 
With Regards,
Amit Kapila.

Re: Skipping schema changes in publication

От

Peter Eisentraut

Дата:

04 мая 2022 г., 13:34:54

On 14.04.22 15:47, Peter Eisentraut wrote:
> That said, I'm not sure this feature is worth the trouble.  If this is 
> useful, what about "whole database except these schemas"?  What about 
> "create this database from this template except these schemas".  This 
> could get out of hand.  I think we should encourage users to group their 
> object the way they want and not offer these complicated negative 
> selection mechanisms.

Another problem in general with this "all except these" way of 
specifying things is that you need to track negative dependencies.

For example, assume you can't add a table to a publication unless it has 
a replica identity.  Now, if you have a publication p1 that says 
includes "all tables except t1", you now have to check p1 whenever a new 
table is created, even though the new table has no direct dependency 
link with p1.  So in more general cases, you would have to check all 
existing objects to see whether their specification is in conflict with 
the new object being created.

Now publications don't actually work that way, so it's not a real 
problem right now, but similar things could work like that.  So I think 
it's worth thinking this through a bit.

Re: Skipping schema changes in publication

От

Amit Kapila

Дата:

05 мая 2022 г., 03:50:36

On Wed, May 4, 2022 at 7:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 14.04.22 15:47, Peter Eisentraut wrote:
> > That said, I'm not sure this feature is worth the trouble.  If this is
> > useful, what about "whole database except these schemas"?  What about
> > "create this database from this template except these schemas".  This
> > could get out of hand.  I think we should encourage users to group their
> > object the way they want and not offer these complicated negative
> > selection mechanisms.
>
> Another problem in general with this "all except these" way of
> specifying things is that you need to track negative dependencies.
>
> For example, assume you can't add a table to a publication unless it has
> a replica identity.  Now, if you have a publication p1 that says
> includes "all tables except t1", you now have to check p1 whenever a new
> table is created, even though the new table has no direct dependency
> link with p1.  So in more general cases, you would have to check all
> existing objects to see whether their specification is in conflict with
> the new object being created.
>

Yes, I think we should avoid adding such negative dependencies. We
have carefully avoided such dependencies during row filter, column
list work where we don't try to perform DDL time verification.
However, it is not clear to me how this proposal is related to this
example or in general about tracking negative dependencies? AFAIR, we
currently have such a check while changing persistence of logged table
(logged to unlogged, see ATPrepChangePersistence) where we cannot
allow changing persistence if that relation is part of some
publication. But as per my understanding, this feature shouldn't add
any such new dependencies. I agree that we have to ensure that
existing checks shouldn't break due to this feature.

> Now publications don't actually work that way, so it's not a real
> problem right now, but similar things could work like that.  So I think
> it's worth thinking this through a bit.
>

This is a good point and I agree that we should be careful to not add
some new negative dependencies unless it is really required but I
can't see how this proposal will make it more prone to such checks.

-- 
With Regards,
Amit Kapila.

Re: Skipping schema changes in publication

От

Amit Kapila

Дата:

05 мая 2022 г., 04:12:32

On Thu, May 5, 2022 at 9:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 4, 2022 at 7:05 PM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > On 14.04.22 15:47, Peter Eisentraut wrote:
> > > That said, I'm not sure this feature is worth the trouble.  If this is
> > > useful, what about "whole database except these schemas"?  What about
> > > "create this database from this template except these schemas".  This
> > > could get out of hand.  I think we should encourage users to group their
> > > object the way they want and not offer these complicated negative
> > > selection mechanisms.
> >
> > Another problem in general with this "all except these" way of
> > specifying things is that you need to track negative dependencies.
> >
> > For example, assume you can't add a table to a publication unless it has
> > a replica identity.  Now, if you have a publication p1 that says
> > includes "all tables except t1", you now have to check p1 whenever a new
> > table is created, even though the new table has no direct dependency
> > link with p1.  So in more general cases, you would have to check all
> > existing objects to see whether their specification is in conflict with
> > the new object being created.
> >
>
> Yes, I think we should avoid adding such negative dependencies. We
> have carefully avoided such dependencies during row filter, column
> list work where we don't try to perform DDL time verification.
> However, it is not clear to me how this proposal is related to this
> example or in general about tracking negative dependencies?
>

I mean to say that even if we have such a restriction, it would apply
to "for all tables" or other publications as well. In your example,
consider one wants to Alter a table and remove its replica identity,
we have to check whether the table is part of any publication similar
to what we are doing for relation persistence in
ATPrepChangePersistence.

> AFAIR, we
> currently have such a check while changing persistence of logged table
> (logged to unlogged, see ATPrepChangePersistence) where we cannot
> allow changing persistence if that relation is part of some
> publication. But as per my understanding, this feature shouldn't add
> any such new dependencies. I agree that we have to ensure that
> existing checks shouldn't break due to this feature.
>
> > Now publications don't actually work that way, so it's not a real
> > problem right now, but similar things could work like that.  So I think
> > it's worth thinking this through a bit.
> >
>
> This is a good point and I agree that we should be careful to not add
> some new negative dependencies unless it is really required but I
> can't see how this proposal will make it more prone to such checks.
>

-- 
With Regards,
Amit Kapila.

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

06 мая 2022 г., 02:35:16

On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
...
>
> Another idea that occurred to me today for tables this is as follows:
> 1. Allow to mention except during create publication ... For All Tables.
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> 2. Allow to Reset it. This new syntax will reset all objects in the
> publications.
> Alter Publication ... RESET;
> 3. Allow to add it to an existing publication
> Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
>
> I think it can be extended in a similar way for schema syntax as well.
>

If the proposed syntax ALTER PUBLICATION ... RESET will reset all the
objects in the publication then there still seems simple way to remove
only the EXCEPT list but leave everything else intact. IIUC to clear
just the EXCEPT list would require a 2 step process - 1) ALTER ...
RESET then 2) ALTER ... ADD ALL TABLES again.

I was wondering if it might be useful to have a variation that *only*
resets the EXCEPT list, but still leaves everything else as-is?

So, instead of:
ALTER PUBLICATION pubname RESET

use a syntax something like:
ALTER PUBLICATION pubname RESET {ALL | EXCEPT}
or
ALTER PUBLICATION pubname RESET [EXCEPT]

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

От

vignesh C

Дата:

10 мая 2022 г., 03:38:48

On Fri, May 6, 2022 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> ...
> >
> > Another idea that occurred to me today for tables this is as follows:
> > 1. Allow to mention except during create publication ... For All Tables.
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > 2. Allow to Reset it. This new syntax will reset all objects in the
> > publications.
> > Alter Publication ... RESET;
> > 3. Allow to add it to an existing publication
> > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
> >
> > I think it can be extended in a similar way for schema syntax as well.
> >
>
> If the proposed syntax ALTER PUBLICATION ... RESET will reset all the
> objects in the publication then there still seems simple way to remove
> only the EXCEPT list but leave everything else intact. IIUC to clear
> just the EXCEPT list would require a 2 step process - 1) ALTER ...
> RESET then 2) ALTER ... ADD ALL TABLES again.
>
> I was wondering if it might be useful to have a variation that *only*
> resets the EXCEPT list, but still leaves everything else as-is?
>
> So, instead of:
> ALTER PUBLICATION pubname RESET

+1 for this syntax as this syntax can be extendable to include options
like (except/all/etc) later.
Currently we can support this syntax and can be extended later based
on the requirements.

The new feature will handle the various use cases based on the
behavior given below:
-- CREATE Publication with EXCEPT TABLE syntax
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; -- ok
Alter Publication pub1 RESET;
-- All Tables and options are reset similar to creating publication
without any publication object and publication option (create
publication pub1)
\dRp+ pub1
Publication pub2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------+------------+---------+---------+---------+-----------+----------
vignesh | f | t | t | t | t | f
(1 row)

-- Can add except table after reset of publication
ALTER PUBLICATION pub1 Add ALL TABLES EXCEPT TABLE t1,t2; -- ok

-- Cannot add except table without reset of publication
ALTER PUBLICATION pub1 Add EXCEPT TABLE t3,t4; -- not ok, need to be reset

Alter Publication pub1 RESET;
-- Cannot add table to ALL TABLES Publication
ALTER PUBLICATION pub1 Add ALL TABLES EXCEPT TABLE t1,t2, t3, t4,
TABLE t5; -- not ok, ALL TABLES Publications does not support
including of TABLES

Alter Publication pub1 RESET;
-- Cannot add table to ALL TABLES Publication
ALTER PUBLICATION pub1 Add ALL TABLES TABLE t1,t2; -- not ok, ALL
TABLES Publications does not support including of TABLES

-- Cannot add ALL TABLES IN SCHEMA to ALL TABLES Publication
ALTER PUBLICATION pub1 Add ALL TABLES ALL TABLES IN SCHEMA sch1, sch2;
-- not ok, ALL TABLES Publications does not support including of ALL
TABLES IN SCHEMA

-- Existing syntax should work as it is
CREATE PUBLICATION pub1 FOR TABLE t1;
ALTER PUBLICATION pub1 ADD TABLE t1; -- ok, existing ALTER should work
as it is (ok without reset)
ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1; -- ok, existing
ALTER should work as it is (ok without reset)
ALTER PUBLICATION pub1 DROP TABLE t1; -- ok, existing ALTER should
work as it is (ok without reset)
ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1; -- ok, existing
ALTER should work as it is (ok without reset)
ALTER PUBLICATION pub1 SET TABLE t1; -- ok, existing ALTER should work
as it is (ok without reset)
ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1; -- ok, existing
ALTER should work as it is (ok without reset)

I will modify the patch to handle this.

Regards,
Vignesh

Re: Skipping schema changes in publication

От

vignesh C

Дата:

12 мая 2022 г., 04:24:39

On Tue, May 10, 2022 at 9:08 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, May 6, 2022 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > ...
> > >
> > > Another idea that occurred to me today for tables this is as follows:
> > > 1. Allow to mention except during create publication ... For All Tables.
> > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > > 2. Allow to Reset it. This new syntax will reset all objects in the
> > > publications.
> > > Alter Publication ... RESET;
> > > 3. Allow to add it to an existing publication
> > > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
> > >
> > > I think it can be extended in a similar way for schema syntax as well.
> > >
> >
> > If the proposed syntax ALTER PUBLICATION ... RESET will reset all the
> > objects in the publication then there still seems simple way to remove
> > only the EXCEPT list but leave everything else intact. IIUC to clear
> > just the EXCEPT list would require a 2 step process - 1) ALTER ...
> > RESET then 2) ALTER ... ADD ALL TABLES again.
> >
> > I was wondering if it might be useful to have a variation that *only*
> > resets the EXCEPT list, but still leaves everything else as-is?
> >
> > So, instead of:
> > ALTER PUBLICATION pubname RESET
>
> +1 for this syntax as this syntax can be extendable to include options
> like (except/all/etc) later.
> Currently we can support this syntax and can be extended later based
> on the requirements.

The attached patch has the implementation for "ALTER PUBLICATION
pubname RESET". This command will reset the publication to default
state which includes resetting the publication options, setting ALL
TABLES option to false and dropping the relations and schemas that are
associated with the publication.

Regards,
Vignesh

Вложения

v1-0001-Add-RESET-option-to-Alter-Publication-which-will-.patch

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

13 мая 2022 г., 04:07:17

On Thu, May 12, 2022 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
...
> The attached patch has the implementation for "ALTER PUBLICATION
> pubname RESET". This command will reset the publication to default
> state which includes resetting the publication options, setting ALL
> TABLES option to false and dropping the relations and schemas that are
> associated with the publication.
>

Please see below my review comments for the v1-0001 (RESET) patch

======

1. Commit message

This patch adds a new RESET option to ALTER PUBLICATION which

Wording: "RESET option" -> "RESET clause"

~~~

2. doc/src/sgml/ref/alter_publication.sgml

+  <para>
+   The <literal>RESET</literal> clause will reset the publication to default
+   state which includes resetting the publication options, setting
+   <literal>ALL TABLES</literal> option to <literal>false</literal>
and drop the
+   relations and schemas that are associated with the publication.
   </para>

2a. Wording: "to default state" -> "to the default state"

2b. Wording: "and drop the relations..." -> "and dropping all relations..."

~~~

3. doc/src/sgml/ref/alter_publication.sgml

+   invoking user to be a superuser.  <literal>RESET</literal> of publication
+   requires invoking user to be a superuser. To alter the owner, you must also

Wording: "requires invoking user" -> "requires the invoking user"

~~~

4. doc/src/sgml/ref/alter_publication.sgml - Example

@@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL
TABLES IN SCHEMA marketing, sales;
    <structname>production_publication</structname>:
 <programlisting>
 ALTER PUBLICATION production_publication ADD TABLE users,
departments, ALL TABLES IN SCHEMA production;
+</programlisting></para>
+
+  <para>
+   Resetting the publication <structname>production_publication</structname>:
+<programlisting>
+ALTER PUBLICATION production_publication RESET;

Wording: "Resetting the publication" -> "Reset the publication"

~~~

5. src/backend/commands/publicationcmds.c

+ /* Check and reset the options */

IMO the code can just reset all these options unconditionally. I did
not see the point to check for existing option values first. I feel
the simpler code outweighs any negligible performance difference in
this case.

~~~

6. src/backend/commands/publicationcmds.c

+ /* Check and reset the options */

Somehow it seemed a pity having to hardcode all these default values
true/false in multiple places; e.g. the same is already hardcoded in
the parse_publication_options function.

To avoid multiple hard coded bools you could just call the
parse_publication_options with an empty options list. That would set
the defaults which you can then use:
values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert);

Alternatively, maybe there should be #defines to use instead of having
the scattered hardcoded bool defaults:
#define PUBACTION_DEFAULT_INSERT true
#define PUBACTION_DEFAULT_UPDATE true
etc

~~~

7. src/include/nodes/parsenodes.h

@@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction
 {
  AP_AddObjects, /* add objects to publication */
  AP_DropObjects, /* remove objects from publication */
- AP_SetObjects /* set list of objects */
+ AP_SetObjects, /* set list of objects */
+ AP_ReSetPublication /* reset the publication */
 } AlterPublicationAction;

Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication"

~~~

8. src/test/regress/sql/publication.sql

8a.
+-- Test for RESET PUBLICATION
SUGGESTED
+-- Tests for ALTER PUBLICATION ... RESET

8b.
+-- Verify that 'ALL TABLES' option is reset
SUGGESTED:
+-- Verify that 'ALL TABLES' flag is reset

8c.
+-- Verify that publish option and publish via root option is reset
SUGGESTED:
+-- Verify that publish options and publish_via_partition_root option are reset

8d.
+-- Verify that only superuser can execute RESET publication
SUGGESTED
+-- Verify that only superuser can reset a publication

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

От

vignesh C

Дата:

14 мая 2022 г., 13:32:54

On Fri, May 13, 2022 at 9:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, May 12, 2022 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> ...
> > The attached patch has the implementation for "ALTER PUBLICATION
> > pubname RESET". This command will reset the publication to default
> > state which includes resetting the publication options, setting ALL
> > TABLES option to false and dropping the relations and schemas that are
> > associated with the publication.
> >
>
> Please see below my review comments for the v1-0001 (RESET) patch
>
> ======
>
> 1. Commit message
>
> This patch adds a new RESET option to ALTER PUBLICATION which
>
> Wording: "RESET option" -> "RESET clause"

Modified

> ~~~
>
> 2. doc/src/sgml/ref/alter_publication.sgml
>
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to default
> +   state which includes resetting the publication options, setting
> +   <literal>ALL TABLES</literal> option to <literal>false</literal>
> and drop the
> +   relations and schemas that are associated with the publication.
>    </para>
>
> 2a. Wording: "to default state" -> "to the default state"

Modified

> 2b. Wording: "and drop the relations..." -> "and dropping all relations..."

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml
>
> +   invoking user to be a superuser.  <literal>RESET</literal> of publication
> +   requires invoking user to be a superuser. To alter the owner, you must also
>
> Wording: "requires invoking user" -> "requires the invoking user"

Modified

> ~~~
>
> 4. doc/src/sgml/ref/alter_publication.sgml - Example
>
> @@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL
> TABLES IN SCHEMA marketing, sales;
>     <structname>production_publication</structname>:
>  <programlisting>
>  ALTER PUBLICATION production_publication ADD TABLE users,
> departments, ALL TABLES IN SCHEMA production;
> +</programlisting></para>
> +
> +  <para>
> +   Resetting the publication <structname>production_publication</structname>:
> +<programlisting>
> +ALTER PUBLICATION production_publication RESET;
>
> Wording: "Resetting the publication" -> "Reset the publication"

Modified

> ~~~
>
> 5. src/backend/commands/publicationcmds.c
>
> + /* Check and reset the options */
>
> IMO the code can just reset all these options unconditionally. I did
> not see the point to check for existing option values first. I feel
> the simpler code outweighs any negligible performance difference in
> this case.

Modified

> ~~~
>
> 6. src/backend/commands/publicationcmds.c
>
> + /* Check and reset the options */
>
> Somehow it seemed a pity having to hardcode all these default values
> true/false in multiple places; e.g. the same is already hardcoded in
> the parse_publication_options function.
>
> To avoid multiple hard coded bools you could just call the
> parse_publication_options with an empty options list. That would set
> the defaults which you can then use:
> values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert);
>
> Alternatively, maybe there should be #defines to use instead of having
> the scattered hardcoded bool defaults:
> #define PUBACTION_DEFAULT_INSERT true
> #define PUBACTION_DEFAULT_UPDATE true
> etc

I have used #define for default value and used it in both the functions.

> ~~~
>
> 7. src/include/nodes/parsenodes.h
>
> @@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction
>  {
>   AP_AddObjects, /* add objects to publication */
>   AP_DropObjects, /* remove objects from publication */
> - AP_SetObjects /* set list of objects */
> + AP_SetObjects, /* set list of objects */
> + AP_ReSetPublication /* reset the publication */
>  } AlterPublicationAction;
>
> Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication"

Modified

> ~~~
>
> 8. src/test/regress/sql/publication.sql
>
> 8a.
> +-- Test for RESET PUBLICATION
> SUGGESTED
> +-- Tests for ALTER PUBLICATION ... RESET

Modified

> 8b.
> +-- Verify that 'ALL TABLES' option is reset
> SUGGESTED:
> +-- Verify that 'ALL TABLES' flag is reset

Modified

> 8c.
> +-- Verify that publish option and publish via root option is reset
> SUGGESTED:
> +-- Verify that publish options and publish_via_partition_root option are reset

Modified

> 8d.
> +-- Verify that only superuser can execute RESET publication
> SUGGESTED
> +-- Verify that only superuser can reset a publication

Modified

Thanks for the comments, the attached v5 patch has the changes for the
same. Also I have made the changes for SKIP Table based on the new
syntax, the changes for the same are available in
v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.

Regards,
Vignesh

On Mon, May 16, 2022 at 8:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the comments, the attached v5 patch has the changes for the same.
> > Also I have made the changes for SKIP Table based on the new syntax, the
> > changes for the same are available in
> > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
> Hi,
>
>
> Thank you for updating the patch.
> I'll share few minor review comments on v5-0001.
>
>
> (1) doc/src/sgml/ref/alter_publication.sgml
>
> @@ -73,12 +85,13 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
>     Adding a table to a publication additionally requires owning that table.
>     The <literal>ADD ALL TABLES IN SCHEMA</literal> and
>     <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the
> -   invoking user to be a superuser.  To alter the owner, you must also be a
> -   direct or indirect member of the new owning role. The new owner must have
> -   <literal>CREATE</literal> privilege on the database.  Also, the new owner
> -   of a <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN
> -   SCHEMA</literal> publication must be a superuser. However, a superuser can
> -   change the ownership of a publication regardless of these restrictions.
> +   invoking user to be a superuser.  <literal>RESET</literal> of publication
> +   requires the invoking user to be a superuser. To alter the owner, you must
> ...
>
>
> I suggest to combine the first part of your change with one existing sentence
> before your change, to make our description concise.
>
> FROM:
> "The <literal>ADD ALL TABLES IN SCHEMA</literal> and
> <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the
> invoking user to be a superuser.  <literal>RESET</literal> of publication
> requires the invoking user to be a superuser."
>
> TO:
> "The <literal>ADD ALL TABLES IN SCHEMA</literal>,
> <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and
> <literal>RESET</literal> of publication requires the invoking user to be a superuser."

Modified

>
> (2) typo
>
> +++ b/src/backend/commands/publicationcmds.c
> @@ -53,6 +53,13 @@
>  #include "utils/syscache.h"
>  #include "utils/varlena.h"
>
> +#define PUB_ATION_INSERT_DEFAULT true
> +#define PUB_ACTION_UPDATE_DEFAULT true
>
>
> Kindly change
> FROM:
> "PUB_ATION_INSERT_DEFAULT"
> TO:
> "PUB_ACTION_INSERT_DEFAULT"

Modified

>
> (3) src/test/regress/expected/publication.out
>
> +-- Verify that only superuser can reset a publication
> +ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
> +SET ROLE regress_publication_user2;
> +ALTER PUBLICATION testpub_reset RESET; -- fail
>
>
> We have "-- fail" for one case in this patch.
> On the other hand, isn't better to add "-- ok" (or "-- success") for
> other successful statements,
> when we consider the entire tests description consistency ?

We generally do not mention success comments for all the success cases
as that might be an overkill. I felt it is better to keep it as it is.
Thoughts?

The attached v6 patch has the changes for the same.

Regards,
Vignesh

On Thu, May 19, 2022 at 1:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Below are my review comments for v6-0001.
>
> ======
>
> 1. General.
>
> The patch failed 'publication' tests in the make check phase.
>
> Please add this work to the commit-fest so that the 'cfbot' can report
> such errors sooner.

Added commitfest entry

> ~~~
>
> 2. src/backend/commands/publicationcmds.c - AlterPublicationReset
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, publication relations and
> publication schemas.
> + */
> +static void
> +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt,
> + Relation rel, HeapTuple tup)
>
> SUGGESTION (Make the comment similar to the sgml text instead of
> repeating "publication" 4x !)
> /*
>  * Reset the publication options, set the ALL TABLES flag to false, and
>  * drop all relations and schemas that are associated with the publication.
>  */

Modified

> ~~~
>
> 3. src/test/regress/expected/publication.out
>
> make check failed. The diff is below:
>
> @@ -1716,7 +1716,7 @@
>  -- Verify that only superuser can reset a publication
>  ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
>  SET ROLE regress_publication_user2;
> -ALTER PUBLICATION testpub_reset RESET; -- fail
> +ALTER PUBLICATION testpub_reset RESET; -- fail - must be superuser
>  ERROR:  must be superuser to RESET publication
>  SET ROLE regress_publication_user;
>  DROP PUBLICATION testpub_reset;

It passed for me locally because the change was present in the 002
patch. I have moved the change to 001.

The attached v7 patch has the changes for the same.
[1] - https://commitfest.postgresql.org/38/3646/

Regards,
Vignesh

On Sat, May 21, 2022 at 11:06 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, May 20, 2022 at 11:23 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Below are my review comments for v6-0002.
> >
> > ======
> >
> > 1. Commit message.
> > The psql \d family of commands to display excluded tables.
> >
> > SUGGESTION
> > The psql \d family of commands can now display excluded tables.
>
> Modified
>
> > ~~~
> >
> > 2. doc/src/sgml/ref/alter_publication.sgml
> >
> > @@ -22,6 +22,7 @@ PostgreSQL documentation
> >   <refsynopsisdiv>
> >  <synopsis>
> >  ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> > ADD <replaceable class="parameter">publication_object</replaceable> [,
> > ...]
> > +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> > ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
> >
> > The "exception_object" font is wrong. Should look the same as
> > "publication_object"
>
> Modified
>
> > ~~~
> >
> > 3. doc/src/sgml/ref/alter_publication.sgml - Examples
> >
> > @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL
> > TABLES IN SCHEMA marketing, sales;
> >  </programlisting>
> >    </para>
> >
> > +  <para>
> > +   Alter publication <structname>production_publication</structname> to publish
> > +   all tables except <structname>users</structname> and
> > +   <structname>departments</structname> tables:
> > +<programlisting>
> > +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE
> > users, departments;
> > +</programlisting></para>
> >
> > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> > show TABLE keyword is optional.
>
> Modified
>
> > ~~~
> >
> > 4. doc/src/sgml/ref/create_publication.sgml
> >
> > An SGML tag error caused building the docs to fail. My fix was
> > previously reported [1].
>
> Modified
>
> > ~~~
> >
> > 5. doc/src/sgml/ref/create_publication.sgml
> >
> > @@ -22,7 +22,7 @@ PostgreSQL documentation
> >   <refsynopsisdiv>
> >  <synopsis>
> >  CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
> > -    [ FOR ALL TABLES
> > +    [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
> >
> > The "exception_object" font is wrong. Should look the same as
> > "publication_object"
>
> Modified
>
> > ~~~
> >
> > 6. doc/src/sgml/ref/create_publication.sgml - Examples
> >
> > @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR
> > TABLE users, departments, ALL TABL
> >  CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;
> >  </programlisting></para>
> >
> > +  <para>
> > +   Create a publication that publishes all changes in all the tables except for
> > +   the changes of <structname>users</structname> and
> > +   <structname>departments</structname> table:
> > +<programlisting>
> > +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments;
> > +</programlisting>
> > +  </para>
> > +
> >
> > 6a.
> > Typo: "FOR ALL TABLE" -> "FOR ALL TABLES"
>
> Modified
>
> > 6b.
> > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> > show TABLE keyword is optional.
>
> Modified
>
> > ~~~
> >
> > 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
> >
> > @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List
> > *ancestors, int *ancestor_level
> >   }
> >   else
> >   {
> > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> > - if (list_member_oid(aschemaPubids, puboid))
> > + List    *aschemapubids = NIL;
> > + List    *aexceptpubids = NIL;
> > +
> > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
> > + aexceptpubids = GetRelationPublications(ancestor, true);
> > + if (list_member_oid(aschemapubids, puboid) ||
> > + (puballtables && !list_member_oid(aexceptpubids, puboid)))
> >   {
> >
> > You could re-write this as multiple conditions instead of one. That
> > could avoid always assigning the 'aexceptpubids', so it might be a
> > more efficient way to write this logic.
>
> Modified
>
> > ~~~
> >
> > 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues
> >
> > +/*
> > + * Check if the publication has default values
> > + *
> > + * Check the following:
> > + * Publication is having default options
> > + *  Publication is not associated with relations
> > + *  Publication is not associated with schemas
> > + *  Publication is not set with "FOR ALL TABLES"
> > + */
> > +static bool
> > +CheckPublicationDefValues(HeapTuple tup)
> >
> > 8a.
> > Remove the tab. Replace with spaces.
>
> Modified
>
> > 8b.
> > It might be better if this comment order is the same as the logic order.
> > e.g.
> >
> > * Check the following:
> > *  Publication is not set with "FOR ALL TABLES"
> > *  Publication is having default options
> > *  Publication is not associated with schemas
> > *  Publication is not associated with relations
>
> Modified
>
> > ~~~
> >
> > 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
> >
> > +/*
> > + * Reset the publication.
> > + *
> > + * Reset the publication options, publication relations and
> > publication schemas.
> > + */
> > +static void
> > +AlterPublicationSetAllTables(Relation rel, HeapTuple tup)
> >
> > The function comment and the function name do not seem to match here;
> > something looks like a cut/paste error ??
>
> Modified
>
> > ~~~
> >
> > 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
> >
> > + /* set all tables option */
> > + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true);
> > + replaces[Anum_pg_publication_puballtables - 1] = true;
> >
> > SUGGEST (comment)
> > /* set all ALL TABLES flag */
>
> Modified
>
> > ~~~
> >
> > 11. src/backend/catalog/pg_publication.c - AlterPublication
> >
> > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> > AlterPublicationStmt *stmt)
> >   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
> >      stmt->pubname);
> >
> > + if (stmt->for_all_tables)
> > + {
> > + bool isdefault = CheckPublicationDefValues(tup);
> > +
> > + if (!isdefault)
> > + ereport(ERROR,
> > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> > default values",
> > +    stmt->pubname),
> > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> >
> > The errmsg should start with a lowercase letter.
>
> Modified
>
> > ~~~
> >
> > 12. src/backend/catalog/pg_publication.c - AlterPublication
> >
> > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> > AlterPublicationStmt *stmt)
> >   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
> >      stmt->pubname);
> >
> > + if (stmt->for_all_tables)
> > + {
> > + bool isdefault = CheckPublicationDefValues(tup);
> > +
> > + if (!isdefault)
> > + ereport(ERROR,
> > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> > default values",
> > +    stmt->pubname),
> > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> >
> > Example test:
> >
> > postgres=# create table t1(a int);
> > CREATE TABLE
> > postgres=# create publication p1 for table t1;
> > CREATE PUBLICATION
> > postgres=# alter publication p1 add all tables except t1;
> > 2022-05-20 14:34:49.301 AEST [21802] ERROR:  Setting ALL TABLES
> > requires publication "p1" to have default values
> > 2022-05-20 14:34:49.301 AEST [21802] HINT:  Use ALTER PUBLICATION ...
> > RESET to reset the publication
> > 2022-05-20 14:34:49.301 AEST [21802] STATEMENT:  alter publication p1
> > add all tables except t1;
> > ERROR:  Setting ALL TABLES requires publication "p1" to have default values
> > HINT:  Use ALTER PUBLICATION ... RESET to reset the publication
> > postgres=# alter publication p1 set all tables except t1;
> >
> > That error message does not quite match what the user was doing.
> > Firstly, they were adding the ALL TABLES, not setting it. Secondly,
> > all the values of the publication were already defaults (only there
> > was an existing table t1 in the publication). Maybe some minor changes
> > to the message wording can be a better reflect what the user is doing
> > here.
>
> Modified
>
> > ~~~
> >
> > 13. src/backend/parser/gram.y
> >
> > @@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE
> > aggregate_with_argtypes OWNER TO RoleSpec
> >   *
> >   * CREATE PUBLICATION name [WITH options]
> >   *
> > - * CREATE PUBLICATION FOR ALL TABLES [WITH options]
> > + * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]]
> > [WITH options]
> >
> > Comment should show the "TABLE" keyword is optional
>
> Modified
>
> > ~~~
> >
> > 14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable
> >
> > @@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const
> > PublicationRelInfo *pubrinfo)
> >
> >   appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
> >     fmtId(pubinfo->dobj.name));
> > +
> >   appendPQExpBuffer(query, " %s",
> >     fmtQualifiedDumpable(tbinfo));
> >
> > This additional whitespace seems unrelated to this patch
>
> Modified
>
> > ~~~
> >
> > 15. src/include/nodes/parsenodes.h
> >
> > 15a.
> > @@ -3999,6 +3999,7 @@ typedef struct PublicationTable
> >   RangeVar   *relation; /* relation to be published */
> >   Node    *whereClause; /* qualifications */
> >   List    *columns; /* List of columns in a publication table */
> > + bool except; /* except relation */
> >  } PublicationTable;
> >
> > Maybe the comment should be more like similar ones:
> > /* exclude the relation */
>
> Modified
>
> > 15b.
> > @@ -4007,6 +4008,7 @@ typedef struct PublicationTable
> >  typedef enum PublicationObjSpecType
> >  {
> >   PUBLICATIONOBJ_TABLE, /* A table */
> > + PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */
> >   PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */
> >   PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of
> >
> > Maybe the comment should be more like:
> > /* A table to be excluded */
>
> Modified
>
> > ~~~
> >
> > 16. src/test/regress/sql/publication.sql
> >
> > I did not see any test cases using EXCEPT when the optional TABLE
> > keyword is omitted.
>
> Added a test
>
> Thanks for the comments, the v7 patch attached at [1] has the changes
> for the same.
> [1] -
https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com

Attached v7 patch which fixes the buildfarm warning for an unused
warning in release mode as in  [1].
[1] - https://cirrus-ci.com/task/6220288017825792

Regards,
Vignesh

On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > release mode as in  [1].
> Hi, thank you for the patches.
>
>
> I'll share several review comments.
>
> For v7-0001.
>
> (1) I'll suggest some minor rewording.
>
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to the
> +   default state which includes resetting the publication options, setting
> +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> +   dropping all relations and schemas that are associated with the publication.
>
> My suggestion is
> "The RESET clause will reset the publication to the
> default state. It resets the publication operations,
> sets ALL TABLES flag to false and drops all relations
> and schemas associated with the publication."

I felt the existing looks better. I would prefer to keep it that way.

> (2) typo and rewording
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, setting ALL TABLES flag to false and drop
> + * all relations and schemas that are associated with the publication.
> + */
>
> The "setting" in this sentence should be "set".
>
> How about changing like below ?
> FROM:
> "Reset the publication options, setting ALL TABLES flag to false and drop
> all relations and schemas that are associated with the publication."
> TO:
> "Reset the publication operations, set ALL TABLES flag to false and drop
> all relations and schemas associated with the publication."

 I felt the existing looks better. I would prefer to keep it that way.

> (3) AlterPublicationReset
>
> Do we need to call CacheInvalidateRelcacheAll() or
> InvalidatePublicationRels() at the end of
> AlterPublicationReset() like AlterPublicationOptions() ?

CacheInvalidateRelcacheAll should be called if we change all tables
from true to false, else the cache will not be invalidated. Modified

>
> For v7-0002.
>
> (4)
>
> +       if (stmt->for_all_tables)
> +       {
> +               bool            isdefault = CheckPublicationDefValues(tup);
> +
> +               if (!isdefault)
> +                       ereport(ERROR,
> +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
>
>
> The errmsg string has three messages for user and is a bit long
> (we have two sentences there connected by 'and').
> Can't we make it concise and split it into a couple of lines for code readability ?
>
> I'll suggest a change below.
> FROM:
> "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALL
TABLESflag should not be set"
 
> TO:
> "adding ALL TABLES requires the publication defined not for ALL TABLES"
> "to have default publish actions without any associated tables/schemas"

Added errdetail and split it

> (5) typo
>
>    <varlistentry>
> +    <term><literal>EXCEPT TABLE</literal></term>
> +    <listitem>
> +     <para>
> +      This clause specifies a list of tables to exclude from the publication.
> +      It can only be used with <literal>FOR ALL TABLES</literal>.
> +     </para>
> +    </listitem>
> +   </varlistentry>
> +
>
> Kindly change
> FROM:
> This clause specifies a list of tables to exclude from the publication.
> TO:
> This clause specifies a list of tables to be excluded from the publication.
> or
> This clause specifies a list of tables excluded from the publication.

Modified

> (6) Minor suggestion for an expression change
>
>        Marks the publication as one that replicates changes for all tables in
> -      the database, including tables created in the future.
> +      the database, including tables created in the future. If
> +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> +      the changes for the specified tables.
>
>
> I'll suggest a minor rewording.
> FROM:
> ...exclude replicating the changes for the specified tables
> TO:
> ...exclude replication changes for the specified tables

I felt the existing is better.

> (7)
> (7-1)
>
> +/*
> + * Check if the publication has default values
> + *
> + * Check the following:
> + * a) Publication is not set with "FOR ALL TABLES"
> + * b) Publication is having default options
> + * c) Publication is not associated with schemas
> + * d) Publication is not associated with relations
> + */
> +static bool
> +CheckPublicationDefValues(HeapTuple tup)
>
>
> I think this header comment can be improved.
> FROM:
> Check the following:
> TO:
> Returns true if the publication satisfies all the following conditions:

Modified

> (7-2)
>
> b) should be changed as well
> FROM:
> Publication is having default options
> TO:
> Publication has the default publish operations

Changed it to "Publication is having default publication parameter values"

Thanks for the comments, the attached v8 patch has the changes for the same.

Regards,
Vignesh

On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v8 patch has the changes for the same.
>

AFAICS, the summary of this proposal is that we want to support
exclude of certain objects from publication with two kinds of
variants. The first variant is to add support to exclude specific
tables from ALL TABLES PUBLICATION. Without this feature, users need
to manually add all tables for a database even when she wants to avoid
only a handful of tables from the database say because they contain
sensitive information or are not required. We have seen that other
database like MySQL also provides similar feature [1] (See
REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
follows:

CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
or
ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;

This will allow us to publish all the tables in the current database
except t1 and t2. Now, I see that pg_dump has a similar option
provided by switch --exclude-table but that allows tables matching
patterns which is not the case here. I am not sure if we need a
similar variant here.

Then users will be allowed to reset the publication by:
ALTER PUBLICATION pub1 RESET;

This will reset the publication to the default state which includes
resetting the publication parameters, setting the ALL TABLES flag to
false, and dropping the relations and schemas that are associated with
the publication. I don't know if we want to go further with allowing
to RESET specific parameters and if so which parameters and what would
its syntax be?

The second variant is to add support to exclude certain columns of a
table while publishing a particular table. Currently, users need to
list all required columns' names even if they don't want to hide most
of the columns in the table (for example Create Publication pub For
Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
or other sensitive information of executives/employees but would like
to publish all other columns. I feel in such cases it will be a lot of
work for the user especially when the table has many columns. I see
that Oracle has a similar feature [2]. I think without this it will be
difficult for users to use this feature in some cases. The patch for
this is not proposed but I would imagine syntax for it to be something
like "Create Publication pub For Table t1 Except (c3)" and similar
variants for Alter Publication.

Have I missed anything?

Thoughts on the proposal/syntax would be appreciated?

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

--
With Regards,
Amit Kapila.

RE: Skipping schema changes in publication

От

"houzj.fnst@fujitsu.com"

Дата:

14 июня 2022 г., 03:40:42

On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v8 patch has the changes for the
> same.
> >
> 
> AFAICS, the summary of this proposal is that we want to support
> exclude of certain objects from publication with two kinds of
> variants. The first variant is to add support to exclude specific
> tables from ALL TABLES PUBLICATION. Without this feature, users need
> to manually add all tables for a database even when she wants to avoid
> only a handful of tables from the database say because they contain
> sensitive information or are not required. We have seen that other
> database like MySQL also provides similar feature [1] (See
> REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
> follows:
> 
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> or
> ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;
> 
> This will allow us to publish all the tables in the current database
> except t1 and t2. Now, I see that pg_dump has a similar option
> provided by switch --exclude-table but that allows tables matching
> patterns which is not the case here. I am not sure if we need a
> similar variant here.
> 
> Then users will be allowed to reset the publication by:
> ALTER PUBLICATION pub1 RESET;
> 
> This will reset the publication to the default state which includes
> resetting the publication parameters, setting the ALL TABLES flag to
> false, and dropping the relations and schemas that are associated with
> the publication. I don't know if we want to go further with allowing
> to RESET specific parameters and if so which parameters and what would
> its syntax be?
> 
> The second variant is to add support to exclude certain columns of a
> table while publishing a particular table. Currently, users need to
> list all required columns' names even if they don't want to hide most
> of the columns in the table (for example Create Publication pub For
> Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
> or other sensitive information of executives/employees but would like
> to publish all other columns. I feel in such cases it will be a lot of
> work for the user especially when the table has many columns. I see
> that Oracle has a similar feature [2]. I think without this it will be
> difficult for users to use this feature in some cases. The patch for
> this is not proposed but I would imagine syntax for it to be something
> like "Create Publication pub For Table t1 Except (c3)" and similar
> variants for Alter Publication.

I think the feature to exclude certain columns of a table would be useful.

In some production scenarios, we usually do not want to replicate
sensitive fields(column) in the table. Although we already can achieve
this by specify all replicated columns in the list[1], but that seems a
hard work when the table has hundreds of columns.

[1]
CREATE TABLE test(a int, b int, c int,..., sensitive text);
CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...);

In addition, it's not easy to maintain the column list like above. Because
we sometimes need to add new fields or delete fields due to business
needs. Every time we add a column(or delete a column in column list), we
need to update the column list.

If we support Except:
CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive);

We don't need to update the column list in most cases.

Thanks for "hametan" for providing the use case off-list.

Best regards,
Hou zj

Re: Skipping schema changes in publication

От

Amit Kapila

Дата:

16 июня 2022 г., 04:04:54

On Tue, Jun 14, 2022 at 9:10 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for the comments, the attached v8 patch has the changes for the
> > same.
> > >
> >
> > AFAICS, the summary of this proposal is that we want to support
> > exclude of certain objects from publication with two kinds of
> > variants. The first variant is to add support to exclude specific
> > tables from ALL TABLES PUBLICATION. Without this feature, users need
> > to manually add all tables for a database even when she wants to avoid
> > only a handful of tables from the database say because they contain
> > sensitive information or are not required. We have seen that other
> > database like MySQL also provides similar feature [1] (See
> > REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
> > follows:
> >
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > or
> > ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;
> >
> > This will allow us to publish all the tables in the current database
> > except t1 and t2. Now, I see that pg_dump has a similar option
> > provided by switch --exclude-table but that allows tables matching
> > patterns which is not the case here. I am not sure if we need a
> > similar variant here.
> >
> > Then users will be allowed to reset the publication by:
> > ALTER PUBLICATION pub1 RESET;
> >
> > This will reset the publication to the default state which includes
> > resetting the publication parameters, setting the ALL TABLES flag to
> > false, and dropping the relations and schemas that are associated with
> > the publication. I don't know if we want to go further with allowing
> > to RESET specific parameters and if so which parameters and what would
> > its syntax be?
> >
> > The second variant is to add support to exclude certain columns of a
> > table while publishing a particular table. Currently, users need to
> > list all required columns' names even if they don't want to hide most
> > of the columns in the table (for example Create Publication pub For
> > Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
> > or other sensitive information of executives/employees but would like
> > to publish all other columns. I feel in such cases it will be a lot of
> > work for the user especially when the table has many columns. I see
> > that Oracle has a similar feature [2]. I think without this it will be
> > difficult for users to use this feature in some cases. The patch for
> > this is not proposed but I would imagine syntax for it to be something
> > like "Create Publication pub For Table t1 Except (c3)" and similar
> > variants for Alter Publication.
>
> I think the feature to exclude certain columns of a table would be useful.
>
> In some production scenarios, we usually do not want to replicate
> sensitive fields(column) in the table. Although we already can achieve
> this by specify all replicated columns in the list[1], but that seems a
> hard work when the table has hundreds of columns.
>
> [1]
> CREATE TABLE test(a int, b int, c int,..., sensitive text);
> CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...);
>
> In addition, it's not easy to maintain the column list like above. Because
> we sometimes need to add new fields or delete fields due to business
> needs. Every time we add a column(or delete a column in column list), we
> need to update the column list.
>
> If we support Except:
> CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive);
>
> We don't need to update the column list in most cases.
>

Right, this is a valid point and I think it makes sense for me to
support such a feature for column list and also to exclude a
particular table(s) from the ALL TABLES publication.

Peter E., Euler, and others, do you have any objections to supporting
the above-mentioned two cases?

-- 
With Regards,
Amit Kapila.

Re: Skipping schema changes in publication

От

vignesh C

Дата:

08 августа 2022 г., 07:16:39

On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > release mode as in  [1].
> > Hi, thank you for the patches.
> >
> >
> > I'll share several review comments.
> >
> > For v7-0001.
> >
> > (1) I'll suggest some minor rewording.
> >
> > +  <para>
> > +   The <literal>RESET</literal> clause will reset the publication to the
> > +   default state which includes resetting the publication options, setting
> > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > +   dropping all relations and schemas that are associated with the publication.
> >
> > My suggestion is
> > "The RESET clause will reset the publication to the
> > default state. It resets the publication operations,
> > sets ALL TABLES flag to false and drops all relations
> > and schemas associated with the publication."
>
> I felt the existing looks better. I would prefer to keep it that way.
>
> > (2) typo and rewording
> >
> > +/*
> > + * Reset the publication.
> > + *
> > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > + * all relations and schemas that are associated with the publication.
> > + */
> >
> > The "setting" in this sentence should be "set".
> >
> > How about changing like below ?
> > FROM:
> > "Reset the publication options, setting ALL TABLES flag to false and drop
> > all relations and schemas that are associated with the publication."
> > TO:
> > "Reset the publication operations, set ALL TABLES flag to false and drop
> > all relations and schemas associated with the publication."
>
>  I felt the existing looks better. I would prefer to keep it that way.
>
> > (3) AlterPublicationReset
> >
> > Do we need to call CacheInvalidateRelcacheAll() or
> > InvalidatePublicationRels() at the end of
> > AlterPublicationReset() like AlterPublicationOptions() ?
>
> CacheInvalidateRelcacheAll should be called if we change all tables
> from true to false, else the cache will not be invalidated. Modified
>
> >
> > For v7-0002.
> >
> > (4)
> >
> > +       if (stmt->for_all_tables)
> > +       {
> > +               bool            isdefault = CheckPublicationDefValues(tup);
> > +
> > +               if (!isdefault)
> > +                       ereport(ERROR,
> > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> >
> >
> > The errmsg string has three messages for user and is a bit long
> > (we have two sentences there connected by 'and').
> > Can't we make it concise and split it into a couple of lines for code readability ?
> >
> > I'll suggest a change below.
> > FROM:
> > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and
ALLTABLES flag should not be set"
 
> > TO:
> > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > "to have default publish actions without any associated tables/schemas"
>
> Added errdetail and split it
>
> > (5) typo
> >
> >    <varlistentry>
> > +    <term><literal>EXCEPT TABLE</literal></term>
> > +    <listitem>
> > +     <para>
> > +      This clause specifies a list of tables to exclude from the publication.
> > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > +     </para>
> > +    </listitem>
> > +   </varlistentry>
> > +
> >
> > Kindly change
> > FROM:
> > This clause specifies a list of tables to exclude from the publication.
> > TO:
> > This clause specifies a list of tables to be excluded from the publication.
> > or
> > This clause specifies a list of tables excluded from the publication.
>
> Modified
>
> > (6) Minor suggestion for an expression change
> >
> >        Marks the publication as one that replicates changes for all tables in
> > -      the database, including tables created in the future.
> > +      the database, including tables created in the future. If
> > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > +      the changes for the specified tables.
> >
> >
> > I'll suggest a minor rewording.
> > FROM:
> > ...exclude replicating the changes for the specified tables
> > TO:
> > ...exclude replication changes for the specified tables
>
> I felt the existing is better.
>
> > (7)
> > (7-1)
> >
> > +/*
> > + * Check if the publication has default values
> > + *
> > + * Check the following:
> > + * a) Publication is not set with "FOR ALL TABLES"
> > + * b) Publication is having default options
> > + * c) Publication is not associated with schemas
> > + * d) Publication is not associated with relations
> > + */
> > +static bool
> > +CheckPublicationDefValues(HeapTuple tup)
> >
> >
> > I think this header comment can be improved.
> > FROM:
> > Check the following:
> > TO:
> > Returns true if the publication satisfies all the following conditions:
>
> Modified
>
> > (7-2)
> >
> > b) should be changed as well
> > FROM:
> > Publication is having default options
> > TO:
> > Publication has the default publish operations
>
> Changed it to "Publication is having default publication parameter values"
>
> Thanks for the comments, the attached v8 patch has the changes for the same.

The patch needed to be rebased on top of HEAD because of commit
"0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
version for the changes of the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От

vignesh C

Дата:

08 августа 2022 г., 09:23:28

On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> > >
> > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > release mode as in  [1].
> > > Hi, thank you for the patches.
> > >
> > >
> > > I'll share several review comments.
> > >
> > > For v7-0001.
> > >
> > > (1) I'll suggest some minor rewording.
> > >
> > > +  <para>
> > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > +   default state which includes resetting the publication options, setting
> > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > +   dropping all relations and schemas that are associated with the publication.
> > >
> > > My suggestion is
> > > "The RESET clause will reset the publication to the
> > > default state. It resets the publication operations,
> > > sets ALL TABLES flag to false and drops all relations
> > > and schemas associated with the publication."
> >
> > I felt the existing looks better. I would prefer to keep it that way.
> >
> > > (2) typo and rewording
> > >
> > > +/*
> > > + * Reset the publication.
> > > + *
> > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > + * all relations and schemas that are associated with the publication.
> > > + */
> > >
> > > The "setting" in this sentence should be "set".
> > >
> > > How about changing like below ?
> > > FROM:
> > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > all relations and schemas that are associated with the publication."
> > > TO:
> > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > all relations and schemas associated with the publication."
> >
> >  I felt the existing looks better. I would prefer to keep it that way.
> >
> > > (3) AlterPublicationReset
> > >
> > > Do we need to call CacheInvalidateRelcacheAll() or
> > > InvalidatePublicationRels() at the end of
> > > AlterPublicationReset() like AlterPublicationOptions() ?
> >
> > CacheInvalidateRelcacheAll should be called if we change all tables
> > from true to false, else the cache will not be invalidated. Modified
> >
> > >
> > > For v7-0002.
> > >
> > > (4)
> > >
> > > +       if (stmt->for_all_tables)
> > > +       {
> > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > +
> > > +               if (!isdefault)
> > > +                       ereport(ERROR,
> > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > >
> > >
> > > The errmsg string has three messages for user and is a bit long
> > > (we have two sentences there connected by 'and').
> > > Can't we make it concise and split it into a couple of lines for code readability ?
> > >
> > > I'll suggest a change below.
> > > FROM:
> > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and
ALLTABLES flag should not be set"
 
> > > TO:
> > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > "to have default publish actions without any associated tables/schemas"
> >
> > Added errdetail and split it
> >
> > > (5) typo
> > >
> > >    <varlistentry>
> > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > +    <listitem>
> > > +     <para>
> > > +      This clause specifies a list of tables to exclude from the publication.
> > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > +     </para>
> > > +    </listitem>
> > > +   </varlistentry>
> > > +
> > >
> > > Kindly change
> > > FROM:
> > > This clause specifies a list of tables to exclude from the publication.
> > > TO:
> > > This clause specifies a list of tables to be excluded from the publication.
> > > or
> > > This clause specifies a list of tables excluded from the publication.
> >
> > Modified
> >
> > > (6) Minor suggestion for an expression change
> > >
> > >        Marks the publication as one that replicates changes for all tables in
> > > -      the database, including tables created in the future.
> > > +      the database, including tables created in the future. If
> > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > +      the changes for the specified tables.
> > >
> > >
> > > I'll suggest a minor rewording.
> > > FROM:
> > > ...exclude replicating the changes for the specified tables
> > > TO:
> > > ...exclude replication changes for the specified tables
> >
> > I felt the existing is better.
> >
> > > (7)
> > > (7-1)
> > >
> > > +/*
> > > + * Check if the publication has default values
> > > + *
> > > + * Check the following:
> > > + * a) Publication is not set with "FOR ALL TABLES"
> > > + * b) Publication is having default options
> > > + * c) Publication is not associated with schemas
> > > + * d) Publication is not associated with relations
> > > + */
> > > +static bool
> > > +CheckPublicationDefValues(HeapTuple tup)
> > >
> > >
> > > I think this header comment can be improved.
> > > FROM:
> > > Check the following:
> > > TO:
> > > Returns true if the publication satisfies all the following conditions:
> >
> > Modified
> >
> > > (7-2)
> > >
> > > b) should be changed as well
> > > FROM:
> > > Publication is having default options
> > > TO:
> > > Publication has the default publish operations
> >
> > Changed it to "Publication is having default publication parameter values"
> >
> > Thanks for the comments, the attached v8 patch has the changes for the same.
>
> The patch needed to be rebased on top of HEAD because of commit
> "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> version for the changes of the same.

I had missed attaching one of the changes that was present locally.
The updated patch has the changes for the same.

Regards,
Vignesh

On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > > <osumi.takamichi@fujitsu.com> wrote:
> > > >
> > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > > release mode as in  [1].
> > > > Hi, thank you for the patches.
> > > >
> > > >
> > > > I'll share several review comments.
> > > >
> > > > For v7-0001.
> > > >
> > > > (1) I'll suggest some minor rewording.
> > > >
> > > > +  <para>
> > > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > > +   default state which includes resetting the publication options, setting
> > > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > > +   dropping all relations and schemas that are associated with the publication.
> > > >
> > > > My suggestion is
> > > > "The RESET clause will reset the publication to the
> > > > default state. It resets the publication operations,
> > > > sets ALL TABLES flag to false and drops all relations
> > > > and schemas associated with the publication."
> > >
> > > I felt the existing looks better. I would prefer to keep it that way.
> > >
> > > > (2) typo and rewording
> > > >
> > > > +/*
> > > > + * Reset the publication.
> > > > + *
> > > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > > + * all relations and schemas that are associated with the publication.
> > > > + */
> > > >
> > > > The "setting" in this sentence should be "set".
> > > >
> > > > How about changing like below ?
> > > > FROM:
> > > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > > all relations and schemas that are associated with the publication."
> > > > TO:
> > > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > > all relations and schemas associated with the publication."
> > >
> > >  I felt the existing looks better. I would prefer to keep it that way.
> > >
> > > > (3) AlterPublicationReset
> > > >
> > > > Do we need to call CacheInvalidateRelcacheAll() or
> > > > InvalidatePublicationRels() at the end of
> > > > AlterPublicationReset() like AlterPublicationOptions() ?
> > >
> > > CacheInvalidateRelcacheAll should be called if we change all tables
> > > from true to false, else the cache will not be invalidated. Modified
> > >
> > > >
> > > > For v7-0002.
> > > >
> > > > (4)
> > > >
> > > > +       if (stmt->for_all_tables)
> > > > +       {
> > > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > > +
> > > > +               if (!isdefault)
> > > > +                       ereport(ERROR,
> > > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > > >
> > > >
> > > > The errmsg string has three messages for user and is a bit long
> > > > (we have two sentences there connected by 'and').
> > > > Can't we make it concise and split it into a couple of lines for code readability ?
> > > >
> > > > I'll suggest a change below.
> > > > FROM:
> > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated
andALL TABLES flag should not be set"
 
> > > > TO:
> > > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > > "to have default publish actions without any associated tables/schemas"
> > >
> > > Added errdetail and split it
> > >
> > > > (5) typo
> > > >
> > > >    <varlistentry>
> > > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > > +    <listitem>
> > > > +     <para>
> > > > +      This clause specifies a list of tables to exclude from the publication.
> > > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > > +     </para>
> > > > +    </listitem>
> > > > +   </varlistentry>
> > > > +
> > > >
> > > > Kindly change
> > > > FROM:
> > > > This clause specifies a list of tables to exclude from the publication.
> > > > TO:
> > > > This clause specifies a list of tables to be excluded from the publication.
> > > > or
> > > > This clause specifies a list of tables excluded from the publication.
> > >
> > > Modified
> > >
> > > > (6) Minor suggestion for an expression change
> > > >
> > > >        Marks the publication as one that replicates changes for all tables in
> > > > -      the database, including tables created in the future.
> > > > +      the database, including tables created in the future. If
> > > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > > +      the changes for the specified tables.
> > > >
> > > >
> > > > I'll suggest a minor rewording.
> > > > FROM:
> > > > ...exclude replicating the changes for the specified tables
> > > > TO:
> > > > ...exclude replication changes for the specified tables
> > >
> > > I felt the existing is better.
> > >
> > > > (7)
> > > > (7-1)
> > > >
> > > > +/*
> > > > + * Check if the publication has default values
> > > > + *
> > > > + * Check the following:
> > > > + * a) Publication is not set with "FOR ALL TABLES"
> > > > + * b) Publication is having default options
> > > > + * c) Publication is not associated with schemas
> > > > + * d) Publication is not associated with relations
> > > > + */
> > > > +static bool
> > > > +CheckPublicationDefValues(HeapTuple tup)
> > > >
> > > >
> > > > I think this header comment can be improved.
> > > > FROM:
> > > > Check the following:
> > > > TO:
> > > > Returns true if the publication satisfies all the following conditions:
> > >
> > > Modified
> > >
> > > > (7-2)
> > > >
> > > > b) should be changed as well
> > > > FROM:
> > > > Publication is having default options
> > > > TO:
> > > > Publication has the default publish operations
> > >
> > > Changed it to "Publication is having default publication parameter values"
> > >
> > > Thanks for the comments, the attached v8 patch has the changes for the same.
> >
> > The patch needed to be rebased on top of HEAD because of commit
> > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> > version for the changes of the same.
>
> I had missed attaching one of the changes that was present locally.
> The updated patch has the changes for the same.

The patch needed to be rebased on top of HEAD because of a recent
commit. The updated v8 patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От

Ian Lawrence Barwick

Дата:

04 ноября 2022 г., 02:49:46

2022年8月19日(金) 2:41 vignesh C <vignesh21@gmail.com>:
>
> On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > > > <osumi.takamichi@fujitsu.com> wrote:
> > > > >
> > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > > > release mode as in  [1].
> > > > > Hi, thank you for the patches.
> > > > >
> > > > >
> > > > > I'll share several review comments.
> > > > >
> > > > > For v7-0001.
> > > > >
> > > > > (1) I'll suggest some minor rewording.
> > > > >
> > > > > +  <para>
> > > > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > > > +   default state which includes resetting the publication options, setting
> > > > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > > > +   dropping all relations and schemas that are associated with the publication.
> > > > >
> > > > > My suggestion is
> > > > > "The RESET clause will reset the publication to the
> > > > > default state. It resets the publication operations,
> > > > > sets ALL TABLES flag to false and drops all relations
> > > > > and schemas associated with the publication."
> > > >
> > > > I felt the existing looks better. I would prefer to keep it that way.
> > > >
> > > > > (2) typo and rewording
> > > > >
> > > > > +/*
> > > > > + * Reset the publication.
> > > > > + *
> > > > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > > > + * all relations and schemas that are associated with the publication.
> > > > > + */
> > > > >
> > > > > The "setting" in this sentence should be "set".
> > > > >
> > > > > How about changing like below ?
> > > > > FROM:
> > > > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > > > all relations and schemas that are associated with the publication."
> > > > > TO:
> > > > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > > > all relations and schemas associated with the publication."
> > > >
> > > >  I felt the existing looks better. I would prefer to keep it that way.
> > > >
> > > > > (3) AlterPublicationReset
> > > > >
> > > > > Do we need to call CacheInvalidateRelcacheAll() or
> > > > > InvalidatePublicationRels() at the end of
> > > > > AlterPublicationReset() like AlterPublicationOptions() ?
> > > >
> > > > CacheInvalidateRelcacheAll should be called if we change all tables
> > > > from true to false, else the cache will not be invalidated. Modified
> > > >
> > > > >
> > > > > For v7-0002.
> > > > >
> > > > > (4)
> > > > >
> > > > > +       if (stmt->for_all_tables)
> > > > > +       {
> > > > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > > > +
> > > > > +               if (!isdefault)
> > > > > +                       ereport(ERROR,
> > > > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/.... 
> > > > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > > > >
> > > > >
> > > > > The errmsg string has three messages for user and is a bit long
> > > > > (we have two sentences there connected by 'and').
> > > > > Can't we make it concise and split it into a couple of lines for code readability ?
> > > > >
> > > > > I'll suggest a change below.
> > > > > FROM:
> > > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated
andALL TABLES flag should not be set" 
> > > > > TO:
> > > > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > > > "to have default publish actions without any associated tables/schemas"
> > > >
> > > > Added errdetail and split it
> > > >
> > > > > (5) typo
> > > > >
> > > > >    <varlistentry>
> > > > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > > > +    <listitem>
> > > > > +     <para>
> > > > > +      This clause specifies a list of tables to exclude from the publication.
> > > > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > > > +     </para>
> > > > > +    </listitem>
> > > > > +   </varlistentry>
> > > > > +
> > > > >
> > > > > Kindly change
> > > > > FROM:
> > > > > This clause specifies a list of tables to exclude from the publication.
> > > > > TO:
> > > > > This clause specifies a list of tables to be excluded from the publication.
> > > > > or
> > > > > This clause specifies a list of tables excluded from the publication.
> > > >
> > > > Modified
> > > >
> > > > > (6) Minor suggestion for an expression change
> > > > >
> > > > >        Marks the publication as one that replicates changes for all tables in
> > > > > -      the database, including tables created in the future.
> > > > > +      the database, including tables created in the future. If
> > > > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > > > +      the changes for the specified tables.
> > > > >
> > > > >
> > > > > I'll suggest a minor rewording.
> > > > > FROM:
> > > > > ...exclude replicating the changes for the specified tables
> > > > > TO:
> > > > > ...exclude replication changes for the specified tables
> > > >
> > > > I felt the existing is better.
> > > >
> > > > > (7)
> > > > > (7-1)
> > > > >
> > > > > +/*
> > > > > + * Check if the publication has default values
> > > > > + *
> > > > > + * Check the following:
> > > > > + * a) Publication is not set with "FOR ALL TABLES"
> > > > > + * b) Publication is having default options
> > > > > + * c) Publication is not associated with schemas
> > > > > + * d) Publication is not associated with relations
> > > > > + */
> > > > > +static bool
> > > > > +CheckPublicationDefValues(HeapTuple tup)
> > > > >
> > > > >
> > > > > I think this header comment can be improved.
> > > > > FROM:
> > > > > Check the following:
> > > > > TO:
> > > > > Returns true if the publication satisfies all the following conditions:
> > > >
> > > > Modified
> > > >
> > > > > (7-2)
> > > > >
> > > > > b) should be changed as well
> > > > > FROM:
> > > > > Publication is having default options
> > > > > TO:
> > > > > Publication has the default publish operations
> > > >
> > > > Changed it to "Publication is having default publication parameter values"
> > > >
> > > > Thanks for the comments, the attached v8 patch has the changes for the same.
> > >
> > > The patch needed to be rebased on top of HEAD because of commit
> > > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> > > version for the changes of the same.
> >
> > I had missed attaching one of the changes that was present locally.
> > The updated patch has the changes for the same.
>
> The patch needed to be rebased on top of HEAD because of a recent
> commit. The updated v8 patch has the changes for the same.

Hi

cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.

[1] http://cfbot.cputube.org/patch_40_3646.log

Thanks

Ian Barwick

Re: Skipping schema changes in publication

От

vignesh C

Дата:

07 ноября 2022 г., 13:39:41

On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> Hi
>
> cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.
>
> [1] http://cfbot.cputube.org/patch_40_3646.log

Here is an updated patch which is rebased on top of HEAD.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От

Ian Lawrence Barwick

Дата:

16 ноября 2022 г., 04:04:18

2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
>
> On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> >
> > Hi
> >
> > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > currently underway, this would be an excellent time to update the patch.
> >
> > [1] http://cfbot.cputube.org/patch_40_3646.log
>
> Here is an updated patch which is rebased on top of HEAD.

Thanks for the updated patch.

While reviewing the patch backlog, we have determined that this patch adds
one or more TAP tests but has not added the test to the "meson.build" file.

To do this, locate the relevant "meson.build" file for each test and add it
in the 'tests' dictionary, which will look something like this:

  'tap': {
    'tests': [
      't/001_basic.pl',
    ],
  },

For some additional details please see this Wiki article:

  https://wiki.postgresql.org/wiki/Meson_for_patch_authors

For more information on the meson build system for PostgreSQL see:

  https://wiki.postgresql.org/wiki/Meson

Regards

Ian Barwick

Re: Skipping schema changes in publication

От

vignesh C

Дата:

16 ноября 2022 г., 10:05:31

On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
> >
> > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> > >
> > > Hi
> > >
> > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > > currently underway, this would be an excellent time to update the patch.
> > >
> > > [1] http://cfbot.cputube.org/patch_40_3646.log
> >
> > Here is an updated patch which is rebased on top of HEAD.
>
> Thanks for the updated patch.
>
> While reviewing the patch backlog, we have determined that this patch adds
> one or more TAP tests but has not added the test to the "meson.build" file.

Thanks, I have updated the meson.build to include the TAP test. The
attached patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От

vignesh C

Дата:

20 января 2023 г., 10:00:54

On Wed, 16 Nov 2022 at 15:35, vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> >
> > 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
> > >
> > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> > > >
> > > > Hi
> > > >
> > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > > > currently underway, this would be an excellent time to update the patch.
> > > >
> > > > [1] http://cfbot.cputube.org/patch_40_3646.log
> > >
> > > Here is an updated patch which is rebased on top of HEAD.
> >
> > Thanks for the updated patch.
> >
> > While reviewing the patch backlog, we have determined that this patch adds
> > one or more TAP tests but has not added the test to the "meson.build" file.
>
> Thanks, I have updated the meson.build to include the TAP test. The
> attached patch has the changes for the same.

The patch was not applying on top of HEAD, attached a rebased version.

Regards,
Vignesh

On Thu, 17 Apr 2025 at 09:12, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 16, 2025 at 8:22 AM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote:
> > >
> > > On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > As I did not see much interest from others, I'm withdrawing this patch
> > > > for now. But if there is any interest others in future, I would be
> > > > more than happy to work on this feature.
> > > >
> > >
> > > Just FYI, I noticed a use case for this patch in email [1]. Users would like to
> > > replicate all except a few columns having sensitive information. The challenge
> > > with current column list features is that adding new tables to columns would
> > > lead users to change the respective publications as well.
> > >
> > > [1] -
> > > https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE
> > > 270FDC3047540208%40qq.com
> >
> > BTW, I noticed that debezium, an open source distributed platform for change
> > data capture that replies on logical decoding, also support specifying the
> > column exclusion list[1]. So, this indicates that there could be some use cases
> > for this feature.
> >
>
> Thanks for sharing the link. I see that they support both the include
> and exclude lists for columns and tables.
>

Hi Hackers,

I see there is some interest in the functionality added by this patch.
I have rebased the patches in [1]. I saw a new column 'pubgencols' was
added in pg_publication in PG 18. So, I have modified v11-0001 to
RESET this as well.
I am also working on creating a patch to exclude columns in
publication as per suggestion in [2].

[1]: https://www.postgresql.org/message-id/CALDaNm3dWZCYDih55qTNAYsjCvYXMFv%3D46UsDWmfCnXMt3kPCg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1KRdAPC%3D5%3D7tQ1GW0cRwD%3DzaDMi%2BT4u_k4GxPhPY6e8BQ%40mail.gmail.com

Thanks and Regards,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

18 июня 2025 г., 04:04:31

On Tue, Jun 17, 2025 at 5:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
...
> I have attached a patch support excluding columns for publication.
>
> I have added a syntax: "FOR TABLE table_name EXCEPT (c1, c2, ..)"
> It can be used with CREATE or ALTER PUBLICATION.
>
> v12-0003 patch contains the changes for the same.
>

Hi Shlok,

I was interested in your new EXCEPT (col-list) so I had a quick look
at your patch v12-0003 (only looked at the documentation).

Below are some comments:

======

1. Chapter 29.5 "Column Lists".

I think new EXCEPT syntax needs a mention here as well.

======

doc/src/sgml/catalogs.sgml

2.
+      <para>
+       This is an array of values that indicates which table columns are
+       excluded from the publication.  For example, a value of
+       <literal>1 3</literal> would mean that the columns except the first and
+       the third columns are published.
+       A null value indicates that no columns are excluded from being
published.
+      </para></entry>

The sentence "A null value indicates that no columns are excluded from
being published" seems kind of confusing, because if the user has a
"normal" column-list  although nothing was being *explicitly* excluded
(using EXCEPT), any columns not named are *implicitly* excluded from
being published.

~

3.
TBH, I was wondering why a new catalog attribute was necessary...

Can't you simply re-use the existing attribute "prattrs" attribute.
e.g. let's just define negative means exclude.

e.g. a value of 1 3 means only the 1st and 3rd columns are published
e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
e.g. a value of null mean all columns are published

(mixes of negative and positive will not be possible)

======

doc/src/sgml/ref/alter_publication.sgml

4. ALTER PUBLICATION syntax

The syntax is currently written as:
TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
]

Can't this be more simply written as:
TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
] [ WHERE ( expression ) ] [, ... ]

~~~

5.
+  <para>
+   Alter publication <structname>mypublication</structname> to add table
+   <structname>users</structname> except column
+   <structname>security_pin</structname>:
+<programlisting>
+ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);

Those tags don't seem correct. e.g. "users" and "security_pin" are not
<structname> (???).

Perhaps, every other example here is wrong too and you just copied
them? Anyway, something here looks wrong to me.

======
doc/src/sgml/ref/create_publication.sgml

6. CREATE PUBLICATION syntax

The syntax is currently written as:
TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
]

Can't this be more simply written as:
TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
] [ WHERE ( expression ) ] [, ... ]

~~~

7.
+     <para>
+      When a column list is specified with EXCEPT, the named columns are not
+      replicated. The excluded column list cannot contain generated
columns. The
+      column list and excluded column list cannot be specified together.
+      Specifying a column list has no effect on <literal>TRUNCATE</literal>
+      commands.
+     </para>

IMO you don't need to say "The column list and excluded column list
cannot be specified together." because AFAIK the syntax makes that
impossible to do anyhow.

~~~

8.
+  <para>
+   Create a publication that publishes all changes for table
<structname>users</structname>
+   except changes for columns <structname>security_pin</structname>:
+<programlisting>
+CREATE PUBLICATION users_safe FOR TABLE users EXCEPT (security_pin);
+</programlisting>
+  </para>

8a.
Same review comment as previously -- Those tags don't seem correct.
e.g. "users" and "security_pin" are not <structname> (???).
Again, are all the other existing tags also wrong? Maybe a new thread
needed to address these?

~

8b.
Plural?  /except changes for columns/except changes for column/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

19 июня 2025 г., 09:41:48

On Wed, 18 Jun 2025 at 06:34, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Tue, Jun 17, 2025 at 5:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> ...
> > I have attached a patch support excluding columns for publication.
> >
> > I have added a syntax: "FOR TABLE table_name EXCEPT (c1, c2, ..)"
> > It can be used with CREATE or ALTER PUBLICATION.
> >
> > v12-0003 patch contains the changes for the same.
> >
>
> Hi Shlok,
>
> I was interested in your new EXCEPT (col-list) so I had a quick look
> at your patch v12-0003 (only looked at the documentation).
>
> Below are some comments:
>
> ======
>
> 1. Chapter 29.5 "Column Lists".
>
> I think new EXCEPT syntax needs a mention here as well.
>
Added

> ======
>
> doc/src/sgml/catalogs.sgml
>
> 2.
> +      <para>
> +       This is an array of values that indicates which table columns are
> +       excluded from the publication.  For example, a value of
> +       <literal>1 3</literal> would mean that the columns except the first and
> +       the third columns are published.
> +       A null value indicates that no columns are excluded from being
> published.
> +      </para></entry>
>
> The sentence "A null value indicates that no columns are excluded from
> being published" seems kind of confusing, because if the user has a
> "normal" column-list  although nothing was being *explicitly* excluded
> (using EXCEPT), any columns not named are *implicitly* excluded from
> being published.
>
I have removed this line.

> ~
>
> 3.
> TBH, I was wondering why a new catalog attribute was necessary...
>
> Can't you simply re-use the existing attribute "prattrs" attribute.
> e.g. let's just define negative means exclude.
>
> e.g. a value of 1 3 means only the 1st and 3rd columns are published
> e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> e.g. a value of null mean all columns are published
>
> (mixes of negative and positive will not be possible)
>

Currently I have added a new attribute 'prexcludeattrs' in
pg_publication_rel table. I used this approach because it will be
easier for user to get the exclude column list, in code no extra
processing is required to get the exclude column list.

For an approach to use negative numbers for exclude columns. I see an
advantage that we do not need to introduce a new column for
pg_publication_rel. But in code, each time we want to get a column
list or exclude column list we need an extra processing of 'prattrs'
columns. Also I don't see any existing catalog table using a negative
attribute for column list.

Based on above observations, I feel that the current is better.

Please correct me if I missed an advantage for the approach you suggested.

> ======
>
> doc/src/sgml/ref/alter_publication.sgml
>
> 4. ALTER PUBLICATION syntax
>
> The syntax is currently written as:
> TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
> EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
> ]
>
> Can't this be more simply written as:
> TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
> ] [ WHERE ( expression ) ] [, ... ]
>
> ~~~
Fixed

>
> 5.
> +  <para>
> +   Alter publication <structname>mypublication</structname> to add table
> +   <structname>users</structname> except column
> +   <structname>security_pin</structname>:
> +<programlisting>
> +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
>
> Those tags don't seem correct. e.g. "users" and "security_pin" are not
> <structname> (???).
>
> Perhaps, every other example here is wrong too and you just copied
> them? Anyway, something here looks wrong to me.
>
I saw different documents and usage of tags seems not well defined.
For example for table we are using tags in document
create_publication.sgml, update.sgml <structname> is used, in document
table.sgml, advanced.sgml <classname> is used, and in
logical-replication.sgml <literal>  is used. Similarly for column
names <structname>, <structfield> or <literal> are used in different
parts of the document.

I kept the changed tag to <structfield> for the column for this patch.
Do you have any suggestions?

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 6. CREATE PUBLICATION syntax
>
> The syntax is currently written as:
> TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
> EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
> ]
>
> Can't this be more simply written as:
> TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
> ] [ WHERE ( expression ) ] [, ... ]
>
> ~~~
Fixed

>
> 7.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. The excluded column list cannot contain generated
> columns. The
> +      column list and excluded column list cannot be specified together.
> +      Specifying a column list has no effect on <literal>TRUNCATE</literal>
> +      commands.
> +     </para>
>
> IMO you don't need to say "The column list and excluded column list
> cannot be specified together." because AFAIK the syntax makes that
> impossible to do anyhow.
>
Removed this line

> ~~~
>
> 8.
> +  <para>
> +   Create a publication that publishes all changes for table
> <structname>users</structname>
> +   except changes for columns <structname>security_pin</structname>:
> +<programlisting>
> +CREATE PUBLICATION users_safe FOR TABLE users EXCEPT (security_pin);
> +</programlisting>
> +  </para>
>
> 8a.
> Same review comment as previously -- Those tags don't seem correct.
> e.g. "users" and "security_pin" are not <structname> (???).
> Again, are all the other existing tags also wrong? Maybe a new thread
> needed to address these?
>
> ~
Same as point 5.
I also feel this should be addressed in a new thread.

> 8b.
> Plural?  /except changes for columns/except changes for column/
Fixed

Also in this patch I added displaying "EXCEPT (column_list)" for \dRp+
and \d table_name psql commands.

Thanks and Regards,
Shlok Kyal

On Fri, 20 Jun 2025 at 09:28, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> ...
> > > 3.
> > > TBH, I was wondering why a new catalog attribute was necessary...
> > >
> > > Can't you simply re-use the existing attribute "prattrs" attribute.
> > > e.g. let's just define negative means exclude.
> > >
> > > e.g. a value of 1 3 means only the 1st and 3rd columns are published
> > > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> > > e.g. a value of null mean all columns are published
> > >
> > > (mixes of negative and positive will not be possible)
> > >
> >
> > Currently I have added a new attribute 'prexcludeattrs' in
> > pg_publication_rel table. I used this approach because it will be
> > easier for user to get the exclude column list, in code no extra
> > processing is required to get the exclude column list.
> >
> > For an approach to use negative numbers for exclude columns. I see an
> > advantage that we do not need to introduce a new column for
> > pg_publication_rel. But in code, each time we want to get a column
> > list or exclude column list we need an extra processing of 'prattrs'
> > columns. Also I don't see any existing catalog table using a negative
> > attribute for column list.
> >
> > Based on above observations, I feel that the current is better.
> >
> > Please correct me if I missed an advantage for the approach you suggested.
> >
>
> OK. Maybe using negative numbers was a bridge too far...
>
> But IMO it is not good to have 2 separate attributes for the lists.
> Doing so implies they can coexist, but that is not true. I felt there
> are not really 2 "kinds" of columns list anyway -- there is just a
> "column list" which defines columns that are either included or
> excluded from the publication determined by EXCEPT.
>
> Having  dual lists gets weird/confusing to describe them -- you end up
> continually having to refer to the other one to clarify behaviour.
>
> e.g. Does 'prattrs' value NULL mean publish everything? Well, no...
> that depends if there is a non null 'prexcludeattrs'
> e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well,
> no... that depends if there is a non null 'prattrs'
>
> Furthermore, all the code is doubling up referring to "column list"
> and "exclude column list"  -- code / docs / comments / error messages.
> There are quite a lot of places the patch touches that I thought were
> not really needed if you don't have 2 different kinds of column-lists.
>
> To summarise, I felt it would be better to just keep the existing
> 'prattrs' as the one-and-only column list, but add another BOOLEAN
> attribute to flag whether 'prattrs' columns should be included or
> excluded.
>
> prattrs;   prattrs_exclude;  Means
> --------------------------------------------
> 1 2 3     f                          only cols 1,2,3 will be published
> 4 5 6     t                          only cols 4,5,6 will NOT be published
> null       f                          all cols are published (flag is ignored)
> null       t                          all cols are published (flag is ignored)
>

I agree with your point and also it would be a better approach. In
patch 0001 an column 'prexcept' was added in pg_publication_rel. We
use that only for publication with all tables. I have reused this
column for patch 0003. If publication is not for all tables and the
'prexcept' flag is true, it implies that the columns in 'prattrs' are
to be excluded from being published. I have included the changes for
it in v14-0003 patch.

> > > 5.
> > > +  <para>
> > > +   Alter publication <structname>mypublication</structname> to add table
> > > +   <structname>users</structname> except column
> > > +   <structname>security_pin</structname>:
> > > +<programlisting>
> > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
> > >
> > > Those tags don't seem correct. e.g. "users" and "security_pin" are not
> > > <structname> (???).
> > >
> > > Perhaps, every other example here is wrong too and you just copied
> > > them? Anyway, something here looks wrong to me.
> > >
> > I saw different documents and usage of tags seems not well defined.
> > For example for table we are using tags in document
> > create_publication.sgml, update.sgml <structname> is used, in document
> > table.sgml, advanced.sgml <classname> is used, and in
> > logical-replication.sgml <literal>  is used. Similarly for column
> > names <structname>, <structfield> or <literal> are used in different
> > parts of the document.
> >
> > I kept the changed tag to <structfield> for the column for this patch.
> > Do you have any suggestions?
>
> No, for this patch I think it is best that you just follow nearby code
> (as you are already doing). I plan to raise another thread to ask what
> are the guidelines for this  sort of markup which is currently used
> inconsistently in different places.
Thanks for starting a thread for it.

>
> //////////
>
> Below are a few more review comments for v13-0003
>
> ======
> Commit message
>
> 1.
> Typo /THe/The/
>
> ~~~
Fixed

> 2.
> The new syntax allows specifying excluded column list when creating or
> altering a publication. For example:
> CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list)
> or
> ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list)
>
> ~
>
> I felt since you say these "For example:" it would be better to give
> real examples.
> e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)".
>
Fixed

> ~~~
>
> 3.
> Typo /family of command/family of commands/
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 4.
> I am not sure that it was a good idea to be making a new term called
> an "exclude column list"... because in introduces a new concept of
> something that sounds like it is a different kind of list, and now you
> have to keep referring everywhere to both to "column list" versus
> "exclude column list". All the doubling up add more complication I
> think.
>
> IMO really there is just a "column list". Whether that list is for
> exclusion or not just depends on the presence of EXCEPT. So I felt
> maybe all places mentioning "exclude column list" could be rephrased.
>
> ======
> src/backend/catalog/pg_publication.c
>
> 5.
> +/*
> + * Returns true if the relation has exluded column list associated with the
> + * publication, false otherwise.
> + *
> + * If a exclude column list is found, the corresponding bitmap is returned
> + * through the cols parameter, if provided. The bitmap is constructed
> within the
> + * given memory context (mcxt).
> + */
> +
>
> Typo /exluded column/an excluded column/
> Typo /exclude column list/excluded column list/
>
updated the comment according to latest implementation

> ~~~
>
> 6.
> +/*
> + * pub_exclude_collist_validate
> + * Process and validate the 'excluded columns' list and ensure the columns
> + * are all valid to exclude from publication.  Checks for and raises an
> + * ERROR for any unknown columns, system columns, duplicate columns, or
> + * generated columns.
> + *
>
> Why can't you exclude generated columns?
>
> e.g. Maybe PUBLICATION says publish_generated_columns=stored and there
> are 100s of such columns, but the user just wants to exclude one of
> them. Why say they cannot do that? Hmm. Perhaps this is being already
> handled elsewhere, in which case this comment still seems misleading.
>
I have removed this restriction. Now we can specify stored generated
columns in EXCEPT (column_list) when we use the
'publish_generated_columns' flag.

> ======
> src/backend/commands/publicationcmds.c
>
> 7.
> + * With REPLICA IDENTITY FULL, no column list and no excluded column
> + * list is allowed.
>
> Really, just "no column list is allowed." same as it said before.
>
> ======
Fixed

Thanks and Regards,
Shlok Kyal

On Thu, 26 Jun 2025 at 09:06, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Below are some review comments for v14-0003
>
> ======
> 1. GENERAL
>
> Since the new syntax uses EXCEPT, then, in my opinion, you should try
> to use that same term where possible when describing things. I
> understand it is hard to do this in text and I agree often it makes
> more sense to say "exclude" columns etc, but OTOH in the code there
> are lots of places where you could have named vars/params differently:
> e.g. 'except_collist' instead of 'exclude_collist' might have been
> better.
>
Fixed the variable names.

> ======
> Commit message
>
> 2.
> Column list specifed with EXCEPT is stored in column "prattrs" in table
> "pg_publication_rel" and also column "prexcept" is set to "true", to maintain
> the column list that user wants to exclude from the publication.
>
> ~
>
> That paragraph could do with some rewording. For example, AFAIK,
> "prattrs" is for all column lists -- not just except col-lists, but
> the way it is described here sounds different.
>
> Also, /specifed/specified/
>
Reworded the paragraph

> ======
> doc/src/sgml/catalogs.sgml
>
> 3. (52.42. pg_publication_rel)
>
>        <para>
> -       True if the relation must be excluded
> +       True if the relation or column list must be excluded. If publication is
> +       created <literal>FOR ALL TABLES</literal> and it is specified as true,
> +       the relation should be excluded. Else if it is true the columns in
> +       <literal>prattrs</literal> should be excluded from being published.
>        </para></entry>
>
> I felt this could be expressed more simply without mentioning anything
> about FOR ALL TABLES.
>
> SUGGESTION
> True if the column list or relation must be excluded from publication.
> If a column list is specified in <literal>prattrs</literal>, then
> exclude only those columns. If <literal>prattrs</literal> is NULL,
> then exclude the entire relation.
>
Fixed

> ======
> doc/src/sgml/logical-replication.sgml
>
> 4. (29.5. Column Lists)
>
>    <para>
> -   Each publication can optionally specify which columns of each table are
> -   replicated to subscribers. The table on the subscriber side must have at
> -   least all the columns that are published. If no column list is specified,
> -   then all columns on the publisher are replicated.
> +   Each publication can optionally specify which columns of each
> table should be
> +   replicated or excluded from replication. On the subscriber side, the table
> +   must include at least all the columns that are published. If no column list
> +   is provided, all columns from the publisher are replicated by default.
>     See <xref linkend="sql-createpublication"/> for details on the syntax.
>    </para>
>
> I felt this patch may have changed too much text. IMO, you only needed
> to say "... are replicated or excluded from replication.". The other
> changes did not seem necessary.
>
> ~~~
Fixed

> 5.
>    <para>
> -   If no column list is specified, any columns added to the table later are
> -   automatically replicated. This means that having a column list which names
> -   all columns is not the same as having no column list at all.
> +   If no column list or a column list with EXCEPT is specified, any columns
> +   added to the table later are automatically replicated. This means
> that having
> +   a column list which names all columns is not the same as having no
> +   column list at all. If an column list is specified, any columns added to the
> +   table later are automatically replicated.
>    </para>
>
> 5a.
> "This means that having a column list which names all columns is not
> the same as having no column list at all." -- That note does not make
> sense when you say EXCEPT. I think some rewording is needed here.
>
Fixed

> ~
>
> 5b.
> "If an column list is specified, any columns added to the table later
> are automatically replicated.".
>
> This made no sense -- some words missing?
>
This change was done by mistake. Removed it.

> ~~~
>
> 6.
>     Generated columns can also be specified in a column list. This allows
>     generated columns to be published, regardless of the publication parameter
>     <link linkend="sql-createpublication-params-with-publish-generated-columns">
> -   <literal>publish_generated_columns</literal></link>. See
> -   <xref linkend="logical-replication-gencols"/> for details.
> +   <literal>publish_generated_columns</literal></link>. Generated columns can
> +   be included in column list specified with EXCEPT clause if publication
> +   parameter
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> is not set to
> +   <literal>none</literal>. Specified generated columns will not be published.
> +   See <xref linkend="logical-replication-gencols"/> for details.
>    </para>
>
> I am not so sure about this. It seemed overly strict to me.
>
> Why can't it simply say:
> "Generated columns can also be specified in a column list. This allows
> generated columns to be published or excluded, regardless of the
> publication parameter..."
>
> Specifically, I don't know why you need to say:
> Generated columns can be included in column list specified with EXCEPT
> clause if publication parameter publish_generated_columns is not set
> to none. Specified generated columns will not be published.
>
> IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named
> cols. So if param is "stored", then the named cols will be excluded.
> OTOH, if param is "none" then all generated cols will be excluded
> anyway, so why not just allow the EXCEPT (gencol,gencol2) here as
> well, because the result will be the same.
>
>
I have removed this change. And allowed specifying generated columns
in EXCEPT column list as well irrespective of value of
‘publish_generated_columns’.

> ~~~
>
> 7. (29.5.1. Examples)
>
>     <para>
> -    Create a table <literal>t1</literal> to be used in the following example.
> +    Create tables <literal>t1</literal>, <literal>t2</literal> to be
> used in the
> +    following example.
>
> /Create tables t1, t2/Create tables t1 and t2/
>
Fixed

> ~~~
>
> 8.
>     <para>
>      Create a publication <literal>p1</literal>. A column list is defined for
> -    table <literal>t1</literal> to reduce the number of columns that will be
> -    replicated. Notice that the order of column names in the column list does
> -    not matter.
> +    table <literal>t1</literal> and a column list is defined for table
> +    <literal>t2</literal> with EXCEPT clause to reduce the number of
> columns that will be
> +    replicated. Notice that the order of column names in the column
> lists does not matter.
>
> BEFORE
> A column list is defined for table t1 and a column list is defined for
> table t2...
>
> SUGGESTION (added comma, etc.)
> A column list is defined for table t1, and another column list is
> defined for table t2...
>
Fixed

> ~~~
>
> 9.
> The final example still says:
> "Only data from the column list of publication p1 is replicated."
>
> That doesn't seem quite appropriate now that you also have an EXCEPT
> column list.
>
> SUGGESTION:
> Only data specified by the column lists of publication p1 is replicated.
>
Fixed

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 10.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands.
> +     </para>
>
> I felt that to be clearer the preceding paragraph should be changed as follows:
>
> /When a column list is specified, only the named columns are
> replicated./When a column list without EXCEPT is specified, only the
> named columns are replicated./
>
Fixed

> ~~~
>
> 11. CREATE PUBLICATION (NOTES section)
>
> 11a.
> The NOTES talk about replica identity columns -- should you mention EXCEPT here?
>
Added notes for EXCEPT

> ~
>
> 11b.
> The NOTES talk about generated columns -- should you mention EXCEPT here?
>
I felt it is not needed.

> ======
> src/backend/catalog/pg_publication.c
>
> 12. check_and_fetch_column_list
>
> + if (!isnull)
> + except = DatumGetBool(cfdatum);
> +
> + *except_columns = except && !pub->alltables;
>
> AFAICT, you can Assert(!pub->alltables) because you already checked
> that earlier up front.
> So you don't need 'except' var either. Just assign *except_cols up
> front and then overwrite it later if true.
>
> SUGGESTION:
>
> *except_cols = false;
>
> if (pub->alltables)
>   return false;
> ...
> if (!isnull)
>  *except_cols = DatumGetBool(cfdatum);
>
Fixed

> ~~~
>
> 13. publication_add_relation
>
>   /* Validate and translate column names into a Bitmapset of attnums. */
> - attnums = pub_collist_validate(pri->relation, pri->columns);
> + attnums = pub_collist_validate(pri->relation, pri->columns,
> +    pri->except && !pub->alltables,
> +    pub->pubgencols_type);
>
>
> I am wondering why we are even calling a function to validate column
> lists if pub->alltables was true. AFAIK, that combination of
> column-lists and FOR ALL TABLES is not even possible, so the code
> seems strange.
>
Fixed

> ~~~
>
> 14. pub_exclude_collist_validate
> .
> + /*
> + * Check if column list specified with EXCEPT have any stored
> + * generated column and 'publish_generated_columns' is not set to
> + * 'stored'.
> + */
> + if (except_columns &&
> + TupleDescAttr(tupdesc, attnum - 1)->attgenerated ==
> ATTRIBUTE_GENERATED_STORED &&
> + pubgencols_type != PUBLISH_GENCOLS_STORED)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> + errmsg("cannot use stored generated column \"%s\" in publication
> column list specified with EXCEPT when \"%s\" set to \"%s\"",
> +    colname, "publish_generated_columns", "stored"));
>
> As mentioned in the above DOCS comments, I was having doubts about why
> we have this error.
>
> If the parameter says "none", then generated columns will not be
> replicated, so why should we care if the user also says
> EXCEPT(gencol1,gencol2). Either way, the result will be the same; the
> generated column will not be published.
>
Removed this restriction.

> ~~~
>
> 15. GetRelationPublications
>
>   {
>   HeapTuple tup = &pubrellist->members[i]->tuple;
>   Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
> + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
> + bool is_table_excluded = ((Form_pg_publication)
> GETSTRUCT(pubtup))->puballtables &&
> + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept;
>
> - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
> + if (except_flag == is_table_excluded)
>   result = lappend_oid(result, pubid);
> +
> + ReleaseS
>
>
> I'm not 100% sure you need the additional 'pubtup'... Can't you just
> look at the "prattrs" field to see if a column-list was specified? If
> "prattrs" is null and "prexcept" is true, isn't that the same
> combination as what you are looking for here?
>
Yes, we can use this combination as well. Fixed it in latest patch.

> ~~~
>
> 16. pg_get_publication_tables
>
> + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +    Anum_pg_publication_rel_prattrs,
> +    &(nulls[2]));
> +
> + /* if column list is specified with EXCEPT */
> + if (!pub->alltables && except)
> + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL);
> + else
> + values[2] = columnsDatum;
>
> 16a.
> Something seems fishy here. Isn't there a pathway where you missed
> assigning value[2] to anything?
>
Modified this change.

> ~
>
> 16b.
> Also, I feel there should be some other boolean variable used here
> instead of checking bot (!pub->alltables && except) in multiple
> places.
>
Fixed
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 17. RelationSyncEntry
> +
> + /* Indicate if no column is included in the publication */
> + bool no_cols_published;
>
> Maybe this can have a more explanatory comment to explain why it is needed?
>
Fixed

> ~~~
>
> 18. check_and_init_gencol
>
> + bool found = false;
> + bool except_columns = false;
> +
> + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL,
> + NULL, &except_columns);
> +
>   /*
>   * The column list takes precedence over the
>   * 'publish_generated_columns' parameter. Those will be checked later,
> - * see pgoutput_column_list_init.
> + * see pgoutput_column_list_init. But when a column list is specified
> + * with EXCEPT, it should be checked.
>   */
> - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL))
> + if (found && !except_columns)
>   continue;
>
> The variable 'found' seems a poor name; how about 'has_column_list' or similar?
>
Fixed

> ~~~
>
> 19. pgoutput_change
>
> + /*
> + * If all columns of a table is present in column list specified with
> + * EXCEPT, skip publishing the changes.
> + */
> + if (relentry->no_cols_published)
> + return;
>
> /is present/are present/
>
fixed

> ======
> src/bin/pg_dump/pg_dump.c
>
> 20. getPublicationTables
>
> + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
>   pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
>
>   pubrinfo[j].dobj.catId.tableoid =
>   atooid(PQgetvalue(res, i, i_tableoid));
> @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo
> tblinfo[], int numTables)
>   pubrinfo[j].pubrelqual = NULL;
>   else
>   pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
> + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0);
>
>
> Why not assign pubrinfo[j].pubexcept earlier so you don't have to
> repeat the strcmp?
>
Fixed

> ~~~
>
> 21.
> - if (strcmp(prexcept, "t") == 0)
> + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
>   simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);
>
> Why not assign pubrinfo[j].pubexcept earlier so you don't have to
> repeat the strcmp? Same also for the PQgetisnull(res, i,
> i_prattrs))...
>
Fixed

> ~~~
>
> 22. dumpPublicationTable
>
>   if (pubrinfo->pubrattrs)
> - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> + {
> + if (pubrinfo->pubexcept)
> + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs);
> + else
> + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> + }
>
> SUGGESTION
> {
>   if (pubrinfo->pubexcept)
>     appendPQExpBuffer(query, " EXCEPT");
>
>   appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> }
Fixed

I have addressed the comments shared by you and shared the updated v15
patch set here.

Thanks and Regards,
Shlok Kyal

On Mon, 30 Jun 2025 at 11:37, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for v15-0003.
>
> ======
> doc/src/sgml/catalogs.sgml
>
> 1.
>        <para>
> -       True if the relation must be excluded
> +       True if the column list or relation must be excluded from publication.
> +       If a column list is specified in <literal>prattrs</literal>, then
> +       exclude only those columns. If <literal>prattrs</literal> is NULL,
> +       then exclude the entire relation.
>        </para></entry>
>
> I noticed other fields on this page say "null" instead of "NULL". It
> seems like "null" is more conventional.
>
Fixed

> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>    <para>
>     If no column list is specified, any columns added to the table later are
>     automatically replicated. This means that having a column list which names
> -   all columns is not the same as having no column list at all.
> +   all columns is not the same as having no column list at all.
> Similarly, if an
> +   column list is specified with EXCEPT, any columns added to the table later
> +   are also replicated automatically.
>    </para>
>
> 2a.
> CURRENTLY
> If no column list or a column list with EXCEPT is specified, any
> columns added to the table later are automatically replicated. This
> means that having a column list which names all columns is not the
> same as having no column list at all. If an column list is specified,
> any columns added to the table later are automatically replicated.
>
> ~
>
> That still doesn't quite make sense. I think instead of saying "This
> means..." it needs to say something a bit like below:
>
> However, a normal column list (without EXCEPT) only
> specified columns and no more. Therefore, having a column list that
> names all columns is not the same as having no column list at all, as
> more columns may be added to the table later.
>
Fixed

> ~
>
> 2b.
> And the final sentence "If an column list..." looks like a cut/paste error (??)
>
Yes it was a mistake.

> ~
>
> 2c.
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed.

> ~~~
>
> 2.5A.
> The description about generated columns still says this:
>
> CURRENT:
> Generated columns can also be specified in a column list. This allows
> generated columns to be published, regardless of the publication
> parameter publish_generated_columns. See Section 29.6 for details.
>
> ~
>
> But I don't think it is quite correct. IMO gencols behaviour is much
> more subtle...
>
> e.g.
>
> a) Normal collist - these named cols are published REGARDLESS of the
> 'publish_generated_cols' parameter (same as before)
>
> b) EXCEPT collist - you can specify gencols in the list REGARDLESS of
> the 'publish_generated_cols' parameter, because since they are named
> as "except" then they will not be published anyhow....
>
> c) BUT for EXCEPT collist case, I think any gencols that are *not*
> covered by that EXCEPT collist should follow the rules according to
> the 'publish_generated_cols' parameter.
>
> So, it is much more tricky than the docs currently say:
>
Modified the documentation

> Also
>
> 2.5B.
> - The text says "See Section 29.6 for details," but there are no
> examples of these combinations (e.g. EXCEPT collist and diff parameter
> setting)
>
Added documentation.

> 2.5C,
> - The regression tests also need to be more complex to cover these
>
Added tests related to these

> 2.5D.
> - You might need to add something in the CREATE PUBLICATION "NOTES"
> section after all -- even if it just refers to here.
>
Added documentation

> ~~~
>
> 3.
>     <para>
>      Create a publication <literal>p1</literal>. A column list is defined for
> -    table <literal>t1</literal> to reduce the number of columns that will be
> -    replicated. Notice that the order of column names in the column list does
> -    not matter.
> +    table <literal>t1</literal>, and another column list is defined for table
> +    <literal>t2</literal> using the EXCEPT clause to reduce the number of
> +    columns that will be replicated. Note that the order of column names in
> +    the column lists does not matter.
>  <programlisting>
> -/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d);
> +/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a);
>  </programlisting></para>
>
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 4.
>       <para>
> -      When a column list is specified, only the named columns are replicated.
> -      The column list can contain stored generated columns as well. If the
> -      column list is omitted, the publication will replicate all non-generated
> -      columns (including any added in the future) by default. Stored generated
> -      columns can also be replicated if
> <literal>publish_generated_columns</literal>
> -      is set to <literal>stored</literal>. Specifying a column list has no
> -      effect on <literal>TRUNCATE</literal> commands. See
> +      When a column list without EXCEPT is specified, only the named
> columns are
> +      replicated. The column list can contain stored generated columns as well.
> +      If the column list is omitted, the publication will replicate
> +      all non-generated columns (including any added in the future) by default.
> +      Stored generated columns can also be replicated if
> +      <literal>publish_generated_columns</literal> is set to
> +      <literal>stored</literal>. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands. See
>        <xref linkend="logical-replication-col-lists"/> for details about column
>        lists.
>       </para>
>
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed
> ~~~
>
> 5.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands.
> +     </para>
>
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>.
>
Fixed

> ** Note all the extra subtleties that I mentioned in the review
> comment #2.5 above --- e.g. IMO any *un-listed* gencols still should
> follow the parameter rules.
>
> ~~~
>
> 6.
>    <para>
>     Any column list must include the <literal>REPLICA IDENTITY</literal> columns
> -   in order for <command>UPDATE</command> or <command>DELETE</command>
> -   operations to be published. There are no column list restrictions if the
> -   publication publishes only <command>INSERT</command> operations.
> +   and any column list specified with EXCEPT must not include the
> +   <literal>REPLICA IDENTITY</literal> columns in order for
> +   <command>UPDATE</command> or <command>DELETE</command> operations to be
> +   published. There are no column list restrictions if the
> publication publishes
> +   only <command>INSERT</command> operations.
>    </para>
>
> 6a.
> CURRENT:
> Any column list must include the REPLICA IDENTITY columns, and any
> column list specified with EXCEPT must not include the REPLICA
> IDENTITY columns in order for UPDATE or DELETE operations to be
> published.
>
> ~
>
> I felt that might be better expressed the other way around. Also, it
> might be better to say "not name" instead of "not include" because
> EXCEPT + include seemed a bit contrary.
>
>
> SUGGESTION (maybe like this)
> In order for UPDATE or DELETE operations to work, all the REPLICA
> IDENTITY columns must be published. So, any column list must name all
> REPLICA IDENTITY columns, and any EXCEPT column list must not name any
> REPLICA IDENTITY columns.
>
Fixed

> ~~
>
> 6b.
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed

> ======
> src/backend/catalog/pg_publication.c
>
> check_and_fetch_column_list:
>
> 7.
> + /* Lookup the except attribute */
> + cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple,
> +   Anum_pg_publication_rel_prexcept, &isnull);
> +
> + if (!isnull)
> + {
> + Assert(!pub->alltables);
> + *except_columns = DatumGetBool(cfdatum);
> + }
> +
>
> I felt it would be safer to also assign *except_columns = false;
> up-front so the caller could be sure this flag was meaningful on
> return.
>
Fixed

> ~~~
>
> pub_form_cols_map:
>
> 8.
> Maybe use snake case like for other params, so /excepcols/except_cols/
>
Fixed

> ~~~
>
> pg_get_publication_tables:
>
> 9.
>
> I felt all the logic in this function maybe can be simpler:
>
> e.g. If you just have "Bitmapset *except_columns = NULL;" then null
> nmeans there is no except columns; otherwise there is. This means you
> don't need a separate 'bool except_column' variable.
>
> e.g. Assign the Bitmapset *except_columns after you already have the
> values[2], instead of doing it later.
>
> e.g. The skip code if (except_columns && bms_is_member(att->attnum,
> columns)) could just check the list member, I think, without the
> additional bool.
>
> ~~~
>
Fixed

> 10.
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and not
> + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that
> + * prattrs contains columns to be excluded for replication.
> + */
> + if (!isnull)
> + except_columns = DatumGetBool(exceptDatum);
>
>
> /indicate/indicates/
>
Fixed

> ======
> src/backend/parser/gram.y
>
> 11.
> + | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->pubobjtype = PUBLICATIONOBJ_TABLE;
> + $$->pubtable = makeNode(PublicationTable);
> + $$->pubtable->relation = $2;
> + $$->pubtable->columns = $4;
> + $$->pubtable->whereClause = $5;
> + $$->pubtable->except = true;
> + $$->location = @1;
> + }
>
> I wasn't expecting you would need another 'opt_except_column_list' and
> all the code duplication that causes. AFAIK, the syntax is identical
> for 'opt_column_list' apart from the preceding EXCEPT so I thought all
> you need is to allow the 'opt_column_list' to have an optional EXCEPT
> qualifier.
>
The main reason I used a separate 'opt_except_column_list' is because
'opt_column_list' can also be NULL. But the column list specified with
EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
it cannot be null.

> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 12.
> +
> + /*
> + * Indicates whether no columns are published for a given relation. With
> + * the introduction of the EXCEPT clause in column lists, it is now
> + * possible to define a publication that excludes all columns of a table.
> + * However, the 'columns' attribute cannot represent this case, since a
> + * NULL value implies that all columns are published. To distinguish this
> + * scenario, the 'no_cols_published' flag is introduced.
> + */
> + bool no_cols_published;
>  } RelationSyncEntry;
>
> But, what about when Bitmapset *columns is not null, but has no bits
> set -- doesn't that mean the same as "no columns"?
>
I think this is possible. A bitmapset which has no set bit is NULL. I
saw following comment in bitmapset.c
"By convention, we always represent a set with
 * the minimum possible number of words, i.e, there are never any trailing
 * zero words.  Enforcing this requires that an empty set is represented as
 * NULL.  Because an empty Bitmapset is represented as NULL, a non-NULL
 * Bitmapset always has at least 1 Bitmapword."

> ======
> src/include/catalog/pg_publication.h
>
> 13.
>  extern Bitmapset *pub_form_cols_map(Relation relation,
> - PublishGencolsType include_gencols_type);
> + PublishGencolsType include_gencols_type,
> + Bitmapset *exceptcols);
>
> Maybe snake-case like the other params: /exceptcols/except_cols/
>
Fixed

> ======
> src/test/regress/sql/publication.sql
>
> 14.
> +-- Verify that publication is created with EXCEPT
> +CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1,
> pub_sch1.pub_test_except2 EXCEPT (b, c);
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
> +
>
> I think tests should also use psql \dRp+ commands in places to show
> that the "describe" stuff is working correctly.
>
> ~~~
Fixed

>
> 15.
> +-- Check for invalid cases
> +CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1,
> TABLE pub_test_except1 EXCEPT (b, c);
> +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
>
> Should explain more about what you are testing here:
> a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA
> b) syntax error EXCEPT without a col-list
>
> ~~~
fixed

>
> 16.
> +-- Verify that publication can be altered with EXCEPT
> +ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT
> (a, b), pub_sch1.pub_test_except2;
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
>
> The comment is a bit misleading because there are many kinds of
> "alter". Maybe say more like
> Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list)
>
> ~~~
Fixed

>
> 17.
> +-- Verify ALTER PUBLICATION ... DROP
> +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b);
> +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1;
>
> Should explain more:
> +-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list)
> +-- Verify ok - ALTER PUBLICATION ... DROP ...
>
> ~~~
Fixed

>
> 18.
> +ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d);
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
>
> Missing comment:
> +-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list)
>
> ~~~
Fixed

>
> 19.
> +-- Verify excluded columns cannot be part of REPLICA IDENTITY
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
> +DROP INDEX pub_test_except1_a_idx;
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +
> +DROP INDEX pub_test_except1_a_idx;
>
> 19a.
> IIUC, really there are multiple tests here, so I think it should all
> be split and commented separately.
>
> a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
> b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> c) Verify that so long as no clash between RI cols and the EXCEPT
> col-list, then it is ok
>
> ~
Fixed

>
> 19b.
> IMO, some index names could be better:
>
> CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
> How about 'pub_test_except1_ac_idx'?
>
> ~~~
>
Fixed

> 20.
> +DROP PUBLICATION testpub_except;
> +DROP TABLE pub_test_except1;
> +DROP TABLE pub_sch1.pub_test_except2;
>
> Add a "cleanup" comment.
>
Added

I have addressed the comments and added the latest v16.

Thanks and Regards,
Shlok Kyal

On Mon, 21 Jul 2025 at 12:17, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for patch v16-0003.
>
> ======
> Commit message
>
> 1.
> The column "prexcept" of system catalog "pg_publication_rel" is set to
> "true" when publication is created with EXCEPT table or EXCEPT column
> list. If column "prattrs" of system catalog "pg_publication_rel" is also
> set or column "puballtables" of system catalog "pg_publication" is
> "false", it indicates the column list is specified with EXCEPT clause
> and columns in "prattrs" are excluded from being published.
>
> ~
>
> Somehow, this seems to contain too much information, making it a bit
> confusing. Can't you chop this down to something like below?
>
> SUGESTION
> When column "prexcept" of system catalog "pg_publication_rel" is set
> to "true", and column "prattrs" of system catalog "pg_publication_rel"
> is not NULL, that means the publication was created with "EXCEPT
> (column-list)", and the columns in "prattrs" will be excluded from
> being published.
>
Modified the commit message as per suggestion.

> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>     Generated columns can also be specified in a column list. This allows
>     generated columns to be published, regardless of the publication parameter
>     <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link>. Generated
> columns can be
> +   specified in a column list using the <literal>EXCEPT</literal> clause. This
> +   excludes the specified generated columns from being published, regardless of
> +   the <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> setting. However, for
> +   generated columns that are not listed in the <literal>EXCEPT</literal>
> +   clause, whether they are published or not still depends on the value of
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
>     <literal>publish_generated_columns</literal></link>. See
>     <xref linkend="logical-replication-gencols"/> for details.
>    </para>
>
> ~~
>
> For this part:
>
> "Generated columns can be specified in a column list using the
> <literal>EXCEPT</literal> clause. This excludes the specified
> generated columns from being published, regardless of..."
>
> I think the whole paragraph already said "Generated columns can also
> be specified in a column list", so you don't need to repeat it.
> Instead, maybe say something like below.
>
> SUGGESTION
> Specifying generated columns in a column list using the
> <literal>EXCEPT</literal> clause excludes those columns from being
> published, regardless of...
>
> ~~~
>
Modified

> 3.
> -                               Publication p1
> -  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -----------+------------+---------+---------+---------+-----------+----------
> - postgres | f          | t       | t       | t       | t         | f
> +                                        Publication p1
> + Owner  | All tables | Inserts | Updates | Deletes | Truncates |
> Generated columns | Via root
> +--------+------------+---------+---------+---------+-----------+-------------------+----------
> + ubuntu | f          | t       | t       | t       | t         | none
>              | f
>  Tables:
>      "public.t1" (id, a, b, d)
> +    "public.t2" EXCEPT (a, d)
>  </programlisting></para>
>
>
> I noticed the Owner changed from "postgres" to "ubuntu". Do you think
> it is better to keep this as "postgres" for the example?
I agree that it is better to keep "postgres". I have reverted back to
the use "postgres"..

>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 4.
> The tables added to a publication that publishes UPDATE and/or DELETE
> operations must have REPLICA IDENTITY defined. Otherwise those
> operations will be disallowed on those tables.
>
> In order for UPDATE or DELETE operations to work, all the REPLICA
> IDENTITY columns must be published. So, any column list must name all
> REPLICA IDENTITY columns, and any EXCEPT column list must not name any
> REPLICA IDENTITY columns.
>
> A row filter expression (i.e., the WHERE clause) must contain only
> columns that are covered by the REPLICA IDENTITY, in order for UPDATE
> and DELETE operations to be published. For publication of INSERT
> operations, any column may be used in the WHERE expression. The row
> filter allows simple expressions that don't have user-defined
> functions, user-defined operators, user-defined types, user-defined
> collations, non-immutable built-in functions, or references to system
> columns.
>
> The generated columns that are part of the column list specified with
> the EXCEPT clause are not published, regardless of the
> publish_generated_columns option. However, generated columns that are
> not part of the column list specified with the EXCEPT clause are
> published according to the value of the publish_generated_columns
> option. See Section 29.6 for details.
>
> The generated columns that are part of REPLICA IDENTITY must be
> published explicitly either by listing them in the column list or by
> enabling the publish_generated_columns option, in order for UPDATE and
> DELETE operations to be published.
>
> ~~
>
> Notice all those 5 paragraphs (above) are talking about REPLICA
> IDENTITY, except the 4th paragraph. Maybe the 4th paragraph should be
> moved to last, to keep all the REPLICA IDENTITY stuff together.
>
Fixed

> ======
> src/backend/catalog/pg_publication.c
>
> 5. pub_form_cols_map
>
>   * Returns a bitmap representing the columns of the specified table.
>   *
>   * Generated columns are included if include_gencols_type is
> - * PUBLISH_GENCOLS_STORED.
> + * PUBLISH_GENCOLS_STORED. Columns that are in the exceptcols are excluded from
> + * the column list.
>   */
>  Bitmapset *
> -pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type)
> +pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type,
> +   Bitmapset *except_cols)
>
> Forgot to add the underscore in the function comment.
>
> /exceptcols/except_cols/
>
Fixed

> ~~~
>
> 6. pg_get_publication_tables
>
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and not
> + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicates that
> + * prattrs contains columns to be excluded for replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum) && !nulls[2])
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
>
> But, you cannot have EXCEPT for null column list, so shouldn't the
> !nulls[2] check be done to also guard the SysCacheGetAttr call?
>
Fixed

> ======
> src/backend/parser/gram.y
>
> 7.
>
> Shlok wrote [1-reply #11]
> The main reason I used a separate 'opt_except_column_list' is because
> 'opt_column_list' can also be NULL. But the column list specified with
> EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
> it cannot be null.
>
> ~
>
> Yeah, but IMO that leads to excessive duplicated code. I think the
> code can perhaps be a lot simpler if the grammar is written more like
> the synopsis:
>
> e.g. TABLE name opt_EXCEPT opt_column_list
>
> where - opt_EXCEPT is null, and opt_column_list is null... means no col list
> where - opt_EXCEPT is null, and opt_column_list is not null... means
> normal col list
> where - opt_EXCEPT is not null, and opt_column_list not null... means
> EXCEPT col list
> where - opt_EXCEPT is not null, and opt_column_list null... SYNTAX ERROR
>
> So code it something like this (just adding opt_EXCEPT to the existing
> productions)
>
> %type <boolean> opt_ordinality opt_without_overlaps opt_EXCEPT
> ...
> opt_EXCEPT:
> EXCEPT { $$ = true; }
> | /*EMPTY*/ { $$ = false; }
> ;
> ...
> TABLE relation_expr opt_EXCEPT opt_column_list OptWhereClause
> {
>   $$ = makeNode(PublicationObjSpec);
>   $$->pubobjtype = PUBLICATIONOBJ_TABLE;
>   $$->pubtable = makeNode(PublicationTable);
>   $$->pubtable->relation = $2;
>   $$->pubtable->except = $3;
>   $$->pubtable->columns = $4;
>   if ($3 && !$4)
>     ereport(ERROR,
>       (errcode(ERRCODE_SYNTAX_ERROR),
>       errmsg("EXCEPT without column list"),
>       parser_errposition(@3)));
>   $$->pubtable->whereClause = $5;
>   $$->location = @1;
> }
>
> etc.
>
I have modified it. I have created a function 'check_except_collist'
to throw error, to avoid duplication code for error message.

> ======
> src/bin/psql/describe.c
>
> 8.
>   if (!PQgetisnull(res, i, 3))
> + {
> + if (!PQgetisnull(res, i, 4) && strcmp(PQgetvalue(res, i, 4), "t") == 0)
> + appendPQExpBuffer(buf, " EXCEPT");
>   appendPQExpBuffer(buf, " (%s)", PQgetvalue(res, i, 3));
> + }
>
> This growing list of columns makes it hard to understand this function
> without looking back at the caller all the time. Maybe you can add a
> function comment that at least explains what those attributes 1,2,3,4
> represent?
>
Added a comment

> ======
> src/bin/psql/tab-complete.in.c
>
> 9.
> + else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|SET",
> "TABLE", MatchAny))
> + COMPLETE_WITH("EXCEPT");
>
> Since it is not allowed to have an EXCEPT with no column list,
> shouldn't this say "EXCEPT ("?
>
Fixed

> ~~~
>
> 10.
>   else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE",
> MatchAny) && !ends_with(prev_wd, ','))
> - COMPLETE_WITH("WHERE (", "WITH (");
> + COMPLETE_WITH("EXCEPT", "WHERE (", "WITH (");
>
> Ditto. Since it is not allowed to have an EXCEPT with no column list,
> shouldn't this say "EXCEPT ("?
>
Fixed

>
> ======
> src/test/regress/expected/publication.out
>
> 11.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +ERROR:  index "pub_test_except1_a_idx" for table "pub_test_except1"
> does not exist
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +ERROR:  cannot update table "pub_test_except1"
> +DETAIL:  Column list used by the publication does not cover the
> replica identity.
> +DROP INDEX pub_test_except1_ac_idx;
>
>
> What's happening here? I'm not sure these are the kind of errors you
> were trying to cause.
>
Yes, it is not the error I was trying to cause. I have modified it.

> ======
> src/test/regress/sql/publication.sql
>
> 12.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
>
> SUGGESTION. Change that comment to:
> Verify fails - EXCEPT col-list cannot...
>
Fixed

> ~~~
>
> 13.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +DROP INDEX pub_test_except1_ac_idx;
>
> SUGGESTION. Change that comment to:
> Verify fails - EXCEPT col-list cannot...
>
Fixed

> ~~~
>
> 14.
> +-- Verify that so long as no clash between RI cols and the EXCEPT
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +
>
> That comment doesn't make sense. Missing words?
>
Fixed

> ======
> .../t/036_rep_changes_except_table.pl
>
> 15.
> (I haven't reviewed this file in detail yet, but here is a general comment)
>
> I know this patch currently lives in the same thread as all the EXCEPT
> TABLE stuff, but that seems just happenstance to me. IMO, this is a
> separate enhancement that just shares the keyword EXCEPT. So, I felt
> it should have quite separate tests too.
>
> e.g. How about: 037_rep_changes_except_collist.pl
>
Modified

> ======
> [1] https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Thanks,
Shlok Kyal

On Tue, 22 Jul 2025 at 07:28, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> Some review comments for patch v17-0003. I also checked the TAP test this time.
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 1.
> +   <literal>publish_generated_columns</literal></link>. Specifying generated
> +   columns in a column list using the <literal>EXCEPT</literal> clause excludes
> +   the specified generated columns from being published, regardless of the
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> setting. However, for
>
> I think that is not quite the same wording I had previously suggested.
> It sounds a bit odd/redundant saying "Specifying" and "specified" in
> the same sentence.
>
> ======
> src/backend/parser/gram.y
>
> 2. check_except_collist
>
> I'm wondering if this checking should be done within the existing
> preprocess_pubobj_list() function, alongside all the other ERROR
> checking. Care needs to be taken to make sure the pubtable->except is
> referring to an EXCEPT (col-list), instead of the other kind of EXCEPT
> tables, but in general I think it is better to keep all the
> publication combinations checking errors like this in one place.
>
Added the check in preprocess_pubobj_list(). I checked the syntaxes
and found that this function is not called for "FOR ALL TABLES" cases
and EXCEPT tables can only be used with "FOR ALL TABLES" publications.
So, I think handling for "EXCEPT tables" will not be required in the
function preprocess_pubobj_list()

>
> ======
> src/bin/psql/describe.c
>
> 3. addFooterToPublicationDesc
>
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (!PQgetisnull(result, i, 3) &&
> + strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> +   PQgetvalue(result, i, 2));
> + else
> + appendPQExpBuffer(&buf, " (%s)",
> +   PQgetvalue(result, i, 2));
> + }
>
> Do you really need to check !PQgetisnull(result, i, 3) here?  (e.g.
> The comment does not say that this attribute can be NULL)
>
> ======
> .../t/037_rep_changes_except_collist.pl
>
> 4.
> +# Copyright (c) 2021-2025, PostgreSQL Global Development Group
> +
> +# Logical replication tests for except table publications
>
> Comment is wrong. These tests are for EXCEPT (column-list)
>
> ~~~
>
> 5.
> +# Test for except column publications
> +# Initial setup
> +$node_publisher->safe_psql('postgres', "CREATE SCHEMA sch1");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab2 (a int, b int NOT NULL, c int)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE sch1.tab2 (a int, b int, c int)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab3 (a int, b int, c int)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab4 (a int, b int GENERATED ALWAYS AS (a * 2) STORED,
> c int GENERATED ALWAYS AS (a * 3) STORED)"
> +);
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab5 (a int, b int GENERATED ALWAYS AS (a * 2) STORED,
> c int GENERATED ALWAYS AS (a * 3) STORED)"
> +);
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (1, 2, 3)");
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO sch1.tab2 VALUES (1, 2, 3)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_col FOR TABLE tab2 EXCEPT (a), sch1.tab2
> EXCEPT (b, c)"
> +);
>
> 5a.
> I think you don't need to say "Test for except column publications",
> because that is the purpose of thie entire file.
>
> ~
>
> 5b.
> You can combine multiple of these safe_psql calls together
>
> ~
>
> 5c.
> It might help make tests easier to read if you named those generated
> columns 'b', 'c' cols as 'bgen', 'cgen' instead.
>
> ~
> 5d.
> The table names are strange, because why does it start at tab2 when
> there is not a tab1?
> ~~~
>
> 6.
> +$node_subscriber->safe_psql('postgres', "CREATE SCHEMA sch1");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab2 (a int, b int NOT NULL, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE sch1.tab2 (a int, b int, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab3 (a int, b int, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab4 (a int, b int, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab5 (a int, b int, c int)");
>
> You can combine multiple of these safe_psql calls together
>
> ~~~
>
> 7.
> +# Test initial sync
> +my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab2");
> +is($result, qq(|2|3),
> + 'check that initial sync for except column publication');
>
> The message seems strange. Do you mean "check initial sync for an
> 'EXCEPT (column-list)' publication"
>
> NOTE: There are many other messages where you wrote "for except column
> publication" but I think maybe all of those can be improved a bit like
> above.
>
> ~~~
>
> 8.
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (4, 5, 6)");
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO sch1.tab2 VALUES (4, 5, 6)");
> +$node_publisher->wait_for_catchup('tap_sub_col');
>
> 8a.
> You can combine multiple of these safe_psql calls together.
>
> NOTE: I won't keep repeating this review comment but I think maybe
> there are lots more places where the safe_psql can all be combined to
> expected multiple statements.
>
> ~
>
> 8b.
> I felt all those commands should be under the "Test incremental
> changes" comment.
>
> ~~~
>
> 9.
> +is($result, qq(1||3), 'check alter publication with EXCEPT');
>
> Maybe that should've said with 'EXCEPT (column-list)'
>
> ~~~
>
> 10.
> +# Test for publication created with publish_generated_columns as true on table
> +# with generated columns and column list specified with EXCEPT
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab4 VALUES (1)");
> +$node_publisher->safe_psql('postgres',
> + "ALTER PUBLICATION tap_pub_col SET (publish_generated_columns)");
> +$node_publisher->safe_psql('postgres',
> + "ALTER PUBLICATION tap_pub_col SET TABLE tab4 EXCEPT(b)");
> +$node_subscriber->safe_psql('postgres',
> + "ALTER SUBSCRIPTION tap_sub_col REFRESH PUBLICATION");
> +$node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub_col');
>
> 10a.
> I felt the test comments for both those generated columns parameter
> test should give more explanation to say what is the expected result
> and why.
>
> ~
>
> 10b.
> How does "ALTER PUBLICATION tap_pub_col SET
> (publish_generated_columns)" even work? I thought the
> "pubish_generated_columns" is an enum but you did not specify any enum
> value here (???)
>
> ~~~
Yes, it works. It works equivalent to publish_generated_columns = stored.
Eg:
postgres=# CREATE PUBLICATION pub1 FOR TABLE t1 with
(publish_generated_columns);
CREATE PUBLICATION
postgres=# select * from pg_publication;
  oid  | pubname | pubowner | puballtables | pubinsert | pubupdate |
pubdelete | pubtruncate | pubviaroot | pubgencols

-------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------+------------
 16395 | pub1    |       10 | f            | t         | t         | t
        | t           | f          | s
(1 row)

For this patch, I have modified the test to use
'publish_generated_columns = stored'.

>
> 11.
> + 'check publication(publish_generated_columns as false) with
> generated columns and EXCEPT'
>
> Hmm. I thought there is no such thing as "publish_generated_columns as
> false", and also the EXCEPT should say 'EXCEPT (column-list)'
>
> ~~~
>
> 12.
> I wonder if there should be another boundary condition test case as follows:
> - have some table with cols a,b,c.
> - create a publication 'EXCEPT (a,b,c)', so you don't publish anything at all.
> - then ALTER the TABLE to add a column 'd'.
> - now the publication should publish only 'd'.
> ======

I have fixed all the comments and added the changes in the latest v18 patch.

Thanks,
Shlok Kyal

On Mon, 4 Aug 2025 at 13:03, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 2:07 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> ...
> > > 10b.
> > > How does "ALTER PUBLICATION tap_pub_col SET
> > > (publish_generated_columns)" even work? I thought the
> > > "pubish_generated_columns" is an enum but you did not specify any enum
> > > value here (???)
> > >
> > > ~~~
> > Yes, it works. It works equivalent to publish_generated_columns = stored.
> > Eg:
> > postgres=# CREATE PUBLICATION pub1 FOR TABLE t1 with
> > (publish_generated_columns);
> > CREATE PUBLICATION
> > postgres=# select * from pg_publication;
> >   oid  | pubname | pubowner | puballtables | pubinsert | pubupdate |
> > pubdelete | pubtruncate | pubviaroot | pubgencols
> >
-------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------+------------
> >  16395 | pub1    |       10 | f            | t         | t         | t
> >         | t           | f          | s
> > (1 row)
> >
>
> Hmm -- it's not documented to behave like that, so I've created
> another thread for getting to the bottom of this topic.
>
> ~~~
>
> Meanwhile, here are my review comments for patch v18-0003
>
> ======
> src/backend/catalog/pg_publication.c
>
> pg_get_publication_tables:
>
> 1.
> if (nattnums > 0)
> {
> values[2] = PointerGetDatum(buildint2vector(attnums, nattnums));
> nulls[2] = false;
> }
> else
> nulls[2] = true;
>
> Is there any possibility that values[2] might not be null, but then
> nattrnums skips some cols so remains 0? Then the final values[2] would
> conflict with nulls[2], which seems strange. Maybe it is safer to also
> assign values[2] = null in the else.
>
Yes, When all the columns of a table are present in 'EXCEPT
(column-list)'. Then effectively no column should be replicated. In
such cases we should mark nulls[2] as true.
I agree with your point that values[2] should be made null. I have
used '(Datum) 0', in accordance with other places.

> ======
> src/backend/replication/logical/tablesync.c
>
> fetch_remote_table_info:
>
> 2.
>  static void
>  fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
> - List **qual, bool *gencol_published)
> + List **qual, bool *gencol_published,
> + bool *no_cols_published)
>
> This new parameter should be documented in the function comment.
>
> ~~~
>
> 3.
> + if (server_version >= 190000)
> + *no_cols_published = DatumGetBool(slot_getattr(tslot, 2, &isnull));
> +
>
> It seems that *no_cols_published (and *gencol_published) are assigned
> false by the caller. I had to go looking for that, so IMO it would be
> better to put Assert at the top of here so it is self-documenting
>
> Assert(*gencol_published == false);
> Assert(*no_cols_published == false);
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 4.
> + /*
> + * Indicates whether no columns are published for a given relation. With
> + * the introduction of the EXCEPT clause in column lists, it is now
> + * possible to define a publication that excludes all columns of a table.
> + * However, the 'columns' attribute cannot represent this case, since a
> + * NULL value implies that all columns are published. To distinguish this
> + * scenario, the 'no_cols_published' flag is introduced.
> + */
> + bool no_cols_published;
>
> The wording of the comment seems a bit strange -- EXCEPT is not a clause.
>
> BEFORE:
> the introduction of the EXCEPT clause in column lists, ...
>
> SUGGESTION
> the introduction of the EXCEPT qualifier for column lists, ....
>
> ~~~
>
> 5.
>   Bitmapset  *cols = NULL;
> + bool except_columns = false;
> + bool no_col_published = false;
>
> There are multiple places in this patch that say:
>
> 'no_col_published'
> or 'no_cols_published'
>
> I felt this var name can be misunderstood because it is easy to read
> "no" as meaning "no." (aka number), and then misinterpret as
> "number_of_cols_published".
>
> Maybe an unambiguous name can be found, like
> - 'zero_cols_published' or
> - 'nothing_published' or
> - really make it 'num_cols_published' and check for 0.
>
> (so this comment applies to multiple places in the patch)
>
How about 'all_cols_excluded'? Or 'has_published_cols'?
I have used 'all_cols_excluded' in this patch. Thoughts?

> ~~
>
> 6.
>   * of the table (including generated columns when
>   * 'publish_generated_columns' parameter is true).
>   */
> - if (!cols)
> + if (!no_col_published && !cols)
>   {
>
> The existing comment above this code fragment also needs to mention
> "EXCEPT (column-list)" where all the columns are excluded
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 7.
>   /* column list (if any) */
>   if (!PQgetisnull(result, i, 2))
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> +   PQgetvalue(result, i, 2));
> + else
> + appendPQExpBuffer(&buf, " (%s)",
> +   PQgetvalue(result, i, 2));
> + }
>
> Isn't this code fragment (and also surrounding code) using the same
> logic as what is already encapsulated in the function
> addFooterToPublicationDesc()?
> Superficially, it seems like a large chunk can all be replaced with a
> single call to the existing function.
>
'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
"schema_name.table_name" EXCEPT (column-list)
Whereas code pasted above is executed when we use \d+ table_name and
the output is the format:
"publication_name" EXCEPT (column-list)

These pieces of code are used to print different info. One is used to
print info related to tables and the other is used to print info
related to publication.
Should we use a common function for this?

> ======
> src/test/regress/expected/publication.out
>
> 8.
> +-- Syntax error EXCEPT without a col-list
> +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> +ERROR:  EXCEPT clause not allowed for table without column list
> +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> +                                               ^
>
> Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> the word "TABLE" instead of "EXCEPT"?
>
In function 'preprocess_pubobj_list' the position of position marker
(^) is decided by "pubobj->location". Function handles multiple errors
and setting "$$->location" only specific to EXCEPT qualifier would not
be appropriate. One solution I feel is to not show "position marker
(^)" in the case of EXCEPT. Or maybe we can add a new variable to
'PublicationTable' for except_location but I think we should not do
that. Thoughts?

For this version of patch, I have removed the "position marker (^)" in
the case of EXCEPT.

> ======
> .../t/037_rep_changes_except_collist.pl
>
> 9.
> +# Test initial sync
> +my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1");
> +is($result, qq(|2|3),
> + 'check that initial sync for EXCEPT (column-list) publication');
> +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.tab1");
> +is($result, qq(1||),
> + 'check that initial sync for EXCEPT (column-list) publication');
>
> These messages still seem to have missing or extra words: "check that
> initial sync" (??). Maybe just remove the word 'that'?
>
> ~~~
>
> 10.
> # Test for update
> $node_subscriber->safe_psql(
> 'postgres', qq(
> CREATE UNIQUE INDEX b_idx ON tab1 (b);
> ALTER TABLE tab1 REPLICA IDENTITY USING INDEX b_idx;
> ));
> $node_publisher->safe_psql(
> 'postgres', qq(
> CREATE UNIQUE INDEX b_idx ON tab1 (b);
> ALTER TABLE tab1 REPLICA IDENTITY USING INDEX b_idx;
> UPDATE tab1 SET a = 3, b = 4, c = 5 WHERE a = 1;
> ));
> $node_publisher->wait_for_catchup('tap_sub_col');
> $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1");
> is( $result, qq(|5|6
> |4|5),
> 'check update for EXCEPT (column-list) publication');
>
> ~
>
> 10a.
> I think the test is OK, but your chosen numbers like 1,2,3, then 4,5,6
> and then updating to 1,2,3 to 3,4,5 make it quite hard to review.
> Maybe use easier numbers that are more identifiable, e.g. update 1,2,3
> => 991,992,993 or something like that.
>
> ~
>
> 10b.
> You may need to put some ORDER BY in all these queries just to make
> sure they are always reproducible, giving rows in the expected order.
>

I have also addressed the remaining comments and attached the latest
v19 patches.

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

11 августа 2025 г., 11:25:12

Hi Shlok.

On Wed, Aug 6, 2025 at 11:11 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
...
> > 5.
> >   Bitmapset  *cols = NULL;
> > + bool except_columns = false;
> > + bool no_col_published = false;
> >
> > There are multiple places in this patch that say:
> >
> > 'no_col_published'
> > or 'no_cols_published'
> >
> > I felt this var name can be misunderstood because it is easy to read
> > "no" as meaning "no." (aka number), and then misinterpret as
> > "number_of_cols_published".
> >
> > Maybe an unambiguous name can be found, like
> > - 'zero_cols_published' or
> > - 'nothing_published' or
> > - really make it 'num_cols_published' and check for 0.
> >
> > (so this comment applies to multiple places in the patch)
> >
> How about 'all_cols_excluded'? Or 'has_published_cols'?
> I have used 'all_cols_excluded' in this patch. Thoughts?

The new name is good.

> > ======
> > src/bin/psql/describe.c
> >
> > describeOneTableDetails:
> >
> > 7.
> >   /* column list (if any) */
> >   if (!PQgetisnull(result, i, 2))
> > - appendPQExpBuffer(&buf, " (%s)",
> > -   PQgetvalue(result, i, 2));
> > + {
> > + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> > + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> > +   PQgetvalue(result, i, 2));
> > + else
> > + appendPQExpBuffer(&buf, " (%s)",
> > +   PQgetvalue(result, i, 2));
> > + }
> >
> > Isn't this code fragment (and also surrounding code) using the same
> > logic as what is already encapsulated in the function
> > addFooterToPublicationDesc()?
> > Superficially, it seems like a large chunk can all be replaced with a
> > single call to the existing function.
> >
> 'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
> "schema_name.table_name" EXCEPT (column-list)
> Whereas code pasted above is executed when we use \d+ table_name and
> the output is the format:
> "publication_name" EXCEPT (column-list)
>
> These pieces of code are used to print different info. One is used to
> print info related to tables and the other is used to print info
> related to publication.
> Should we use a common function for this?

It still seems like quite a lot of overlap. e.g. I thought there were
~30 lines common. OTOH, perhaps you'll need to pass another boolean to
the function to indicate it is a "Publication:" footer. I guess you'd
have to try it out first to see if the changes required to save those
30 LOC are worthwhile or not.

>
> > ======
> > src/test/regress/expected/publication.out
> >
> > 8.
> > +-- Syntax error EXCEPT without a col-list
> > +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> > +ERROR:  EXCEPT clause not allowed for table without column list
> > +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> > +                                               ^
> >
> > Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> > the word "TABLE" instead of "EXCEPT"?
> >
> In function 'preprocess_pubobj_list' the position of position marker
> (^) is decided by "pubobj->location". Function handles multiple errors
> and setting "$$->location" only specific to EXCEPT qualifier would not
> be appropriate. One solution I feel is to not show "position marker
> (^)" in the case of EXCEPT. Or maybe we can add a new variable to
> 'PublicationTable' for except_location but I think we should not do
> that. Thoughts?

In the review comments below, I suggest putting this location back,
but changing the message.

>
> For this version of patch, I have removed the "position marker (^)" in
> the case of EXCEPT.
>

//////

Here are my review comments for the patch v19-0003.

======
1. General - SGML tags in docs for table/column names.

There is nothing to change just yet, but keep an eye on the thread
[1],  because if/when that gets pushed, then there will several tags
in this patch for table/column names that will need to be updated for
consistency.

======
src/backend/catalog/pg_publication.c

pg_get_publication_tables:

2.
+
+ if (!nulls[2])
+ {
+ Datum exceptDatum;
+ bool isnull;
+
+ /*
+ * We fetch pubtuple if publication is not FOR ALL TABLES and
+ * not FOR TABLES IN SCHEMA. So if prexcept is true, it
+ * indicates that prattrs contains columns to be excluded for
+ * replication.
+ */
+ exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
+   Anum_pg_publication_rel_prexcept,
+   &isnull);
+
+ if (!isnull && DatumGetBool(exceptDatum))
+ except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
+ }

Maybe this should be done a few lines earlier, to keep all the
values[2]/nulls[2] code together, ahead of the values[3]/nulls[3]
code. Indeed, there is lots of other values[2]/nulls[2] logic that
comes later in this function, so maybe it is better to do all of that
first, instead of mingling it with values[3]/nulls[3].

======
src/backend/commands/publicationcmds.c

pub_contains_invalid_column:

3.
  * 1. Ensures that all columns referenced in the REPLICA IDENTITY are covered
- *    by the column list. If any column is missing, *invalid_column_list is set
+ *    by the column list and are not part of column list specified with EXCEPT.
+ *   If any column is missing, *invalid_column_list is set
  *    to true.

Whitespace problem here; there is some tab instead of space in this comment.

Also /part of column list/part of the column list/

~~~

AlterPublicationTables:

4.
  bool isnull = true;
  Datum whereClauseDatum;
  Datum columnListDatum;
+ Datum exceptDatum;

It's not necessary to have all these different Datum variables; they
are only temporary storage. It might be simpler to use a single "Datum
datum;" which is reused 3x.

~

5.
+ exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+   Anum_pg_publication_rel_prexcept,
+   &isnull);
+
+ if (!isnull)
+ oldexcept = DatumGetBool(exceptDatum);
+

Isn't the 'prexcept' also used for EXCEPT TABLE as well as EXCEPT
(column-list)? In other words, should the change to this function be
done already in one of the earlier patches?

~

6.
  if (equal(oldrelwhereclause, newpubrel->whereClause) &&
- bms_equal(oldcolumns, newcolumns))
+ bms_equal(oldcolumns, newcolumns) &&
+ oldexcept == newpubrel->except)

The code comment about this code fragment should also mention EXCEPT.

======
src/backend/parser/gram.y

preprocess_pubobj_list:

7.
+ if (pubobj->pubtable && pubobj->pubtable->except &&
+ pubobj->pubtable->columns == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("EXCEPT clause not allowed for table without column list"));
+

Having the syntax error location (like before in v18) might be better,
but since that location is associated with the TABLE, then the error
message should also be reworded so the subject is the table.

SUGGESTION
errmsg("table without column list cannot use EXCEPT clause")

======
src/bin/psql/describe.c

describeOneTableDetails:

8.
- if (pset.sversion >= 150000)
+ if (pset.sversion >= 190000)
  {
  printfPQExpBuffer(&buf,
    "SELECT pubname\n"
    "     , NULL\n"
    "     , NULL\n"
+   " , NULL\n"
    "FROM pg_catalog.pg_publication p\n"
    "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
    "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
@@ -3038,35 +3039,62 @@ describeOneTableDetails(const char *schemaname,
    "                pg_catalog.pg_attribute\n"
    "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
    "        ELSE NULL END) "
+   " , prexcept "
    "FROM pg_catalog.pg_publication p\n"
    " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
    " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
-   "WHERE pr.prrelid = '%s'\n",
-   oid, oid, oid);
-
- if (pset.sversion >= 190000)
- appendPQExpBufferStr(&buf, " AND NOT pr.prexcept\n");
+   "WHERE pr.prrelid = '%s' "
+   "AND  c.relnamespace NOT IN (\n "
+   " SELECT pnnspid FROM\n"
+   " pg_catalog.pg_publication_namespace)\n"

- appendPQExpBuffer(&buf,
    "UNION\n"
    "SELECT pubname\n"
    " , NULL\n"
    " , NULL\n"
+   " , NULL\n"
    "FROM pg_catalog.pg_publication p\n"
-   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n",
-   oid);
-
- if (pset.sversion >= 190000)
- appendPQExpBuffer(&buf,
-   "     AND NOT EXISTS (\n"
-   " SELECT 1\n"
-   " FROM pg_catalog.pg_publication_rel pr\n"
-   " JOIN pg_catalog.pg_class pc\n"
-   " ON pr.prrelid = pc.oid\n"
-   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
-   oid);
-
- appendPQExpBufferStr(&buf, "ORDER BY 1;");
+   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
+   "     AND NOT EXISTS (\n"
+   " SELECT 1\n"
+   " FROM pg_catalog.pg_publication_rel pr\n"
+   " JOIN pg_catalog.pg_class pc\n"
+   " ON pr.prrelid = pc.oid\n"
+   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n"
+   "ORDER BY 1;",
+   oid, oid, oid, oid, oid);
+ }
+ else if (pset.sversion >= 150000)
+ {
+ printfPQExpBuffer(&buf,
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
+   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
+   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , pg_get_expr(pr.prqual, c.oid)\n"
+   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
+   "         (SELECT string_agg(attname, ', ')\n"
+   "           FROM pg_catalog.generate_series(0,
pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
+   "                pg_catalog.pg_attribute\n"
+   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
+   "        ELSE NULL END) "
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
+   "WHERE pr.prrelid = '%s'\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
+   "ORDER BY 1;",
+   oid, oid, oid, oid);

I found these large SQL selects with 3x UNIONs are difficult to read.
Maybe you can add more comments to describe the intention of each of
the UNION SELECTs?

~~~

9.
  /* column list (if any) */
  if (!PQgetisnull(result, i, 2))
- appendPQExpBuffer(&buf, " (%s)",
-   PQgetvalue(result, i, 2));
+ {
+ if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
+ appendPQExpBuffer(&buf, " EXCEPT");
+ appendPQExpBuffer(&buf, " (%s)", PQgetvalue(result, i, 2));
+ }

I did not find any regression test case where the "EXCEPT" col-list is
getting output for a "Publications:" footer.

======
[1] https://www.postgresql.org/message-id/aIELRMAviNiUL1ie%40momjian.us

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

13 августа 2025 г., 12:21:42

On Mon, 11 Aug 2025 at 13:55, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> On Wed, Aug 6, 2025 at 11:11 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> ...
> > > 5.
> > >   Bitmapset  *cols = NULL;
> > > + bool except_columns = false;
> > > + bool no_col_published = false;
> > >
> > > There are multiple places in this patch that say:
> > >
> > > 'no_col_published'
> > > or 'no_cols_published'
> > >
> > > I felt this var name can be misunderstood because it is easy to read
> > > "no" as meaning "no." (aka number), and then misinterpret as
> > > "number_of_cols_published".
> > >
> > > Maybe an unambiguous name can be found, like
> > > - 'zero_cols_published' or
> > > - 'nothing_published' or
> > > - really make it 'num_cols_published' and check for 0.
> > >
> > > (so this comment applies to multiple places in the patch)
> > >
> > How about 'all_cols_excluded'? Or 'has_published_cols'?
> > I have used 'all_cols_excluded' in this patch. Thoughts?
>
> The new name is good.
>
> > > ======
> > > src/bin/psql/describe.c
> > >
> > > describeOneTableDetails:
> > >
> > > 7.
> > >   /* column list (if any) */
> > >   if (!PQgetisnull(result, i, 2))
> > > - appendPQExpBuffer(&buf, " (%s)",
> > > -   PQgetvalue(result, i, 2));
> > > + {
> > > + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> > > + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> > > +   PQgetvalue(result, i, 2));
> > > + else
> > > + appendPQExpBuffer(&buf, " (%s)",
> > > +   PQgetvalue(result, i, 2));
> > > + }
> > >
> > > Isn't this code fragment (and also surrounding code) using the same
> > > logic as what is already encapsulated in the function
> > > addFooterToPublicationDesc()?
> > > Superficially, it seems like a large chunk can all be replaced with a
> > > single call to the existing function.
> > >
> > 'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
> > "schema_name.table_name" EXCEPT (column-list)
> > Whereas code pasted above is executed when we use \d+ table_name and
> > the output is the format:
> > "publication_name" EXCEPT (column-list)
> >
> > These pieces of code are used to print different info. One is used to
> > print info related to tables and the other is used to print info
> > related to publication.
> > Should we use a common function for this?
>
> It still seems like quite a lot of overlap. e.g. I thought there were
> ~30 lines common. OTOH, perhaps you'll need to pass another boolean to
> the function to indicate it is a "Publication:" footer. I guess you'd
> have to try it out first to see if the changes required to save those
> 30 LOC are worthwhile or not.
>
I have added the code changes for the same in this patch.

> >
> > > ======
> > > src/test/regress/expected/publication.out
> > >
> > > 8.
> > > +-- Syntax error EXCEPT without a col-list
> > > +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> > > +ERROR:  EXCEPT clause not allowed for table without column list
> > > +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> > > +                                               ^
> > >
> > > Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> > > the word "TABLE" instead of "EXCEPT"?
> > >
> > In function 'preprocess_pubobj_list' the position of position marker
> > (^) is decided by "pubobj->location". Function handles multiple errors
> > and setting "$$->location" only specific to EXCEPT qualifier would not
> > be appropriate. One solution I feel is to not show "position marker
> > (^)" in the case of EXCEPT. Or maybe we can add a new variable to
> > 'PublicationTable' for except_location but I think we should not do
> > that. Thoughts?
>
> In the review comments below, I suggest putting this location back,
> but changing the message.
>
> >
> > For this version of patch, I have removed the "position marker (^)" in
> > the case of EXCEPT.
> >
>
> //////
>
> Here are my review comments for the patch v19-0003.
>
> ======
> 1. General - SGML tags in docs for table/column names.
>
> There is nothing to change just yet, but keep an eye on the thread
> [1],  because if/when that gets pushed, then there will several tags
> in this patch for table/column names that will need to be updated for
> consistency.
>
Noted

> ======
> src/backend/catalog/pg_publication.c
>
> pg_get_publication_tables:
>
> 2.
> +
> + if (!nulls[2])
> + {
> + Datum exceptDatum;
> + bool isnull;
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and
> + * not FOR TABLES IN SCHEMA. So if prexcept is true, it
> + * indicates that prattrs contains columns to be excluded for
> + * replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum))
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
> + }
>
> Maybe this should be done a few lines earlier, to keep all the
> values[2]/nulls[2] code together, ahead of the values[3]/nulls[3]
> code. Indeed, there is lots of other values[2]/nulls[2] logic that
> comes later in this function, so maybe it is better to do all of that
> first, instead of mingling it with values[3]/nulls[3].
>
> ======
> src/backend/commands/publicationcmds.c
>
> pub_contains_invalid_column:
>
> 3.
>   * 1. Ensures that all columns referenced in the REPLICA IDENTITY are covered
> - *    by the column list. If any column is missing, *invalid_column_list is set
> + *    by the column list and are not part of column list specified with EXCEPT.
> + *   If any column is missing, *invalid_column_list is set
>   *    to true.
>
> Whitespace problem here; there is some tab instead of space in this comment.
>
> Also /part of column list/part of the column list/
>
> ~~~
>
> AlterPublicationTables:
>
> 4.
>   bool isnull = true;
>   Datum whereClauseDatum;
>   Datum columnListDatum;
> + Datum exceptDatum;
>
> It's not necessary to have all these different Datum variables; they
> are only temporary storage. It might be simpler to use a single "Datum
> datum;" which is reused 3x.
>
> ~
>
> 5.
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull)
> + oldexcept = DatumGetBool(exceptDatum);
> +
>
> Isn't the 'prexcept' also used for EXCEPT TABLE as well as EXCEPT
> (column-list)? In other words, should the change to this function be
> done already in one of the earlier patches?
>
> ~
This code path is only executed when running ALTER PUBLICATION ... SET
TABLE and running this command on a  ALL TABLES publication throws an
error due to check by function 'CheckAlterPublication' . And EXCEPT
TABLE can only be used for ALL TABLES publications, I think it doesn’t
need to be moved to the 0002 patch.

>
> 6.
>   if (equal(oldrelwhereclause, newpubrel->whereClause) &&
> - bms_equal(oldcolumns, newcolumns))
> + bms_equal(oldcolumns, newcolumns) &&
> + oldexcept == newpubrel->except)
>
> The code comment about this code fragment should also mention EXCEPT.
>
> ======
> src/backend/parser/gram.y
>
> preprocess_pubobj_list:
>
> 7.
> + if (pubobj->pubtable && pubobj->pubtable->except &&
> + pubobj->pubtable->columns == NULL)
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("EXCEPT clause not allowed for table without column list"));
> +
>
> Having the syntax error location (like before in v18) might be better,
> but since that location is associated with the TABLE, then the error
> message should also be reworded so the subject is the table.
>
> SUGGESTION
> errmsg("table without column list cannot use EXCEPT clause")
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 8.
> - if (pset.sversion >= 150000)
> + if (pset.sversion >= 190000)
>   {
>   printfPQExpBuffer(&buf,
>     "SELECT pubname\n"
>     "     , NULL\n"
>     "     , NULL\n"
> +   " , NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
>     "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
>     "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> @@ -3038,35 +3039,62 @@ describeOneTableDetails(const char *schemaname,
>     "                pg_catalog.pg_attribute\n"
>     "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
>     "        ELSE NULL END) "
> +   " , prexcept "
>     "FROM pg_catalog.pg_publication p\n"
>     " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
>     " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> -   "WHERE pr.prrelid = '%s'\n",
> -   oid, oid, oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBufferStr(&buf, " AND NOT pr.prexcept\n");
> +   "WHERE pr.prrelid = '%s' "
> +   "AND  c.relnamespace NOT IN (\n "
> +   " SELECT pnnspid FROM\n"
> +   " pg_catalog.pg_publication_namespace)\n"
>
> - appendPQExpBuffer(&buf,
>     "UNION\n"
>     "SELECT pubname\n"
>     " , NULL\n"
>     " , NULL\n"
> +   " , NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
> -   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n",
> -   oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBuffer(&buf,
> -   "     AND NOT EXISTS (\n"
> -   " SELECT 1\n"
> -   " FROM pg_catalog.pg_publication_rel pr\n"
> -   " JOIN pg_catalog.pg_class pc\n"
> -   " ON pr.prrelid = pc.oid\n"
> -   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
> -   oid);
> -
> - appendPQExpBufferStr(&buf, "ORDER BY 1;");
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "     AND NOT EXISTS (\n"
> +   " SELECT 1\n"
> +   " FROM pg_catalog.pg_publication_rel pr\n"
> +   " JOIN pg_catalog.pg_class pc\n"
> +   " ON pr.prrelid = pc.oid\n"
> +   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid, oid);
> + }
> + else if (pset.sversion >= 150000)
> + {
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> +   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , pg_get_expr(pr.prqual, c.oid)\n"
> +   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
> +   "         (SELECT string_agg(attname, ', ')\n"
> +   "           FROM pg_catalog.generate_series(0,
> pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
> +   "                pg_catalog.pg_attribute\n"
> +   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> +   "        ELSE NULL END) "
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> +   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prrelid = '%s'\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid);
>
> I found these large SQL selects with 3x UNIONs are difficult to read.
> Maybe you can add more comments to describe the intention of each of
> the UNION SELECTs?
>
> ~~~
>
> 9.
>   /* column list (if any) */
>   if (!PQgetisnull(result, i, 2))
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT");
> + appendPQExpBuffer(&buf, " (%s)", PQgetvalue(result, i, 2));
> + }
>
> I did not find any regression test case where the "EXCEPT" col-list is
> getting output for a "Publications:" footer.
>
> ======
> [1] https://www.postgresql.org/message-id/aIELRMAviNiUL1ie%40momjian.us
>

I have addressed the comments and the changes in v20 patch.

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

15 августа 2025 г., 03:53:10

Hi Shlok,

Here are some review comments for v20-0003.

======
src/backend/commands/publicationcmds.c

AlterPublicationTables:

1.
  bool isnull = true;
- Datum whereClauseDatum;
- Datum columnListDatum;
+ Datum datum;

I know you did not write the code, but that "isnull = true" is
redundant, and seems kind of misleading because it will always be
re-assigned before it is used.

~~~

2.
  /* Load the WHERE clause for this table. */
- whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
-    Anum_pg_publication_rel_prqual,
-    &isnull);
+ datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &isnull);
  if (!isnull)
- oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+ oldrelwhereclause = stringToNode(TextDatumGetCString(datum));

  /* Transform the int2vector column list to a bitmap. */
- columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
-   Anum_pg_publication_rel_prattrs,
-   &isnull);
+ datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prattrs,
+ &isnull);
+
+ if (!isnull)
+ oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
+
+ /* Load the prexcept flag for this table. */
+ datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prexcept,
+ &isnull);

  if (!isnull)
- oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
+ oldexcept = DatumGetBool(datum);

Use consistent spacing. Either do or don't (I prefer don't) put a
blank line between the pairs of "datum =" and "if (!isnull)". Avoid
having a mixture.

======
src/bin/psql/describe.c

addFooterToPublicationOrTableDesc:

3.
+/*
+ * If is_tbl_desc is true add footer to table description else add footer to
+ * publication description.
+ */
+static bool
+addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
+   bool as_schema, printTableContent *const cont,
+   bool is_tbl_desc)

3a.
Since you are changing this anyway, I think it would be better to keep
those boolean params together (at the end).

~

3b.
It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
but having the variable 'is_tbl_desc', because it seems more natural
to me to read left to right, so the logical order of everything here
should be pub desc then table desc. In other words, use boolean
'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
thing is kind of a *subset* of the publication description, so it
makes more sense for that to come last too.

e.g.
CURRENT
addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
SUGGESTION
addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)

~

3c
While you are changing things, maybe also consider changing that
'as_schema' name because I did not understand what "as" means. Perhaps
rename like 'pub_schemas', or 'only_show_schemas' or something better
(???).

~~~

4.
+ PGresult   *res;
+ int count = 0;
+ int i = 0;
+ int col = is_tbl_desc ? 0 : 1;
+
+ res = PSQLexec(buf->data);
+ if (!res)
+ return false;
+ else
+ count = PQntuples(res);
+

4a.
Assignment count = 0 is redundant.

~

4b.
Remove the 'i' declaration here. Declare it in the "for" loop later.

~

4c.
The "else" is not required. If 'res' was not good, you already returned.

~~~

5.
+ if (as_schema)
+ printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
+ else
+ {
+ if (is_tbl_desc)
+ printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
+ else
+ printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
+   PQgetvalue(res, i, col));

This function is basically either (a) a footer for a table description
or (b) a footer for a publication description. And that all hinges on
the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
main condition to be "if (is_tbl_desc)" here.

This turned everything inside out. PSA: a top-up patch to show a way
to do this. Perhaps my implementation is a bit verbose, but OTOH it
seems easier to understand. Anyway, see what you think...

~~~

6.
+ /*---------------------------------------------------
+ * Publication/ table description columns:
+ * [0]: schema name (nspname)
+ * [col]: table name (relname) / publication name (pubname)
+ * [col + 1]: row filter expression (prqual), may be NULL
+ * [col + 2]: column list (comma-separated), may be NULL
+ * [col + 3]: except flag ("t" if EXCEPT, else "f")
+ *---------------------------------------------------

I've modified this comment slightly so I could understand it better.
See if you agree.

SUGGESTION
/*---------------------------------------------------
 * Description columns:
 * PUB      TBL
 * [0]      -      : schema name (nspname)
 * [col]    -      : table name (relname)
 * -        [col]  : publication name (pubname)
 * [col+1]  [col+1]: row filter expression (prqual), may be NULL
 * [col+2]  [col+1]: column list (comma-separated), may be NULL
 * [col+3]  [col+1]: except flag ("t" if EXCEPT, else "f")
 *---------------------------------------------------
 */

~~~

describeOneTableDetails:

7.
+ else if (pset.sversion >= 150000)
+ {
+ printfPQExpBuffer(&buf,
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
+   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
+   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , pg_get_expr(pr.prqual, c.oid)\n"
+   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
+   "         (SELECT string_agg(attname, ', ')\n"
+   "           FROM pg_catalog.generate_series(0,
pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
+   "                pg_catalog.pg_attribute\n"
+   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
+   "        ELSE NULL END) "
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
+   "WHERE pr.prrelid = '%s'\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
+   "ORDER BY 1;",
+   oid, oid, oid, oid);

AFAICT, that >= 150000 code seems to have added another UNION at the
end that was not previously there. What's that about? How is that
related to EXCEPT (column-list)?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

PS_addFooterToPublicationOrTableDesc.diff

Re: Skipping schema changes in publication

От

Kirill Reshke

Дата:

15 августа 2025 г., 09:16:32

Hi

On Fri, 15 Aug 2025 at 05:53, Peter Smith <smithpb2250@gmail.com> wrote:

> 1.
>   bool isnull = true;
> - Datum whereClauseDatum;
> - Datum columnListDatum;
> + Datum datum;
>
> I know you did not write the code, but that "isnull = true" is
> redundant, and seems kind of misleading because it will always be
> re-assigned before it is used.

People are not generally excited about refactoring code they did not
change. This makes patch to have more review cycles, and less probable
to actually being committed. If we are really wedded with this change,
this could be a separate thread.


> ~~~
>
> 2.
>   /* Load the WHERE clause for this table. */
> - whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -    Anum_pg_publication_rel_prqual,
> -    &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prqual,
> + &isnull);
>   if (!isnull)
> - oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
> + oldrelwhereclause = stringToNode(TextDatumGetCString(datum));
>
>   /* Transform the int2vector column list to a bitmap. */
> - columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -   Anum_pg_publication_rel_prattrs,
> -   &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prattrs,
> + &isnull);
> +
> + if (!isnull)
> + oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
> +
> + /* Load the prexcept flag for this table. */
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prexcept,
> + &isnull);
>
>   if (!isnull)
> - oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
> + oldexcept = DatumGetBool(datum);
>
> Use consistent spacing. Either do or don't (I prefer don't) put a
> blank line between the pairs of "datum =" and "if (!isnull)". Avoid
> having a mixture.
>
> ======
> src/bin/psql/describe.c
>
> addFooterToPublicationOrTableDesc:
>
> 3.
> +/*
> + * If is_tbl_desc is true add footer to table description else add footer to
> + * publication description.
> + */
> +static bool
> +addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
> +   bool as_schema, printTableContent *const cont,
> +   bool is_tbl_desc)
>
> 3a.
> Since you are changing this anyway, I think it would be better to keep
> those boolean params together (at the end).
>
> ~
>
> 3b.
> It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
> but having the variable 'is_tbl_desc', because it seems more natural
> to me to read left to right, so the logical order of everything here
> should be pub desc then table desc. In other words, use boolean
> 'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
> thing is kind of a *subset* of the publication description, so it
> makes more sense for that to come last too.
>
> e.g.
> CURRENT
> addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
> SUGGESTION
> addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)
>
> ~
>
> 3c
> While you are changing things, maybe also consider changing that
> 'as_schema' name because I did not understand what "as" means. Perhaps
> rename like 'pub_schemas', or 'only_show_schemas' or something better
> (???).
>
> ~~~
>
> 4.
> + PGresult   *res;
> + int count = 0;
> + int i = 0;
> + int col = is_tbl_desc ? 0 : 1;
> +
> + res = PSQLexec(buf->data);
> + if (!res)
> + return false;
> + else
> + count = PQntuples(res);
> +
>
> 4a.
> Assignment count = 0 is redundant.
>
> ~
>
> 4b.
> Remove the 'i' declaration here. Declare it in the "for" loop later.
>
> ~
>
> 4c.
> The "else" is not required. If 'res' was not good, you already returned.
>
> ~~~
>
> 5.
> + if (as_schema)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
> + else
> + {
> + if (is_tbl_desc)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
> + else
> + printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
> +   PQgetvalue(res, i, col));
>
> This function is basically either (a) a footer for a table description
> or (b) a footer for a publication description. And that all hinges on
> the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
> main condition to be "if (is_tbl_desc)" here.
>
> This turned everything inside out. PSA: a top-up patch to show a way
> to do this. Perhaps my implementation is a bit verbose, but OTOH it
> seems easier to understand. Anyway, see what you think...
>

+ 1

>
> 6.
> + /*---------------------------------------------------
> + * Publication/ table description columns:
> + * [0]: schema name (nspname)
> + * [col]: table name (relname) / publication name (pubname)
> + * [col + 1]: row filter expression (prqual), may be NULL
> + * [col + 2]: column list (comma-separated), may be NULL
> + * [col + 3]: except flag ("t" if EXCEPT, else "f")
> + *---------------------------------------------------
>
> I've modified this comment slightly so I could understand it better.
> See if you agree.

For me that's equal. lets see what other people think


-- 
Best regards,
Kirill Reshke

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

20 августа 2025 г., 12:00:36

On Fri, 15 Aug 2025 at 06:23, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> Here are some review comments for v20-0003.
>
> ======
> src/backend/commands/publicationcmds.c
>
> AlterPublicationTables:
>
> 1.
>   bool isnull = true;
> - Datum whereClauseDatum;
> - Datum columnListDatum;
> + Datum datum;
>
> I know you did not write the code, but that "isnull = true" is
> redundant, and seems kind of misleading because it will always be
> re-assigned before it is used.
>
Since this is part of already existing code, I think this should be a
new thread. I have created a new thread for this. See [1].

> ~~~
>
> 2.
>   /* Load the WHERE clause for this table. */
> - whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -    Anum_pg_publication_rel_prqual,
> -    &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prqual,
> + &isnull);
>   if (!isnull)
> - oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
> + oldrelwhereclause = stringToNode(TextDatumGetCString(datum));
>
>   /* Transform the int2vector column list to a bitmap. */
> - columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -   Anum_pg_publication_rel_prattrs,
> -   &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prattrs,
> + &isnull);
> +
> + if (!isnull)
> + oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
> +
> + /* Load the prexcept flag for this table. */
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prexcept,
> + &isnull);
>
>   if (!isnull)
> - oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
> + oldexcept = DatumGetBool(datum);
>
> Use consistent spacing. Either do or don't (I prefer don't) put a
> blank line between the pairs of "datum =" and "if (!isnull)". Avoid
> having a mixture.
>
> ======
> src/bin/psql/describe.c
>
> addFooterToPublicationOrTableDesc:
>
> 3.
> +/*
> + * If is_tbl_desc is true add footer to table description else add footer to
> + * publication description.
> + */
> +static bool
> +addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
> +   bool as_schema, printTableContent *const cont,
> +   bool is_tbl_desc)
>
> 3a.
> Since you are changing this anyway, I think it would be better to keep
> those boolean params together (at the end).
>
> ~
>
> 3b.
> It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
> but having the variable 'is_tbl_desc', because it seems more natural
> to me to read left to right, so the logical order of everything here
> should be pub desc then table desc. In other words, use boolean
> 'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
> thing is kind of a *subset* of the publication description, so it
> makes more sense for that to come last too.
>
> e.g.
> CURRENT
> addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
> SUGGESTION
> addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)
>
> ~
>
> 3c
> While you are changing things, maybe also consider changing that
> 'as_schema' name because I did not understand what "as" means. Perhaps
> rename like 'pub_schemas', or 'only_show_schemas' or something better
> (???).
>
I have used pub_schemas.
> ~~~
>
> 4.
> + PGresult   *res;
> + int count = 0;
> + int i = 0;
> + int col = is_tbl_desc ? 0 : 1;
> +
> + res = PSQLexec(buf->data);
> + if (!res)
> + return false;
> + else
> + count = PQntuples(res);
> +
>
> 4a.
> Assignment count = 0 is redundant.
>
> ~
>
> 4b.
> Remove the 'i' declaration here. Declare it in the "for" loop later.
>
> ~
>
> 4c.
> The "else" is not required. If 'res' was not good, you already returned.
>
> ~~~
>
> 5.
> + if (as_schema)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
> + else
> + {
> + if (is_tbl_desc)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
> + else
> + printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
> +   PQgetvalue(res, i, col));
>
> This function is basically either (a) a footer for a table description
> or (b) a footer for a publication description. And that all hinges on
> the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
> main condition to be "if (is_tbl_desc)" here.
>
> This turned everything inside out. PSA: a top-up patch to show a way
> to do this. Perhaps my implementation is a bit verbose, but OTOH it
> seems easier to understand. Anyway, see what you think...
>
I have also used the patch with minor changes.

> ~~~
>
> 6.
> + /*---------------------------------------------------
> + * Publication/ table description columns:
> + * [0]: schema name (nspname)
> + * [col]: table name (relname) / publication name (pubname)
> + * [col + 1]: row filter expression (prqual), may be NULL
> + * [col + 2]: column list (comma-separated), may be NULL
> + * [col + 3]: except flag ("t" if EXCEPT, else "f")
> + *---------------------------------------------------
>
> I've modified this comment slightly so I could understand it better.
> See if you agree.
>
> SUGGESTION
> /*---------------------------------------------------
>  * Description columns:
>  * PUB      TBL
>  * [0]      -      : schema name (nspname)
>  * [col]    -      : table name (relname)
>  * -        [col]  : publication name (pubname)
>  * [col+1]  [col+1]: row filter expression (prqual), may be NULL
>  * [col+2]  [col+1]: column list (comma-separated), may be NULL
>  * [col+3]  [col+1]: except flag ("t" if EXCEPT, else "f")
>  *---------------------------------------------------
>  */
>
> ~~~
>
I have used the suggested description with some modifications.

> describeOneTableDetails:
>
> 7.
> + else if (pset.sversion >= 150000)
> + {
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> +   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , pg_get_expr(pr.prqual, c.oid)\n"
> +   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
> +   "         (SELECT string_agg(attname, ', ')\n"
> +   "           FROM pg_catalog.generate_series(0,
> pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
> +   "                pg_catalog.pg_attribute\n"
> +   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> +   "        ELSE NULL END) "
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> +   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prrelid = '%s'\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid);
>
> AFAICT, that >= 150000 code seems to have added another UNION at the
> end that was not previously there. What's that about? How is that
> related to EXCEPT (column-list)?
>
This patch does not add any new code to  >= 150000. It is the same as
HEAD. This diff appears because of changes in 0002 patchset. In patch
0002, I did not create a separate full query for >= 190000 due to
small changes.

I have addressed the rest of the comments and added the changes in the
latest v21 patchset.


[1]: https://www.postgresql.org/message-id/CANhcyEXHiCbk2q8%3Dbq3boQDyc8ac9fjgK-kkp5PdTYLcAOq80Q%40mail.gmail.com

Thanks,
Shlok Kyal

On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> I reviewed your latest v20-0003 patch and have no more comments at
> this time; I only found one trivial typo.
>
> ======
> src/bin/psql/describe.c
>
> 1.
> + /*
> + * Footers entries for a publication description or a table
> + * description
> + */
>
> Typo. /Footers entries/Footer entries/
>

I have fixed it and attached the updated patches

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

05 сентября 2025 г., 09:27:23

On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Shlok,
> >
> > I reviewed your latest v20-0003 patch and have no more comments at
> > this time; I only found one trivial typo.
> >
> > ======
> > src/bin/psql/describe.c
> >
> > 1.
> > + /*
> > + * Footers entries for a publication description or a table
> > + * description
> > + */
> >
> > Typo. /Footers entries/Footer entries/
> >
>
> I have fixed it and attached the updated patches
>
The patches were not applying on HEAD and needed a Rebase. Here is the
rebased patches

Thanks,
Shlok Kyal

On Thu, 25 Sept 2025 at 16:39, vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 5 Sept 2025 at 11:57, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > > On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > Hi Shlok,
> > > >
> > > > I reviewed your latest v20-0003 patch and have no more comments at
> > > > this time; I only found one trivial typo.
> > > >
> > > > ======
> > > > src/bin/psql/describe.c
> > > >
> > > > 1.
> > > > + /*
> > > > + * Footers entries for a publication description or a table
> > > > + * description
> > > > + */
> > > >
> > > > Typo. /Footers entries/Footer entries/
> > > >
> > >
> > > I have fixed it and attached the updated patches
> > >
> > The patches were not applying on HEAD and needed a Rebase. Here is the
> > rebased patches
>
> Few comments:
> 1) Currently from pg_publication_tables it is not clear if it is
> replicating column list or replicating exclude column, can we indicate
> if it is exclude or not:
> create publication pub1 for table t1(c1);
> create publication pub2 for  table t1 except ( c1);
>
> postgres=# select * from pg_publication_tables;
>  pubname | schemaname | tablename | attnames | rowfilter
> ---------+------------+-----------+----------+-----------
>  pub1    | public     | t1        | {c1}     |
>  pub2    | public     | t1        | {c2}     |
> (2 rows)
>
> 2) Tab completion is not correct in this case:
> postgres=# alter publication pub3 add table t2 EXCEPT (
> ,        WHERE (
>
> 3) tab6 is not used anywhere, it can be removed:
> +       CREATE TABLE tab5 (a int, b int, c int);
> +       CREATE TABLE tab6 (agen int GENERATED ALWAYS AS (1) STORED,
> bgen int GENERATED ALWAYS AS (2) STORED);
> +       INSERT INTO tab1 VALUES (1, 2, 3);
>
> 4) both these tests are using same message:
> +  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1 ORDER BY a");
> +is( $result, qq(|2|3
> +|5|6),
> +       'check incremental insert for EXCEPT (column-list) publication');
> +$result = $node_subscriber->safe_psql('postgres',
> +       "SELECT * FROM sch1.tab1 ORDER BY a");
> +is( $result, qq(1||
> +4||), 'check incremental insert for EXCEPT (column-list) publication');
>
> we can include table name here to differentiate the test that will
> help in identifying test failure easily
>
> 5) /newly added column are is replicated/ should be "newly added
> column is replicated"
> is($result, qq(|||10), 'newly added column are is replicated');
Hi Vignesh,

Thanks for reviewing the patch.
I have addressed the comments and attached the updated version.

Thanks,
Shlok Kyal

On Thu, 30 Oct 2025 at 11:34, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Vignesh
>
> Here are some review comments for the patch v24-0002.
>
> These comments are just for the SGML docs. The patch needs a rebase so
> I was unable to review the code.
>
> ======
> Commit message
>
> 1.
> A new column "prexcept" is added to table "pg_publication_rel", to maintain
> the relations that the user wants to exclude from the publications.
>
> ~
>
> /to maintain/to flag/
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>    <para>
> -   To add tables to a publication, the user must have ownership rights on the
> -   table. To add all tables in schema to a publication, the user must be a
> -   superuser. To create a publication that publishes all tables or
> all tables in
> -   schema automatically, the user must be a superuser.
> +   To create a publication using FOR ALL TABLES or FOR ALL TABLES IN SCHEMA,
> +   the user must be a superuser. To add ALL TABLES or ALL TABLES IN SCHEMA to a
> +   publication, the user must be a superuser. To add tables to a publication,
> +   the user must have ownership rights on the table.
>    </para>
>
> Those "FOR ALL TABLES" etc are missing SGML markup.
>
> ======
> doc/src/sgml/ref/alter_publication.sgml
>
> 3.
> +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES [ EXCEPT [ TABLE ] <replaceable
> class="parameter">exception_object</replaceable> [, ... ] ]
>
> and
>
> +
> +<phrase>where <replaceable
> class="parameter">exception_object</replaceable> is:</phrase>
> +
> +    [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
> +
>
> It is not clear from the syntax which of these is possible.
>
> ... ADD ALL TABLES EXCEPT TABLE t1,t2,t3
> ... ADD ALL TABLES EXCEPT TABLE t1, TABLE t2, TABLES t3
>
> IMO it is best put the "[TABLE]" within the exception_object:
> [ TABLE ] [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
>
> Then both are possible, which is consistent with how "FOR TABLE" syntax works.
>
> Furthermore, you might want later to say EXCLUDE SEQUENCE, so doing it
> this way makes that possible.
>
> ~~~
>
> 4.
> -   Adding a table to a publication additionally requires owning that table.
> -   The <literal>ADD TABLES IN SCHEMA</literal>,
> +   Adding a table to or excluding a table from a publication additionally
> +   requires owning that table. The <literal>ADD ALL TABLES</literal>,
>
> This wording seems a bit awkward. How are re-phrasing like:
>
> SUGGESTION
> Adding or excluding a table from a publication requires ownership of that table.
>
> ~~~
>
> 5.
> -      name to explicitly indicate that descendant tables are included.
> +      name to explicitly indicate that descendant tables are affected. For
> +      partitioned tables, <literal>ONLY</literal> donot have any effect.
>
> typo: /donot/does not/
>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 6.
> -    [ FOR ALL TABLES
> +    [ FOR ALL TABLES [ EXCEPT [ TABLE ] <replaceable
> class="parameter">exception_object</replaceable> [, ... ] ]
>        | FOR <replaceable
> class="parameter">publication_object</replaceable> [, ... ] ]
>      [ WITH ( <replaceable
> class="parameter">publication_parameter</replaceable> [= <replaceable
> class="parameter">value</replaceable>] [, ... ] ) ]
>
> @@ -30,6 +30,10 @@ CREATE PUBLICATION <replaceable
> class="parameter">name</replaceable>
>
>      TABLE [ ONLY ] <replaceable
> class="parameter">table_name</replaceable> [ * ] [ ( <replaceable
> class="parameter">column_name</replaceable> [, ... ] ) ] [ WHERE (
> <replaceable class="parameter">expression</replaceable> ) ] [, ... ]
>      TABLES IN SCHEMA { <replaceable
> class="parameter">schema_name</replaceable> | CURRENT_SCHEMA } [, ...
> ]
> +
> +<phrase>where <replaceable
> class="parameter">exception_object</replaceable> is:</phrase>
> +
> +    [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
>
> Same review comment as #3 before.
>
> I think it is clearer (and more flexible) to change the
> exception_object to include [TABLE].
> [ TABLE ] [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
>
> It also helps pave the way for any future EXCLUDE SEQUENCE feature.
>
> ~~~
>
> 7.
> +     <para>
> +      This clause specifies a list of tables to be excluded from the
> +      publication. It can only be used with <literal>FOR ALL TABLES</literal>.
> +      If <literal>ONLY</literal> is specified before the table name, only
> +      that table is excluded from the publication. If
> <literal>ONLY</literal> is
> +      not specified, the table and all its descendant tables (if any) are
> +      excluded. Optionally, <literal>*</literal> can be specified after the
> +      table name to explicitly indicate that descendant tables are excluded.
> +      This does not apply to a partitioned table, however.  The partitioned
> +      table or its partitions are excluded from the publication based on the
> +      parameter <literal>publish_via_partition_root</literal>.
> +     </para>
> +     <para>
> +      When <literal>publish_via_partition_root</literal> is set to
> +      <literal>true</literal>, specifying a root partitioned table in
> +      <literal>EXCEPT TABLE</literal> excludes it and all its partitions from
> +      replication. Specifying a leaf partition has no effect, as its
> changes are
> +      still replicated via the root partitioned table. When
> +      <literal>publish_via_partition_root</literal> is set to
> +      <literal>false</literal>, specifying a partitioned table or non-leaf
> +      partition has no effect, as changes are replicated via the leaf
> +      partitions. Specifying a leaf partition excludes only that partition from
> +      replication.
> +     </para>
>
> I felt that the second paragraph should be started with the sentence
> "The partitioned table or its partitions are excluded...", so then
> everything related to "publish_via_partition_root" is kept together.
>
> ~~~
>
> 8.
> +  <para>
> +   Create a publication that publishes all changes in all the tables except for
> +   the changes of <structname>users</structname> and
> +   <structname>departments</structname>:
> +<programlisting>
> +CREATE PUBLICATION mypublication FOR ALL TABLES EXCEPT users, departments;
> +</programlisting>
> +  </para>
>
> The words "the changes of" are not needed, and you did not use that
> wording in the ALTER PUBLICATION example.
>
> ======
> doc/src/sgml/ref/psql-ref.sgml
>
> 9.
>          If <literal>x</literal> is appended to the command name, the results
>          are displayed in expanded mode.
> -        If <literal>+</literal> is appended to the command name, the tables and
> -        schemas associated with each publication are shown as well.
> +        If <literal>+</literal> is appended to the command name, the tables,
> +        excluded tables and schemas associated with each publication
> are shown as
> +        well.
>          </para>
>
> /excluded tables and schemas/excluded tables, and schemas/
>
Hi Peter, Vignesh

Thanks for reviewing the patches.
I have rebased the patches. I have modified the syntax for EXCEPT
TABLE (002) patch.
For example, now to exclude a table we need to specify like:
CREATE PUBLICATION pub1 FOR ALL TABLE EXCEPT TABLE (t1, t2);
We need to specify '()' around the table list.

This patchset is the only rebased version. I will address all the
comments in the next version of patch.

Thanks,
Shlok Kyal

On Fri, 7 Nov 2025 at 11:36, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some questions for the patch v25-0002 (EXCEPT tables)
>
> ======
> doc/src/sgml/ref/alter_publication.sgml
>
> 1.
> +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES [ EXCEPT [ TABLE ] ( <replaceable
> class="parameter">exception_object</replaceable> [, ... ] ) ]
>
> You can do both ADD/SET the <publication_object>, so really there
> should be an ADD/SET ALL TABLES command as well, right?
>
These patches only added the ADD ALL TABLES command. I think once the
ADD ALL TABLES patch is committed, we can add the syntax SET ALL
TABLES.

> ~~~
>
> 2.
> What was your reason for changing the syntax?
> AFAICT those added "( )" are not strictly necessary, so I just
> wondered your reason.
>
> For example, we do not have any "( )" for <publication_object> [,...].
> It is: ALTER PUBLICATION name ADD publication_object [, ...]
> Not:   ALTER PUBLICATION name ADD (publication_object [, ...])
>
> So in the same way we could have EXCEPT syntax like that:
> ALTER PUBLICATION name ADD ALL TABLES [EXCEPT <table_exception_object> [, ...]]
> Where table_exception_object is: [ TABLE ] [ ONLY ] table_name [ * ]
>
> Currently, if the user just wants to exclude a single table they must do:
> ALTER PUBLICATION name ADD ALL TABLES EXCEPT (t1);
> instead of just ALTER PUBLICATION name ADD ALL TABLES EXCEPT t1;
>
With recent commit now we support
CREATE PUBLICATION .. FOR ALL TABLES, ALL SEQUENCES.

Now when I am trying to support "FOR ALL TABLE EXCEPT t1, t2" , I am
getting a conflict when compiling this grammar.
For example
CREATE PUBLICATION .. FOR ALL TABLES EXCEPT t1, ...
After this comma, bison is giving conflict because it is not able to
figure whether to pick
ExceptPublicationObjSpec or a PublicationAllObjSpec.
So to handle this I introduced brackets around the table list.
And to make ALTER PUBLICATION similar to CREATE PUBLICATION, I have
added the same syntax for it.

So current syntax for CREATE/ALTER PUBLICATION is like:
CREATE PUBLICATION ... ALL TABLES EXCEPT TABLE(t1, t2, t3);
ALTER PUBLICATION ... ADD ALL TABLES EXCEPT TABLE(t1, t2, t3);

> ~~~
>
> 3.
> BTW, I think you may need to consider a <table_exception_object>
> instead of a generic name like <exception_object>, because in the
> future if we EXCEPT SEQUENCES the <exception_object> name may be not
> appropriate because things like [ONLY] and [*] are not applicable for
> sequences.
Fixed

I have attached the latest patch here.
I have also addressed the comments for [1], [2].

[1]: https://www.postgresql.org/message-id/CALDaNm0xDv96F%2B5LzcJYV6RC3Jg%2BRtdUqpQ-zoauwq3woTFzmQ%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAHut+PsRD8ybC7MDBNBXXs=J2DuGiOc8kSePRyZc0s63U5f7tw@mail.gmail.com

Thanks,
Shlok Kyal

On Tue, 11 Nov 2025 at 15:50, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Fri, 7 Nov 2025 at 11:36, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Shlok.
> >
> > Some questions for the patch v25-0002 (EXCEPT tables)
> >
> > ======
> > doc/src/sgml/ref/alter_publication.sgml
> >
> > 1.
> > +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> > ADD ALL TABLES [ EXCEPT [ TABLE ] ( <replaceable
> > class="parameter">exception_object</replaceable> [, ... ] ) ]
> >
> > You can do both ADD/SET the <publication_object>, so really there
> > should be an ADD/SET ALL TABLES command as well, right?
> >
> These patches only added the ADD ALL TABLES command. I think once the
> ADD ALL TABLES patch is committed, we can add the syntax SET ALL
> TABLES.
>
> > ~~~
> >
> > 2.
> > What was your reason for changing the syntax?
> > AFAICT those added "( )" are not strictly necessary, so I just
> > wondered your reason.
> >
> > For example, we do not have any "( )" for <publication_object> [,...].
> > It is: ALTER PUBLICATION name ADD publication_object [, ...]
> > Not:   ALTER PUBLICATION name ADD (publication_object [, ...])
> >
> > So in the same way we could have EXCEPT syntax like that:
> > ALTER PUBLICATION name ADD ALL TABLES [EXCEPT <table_exception_object> [, ...]]
> > Where table_exception_object is: [ TABLE ] [ ONLY ] table_name [ * ]
> >
> > Currently, if the user just wants to exclude a single table they must do:
> > ALTER PUBLICATION name ADD ALL TABLES EXCEPT (t1);
> > instead of just ALTER PUBLICATION name ADD ALL TABLES EXCEPT t1;
> >
> With recent commit now we support
> CREATE PUBLICATION .. FOR ALL TABLES, ALL SEQUENCES.
>
> Now when I am trying to support "FOR ALL TABLE EXCEPT t1, t2" , I am
> getting a conflict when compiling this grammar.
> For example
> CREATE PUBLICATION .. FOR ALL TABLES EXCEPT t1, ...
> After this comma, bison is giving conflict because it is not able to
> figure whether to pick
> ExceptPublicationObjSpec or a PublicationAllObjSpec.
> So to handle this I introduced brackets around the table list.
> And to make ALTER PUBLICATION similar to CREATE PUBLICATION, I have
> added the same syntax for it.
>
> So current syntax for CREATE/ALTER PUBLICATION is like:
> CREATE PUBLICATION ... ALL TABLES EXCEPT TABLE(t1, t2, t3);
> ALTER PUBLICATION ... ADD ALL TABLES EXCEPT TABLE(t1, t2, t3);
>
> > ~~~
> >
> > 3.
> > BTW, I think you may need to consider a <table_exception_object>
> > instead of a generic name like <exception_object>, because in the
> > future if we EXCEPT SEQUENCES the <exception_object> name may be not
> > appropriate because things like [ONLY] and [*] are not applicable for
> > sequences.
> Fixed
>
> I have attached the latest patch here.
> I have also addressed the comments for [1], [2].
>
> [1]: https://www.postgresql.org/message-id/CALDaNm0xDv96F%2B5LzcJYV6RC3Jg%2BRtdUqpQ-zoauwq3woTFzmQ%40mail.gmail.com
> [2]: https://www.postgresql.org/message-id/CAHut+PsRD8ybC7MDBNBXXs=J2DuGiOc8kSePRyZc0s63U5f7tw@mail.gmail.com
>

The patches needed a rebase. Here are the rebased patches.

Thanks,
Shlok Kyal

On Fri, 14 Nov 2025 at 12:15, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for patch v27-0001.
>
> ======
> doc/src/sgml/ref/alter_publication.sgml
>
> 1.
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to
> the default
> +   state. This includes resetting all publication parameters, setting the
> +   <literal>ALL TABLES</literal> and <literal>ALL SEQUENCES</literal> flags to
> +   <literal>false</literal>, and removing all associated tables and
> schemas from
> +   the publication.
>    </para>
>
> It would be better to give references to the actual
> pg_publication.puballtables and .puballsequences flag fields [1]
> instead of vaguely calling them the "<literal>ALL TABLES</literal> and
> <literal>ALL SEQUENCES</literal> flags".
>
Fixed

> ======
> src/backend/commands/publicationcmds.c
>
> AlterPublicationReset:
>
> 2.
> + if (pubform->puballtables)
> + CacheInvalidateRelcacheAll();
>
> Does that also need to check ->puballsequences?
>
I think we call CacheInvalidateRelcacheAll to invalide the relsync
cache for the case of ALTER Publication. For sequences we do not build
RelSyncEntry.
Also I see there are other similar occurrences (such as
RemovePublicationById, AlterPublicationOptions) where we do not
invalidate cache if we modify all sequence publications.
So, I think we do not require this check for puballsequences.

> ======
> src/test/regress/sql/publication.sql
>
> 3.
> If you want to, you can easily combine many of these test cases and
> verify them in one go instead of separate ALTER/RESET for every kind
> of flag.
>
> ~~~
>
I agree. I have made the changes in the latest patch.

> 4.
> +-- Verify that 'ALL TABLES' flag is reset
>
> Missing test to check the 'ALL SEQUENCES' flag gets reset?
>
Added the test.

> ======
> [1] https://www.postgresql.org/docs/devel/catalog-pg-publication.html
>

I have also addressed the comments in [1], [2].

[1]: https://www.postgresql.org/message-id/CAHut%2BPtRzCD4-0894cutkU_h8cPNtosN0_oSHn2iAKEfg2ENOQ%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAHut+PuHn-hohA4OdEJz+Zfukfr41TvMTeTH7NwJ=wg1+94uNA@mail.gmail.com

Thanks,
Shlok Kyal

On Thu, 20 Nov 2025 at 11:54, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Thanks for splitting the patches.
>
> Here are some review comments for the new patch v28-0002 (ADD ALL TABLES).
>
> ======
> Commit Message
>
> 1.
> This patch adds support for using ADD ALL TABLES in ALTER PUBLICATION,
> allowing an existing publication to be changed into an ALL TABLES
> publication. This command is permitted only when the publication is
> in its default state, meaning it has no tables or schemas added, its
> ALL TABLES and ALL SEQUENCES flags are not set, and publication
> options such as publish_via_root_partition, publish_generated_columns,
> and publish are at their default values.
>
> ~
>
> IMO, the restrictions for this new command are too severe:
>
> e.g. If I already have a FOR ALL SEQUENCES publication, then I
> expected it should be possible to ADD ALL TABLES to that as well,
> right?
>
> Likewise, why are we enforcing that the publication parameters must be
> defaults? IOW, why is (i) below disallowed, but (ii) is allowed?
>
> (i)
> ALTER PUBLICATION pub SET (publish_generated_columns=stored);
> ALTER PUBLICATION pub ADD ALL TABLES;
>
> (ii)
> ALTER PUBLICATION pub ADD ALL TABLES;
> ALTER PUBLICATION pub SET (publish_generated_columns=stored);
>
I agree that the current restrictions were too strict. With the latest
patch we avoid adding ALL TABLES only when we have an existing list of
tables or schemas in a publication.

> ======
> doc/src/sgml/ref/alter_publication.sgml
>
> Description:
>
> 2.
> The "Description" part of this page is confusing because it was
> referring to "The first three variants" and later "The fourth
> variant".  Now that the "ADD ALL TABLES" variant has been added, I
> have lost track of what "variants" this description is talking about.
> Those words should be replaced by something clearer. This could be an
> ongoing issue if it is not worded differently because the same problem
> will happen again, e.g. when more syntax gets added for ALL SEQUENCES,
> etc.
>
> ~~~
>
I have updated the description to avoid the wording "The first three
variants". Instead I have added a list to describe each command
separately. Similar to ALTER TABLE [1].

> 3.
> Note also that DROP TABLES IN SCHEMA will not drop any schema tables
> that were specified using FOR TABLE/ ADD TABLE.
>
> ~
>
> That sentence (above) is from the docs. Does that also need updating
> now that there is ADD ALL TABLES?
>
When we create a publication on a schema, we can also add specific
tables using FOR TABLE/ADD TABLE.
But in case of ALL TABLES publication we are not allowed to include
tables using FOR TABLE/ADD TABLE.

So for ALL TABLES case this wording is not required.

> ======
> src/backend/commands/publicationcmds.c
>
> CheckPublicationDefValues:
>
> 4.
> Is this function needed?
>
It is not needed. Modified the function to give proper error messages
for each case.

> ~~~
>
> AlterPublication:
>
> 5.
> + if (stmt->for_all_tables)
> + {
> + bool isdefault = CheckPublicationDefValues(tup);
> +
> + if (!isdefault)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("adding ALL TABLES requires the publication to have default
> publication parameter values"),
> + errdetail("ALL TABLES or ALL SEQUENCES flag should not be set and no
> tables/schemas should be associated."),
> + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> +
> + AlterPublicationSetAllTables(rel, tup);
> + }
> +
>
> Why do we need this self-imposed restriction?
>
See reply to comment 1.

> ======
> src/include/nodes/parsenodes.h
>
> 6.
>   List    *pubobjects; /* Optional list of publication objects */
> + bool for_all_tables; /* Special publication for all tables in db */
>   AlterPublicationAction action; /* What action to perform with the given
>   * objects */
>  } AlterPublicationStmt;
>
>
> There is no such "FOR" syntax like ALTER PUBLICATION ... FOR ALL
> TABLES, so I felt just 'puballtables' might be a better member name.
>
We have the same variable name in CreatePublicationStmt. I feel
keeping the name as 'for_all_tables' will keep it consistent and
easier to understand.

> ======
> src/test/regress/sql/publication.sql
>
> 7.
> Don't uppercase any of the publication parameters because they never
> appear in the docs/examples like that.
>
> ~
>
> 8.
> So that the last command is the one being tested, I felt that all the
> test cases should be doing RESET *first* instead of last.
>
> ~~~
>
> 9.
> You don't always need to use RESET. There should also be some tests
> using an "empty" publication just to be sure it works. e.g
>
> CREATE PUBLICATION pub_empty;
> ALTER PUBLICATION pub_empty ADD ALL TABLES;
>
> ~~~
>
> 10.
> As commented earlier, I felt the rules were too restrictive. So I
> think some test cases can be removed.
>
> ~~~
>
> 11.
> +-- Tests for ALTER PUBLICATION ... ADD ALL TABLES
>
> ~
>
> I noticed there is a "--
> ======================================================" separator
> between the major groups of tests.
>
> 11a. Should use this separator in patch 0001 for the RESET group of tests
>
> 11b. Should use this separator in patch 0002 for the ADD ALL TABLES
> groups of tests
>
> ~~~
>
> 12.
> +-- Can't add ALL TABLES to 'ALL TABLES' publication
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES;
> +
>
> This test case seems to belong earlier, near the 'FOR TABLE' and the
> 'TABLES IN SCHEMA' tests.
>
I saw the patch needed a rebase. I have rebased it.
I have also addressed the remaining comments in this email and
comments in the email [2].

While addressing the comments I saw there were a couple of race
conditions when we run 'ALTER PUBLICATION ... RESET and ALTER
PUBLICATION ... ADD TABLE concurrently' and
 'ALTER PUBLICATION ... ADD ALL TABLES and ALTER PUBLICATION ... ADD
TABLE concurrently'
I have addressed these in the v29 patch.
Will address comments for 0003 and 0004 patch by Peter and comments by
Shveta in next version.

[1]: https://www.postgresql.org/docs/current/sql-altertable.html
[2]: https://www.postgresql.org/message-id/CAHut%2BPv4d9EAjDQiOHiu2BrYP3ZA-oJgsgGZdygBaZnWDR7sDA%40mail.gmail.com

Thanks,
Shlok Kyal

On Mon, 8 Dec 2025 at 17:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 8, 2025 at 5:27 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Dec 4, 2025 at 5:21 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > > I have addressed these in the v29 patch.
> > > Will address comments for 0003 and 0004 patch by Peter and comments by
> > > Shveta in next version.
> > >
> >
> > Thanks for the patch.
> >
> > I believe patch 003 (EXCEPT table) and 004 (EXCEPT column_list) should
> > be the primary focus.
> >
>
> +1. We should first try to make 0003 RFC before going further.
>
I have removed the 0001 0002 and 0004 patches for now. Will post them
once 0003 patch is RFC.
Here is the update patch for "EXCEPT TABLE".

Thanks,
Shlok Kyal

Вложения

v30-0001-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

09 декабря 2025 г., 20:48:48

On Mon, 24 Nov 2025 at 13:03, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Nov 21, 2025 at 5:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Shlok.
> >
> > Here are some review comments for your patch v28-0003 (EXCEPT TABLE ...).
> >
> > The review of this patch is a WIP. In this post I only looked at the test code.
> >
>
> Here are my remaining review comments for patch v28-0003 (EXCEPT TABLE ...).
>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 1.
> -ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES
> +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES [ EXCEPT [ TABLE ] ( <replaceable
> class="parameter">table_exception_object</replaceable> [, ... ] ) ]
>
> Why is that optional [TABLE] keyword needed?
>
> I know PostGres commands sometimes have "noise" words in the syntax so
> the command can be more English-like, but in this case, the
> publication is a FOR ALL *TABLES* anyway, so I am not sure what the
> benefit is of the user being able to say TABLE a 2nd time?
>
I think this feature can be extended to EXCEPT SCHEMA etc. So I think
it is necessary for clarity.
There is already a discussion [1].

> ======
> src/backend/catalog/pg_publication.c
>
> 2.
> + /*
> + * Check for partitions of partitioned table which are specified with
> + * EXCEPT clause and partitioned table is published with
> + * publish_via_partition_root = true.
> + */
>
> I think you can just say "partitions" or "table partitions", but
> "partitions of [a] partitioned table" seems overkill.
>
> Also, "... and partitioned table is published with
> publish_via_partition_root = true." seems too wordy. Isn't that just
> the same as "... and publish_via_partition_root = true"
>
> SUGGESTION
> Check for when the publication says "EXCEPT TABLE (partition)" but
> publish_via_partition_root = true.
>
> ~~~
>
Modified

> 3.
> -/* Gets list of publication oids for a relation */
> +/* Gets list of publication oids for a relation that matches the except_flag */
>  List *
> -GetRelationPublications(Oid relid)
> +GetRelationPublications(Oid relid, bool except_flag)
>  {
>   List    *result = NIL;
>   CatCList   *pubrellist;
> @@ -765,7 +791,8 @@ GetRelationPublications(Oid relid)
>   HeapTuple tup = &pubrellist->members[i]->tuple;
>   Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
>
> - result = lappend_oid(result, pubid);
> + if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
> + result = lappend_oid(result, pubid);
>   }
>
> I was wondering if it might be better to return 2 lists from this
> function (e.g. an included-list, and an excluded-list) instead of
> passing the 'except_flag' like the current code. IIUC, you are mostly
> calling this function twice to get 2 lists anyway, but returning 2
> lists instead of 1, this function might be more efficient since it
> will only process the publication loop once.
>
Modified

> ~~~
>
> 4.
> /*
>  * Gets list of relation oids for a publication that matches the except_flag.
>  *
>  * This should only be used FOR TABLE publications, the FOR ALL TABLES/SEQUENCES
>  * should use GetAllPublicationRelations().
>  */
> List *
> GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
> bool except_flag)
> Something doesn't seem right -- the function comment says we shouldn't
> be calling the function for FOR ALL TABLES, but meanwhile, EXCEPT
> TABLE is currently only implemented via FOR ALL TABLES. So it feels
> contradictory. Maybe it is just the comment that needs updating?
>
I thought more about this function and found that we can remove the
'except_flag' variable.

Since we can only use EXCEPT TABLE clause for ALL TABLES publication
and we cannot use FOR TABLE clause with ALL TABLES.
If for ALL TABLES publication we call this function, we will return an
except table list.
Else we will return a list of table to be included in publication.

I have added a comment to this behaviour.

> ~~~
>
> 5.
> /*
>  * Gets list of relation oids for a publication that matches the except_flag.
>  *
>  * This should only be used FOR TABLE publications, the FOR ALL TABLES/SEQUENCES
>  * should use GetAllPublicationRelations().
>  */
> List *
> GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
> bool except_flag)
> {
> List    *result;
> Relation pubrelsrel;
> ScanKeyData scankey;
> SysScanDesc scan;
> HeapTuple tup;
>
> /* Find all publications associated with the relation. */
> pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
>
> Existing bug? Isn't this a bogus comment?
> /* Find all publications associated with the relation. */
>
> Was that meant to be the other way around? -- e.g. Find all the
> relations associated with the specified publication.
>
I think you are correct. I will create a separate thread for this change.

> ======
> src/backend/commands/publicationcmds.c
>
> 6.
> + default:
> + /* shouldn't happen */
> + elog(ERROR, "invalid publication object type %d",
> + puballobj->pubobjtype);
> + break;
>
> I think the ERROR is enough of a clue that it shouldn't happen. I felt
> the comment was redundant.
>
There are multiple similar occurrences. See functions
'ObjectsInPublicationToOids', in publicationcmds.c.
There are some occurrences in the openssl.c file as well. But I also
think this comment is redundant.
I have removed the comment.

> ~~~
>
> ObjectsInPublicationToOids:
>
> 7.
>   case PUBLICATIONOBJ_TABLE:
> + pubobj->pubtable->except = false;
> + *rels = lappend(*rels, pubobj->pubtable);
> + break;
> + case PUBLICATIONOBJ_EXCEPT_TABLE:
> + pubobj->pubtable->except = true;
>   *rels = lappend(*rels, pubobj->pubtable);
>   break;
> Those are very similar. How about combining like below?
>
> case PUBLICATIONOBJ_TABLE:
> case PUBLICATIONOBJ_EXCEPT_TABLE:
>   pubobj->pubtable->except = (pubobj->pubobjtype ==
> PUBLICATIONOBJ_EXCEPT_TABLE);
>   *rels = lappend(*rels, pubobj->pubtable);
>   break;
>
Modified

> ~~
>
> pub_contains_invalid_column:
>
> 8.
>  pub_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
>   bool pubviaroot, char pubgencols_type,
> - bool *invalid_column_list,
> + bool puballtables, bool *invalid_column_list,
>   bool *invalid_gen_col)
>
> The 'pub_via_root' and 'pubgencols_type' are parameters. Somehow it
> seems more natural for the 'puballtables' to be passed before those,
> because FOR ALL TABLES comes before WITH in the syntax.
>
Modified

> ~~~
>
> CreatePublication:
>
> 9.
>   else if (!stmt->for_all_sequences)
> - {
>   ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
>      &schemaidlist);
>
> AFAICT, this function is refactored a lot because of the removal of
> that '{'. It looks like mostly whitespace, but really, I think the
> logic is quite different. I wasn't sure what that was about. Is it
> related to this patch, or some other bugfix in passing or what?
>
This change is part of this patch.
This change is required to get a list of tables which are excluded
(for ALL TABLES publication).

I think the code related to schema should be inside the condition
'else if (!stmt->for_all_sequences)'
I have made the change for the same and also added a comment.

> ======
> src/backend/commands/tablecmds.c
>
> ATPrepChangePersistence:
>
> 10.
> - GetRelationPublications(RelationGetRelid(rel)) != NIL)
> + list_length(GetRelationPublications(RelationGetRelid(rel), false)) > 0)
>
> Isn't an empty List the same as a NIL list? Maybe that list_length()
> change was not really needed.
>
Modified

> ======
> src/backend/parser/gram.y
>
> 11.
>   drop_option_list pub_obj_list pub_all_obj_type_list
> + except_pub_obj_list opt_except_clause
>
> Is this name consistent with the others? Should it be pub_except_obj_list?
>
I think pub_except_obj_list is consistent with others. Modified.

> ~~~
>
> 12.
>  %type <publicationobjectspec> PublicationObjSpec
> +%type <publicationobjectspec> ExceptPublicationObjSpec
>  %type <publicationallobjectspec> PublicationAllObjSpec
>
> Is this name consistent with the others? Should it be PublicationExceptObjSpec?
>
I agree. Modified.

> ~~~
>
> CreatePublicationStmt:
>
> 13.
>   n->pubname = $3;
> + n->pubobjects = $5;
>
> I noticed that sometimes there is a cast (List *) and other times
> there is not. e.g. none here, but cast in AlterPublicationStmt. Why
> the differences?
>
This change is not required in the latest patch. Due to discussion in [2].

> ~~~
>
> PublicationObjSpec:
>
> 14.
> The comment for 'PublicationObjSpec' says "FOR TABLE and FOR TABLES IN
> SCHEMA specifications". If that comment is correct, then why is this
> patch changing this code? OTOH, if the code is correct, then does the
> comment need updating?
>
We have only added "$$->location = @1;" for PublicationObjSpec.
I define the location of the '^' indicator while throwing an error. I
think we don't need to update comments for it?

> ======
> src/bin/pg_dump/pg_dump.c
>
> 15.
> Shouldn't there have already been some ALTER ... ADD ALL TABLE dump
> code and test code implemented back in patch 0002?
>
This change is not required in the latest patch. Due to discussion in [2].

> ~~~
>
> dumpPublication:
>
> 16.
>   else if (pubinfo->puballtables)
> + {
> + SimplePtrListCell *cell;
> +
>   appendPQExpBufferStr(query, " FOR ALL TABLES");
> +
> + /* Include exception tables if the publication has except tables */
> + for (cell = exceptinfo.head; cell; cell = cell->next)
> + {
> + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
> + TableInfo  *tbinfo;
> +
> + if (pubinfo == pubrinfo->publication)
> + {
> + tbinfo = pubrinfo->pubtable;
> +
> + if (first)
> + {
> + appendPQExpBufferStr(query, " EXCEPT TABLE (");
> + first = false;
> + }
> + else
> + appendPQExpBufferStr(query, ", ");
> + appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
> + }
> + }
> + if (!first)
> + appendPQExpBufferStr(query, ")");
> + }
>
> 16a.
> SimplePtrListCell *cell can be declared as a for-loop variable.
>
> ~
>
> 16b.
> The comment should say "EXCEPT TABLES" in uppercase.
>
> ~
>
> 16c.
> I am not convinced you can use that 'first' flag like you are doing.
> Isn't that interfering with the existing usage of that flag? Perhaps
> another boolean just for this EXCEPT loop is needed.
>
> ~~~
>
Modified

> getPublicationTables:
>
> 17.
> + if (strcmp(prexcept, "f") == 0)
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> +
>
> ...
>
> + if (strcmp(prexcept, "t") == 0)
> + simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);
> +
>
> Here you are comparing the same 'prexcept' flag for both "f" and "t".
>
> I felt it was better if both comparisons are the same (e.g. both "t").
>
> Or better still, assign a new boolean and avoid that 2nd strcmp
> entirely -- e.g. except_flag = (strcmp(prexcept, "t") == 0);
>
Modified

> ======
> src/bin/pg_dump/pg_dump_sort.c
>
> DOTypeNameCompare:
>
> 18.
> + else if (obj1->objType == DO_PUBLICATION_EXCEPT_REL)
> + {
> + PublicationRelInfo *probj1 = *(PublicationRelInfo *const *) p1;
> + PublicationRelInfo *probj2 = *(PublicationRelInfo *const *) p2;
> +
> + /* Sort by publication name, since (namespace, name) match the rel */
> + cmpval = strcmp(probj1->publication->dobj.name,
> + probj2->publication->dobj.name);
> + if (cmpval != 0)
> + return cmpval;
> + }
>
> Isn't this identical to the previous code block? So can't you just add
>  DO_PUBLICATION_EXCEPT_REL to that condition?
>
Modified

> ======
> src/bin/pg_dump/t/002_pg_dump.pl
>
> 19.
> Missing test cases for ALTER? But also.
>
This change is not required in the latest patch. Due to discussion in [2].

> ~~~
>
> 20.
> Missing test cases for EXCEPT for INHERITED tables?
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 21.
> I was wondering if the "describe" for tables (e.g. \d+) should also
> show the publications where the table is an ECEPT TABLE? How else is
> the user going to know it has been excluded by some publication?
>
I thought it would be sufficient to show only the list of
publications, the table is part of.
Users can check the excluded tables by checking the description of the
publication using \dRp+.
Will it be not sufficient?
I am not sure why we should show a list of publications which it is not part of?
Am I missing something thoughts?

> ======
> src/bin/psql/tab-complete.in.c
>
> ALTER PUBLICATION:
>
> 22.
> The tab completion does not seem as good as it could be. e.g, there is
> missing '(' and the for EXCEPT TABLE
>
> ~~~
>
Modified

> CREATE PUBLICATION:
>
> 23.
> The tab completion does not seem as good as it could be. e.g, there is
> missing '(' and the for EXCEPT TABLE
>
Modified

> ======
> src/test/regress/sql/publication.sql
>
> 24.
> +\dRp+ testpub_foralltables_excepttable
> +\dRp+ testpub_foralltables_excepttable1
>
> As well as doing the "describes" for the publication, I think we need
> to see the test cases for the describes of those excluded tables. e.g.
> I imagine that they should also list the publications that they are
> *excluded* from, right?
>
See Reply to comment 21.

> ~~~
>
> 25.
> +CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);
> +CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (ONLY testpub_tbl3);
>
> 25a.
> Needs some explanatory comments here saying these are for testing the
> EXCEPT with inherited tables (e.g. ONLY versus not).
>
> ~
>
> 25b.
> I think you should be testing the '*' syntax here too.
>
> ~~~
>
I agree. Made the changes.

> 26.
> +CREATE TABLE pub_sch1.tbl2 (a int);
>  SET client_min_messages = 'ERROR';
>  CREATE PUBLICATION testpub_reset FOR ALL TABLES, ALL SEQUENCES;
>  RESET client_min_messages;
> @@ -1344,9 +1358,15 @@ ALTER PUBLICATION testpub_reset ADD ALL TABLES;
>
>  -- Can't add ALL TABLES to 'ALL TABLES' publication
>  ALTER PUBLICATION testpub_reset ADD ALL TABLES;
> +ALTER PUBLICATION testpub_reset RESET;
> +
> +-- Verify adding EXCEPT TABLE
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE
> (pub_sch1.tbl1, pub_sch1.tbl2);
> +\dRp+ testpub_reset
>
>  DROP PUBLICATION testpub_reset;
>  DROP TABLE pub_sch1.tbl1;
> +DROP TABLE pub_sch1.tbl2;
>  DROP SCHEMA pub_sch1;
>
> It looks like that added CREATE TABLE (and RESET?) belongs more
> appropriately within the scope of the new test "Verify adding EXCEPT
> TABLE".
>
This change is not required in the latest patch. Due to discussion in [2].

I have addressed the comments and attached the latest patch.
As per suggestion by Shveta and Amit, I have omitted the patches 0001,
0002, and 0004 (as per [2]). Will post these patches once 0003 patch
is RFC.

The new 0001 patch is to support EXCEPT TABLE for CREATE PUBLICATION
.. FOR ALL TABLES syntax. I have attached in [3].

[1]: https://www.postgresql.org/message-id/CAA4eK1JEKs8qwwhRb1BCiMNduJ5ePUtFnTscrZt86UKWBkLxwg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1KZ1Sb0soHp3HH2htwJ3%3Dqka-eQjW35vOW3%2B4VeWw4VoQ%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CANhcyEXwLrQsec6g%2B1dqWTKyJQMQMh%3Dgetj28C%2BzLL14BjuumA%40mail.gmail.com

Thanks,
Shlok Kyal

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

09 декабря 2025 г., 20:49:26

On Fri, 21 Nov 2025 at 12:26, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Here are some review comments for your patch v28-0003 (EXCEPT TABLE ...).
>
> The review of this patch is a WIP. In this post I only looked at the test code.
>
> ======
> .../t/037_rep_changes_except_table.pl
>
> 1.
> +
> +# Copyright (c) 2021-2025, PostgreSQL Global Development Group
> +
> +# Logical replication tests for except table publications
>
> Use uppercase: /except table/EXCEPT TABLE/
>
> ~~~
>
> 2.
> There are lots of test cases dedicated to partiion-table testing. I
> felt a bigger comment separating these major groups might be helpful.
>
> Something like:
>
> -- ============================================
> -- EXCEPT TABLE test cases for normal tables
> -- ============================================
>
> and
>
> -- ============================================
> -- EXCEPT TABLE test cases for partition tables
> -- ============================================
>
> ~~~
>
> 3.
> +# Initialize publisher node
> ...
> +# Create subscriber node
>
> Those 2 comments should be almost alike -- e.g. both should say
> "Initialize" or both should say "Create".
>
> ~~~
>
> 4.
> +# Test replication with publications created using FOR ALL TABLES EXCEPT TABLE
> +# clause.
> +# Create schemas and tables on publisher
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + CREATE SCHEMA sch1;
> + CREATE TABLE sch1.tab1 AS SELECT generate_series(1,10) AS a;
> + CREATE TABLE public.tab1(a int);
> +));
> +
>
> That first sentence ("Test replication with ...") is not needed here.
> The is just repeating the purpose of the entire file, so that comment
> can replace the one at the top of this file.
>
> ~~~
>
> 5.
> +# Insert some data and verify that inserted data is not replicated
>
> Be explicit that we are referring to the excluded table.
>
> SUGGESTION (e.g.)
> Verify that data inserted to the excluded table is not replcated.
>
> ~~~
>
> 6.
> +# Alter publication to exclude data changes in public.tab1 and verify that
> +# subscriber does not get the changed data for this table.
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + ALTER PUBLICATION tap_pub_schema RESET;
> + ALTER PUBLICATION tap_pub_schema ADD ALL TABLES EXCEPT TABLE
> (sch1.tab1, public.tab1);
> + INSERT INTO public.tab1 VALUES(generate_series(1,10));
> +));
> +$node_publisher->wait_for_catchup('tap_sub_schema');
> +
>
> It is not strictly needed for these tests, but do you think it makes
> more sense to also do an ALTER SUBSCRIPTION ... REFRESH PUBLICATION;
> whenever you change the publications?
>
> ~~~
>
> 7.
> +# cleanup
> +$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_schema");
> +$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_schema");
> +
> +
>
> double-blank lines.
>
> ~~~
>
> 8.
> I think it would be more helpful if the partition table test cases say
> (in their comments) a lot more about the steps they are doing, and
> what they expect the result to be. Sure, I can read all the code to
> figure it out for each case, but it is better to know the test
> intentions/expectations then verify they are doing the right thing.
>
> ~~~
>
> 9.
> + CREATE TABLE sch1.t1(a int) PARTITION BY RANGE(a);
> + CREATE TABLE sch1.part1 PARTITION OF sch1.t1 FOR VALUES FROM (0) TO (5);
>
> Maybe create this table to have *multiple* partitions. It might be
> interesting later to see what happens when you try to EXCEPT only one
> of the partitions.
>
I have addressed all the comments
Please find the updated patch in [1].

[1]: https://www.postgresql.org/message-id/CANhcyEXwLrQsec6g%2B1dqWTKyJQMQMh%3Dgetj28C%2BzLL14BjuumA%40mail.gmail.com

Thanks,
Shlok Kyal

Re: Skipping schema changes in publication

От

shveta malik

Дата:

10 декабря 2025 г., 08:50:59

On Tue, Dec 9, 2025 at 11:17 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> >
> I have removed the 0001 0002 and 0004 patches for now. Will post them
> once 0003 patch is RFC.
> Here is the update patch for "EXCEPT TABLE".
>

Thanks, I have not looked at new patch yet, but here are few comments
for v29-003:

1)
create_publication.sgml:
Please add one more example in the example section for EXCEPT using
'all tables, and all sequences' after the last existing one. This is
needed to show that ALL TABLES EXCEPT() and ALL SEQ are still possible
in single command.

2)
+      excluded. Optionally, <literal>*</literal> can be specified after the
+      table name to explicitly indicate that descendant tables are excluded.
+     </para>

We may add: This does not apply to a partitioned table, however.
(this will make it more clear similar to how existing doc has it for
'FOR TABLE' clause ). And then start details on partition.

3)
When
+      <literal>publish_via_partition_root</literal> is set to
+      <literal>false</literal>, specifying a partitioned table or non-leaf
+      partition has no effect

Can we simply say 'specifying a root partitioned table has no effect'.
This will make it consistent as the previous sentence also uses the
same term rather than 'non-leaf'.

4)
tab_root is a partitioned table with tab_part_1 and tab_part_2 as its
partitions.
In the first case, I receive a WARNING because the user excluded
tab_part_2 but its data will still be replicated through the root
table:

postgres=# create publication pub3 for all tables except (tab_part_2)
WITH (publish_via_partition_root=true);
WARNING:  partition "tab_part_2" will be replicated as
publish_via_partition_root is "true"

But in the following case, no WARNING is shown:
postgres=# create publication pub4 for all tables except (tab_root)
WITH (publish_via_partition_root=false);
CREATE PUBLICATION

In this scenario, the user has excluded the root table, yet its data
will still be replicated because publish_via_partition_root = false.
Should we emit a warning in this case as well? Thoughts?

5)
publication_add_relation:
+ if (pub->alltables && pri->except && targetrel->rd_rel->relispartition &&
+ pub->pubviaroot)

Can we please bring both the 'pub' conditions together, as that seems
more understandable:
if (pub->alltables && pub->pubviaroot &&...)

6)
We have added pubid as argument to GetAllPublicationRelations to
exclude except-list tables.
We should change comment atop GetAllPublicationRelations() to indicate
the same. We should extend
this existing comment to say about except-list exclusion also.

 * If the publication publishes partition changes via their respective root
 * partitioned tables, we must exclude partitions in favor of including the
 * root partitioned tables.

thanks
Shveta

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

10 декабря 2025 г., 13:21:53

On Wed, 10 Dec 2025 at 11:21, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Dec 9, 2025 at 11:17 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > >
> > I have removed the 0001 0002 and 0004 patches for now. Will post them
> > once 0003 patch is RFC.
> > Here is the update patch for "EXCEPT TABLE".
> >
>
> Thanks, I have not looked at new patch yet, but here are few comments
> for v29-003:
>
> 1)
> create_publication.sgml:
> Please add one more example in the example section for EXCEPT using
> 'all tables, and all sequences' after the last existing one. This is
> needed to show that ALL TABLES EXCEPT() and ALL SEQ are still possible
> in single command.
>
> 2)
> +      excluded. Optionally, <literal>*</literal> can be specified after the
> +      table name to explicitly indicate that descendant tables are excluded.
> +     </para>
>
> We may add: This does not apply to a partitioned table, however.
> (this will make it more clear similar to how existing doc has it for
> 'FOR TABLE' clause ). And then start details on partition.
>
> 3)
> When
> +      <literal>publish_via_partition_root</literal> is set to
> +      <literal>false</literal>, specifying a partitioned table or non-leaf
> +      partition has no effect
>
> Can we simply say 'specifying a root partitioned table has no effect'.
> This will make it consistent as the previous sentence also uses the
> same term rather than 'non-leaf'.
>
> 4)
> tab_root is a partitioned table with tab_part_1 and tab_part_2 as its
> partitions.
> In the first case, I receive a WARNING because the user excluded
> tab_part_2 but its data will still be replicated through the root
> table:
>
> postgres=# create publication pub3 for all tables except (tab_part_2)
> WITH (publish_via_partition_root=true);
> WARNING:  partition "tab_part_2" will be replicated as
> publish_via_partition_root is "true"
>
> But in the following case, no WARNING is shown:
> postgres=# create publication pub4 for all tables except (tab_root)
> WITH (publish_via_partition_root=false);
> CREATE PUBLICATION
>
> In this scenario, the user has excluded the root table, yet its data
> will still be replicated because publish_via_partition_root = false.
> Should we emit a warning in this case as well? Thoughts?
>
> 5)
> publication_add_relation:
> + if (pub->alltables && pri->except && targetrel->rd_rel->relispartition &&
> + pub->pubviaroot)
>
> Can we please bring both the 'pub' conditions together, as that seems
> more understandable:
> if (pub->alltables && pub->pubviaroot &&...)
>
> 6)
> We have added pubid as argument to GetAllPublicationRelations to
> exclude except-list tables.
> We should change comment atop GetAllPublicationRelations() to indicate
> the same. We should extend
> this existing comment to say about except-list exclusion also.
>
>  * If the publication publishes partition changes via their respective root
>  * partitioned tables, we must exclude partitions in favor of including the
>  * root partitioned tables.
>
Hi Shveta,

I have addressed the above comments and attached the updated patch.
I have also addressed a comment by Peter (comment no. 20 in [1]) which
I missed in the earlier version.

[1]: https://www.postgresql.org/message-id/CAHut%2BPudi%2B9ssBR_Q_Fd29aGEu8s18OyKUGo5w5aKJK-2_c%2B8g%40mail.gmail.com

Thanks,
Shlok Kyal

Вложения

v31-0001-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

11 декабря 2025 г., 01:39:33

Hi Shlok -

Here are some review comments for v31-0001 (EXCEPT (tablelist))

======
Commit message

1.
The new syntax allows specifying excluded relations when creating or altering
a publication. For example:
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE (t1,t2);

~

In v30, you removed all the ALTER PUBLICATION changes, so the "or
altering" in the message above also needs to be removed.

======
doc/src/sgml/logical-replication.sgml

2.
   <para>
-   To add tables to a publication, the user must have ownership rights on the
-   table. To add all tables in schema to a publication, the user must be a
-   superuser. To create a publication that publishes all tables, all tables in
-   schema, or all sequences automatically, the user must be a superuser.
+   To create a publication using <literal>FOR ALL TABLES</literal>,
+   <literal>FOR ALL SEQUENCES</literal> or
+   <literal>FOR TABLES IN SCHEMA</literal>, the user must be a
superuser. To add
+   <literal>ALL TABLES</literal> or <literal>TABLES IN SCHEMA</literal> to a
+   publication, the user must be a superuser. To add tables to a publication,
+   the user must have ownership rights on the table.
   </para>

This is a good improvement, but I was not sure why it is in this
patch. Should it be a separate thread for a docs improvement?

======
src/backend/catalog/pg_publication.c

GetTopMostAncestorInPublication:

3.
{
  Oid ancestor = lfirst_oid(lc);
- List    *apubids = GetRelationPublications(ancestor);
- List    *aschemaPubids = NIL;
+ List    *apubids = NIL;
+ List    *aexceptpubids = NIL;
+ List    *aschemapubids = NIL;
+ bool set_top = false;
+
+ GetRelationPublications(ancestor, &apubids, &aexceptpubids);

  level++;

- if (list_member_oid(apubids, puboid))
+ /* check if member of table publications */
+ set_top = list_member_oid(apubids, puboid);
+ if (!set_top)
  {
- topmost_relid = ancestor;
+ aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));

- if (ancestor_level)
- *ancestor_level = level;
+ /* check if member of schema publications */
+ set_top = list_member_oid(aschemapubids, puboid);
+
+ /*
+ * If the publication is all tables publication and the table is
+ * not part of exception tables.
+ */
+ if (!set_top && puballtables)
+ set_top = !list_member_oid(aexceptpubids, puboid);
  }
- else
+
+ if (set_top)
  {
- aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
- if (list_member_oid(aschemaPubids, puboid))
- {
- topmost_relid = ancestor;
+ topmost_relid = ancestor;

- if (ancestor_level)
- *ancestor_level = level;
- }
+ if (ancestor_level)
+ *ancestor_level = level;
  }

  list_free(apubids);
- list_free(aschemaPubids);
+ list_free(aschemapubids);
+ list_free(aexceptpubids);
  }

That 'aschemapubids' can be declared and freed within the if block.

~~~

publication_add_relation:

4.
+ /*
+ * Check when a partition is excluded via EXCEPT TABLE while the
+ * publication has publish_via_partition_root = true.
+ */
+ if (pub->alltables && pub->pubviaroot && pri->except &&
+ targetrel->rd_rel->relispartition)
+ ereport(WARNING,


This comment doesn't sound quite right:

SUGGESTION
Handle the case where a partition is excluded by EXCEPT TABLE while
publish_via_partition_root = true.

~~~

5.
+ /*
+ * Check when a partitioned table is excluded via EXCEPT TABLE while the
+ * publication has publish_via_partition_root = false.
+ */
+ if (pub->alltables && !pub->pubviaroot && pri->except &&
+ targetrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ ereport(WARNING,

Ditto. Reword like suggested in the previous review comment.

~~~

6.
+/*
+ * Get the list of publication oids associated with a specified relation.
+ * pubids is filled with the list of publication oids the relation is part of.
+ * except_pubids is filled with the list of publication oids the relation is
+ * excluded from.
+ *
+ * This function returns true if the relation is part of any publication.
+ */

Maybe putting 'pubids' and 'except_pubids' in single quotes will help
readability of this comment?

Also, these are already Lists, so they are not filled with lists.

SUGGESTION
Parameter 'pubids' returns the OIDs of the publications the relation is part of.
Parameter 'except_pubids' returns the OIDs of publications the
relation is excluded from.

~~~

GetPublicationRelations:

7.
 /*
- * Gets list of relation oids for a publication.
+ * Return the list of relation OIDs for a publication.
+ *
+ * For a FOR ALL TABLES publication, this returns the list of tables that were
+ * explicitly excluded via an EXCEPT TABLE clause.
+ *
+ * For a FOR TABLE publication, this returns the list of tables explicitly
+ * included in the publication.
  *
- * This should only be used FOR TABLE publications, the FOR ALL
TABLES/SEQUENCES
- * should use GetAllPublicationRelations().
+ * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
+ * GetAllPublicationRelations() to obtain the complete set of tables covered by
+ * the publication.
  */
 List *
 GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)

7a.
The function is called 'GetPublicationRelations', so it seems
unintuitive that it sometimes returns the list of all the tables that
are *excluded* from the publication. If you are going to have one
single function that does everything, then IMO it might be better to
hide that behind some wrapper functions like:
GetPublicationMemberRelations
GetPublicationExcludedRelations

Consider also that all these assumptions might be OK today but they
won't be OK in the future. e.g. One day, when named FOR SEQUENCE
sq1,sq2 are supported then you will be alble to write a command like
FOR ALL TABLES EXCEPT (t1), FOR SEQUENCE sq1,sq2. That's going to be a
muddle of some included and some excluded relations. So, it is better
to cater for that scenario now, rather than have to rewrite all of
this function again in the future. e.g. Maybe instead of this function
returning one list it is better to return included/excluded Lists or
relations as output parameters?

~

7b.
Also, comments like "Publications declared with FOR ALL TABLES or FOR
ALL SEQUENCES should use..." seems like too many assumptions are being
made. It would be better to enforce the calling requirements using
parameter checking and Asserts instead instead of hoping that callers
are going to abide by the comments.

~~~

GetAllPublicationRelations:

8.
+ exceptlist = GetPublicationRelations(pubid, pubviaroot ?
+ PUBLICATION_PART_ALL :
+ PUBLICATION_PART_ROOT);

This is similar to the above review comment. I'm not sure how you can
just assume that this must be the "except list" -- AFAICT this assumes
that 'GetAllPublicationRelations' can only be called by FOR ALL TABLES
(???). Seems like a lot of assumptions, that would be much better to
be enforced by Asserts in the code.

======
src/backend/commands/publicationcmds.c

pub_rf_contains_invalid_column:

9.
 bool
 pub_rf_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
-    bool pubviaroot)
+    bool pubviaroot, bool puballtables)

I felt that 'puballtables' is more "important" than 'pubviaroot' so
maybe it should come earlier in the parameter list. (e.g. make it more
similar to 'pub_contains_invalid_column')

======
src/backend/parser/gram.y

10.
+ pub_except_obj_list opt_except_clause

I felt that 'opt_except_clause' should better be called
'opt_pub_except_clause' or 'pub_opt_except_clause' because without
'pub' it is a bit vague.

~~~

11.
+/*
+ * ALL TABLES EXCEPT ( table_name [, ...] ) specification
+ */

11a
This comment should be up where all the other CREATE PUBLICATION
syntax is commented.

~

11b.
Also, there is a missing optional "[TABLE]" part.

~~~

12.
+pub_except_obj_list: PublicationExceptObjSpec
+ { $$ = list_make1($1); }
+ | pub_except_obj_list ',' PublicationExceptObjSpec
+ { $$ = lappend($1, $3); }
+ ;
+
+opt_except_clause:
+ EXCEPT opt_table '(' pub_except_obj_list ')' { $$ = $4; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;

I felt the clause should be defined before the obj list because that
seems the natural order to read these.

======
src/bin/pg_dump/pg_dump.c

13.
+static SimplePtrList exceptinfo = {NULL, NULL};

Having this as global seems a bit hacky. It has nothing in common with
all the other nearby lists, which are commented as being based on
"patterns given by command-line switches"

~~~

dumpPublication:

14.
+ /* Include exception tables if the publication has EXCEPT TABLEs */
+ for (SimplePtrListCell *cell = exceptinfo.head; cell; cell = cell->next)
+ {
+ PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
+ TableInfo  *tbinfo;
+
+ if (pubinfo == pubrinfo->publication)
+ {
+ tbinfo = pubrinfo->pubtable;

That 'tbinfo' can be declared within the "if".

~~~

15.
+ appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));

ONLY is not the default. How did you decide that "ONLY" is the correct
thing to do here?

~~~

getPublicationTables:

16.
- pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
+ if (prexcept)
+ pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
+ else
+ pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
+

Would a single assignment (ternary) make this code simpler and easier to read?

SUGGESTION
pubrinfo[j].dobj.objType = prexcept ?
  DO_PUBLICATION_EXCEPT_REL :
  DO_PUBLICATION_REL;

======
src/bin/pg_dump/t/002_pg_dump.pl

17.
+ 'CREATE PUBLICATION pub10' => {
+ create_order => 50,
+ create_sql =>
+   'CREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE
(dump_test.test_table_generated);',
+ regexp => qr/^
+ \QCREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE (ONLY
dump_test.test_table_generated, ONLY
dump_test.test_table_generated_child2, ONLY
dump_test.test_table_generated_child1) WITH (publish = 'insert,
update, delete, truncate');\E
+ /xm,
+ like => { %full_runs, section_post_data => 1, },
+ },
+

These "generated" names seem unusual. I saw there are some other
tables like 'dump_test.test_inheritance_child' and
'dump_test.test_inheritance_parent'. Can you use those more normal
table names instead?

Also curious - does the order of the tests matter? I saw that the
CREATE TABLE tests seem to be coming after the CREATE PUBLICATION
tests that are using them.

~~~
18.
- if (!defined($tests{$test}->{all_runs})
+ if (   !defined($tests{$test}->{all_runs})

Why add this whitespace?

======
src/include/nodes/parsenodes.h

19.
  AP_SetObjects, /* set list of objects */
+ AP_Reset, /* reset the publication */
 } AlterPublicationAction;

AFAIK, you removed all ALTER command changes from v30-0001. So this
should not be here.

~~~

20.
+ bool for_all_tables; /* Special publication for all tables in db */
  AlterPublicationAction action; /* What action to perform with the given
  * objects */
 } AlterPublicationStmt;

AFAIK, you removed all ALTER command changes from v30-0001. So this
should not be here.

======
src/test/regress/sql/publication.sql

21.
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_foralltables_excepttable FOR ALL TABLES
EXCEPT TABLE (testpub_tbl1, testpub_tbl2);
+-- specify EXCEPT without TABLE
+CREATE PUBLICATION testpub_foralltables_excepttable1 FOR ALL TABLES
EXCEPT (testpub_tbl1);

Should be 2 comments here for the 2x CREATE:

# Exclude tables using FOR ALL TABLES EXCEPT TABLE (tablelist)

# Exclude tables using FOR ALL TABLES EXCEPT (tablelist)

~~~

22.
 CREATE TABLE testpub_tbl3 (a int);
 CREATE TABLE testpub_tbl3a (b text) INHERITS (testpub_tbl3);

If you rename these tables like 'testpub_tbl_parent' and
'testpub_tbl_child', it will be much easier to see what is going on.

~~~

23.
+CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);

Missing comment -- something like:
# Exclude parent table, omitting both of 'ONLY' and '*'

~~~

24.
+-- EXCEPT with wildcard: exclude table and all descendants
+CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3*);

24a.
TBH, I don't think this is a "wildcard" -- it is not doing any pattern
matching. IMO just call it an "asterisk" or a "star".

~

24b.
And put a space before the '*' here.

======
.../t/037_rep_changes_except_table.pl

25.
+# ============================================
+# EXCEPT TABLE test cases for partition tables
+# ============================================
+# Check behavior of EXCEPT TABLE together with publish_via_partition_root
+# when applied to a partitioned table and its partitions.


Really, that "Check behavior" sentence is generic for all of the
following tests, so it should also be (within the "=======" of the
previous comment)

~~~

26.
+$node_publisher->safe_psql(
+ 'postgres', qq(
+ CREATE TABLE sch1.t1(a int) PARTITION BY RANGE(a);
+ CREATE TABLE sch1.part1 PARTITION OF sch1.t1 FOR VALUES FROM (0) TO (5);
+ CREATE TABLE sch1.part2 PARTITION OF sch1.t1 FOR VALUES FROM (6) TO (10);
+ INSERT INTO sch1.t1 VALUES (1), (6);
+));
+
+$node_subscriber->safe_psql(
+ 'postgres', qq(
+ CREATE TABLE sch1.t1(a int);
+ CREATE TABLE sch1.part1(a int);
+ CREATE TABLE sch1.part2(a int);
+));

26a.
There should be a comment for this part that just says something like
"Setup partition table and partitions on the publisher that map to
normal tables on the subscriber"

~

26b.
The INSERT should be done later, after the CREATE PUBLICATION but
before the CREATE SUBSCRIPTION. The pattern will be the same for all
the test cases.

~~~

27.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.part1)"
+);

Even though the publish_via_partition_root is 'false' by default, I
think you should spell it out explicitly here for clarity.

~~~

28.
+# EXCEPT TABLE (sch1.t1) with publish_via_partition_root = false
+# Excluding the partitioned table while publish_via_partition_root = false
+# still allows rows inserted into its partitions to be replicated.

I felt you should word this differently. I don't think you should say
"inserted into its partitions" because actually, you inserted into the
partition table, and the data just ends up in the partitions.

~~~

29.
+$node_publisher->safe_psql(
+ 'postgres', qq(
+ CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.t1);
+ INSERT INTO sch1.t1 VALUES (1), (6);
+));

Ditto earlier comment. Better to explicitly say
"publish_via_partition_root=false".

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

11 декабря 2025 г., 02:01:13

On Wed, Dec 10, 2025 at 4:49 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Mon, 24 Nov 2025 at 13:03, Peter Smith <smithpb2250@gmail.com> wrote:
> >
...
> > 21.
> > I was wondering if the "describe" for tables (e.g. \d+) should also
> > show the publications where the table is an ECEPT TABLE? How else is
> > the user going to know it has been excluded by some publication?
> >
> I thought it would be sufficient to show only the list of
> publications, the table is part of.
> Users can check the excluded tables by checking the description of the
> publication using \dRp+.
> Will it be not sufficient?
> I am not sure why we should show a list of publications which it is not part of?
> Am I missing something thoughts?

For this comment, I was imagining a scenario where there are dozens of
publications, and the user is wondering why their table is not being
replicated to the subscriber like they expected it would be.

Yes, they could use \dRs+ to identify the publications excluding it,
but that will be quite painful if there are very many publications
they have to check. IIUC, there is no other way to check it without
digging into System Catalogs.

That's why I thought it might be useful if the \d+ could also show
publications where the table was named in an EXCEPT TABLE clause.

======
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

16 декабря 2025 г., 12:20:41

On Thu, 11 Dec 2025 at 04:10, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok -
>
> Here are some review comments for v31-0001 (EXCEPT (tablelist))
>
> ======
> Commit message
>
> 1.
> The new syntax allows specifying excluded relations when creating or altering
> a publication. For example:
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE (t1,t2);
>
> ~
>
> In v30, you removed all the ALTER PUBLICATION changes, so the "or
> altering" in the message above also needs to be removed.
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>    <para>
> -   To add tables to a publication, the user must have ownership rights on the
> -   table. To add all tables in schema to a publication, the user must be a
> -   superuser. To create a publication that publishes all tables, all tables in
> -   schema, or all sequences automatically, the user must be a superuser.
> +   To create a publication using <literal>FOR ALL TABLES</literal>,
> +   <literal>FOR ALL SEQUENCES</literal> or
> +   <literal>FOR TABLES IN SCHEMA</literal>, the user must be a
> superuser. To add
> +   <literal>ALL TABLES</literal> or <literal>TABLES IN SCHEMA</literal> to a
> +   publication, the user must be a superuser. To add tables to a publication,
> +   the user must have ownership rights on the table.
>    </para>
>
> This is a good improvement, but I was not sure why it is in this
> patch. Should it be a separate thread for a docs improvement?
>
> ======
> src/backend/catalog/pg_publication.c
>
> GetTopMostAncestorInPublication:
>
> 3.
> {
>   Oid ancestor = lfirst_oid(lc);
> - List    *apubids = GetRelationPublications(ancestor);
> - List    *aschemaPubids = NIL;
> + List    *apubids = NIL;
> + List    *aexceptpubids = NIL;
> + List    *aschemapubids = NIL;
> + bool set_top = false;
> +
> + GetRelationPublications(ancestor, &apubids, &aexceptpubids);
>
>   level++;
>
> - if (list_member_oid(apubids, puboid))
> + /* check if member of table publications */
> + set_top = list_member_oid(apubids, puboid);
> + if (!set_top)
>   {
> - topmost_relid = ancestor;
> + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
>
> - if (ancestor_level)
> - *ancestor_level = level;
> + /* check if member of schema publications */
> + set_top = list_member_oid(aschemapubids, puboid);
> +
> + /*
> + * If the publication is all tables publication and the table is
> + * not part of exception tables.
> + */
> + if (!set_top && puballtables)
> + set_top = !list_member_oid(aexceptpubids, puboid);
>   }
> - else
> +
> + if (set_top)
>   {
> - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> - if (list_member_oid(aschemaPubids, puboid))
> - {
> - topmost_relid = ancestor;
> + topmost_relid = ancestor;
>
> - if (ancestor_level)
> - *ancestor_level = level;
> - }
> + if (ancestor_level)
> + *ancestor_level = level;
>   }
>
>   list_free(apubids);
> - list_free(aschemaPubids);
> + list_free(aschemapubids);
> + list_free(aexceptpubids);
>   }
>
> That 'aschemapubids' can be declared and freed within the if block.
>
> ~~~
>
> publication_add_relation:
>
> 4.
> + /*
> + * Check when a partition is excluded via EXCEPT TABLE while the
> + * publication has publish_via_partition_root = true.
> + */
> + if (pub->alltables && pub->pubviaroot && pri->except &&
> + targetrel->rd_rel->relispartition)
> + ereport(WARNING,
>
>
> This comment doesn't sound quite right:
>
> SUGGESTION
> Handle the case where a partition is excluded by EXCEPT TABLE while
> publish_via_partition_root = true.
>
> ~~~
>
> 5.
> + /*
> + * Check when a partitioned table is excluded via EXCEPT TABLE while the
> + * publication has publish_via_partition_root = false.
> + */
> + if (pub->alltables && !pub->pubviaroot && pri->except &&
> + targetrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> + ereport(WARNING,
>
> Ditto. Reword like suggested in the previous review comment.
>
> ~~~
>
> 6.
> +/*
> + * Get the list of publication oids associated with a specified relation.
> + * pubids is filled with the list of publication oids the relation is part of.
> + * except_pubids is filled with the list of publication oids the relation is
> + * excluded from.
> + *
> + * This function returns true if the relation is part of any publication.
> + */
>
> Maybe putting 'pubids' and 'except_pubids' in single quotes will help
> readability of this comment?
>
> Also, these are already Lists, so they are not filled with lists.
>
> SUGGESTION
> Parameter 'pubids' returns the OIDs of the publications the relation is part of.
> Parameter 'except_pubids' returns the OIDs of publications the
> relation is excluded from.
>
> ~~~
>
> GetPublicationRelations:
>
> 7.
>  /*
> - * Gets list of relation oids for a publication.
> + * Return the list of relation OIDs for a publication.
> + *
> + * For a FOR ALL TABLES publication, this returns the list of tables that were
> + * explicitly excluded via an EXCEPT TABLE clause.
> + *
> + * For a FOR TABLE publication, this returns the list of tables explicitly
> + * included in the publication.
>   *
> - * This should only be used FOR TABLE publications, the FOR ALL
> TABLES/SEQUENCES
> - * should use GetAllPublicationRelations().
> + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
> + * GetAllPublicationRelations() to obtain the complete set of tables covered by
> + * the publication.
>   */
>  List *
>  GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
>
> 7a.
> The function is called 'GetPublicationRelations', so it seems
> unintuitive that it sometimes returns the list of all the tables that
> are *excluded* from the publication. If you are going to have one
> single function that does everything, then IMO it might be better to
> hide that behind some wrapper functions like:
> GetPublicationMemberRelations
> GetPublicationExcludedRelations
>
> Consider also that all these assumptions might be OK today but they
> won't be OK in the future. e.g. One day, when named FOR SEQUENCE
> sq1,sq2 are supported then you will be alble to write a command like
> FOR ALL TABLES EXCEPT (t1), FOR SEQUENCE sq1,sq2. That's going to be a
> muddle of some included and some excluded relations. So, it is better
> to cater for that scenario now, rather than have to rewrite all of
> this function again in the future. e.g. Maybe instead of this function
> returning one list it is better to return included/excluded Lists or
> relations as output parameters?
>
> ~
>
> 7b.
> Also, comments like "Publications declared with FOR ALL TABLES or FOR
> ALL SEQUENCES should use..." seems like too many assumptions are being
> made. It would be better to enforce the calling requirements using
> parameter checking and Asserts instead instead of hoping that callers
> are going to abide by the comments.
>
> ~~~
>
> GetAllPublicationRelations:
>
> 8.
> + exceptlist = GetPublicationRelations(pubid, pubviaroot ?
> + PUBLICATION_PART_ALL :
> + PUBLICATION_PART_ROOT);
>
> This is similar to the above review comment. I'm not sure how you can
> just assume that this must be the "except list" -- AFAICT this assumes
> that 'GetAllPublicationRelations' can only be called by FOR ALL TABLES
> (???). Seems like a lot of assumptions, that would be much better to
> be enforced by Asserts in the code.
>
I agree with comments 7 and 8. I have added two functions
'GetPublicationIncludedRelations' and
'GetPublicationExcludedRelations'. To get Relations which are included
or excluded in a publication.
Both functions will call 'GetPublicationRelationsInternal' function. I
have also reintroduced the 'except_flag' variable

> ======
> src/backend/commands/publicationcmds.c
>
> pub_rf_contains_invalid_column:
>
> 9.
>  bool
>  pub_rf_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
> -    bool pubviaroot)
> +    bool pubviaroot, bool puballtables)
>
> I felt that 'puballtables' is more "important" than 'pubviaroot' so
> maybe it should come earlier in the parameter list. (e.g. make it more
> similar to 'pub_contains_invalid_column')
>
> ======
> src/backend/parser/gram.y
>
> 10.
> + pub_except_obj_list opt_except_clause
>
> I felt that 'opt_except_clause' should better be called
> 'opt_pub_except_clause' or 'pub_opt_except_clause' because without
> 'pub' it is a bit vague.
>
I agree. I prefer 'opt_pub_except_clause'. By looking at other
variables it better make sense to start the variable name with 'opt_'
as it indicates that it is optional.
Made changes for the same.

> ~~~
>
> 11.
> +/*
> + * ALL TABLES EXCEPT ( table_name [, ...] ) specification
> + */
>
> 11a
> This comment should be up where all the other CREATE PUBLICATION
> syntax is commented.
>
> ~
>
> 11b.
> Also, there is a missing optional "[TABLE]" part.
>
> ~~~
>
> 12.
> +pub_except_obj_list: PublicationExceptObjSpec
> + { $$ = list_make1($1); }
> + | pub_except_obj_list ',' PublicationExceptObjSpec
> + { $$ = lappend($1, $3); }
> + ;
> +
> +opt_except_clause:
> + EXCEPT opt_table '(' pub_except_obj_list ')' { $$ = $4; }
> + | /*EMPTY*/ { $$ = NIL; }
> + ;
>
> I felt the clause should be defined before the obj list because that
> seems the natural order to read these.
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 13.
> +static SimplePtrList exceptinfo = {NULL, NULL};
>
> Having this as global seems a bit hacky. It has nothing in common with
> all the other nearby lists, which are commented as being based on
> "patterns given by command-line switches"
>
I agree, I have added it in the PublicationInfo struct and made the
corresponding code changes.

> ~~~
>
> dumpPublication:
>
> 14.
> + /* Include exception tables if the publication has EXCEPT TABLEs */
> + for (SimplePtrListCell *cell = exceptinfo.head; cell; cell = cell->next)
> + {
> + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
> + TableInfo  *tbinfo;
> +
> + if (pubinfo == pubrinfo->publication)
> + {
> + tbinfo = pubrinfo->pubtable;
>
> That 'tbinfo' can be declared within the "if".
>
> ~~~
>
> 15.
> + appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
>
> ONLY is not the default. How did you decide that "ONLY" is the correct
> thing to do here?
>
For pg_dump for publication we use "ONLY" by default while specifying the table

For Alter publication we use similar thing:
```
  appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
            fmtId(pubinfo->dobj.name));
```

Also if we specify a parent table in a publication(without ONLY) all
its child tables are also added to the pg_publication_rel table.
So when we dump such a publication we get something like:
.... EXCEPT TABLE(ONLY parent_table, ONLY child_table)...

> ~~~
>
> getPublicationTables:
>
> 16.
> - pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
> + if (prexcept)
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
> +
>
> Would a single assignment (ternary) make this code simpler and easier to read?
>
> SUGGESTION
> pubrinfo[j].dobj.objType = prexcept ?
>   DO_PUBLICATION_EXCEPT_REL :
>   DO_PUBLICATION_REL;
>
> ======
> src/bin/pg_dump/t/002_pg_dump.pl
>
> 17.
> + 'CREATE PUBLICATION pub10' => {
> + create_order => 50,
> + create_sql =>
> +   'CREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE
> (dump_test.test_table_generated);',
> + regexp => qr/^
> + \QCREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE (ONLY
> dump_test.test_table_generated, ONLY
> dump_test.test_table_generated_child2, ONLY
> dump_test.test_table_generated_child1) WITH (publish = 'insert,
> update, delete, truncate');\E
> + /xm,
> + like => { %full_runs, section_post_data => 1, },
> + },
> +
>
> These "generated" names seem unusual. I saw there are some other
> tables like 'dump_test.test_inheritance_child' and
> 'dump_test.test_inheritance_parent'. Can you use those more normal
> table names instead?
>
> Also curious - does the order of the tests matter? I saw that the
> CREATE TABLE tests seem to be coming after the CREATE PUBLICATION
> tests that are using them.
>
I looked into it and came to the conclusion that this is controlled
using 'create_order' while specifying the tests.
Tests with a lower create_order value are executed earlier.
So to ensure 'CREATE PUBLICATION' runs correctly we have to make sure
the 'create_order' of these statements is higher than that of the
respective 'CREATE TABLE' statement.

> ~~~
> 18.
> - if (!defined($tests{$test}->{all_runs})
> + if (   !defined($tests{$test}->{all_runs})
>
> Why add this whitespace?
>
pg_perltidy makes this change. I have reverted it.

> ======
> src/include/nodes/parsenodes.h
>
> 19.
>   AP_SetObjects, /* set list of objects */
> + AP_Reset, /* reset the publication */
>  } AlterPublicationAction;
>
> AFAIK, you removed all ALTER command changes from v30-0001. So this
> should not be here.
>
> ~~~
>
> 20.
> + bool for_all_tables; /* Special publication for all tables in db */
>   AlterPublicationAction action; /* What action to perform with the given
>   * objects */
>  } AlterPublicationStmt;
>
> AFAIK, you removed all ALTER command changes from v30-0001. So this
> should not be here.
>
> ======
> src/test/regress/sql/publication.sql
>
> 21.
> +SET client_min_messages = 'ERROR';
> +CREATE PUBLICATION testpub_foralltables_excepttable FOR ALL TABLES
> EXCEPT TABLE (testpub_tbl1, testpub_tbl2);
> +-- specify EXCEPT without TABLE
> +CREATE PUBLICATION testpub_foralltables_excepttable1 FOR ALL TABLES
> EXCEPT (testpub_tbl1);
>
> Should be 2 comments here for the 2x CREATE:
>
> # Exclude tables using FOR ALL TABLES EXCEPT TABLE (tablelist)
>
> # Exclude tables using FOR ALL TABLES EXCEPT (tablelist)
>
> ~~~
>
> 22.
>  CREATE TABLE testpub_tbl3 (a int);
>  CREATE TABLE testpub_tbl3a (b text) INHERITS (testpub_tbl3);
>
> If you rename these tables like 'testpub_tbl_parent' and
> 'testpub_tbl_child', it will be much easier to see what is going on.
>
> ~~~
>
> 23.
> +CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);
>
> Missing comment -- something like:
> # Exclude parent table, omitting both of 'ONLY' and '*'
>
> ~~~
>
> 24.
> +-- EXCEPT with wildcard: exclude table and all descendants
> +CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3*);
>
> 24a.
> TBH, I don't think this is a "wildcard" -- it is not doing any pattern
> matching. IMO just call it an "asterisk" or a "star".
>
> ~
>
> 24b.
> And put a space before the '*' here.
>
> ======
> .../t/037_rep_changes_except_table.pl
>
> 25.
> +# ============================================
> +# EXCEPT TABLE test cases for partition tables
> +# ============================================
> +# Check behavior of EXCEPT TABLE together with publish_via_partition_root
> +# when applied to a partitioned table and its partitions.
>
>
> Really, that "Check behavior" sentence is generic for all of the
> following tests, so it should also be (within the "=======" of the
> previous comment)
>
> ~~~
>
> 26.
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + CREATE TABLE sch1.t1(a int) PARTITION BY RANGE(a);
> + CREATE TABLE sch1.part1 PARTITION OF sch1.t1 FOR VALUES FROM (0) TO (5);
> + CREATE TABLE sch1.part2 PARTITION OF sch1.t1 FOR VALUES FROM (6) TO (10);
> + INSERT INTO sch1.t1 VALUES (1), (6);
> +));
> +
> +$node_subscriber->safe_psql(
> + 'postgres', qq(
> + CREATE TABLE sch1.t1(a int);
> + CREATE TABLE sch1.part1(a int);
> + CREATE TABLE sch1.part2(a int);
> +));
>
> 26a.
> There should be a comment for this part that just says something like
> "Setup partition table and partitions on the publisher that map to
> normal tables on the subscriber"
>
> ~
>
> 26b.
> The INSERT should be done later, after the CREATE PUBLICATION but
> before the CREATE SUBSCRIPTION. The pattern will be the same for all
> the test cases.
>
> ~~~
>
> 27.
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.part1)"
> +);
>
> Even though the publish_via_partition_root is 'false' by default, I
> think you should spell it out explicitly here for clarity.
>
> ~~~
>
> 28.
> +# EXCEPT TABLE (sch1.t1) with publish_via_partition_root = false
> +# Excluding the partitioned table while publish_via_partition_root = false
> +# still allows rows inserted into its partitions to be replicated.
>
> I felt you should word this differently. I don't think you should say
> "inserted into its partitions" because actually, you inserted into the
> partition table, and the data just ends up in the partitions.
>
> ~~~
>
> 29.
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.t1);
> + INSERT INTO sch1.t1 VALUES (1), (6);
> +));
>
> Ditto earlier comment. Better to explicitly say
> "publish_via_partition_root=false".
>
I have also addressed the remaining comments and attached the latest patch.

Thanks,
Shlok Kyal

Вложения

v32-0001-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

16 декабря 2025 г., 12:21:18

On Thu, 11 Dec 2025 at 04:31, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, Dec 10, 2025 at 4:49 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Mon, 24 Nov 2025 at 13:03, Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> ...
> > > 21.
> > > I was wondering if the "describe" for tables (e.g. \d+) should also
> > > show the publications where the table is an ECEPT TABLE? How else is
> > > the user going to know it has been excluded by some publication?
> > >
> > I thought it would be sufficient to show only the list of
> > publications, the table is part of.
> > Users can check the excluded tables by checking the description of the
> > publication using \dRp+.
> > Will it be not sufficient?
> > I am not sure why we should show a list of publications which it is not part of?
> > Am I missing something thoughts?
>
> For this comment, I was imagining a scenario where there are dozens of
> publications, and the user is wondering why their table is not being
> replicated to the subscriber like they expected it would be.
>
> Yes, they could use \dRs+ to identify the publications excluding it,
> but that will be quite painful if there are very many publications
> they have to check. IIUC, there is no other way to check it without
> digging into System Catalogs.
>
> That's why I thought it might be useful if the \d+ could also show
> publications where the table was named in an EXCEPT TABLE clause.
>
I thought more about this point and it can be useful. I have added the
changes for the same in the latest patch in [1].

[1]: https://www.postgresql.org/message-id/CANhcyEWg2WbEW_fFwk0D3J2KBrUF7th6VrE%2BgvESgkUKP9VpZg%40mail.gmail.com

Thanks,
Shlok Kyal

Re: Skipping schema changes in publication

От

shveta malik

Дата:

17 декабря 2025 г., 08:54:23

On Tue, Dec 16, 2025 at 2:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> I have also addressed the remaining comments and attached the latest patch.
>

Thanks. A few comments:

1)
+ if (!set_top && puballtables)
+ set_top = !list_member_oid(aexceptpubids, puboid);

In GetTopMostAncestorInPublication(), we have made the above change
which will now get ancestor from all-tables publication as well,
provided table is not part of 'except' List. Earlier this function was
only checking pg_subscription_rel and pg_publication_namespace which
does not include all-tables publication. Won't it change the
result-set for callers?

2)
+ * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
+ * GetAllPublicationRelations() to obtain the complete set of tables covered by
+ * the publication.
+ */
+List *
+GetPublicationIncludedRelations(Oid pubid, PublicationPartOpt pub_partopt)
+{
+ return GetPublicationRelationsInternal(pubid, pub_partopt, false);
+}

We can have an Assert here that pubid passed is not for ALL-Tables or
ALL-sequences

3)
GetAllPublicationRelations:
 * If the publication publishes partition changes via their respective root
 * partitioned tables, we must exclude partitions in favor of including the
 * root partitioned tables. This is not applicable to FOR ALL SEQUENCES
 * publication.

+ * The list does not include relations that are explicitly excluded via the
+ * EXCEPT TABLE clause of the publication specified by pubid.

Suggestion:
/*
 * If the publication publishes partition changes via their respective root
 * partitioned tables, we must exclude partitions in favor of including the
 * root partitioned tables. The list also excludes tables that are
 * explicitly excluded via the EXCEPT TABLE clause of the publication
 * identified by pubid. Neither of these rules applies to FOR ALL SEQUENCES
 * publications.
 */

4)
GetAllPublicationRelations:
+ if (relkind == RELKIND_RELATION)
+ exceptlist = GetPublicationExcludedRelations(pubid, pubviaroot ?
+ PUBLICATION_PART_ALL :
+ PUBLICATION_PART_ROOT);

  Assert(!(relkind == RELKIND_SEQUENCE && pubviaroot));

Generally we keep such parameters' sanity checks as the first step. We
can add new code after Assert.

5)
ObjectsInAllPublicationToOids() only has one caller which calls it
under check: 'if (stmt->for_all_tables)'

Thus IMO, we do not need a switch-case in
ObjectsInAllPublicationToOids(). We can simply have a sanity check to
see it is 'PUBLICATION_ALL_TABLES' and then do the needed operation
for this pub-type.

6)
CreatePublication():
/*
* If publication is for ALL TABLES and relations is not empty, it means
* that there are some relations to be excluded from the publication.
* Else, relations is the list of relations to be added to the
* publication.
*/

Shall we rephrase slightly to:

/*
 * If the publication is for ALL TABLES and 'relations' is not empty,
 * it indicates that some relations should be excluded from the publication.
 * Add those excluded relations to the publication with 'prexcept' set to true.
 * Otherwise, 'relations' contains the list of relations to be explicitly
 * included in the publication.
 */

7)
+ /* Associate objects with the publication. */
+ if (stmt->for_all_tables)
+ {
+ /* Invalidate relcache so that publication info is rebuilt. */
+ CacheInvalidateRelcacheAll();
+ }

I think this comment is misplaced. We shall have it at previous place, atop:
if (stmt->for_all_tables)
This is because here we are just trying to invalidate cache while at
previous place we are trying to associate.

thanks
Shveta

Re: Skipping schema changes in publication

От

shveta malik

Дата:

18 декабря 2025 г., 09:00:30

On Wed, Dec 17, 2025 at 11:24 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Dec 16, 2025 at 2:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > I have also addressed the remaining comments and attached the latest patch.
> >
>
> Thanks. A few comments:
>
> 1)
> + if (!set_top && puballtables)
> + set_top = !list_member_oid(aexceptpubids, puboid);
>
> In GetTopMostAncestorInPublication(), we have made the above change
> which will now get ancestor from all-tables publication as well,
> provided table is not part of 'except' List. Earlier this function was
> only checking pg_subscription_rel and pg_publication_namespace which
> does not include all-tables publication. Won't it change the
> result-set for callers?
>
> 2)
> + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
> + * GetAllPublicationRelations() to obtain the complete set of tables covered by
> + * the publication.
> + */
> +List *
> +GetPublicationIncludedRelations(Oid pubid, PublicationPartOpt pub_partopt)
> +{
> + return GetPublicationRelationsInternal(pubid, pub_partopt, false);
> +}
>
> We can have an Assert here that pubid passed is not for ALL-Tables or
> ALL-sequences
>
> 3)
> GetAllPublicationRelations:
>  * If the publication publishes partition changes via their respective root
>  * partitioned tables, we must exclude partitions in favor of including the
>  * root partitioned tables. This is not applicable to FOR ALL SEQUENCES
>  * publication.
>
> + * The list does not include relations that are explicitly excluded via the
> + * EXCEPT TABLE clause of the publication specified by pubid.
>
> Suggestion:
> /*
>  * If the publication publishes partition changes via their respective root
>  * partitioned tables, we must exclude partitions in favor of including the
>  * root partitioned tables. The list also excludes tables that are
>  * explicitly excluded via the EXCEPT TABLE clause of the publication
>  * identified by pubid. Neither of these rules applies to FOR ALL SEQUENCES
>  * publications.
>  */
>
> 4)
> GetAllPublicationRelations:
> + if (relkind == RELKIND_RELATION)
> + exceptlist = GetPublicationExcludedRelations(pubid, pubviaroot ?
> + PUBLICATION_PART_ALL :
> + PUBLICATION_PART_ROOT);
>
>   Assert(!(relkind == RELKIND_SEQUENCE && pubviaroot));
>
> Generally we keep such parameters' sanity checks as the first step. We
> can add new code after Assert.
>
> 5)
> ObjectsInAllPublicationToOids() only has one caller which calls it
> under check: 'if (stmt->for_all_tables)'
>
> Thus IMO, we do not need a switch-case in
> ObjectsInAllPublicationToOids(). We can simply have a sanity check to
> see it is 'PUBLICATION_ALL_TABLES' and then do the needed operation
> for this pub-type.
>
> 6)
> CreatePublication():
> /*
> * If publication is for ALL TABLES and relations is not empty, it means
> * that there are some relations to be excluded from the publication.
> * Else, relations is the list of relations to be added to the
> * publication.
> */
>
> Shall we rephrase slightly to:
>
> /*
>  * If the publication is for ALL TABLES and 'relations' is not empty,
>  * it indicates that some relations should be excluded from the publication.
>  * Add those excluded relations to the publication with 'prexcept' set to true.
>  * Otherwise, 'relations' contains the list of relations to be explicitly
>  * included in the publication.
>  */
>
> 7)
> + /* Associate objects with the publication. */
> + if (stmt->for_all_tables)
> + {
> + /* Invalidate relcache so that publication info is rebuilt. */
> + CacheInvalidateRelcacheAll();
> + }
>
> I think this comment is misplaced. We shall have it at previous place, atop:
> if (stmt->for_all_tables)
> This is because here we are just trying to invalidate cache while at
> previous place we are trying to associate.
>

Few more:

8)
get_rel_sync_entry()
+ List    *exceptTablePubids = NIL;

At all other places, we are using exceptpubids, shall we use the same here?

9)
ObjectsInPublicationToOids()

  case PUBLICATIONOBJ_TABLE:
+ case PUBLICATIONOBJ_EXCEPT_TABLE:
+ pubobj->pubtable->except = (pubobj->pubobjtype ==
PUBLICATIONOBJ_EXCEPT_TABLE);
  *rels = lappend(*rels, pubobj->pubtable);
  break;

It looks slightly odd that for pubobjtype case
'PUBLICATIONOBJ_EXCEPT_TABLE', we have to check pubobjtype against
PUBLICATIONOBJ_EXCEPT_TABLE itself.

Shall we make it:
case PUBLICATIONOBJ_EXCEPT_TABLE:
    pubobj->pubtable->except = true;
    /* fall through */
case PUBLICATIONOBJ_TABLE:
    *rels = lappend(*rels, pubobj->pubtable);
    break;

10)
I want to understand the usage of DO_PUBLICATION_EXCEPT_REL. Can you
give a scenario where its usage in DOTypeNameCompare() will be hit?
Its all other usages too need some analysis and validation.

11)
+ List    *except_objects; /* List of publication object to be excluded */

object --> objects
Currently since we exclude only tables, does it make sense to name it
as except_tables?

thanks
Shveta

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

18 декабря 2025 г., 14:45:06

On Thu, 18 Dec 2025 at 11:30, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Dec 17, 2025 at 11:24 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Dec 16, 2025 at 2:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > > I have also addressed the remaining comments and attached the latest patch.
> > >
> >
> > Thanks. A few comments:
> >
> > 1)
> > + if (!set_top && puballtables)
> > + set_top = !list_member_oid(aexceptpubids, puboid);
> >
> > In GetTopMostAncestorInPublication(), we have made the above change
> > which will now get ancestor from all-tables publication as well,
> > provided table is not part of 'except' List. Earlier this function was
> > only checking pg_subscription_rel and pg_publication_namespace which
> > does not include all-tables publication. Won't it change the
> > result-set for callers?
> >
It can change the result set of callers. I analysed more and saw that
GetTopMostAncestorInPublication is called from 3 places.
1. pub_rf_contains_invalid_column: it is called when publication is
not ALL TABLES. It will have no impact with the change.
2. pub_contains_invalid_column : it is called for all type of
publication. it calls GetTopMostAncestorInPublication like:
```
    if (pubviaroot && relation->rd_rel->relispartition)
  {
    publish_as_relid = GetTopMostAncestorInPublication(pubid, ancestors,
                               NULL, puballtables);

    if (!OidIsValid(publish_as_relid))
      publish_as_relid = relid;
  }
```
In HEAD for ALL TABLES publication GetTopMostAncestorInPublication
will always return InvalidOid. With this patch it can have some value.
So there is a difference in behaviour.

3. get_rel_sync_entry
in HEAD we had
```
if (pub->alltables)
      {
        publish = true;
        if (pub->pubviaroot && am_partition)
        {
          List     *ancestors = get_partition_ancestors(relid);

          pub_relid = llast_oid(ancestors);
          ancestor_level = list_length(ancestors);
        }
      }
```
With patch this condition is not valid because we cannot set
'pub_relid = llast_oid(ancestors);' directly as the table can be
excluded.
So, the change in GetTopMostAncestorInPublication will even handle the
case of "ALL TABLES" publication.

Since we have a behaviour difference for the 2nd function, I have
removed the changes for 'ALL TABLES' from
GetTopMostAncestorInPublication and added it separately
'get_rel_sync_entry'. Thoughts?

> > 2)
> > + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
> > + * GetAllPublicationRelations() to obtain the complete set of tables covered by
> > + * the publication.
> > + */
> > +List *
> > +GetPublicationIncludedRelations(Oid pubid, PublicationPartOpt pub_partopt)
> > +{
> > + return GetPublicationRelationsInternal(pubid, pub_partopt, false);
> > +}
> >
> > We can have an Assert here that pubid passed is not for ALL-Tables or
> > ALL-sequences
> >
Added assert for all tables. I found during testing that this function
can be called for ALL SEQUENCES in HEAD. So I have not added an
assertion for it.
I think it is a bug and shared the same in [1]. Will add assert for
all sequences as well once the bug is fixed.

> > 3)
> > GetAllPublicationRelations:
> >  * If the publication publishes partition changes via their respective root
> >  * partitioned tables, we must exclude partitions in favor of including the
> >  * root partitioned tables. This is not applicable to FOR ALL SEQUENCES
> >  * publication.
> >
> > + * The list does not include relations that are explicitly excluded via the
> > + * EXCEPT TABLE clause of the publication specified by pubid.
> >
> > Suggestion:
> > /*
> >  * If the publication publishes partition changes via their respective root
> >  * partitioned tables, we must exclude partitions in favor of including the
> >  * root partitioned tables. The list also excludes tables that are
> >  * explicitly excluded via the EXCEPT TABLE clause of the publication
> >  * identified by pubid. Neither of these rules applies to FOR ALL SEQUENCES
> >  * publications.
> >  */
> >
> > 4)
> > GetAllPublicationRelations:
> > + if (relkind == RELKIND_RELATION)
> > + exceptlist = GetPublicationExcludedRelations(pubid, pubviaroot ?
> > + PUBLICATION_PART_ALL :
> > + PUBLICATION_PART_ROOT);
> >
> >   Assert(!(relkind == RELKIND_SEQUENCE && pubviaroot));
> >
> > Generally we keep such parameters' sanity checks as the first step. We
> > can add new code after Assert.
> >
> > 5)
> > ObjectsInAllPublicationToOids() only has one caller which calls it
> > under check: 'if (stmt->for_all_tables)'
> >
> > Thus IMO, we do not need a switch-case in
> > ObjectsInAllPublicationToOids(). We can simply have a sanity check to
> > see it is 'PUBLICATION_ALL_TABLES' and then do the needed operation
> > for this pub-type.
> >
> > 6)
> > CreatePublication():
> > /*
> > * If publication is for ALL TABLES and relations is not empty, it means
> > * that there are some relations to be excluded from the publication.
> > * Else, relations is the list of relations to be added to the
> > * publication.
> > */
> >
> > Shall we rephrase slightly to:
> >
> > /*
> >  * If the publication is for ALL TABLES and 'relations' is not empty,
> >  * it indicates that some relations should be excluded from the publication.
> >  * Add those excluded relations to the publication with 'prexcept' set to true.
> >  * Otherwise, 'relations' contains the list of relations to be explicitly
> >  * included in the publication.
> >  */
> >
> > 7)
> > + /* Associate objects with the publication. */
> > + if (stmt->for_all_tables)
> > + {
> > + /* Invalidate relcache so that publication info is rebuilt. */
> > + CacheInvalidateRelcacheAll();
> > + }
> >
> > I think this comment is misplaced. We shall have it at previous place, atop:
> > if (stmt->for_all_tables)
> > This is because here we are just trying to invalidate cache while at
> > previous place we are trying to associate.
> >
>
> Few more:
>
> 8)
> get_rel_sync_entry()
> + List    *exceptTablePubids = NIL;
>
> At all other places, we are using exceptpubids, shall we use the same here?
>
> 9)
> ObjectsInPublicationToOids()
>
>   case PUBLICATIONOBJ_TABLE:
> + case PUBLICATIONOBJ_EXCEPT_TABLE:
> + pubobj->pubtable->except = (pubobj->pubobjtype ==
> PUBLICATIONOBJ_EXCEPT_TABLE);
>   *rels = lappend(*rels, pubobj->pubtable);
>   break;
>
> It looks slightly odd that for pubobjtype case
> 'PUBLICATIONOBJ_EXCEPT_TABLE', we have to check pubobjtype against
> PUBLICATIONOBJ_EXCEPT_TABLE itself.
>
> Shall we make it:
> case PUBLICATIONOBJ_EXCEPT_TABLE:
>     pubobj->pubtable->except = true;
>     /* fall through */
> case PUBLICATIONOBJ_TABLE:
>     *rels = lappend(*rels, pubobj->pubtable);
>     break;
>
We should also make pubobj->pubtable->except = false for PUBLICATIONOBJ_TABLE?
Updated the condition like:
      case PUBLICATIONOBJ_EXCEPT_TABLE:
        pubobj->pubtable->except = true;
        *rels = lappend(*rels, pubobj->pubtable);
        break;
      case PUBLICATIONOBJ_TABLE:
        pubobj->pubtable->except = false;
        *rels = lappend(*rels, pubobj->pubtable);
        break;

> 10)
> I want to understand the usage of DO_PUBLICATION_EXCEPT_REL. Can you
> give a scenario where its usage in DOTypeNameCompare() will be hit?
> Its all other usages too need some analysis and validation.
>
In the current patch we are not setting an objecttype to
DO_PUBLICATION_EXCEPT_REL.
We are storing the list of except tables in 'pubinfo[i].excepttbls'
list in function getPublications and "pubinfo[i].dobj.objType =
DO_PUBLICATION". So, I don't see any requirement of
DO_PUBLICATION_EXCEPT_REL now. I have removed it.

> 11)
> + List    *except_objects; /* List of publication object to be excluded */
>
> object --> objects
> Currently since we exclude only tables, does it make sense to name it
> as except_tables?
>
I have also addressed the remaining comments and attached the updated v33 patch.
[1]:
https://www.postgresql.org/message-id/CALDaNm0qoNtsX%2B9KPug6qb%3DuC-H2iPMYW%2BgL%3DHehx%2BNgOxga6w%40mail.gmail.com

Thanks,
Shlok Kyal

Вложения

v33-0001-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch

Re: Skipping schema changes in publication

От

shveta malik

Дата:

22 декабря 2025 г., 08:12:33

On Thu, Dec 18, 2025 at 5:15 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Thu, 18 Dec 2025 at 11:30, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Dec 17, 2025 at 11:24 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, Dec 16, 2025 at 2:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > > >
> > > > I have also addressed the remaining comments and attached the latest patch.
> > > >
> > >
> > > Thanks. A few comments:
> > >
> > > 1)
> > > + if (!set_top && puballtables)
> > > + set_top = !list_member_oid(aexceptpubids, puboid);
> > >
> > > In GetTopMostAncestorInPublication(), we have made the above change
> > > which will now get ancestor from all-tables publication as well,
> > > provided table is not part of 'except' List. Earlier this function was
> > > only checking pg_subscription_rel and pg_publication_namespace which
> > > does not include all-tables publication. Won't it change the
> > > result-set for callers?
> > >
> It can change the result set of callers. I analysed more and saw that
> GetTopMostAncestorInPublication is called from 3 places.
> 1. pub_rf_contains_invalid_column: it is called when publication is
> not ALL TABLES. It will have no impact with the change.
> 2. pub_contains_invalid_column : it is called for all type of
> publication. it calls GetTopMostAncestorInPublication like:
> ```
>     if (pubviaroot && relation->rd_rel->relispartition)
>   {
>     publish_as_relid = GetTopMostAncestorInPublication(pubid, ancestors,
>                                NULL, puballtables);
>
>     if (!OidIsValid(publish_as_relid))
>       publish_as_relid = relid;
>   }
> ```
> In HEAD for ALL TABLES publication GetTopMostAncestorInPublication
> will always return InvalidOid. With this patch it can have some value.
> So there is a difference in behaviour.
>
> 3. get_rel_sync_entry
> in HEAD we had
> ```
> if (pub->alltables)
>       {
>         publish = true;
>         if (pub->pubviaroot && am_partition)
>         {
>           List     *ancestors = get_partition_ancestors(relid);
>
>           pub_relid = llast_oid(ancestors);
>           ancestor_level = list_length(ancestors);
>         }
>       }
> ```
> With patch this condition is not valid because we cannot set
> 'pub_relid = llast_oid(ancestors);' directly as the table can be
> excluded.
> So, the change in GetTopMostAncestorInPublication will even handle the
> case of "ALL TABLES" publication.
>
> Since we have a behaviour difference for the 2nd function, I have
> removed the changes for 'ALL TABLES' from
> GetTopMostAncestorInPublication and added it separately
> 'get_rel_sync_entry'. Thoughts?

I find the current implementation better, the previous one was
impacting the results of different paths.

Regarding:
+ if (list_member_oid(aexceptpubids, puboid))
+ {
+ list_free(aexceptpubids);
+ continue;
+ }

IMO, if puboid is part of apubids, that check is enough. This is
because aexceptpubids and apubids are mutually exclusive lists for a
particular 'ancestor'. But if we want to have it to avoid
schma-mapping check later, we should add a comment. How about:

This step isn’t strictly necessary, but we keep it so we can skip the
table if it appears in the EXCEPT list, avoiding an expensive
schema-mapping check later.

>
> > > 2)
> > > + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
> > > + * GetAllPublicationRelations() to obtain the complete set of tables covered by
> > > + * the publication.
> > > + */
> > > +List *
> > > +GetPublicationIncludedRelations(Oid pubid, PublicationPartOpt pub_partopt)
> > > +{
> > > + return GetPublicationRelationsInternal(pubid, pub_partopt, false);
> > > +}
> > >
> > > We can have an Assert here that pubid passed is not for ALL-Tables or
> > > ALL-sequences
> > >
> Added assert for all tables. I found during testing that this function
> can be called for ALL SEQUENCES in HEAD. So I have not added an
> assertion for it.
> I think it is a bug and shared the same in [1]. Will add assert for
> all sequences as well once the bug is fixed.
>

Okay.

> > > 3)
> > > GetAllPublicationRelations:
> > >  * If the publication publishes partition changes via their respective root
> > >  * partitioned tables, we must exclude partitions in favor of including the
> > >  * root partitioned tables. This is not applicable to FOR ALL SEQUENCES
> > >  * publication.
> > >
> > > + * The list does not include relations that are explicitly excluded via the
> > > + * EXCEPT TABLE clause of the publication specified by pubid.
> > >
> > > Suggestion:
> > > /*
> > >  * If the publication publishes partition changes via their respective root
> > >  * partitioned tables, we must exclude partitions in favor of including the
> > >  * root partitioned tables. The list also excludes tables that are
> > >  * explicitly excluded via the EXCEPT TABLE clause of the publication
> > >  * identified by pubid. Neither of these rules applies to FOR ALL SEQUENCES
> > >  * publications.
> > >  */
> > >
> > > 4)
> > > GetAllPublicationRelations:
> > > + if (relkind == RELKIND_RELATION)
> > > + exceptlist = GetPublicationExcludedRelations(pubid, pubviaroot ?
> > > + PUBLICATION_PART_ALL :
> > > + PUBLICATION_PART_ROOT);
> > >
> > >   Assert(!(relkind == RELKIND_SEQUENCE && pubviaroot));
> > >
> > > Generally we keep such parameters' sanity checks as the first step. We
> > > can add new code after Assert.
> > >
> > > 5)
> > > ObjectsInAllPublicationToOids() only has one caller which calls it
> > > under check: 'if (stmt->for_all_tables)'
> > >
> > > Thus IMO, we do not need a switch-case in
> > > ObjectsInAllPublicationToOids(). We can simply have a sanity check to
> > > see it is 'PUBLICATION_ALL_TABLES' and then do the needed operation
> > > for this pub-type.
> > >
> > > 6)
> > > CreatePublication():
> > > /*
> > > * If publication is for ALL TABLES and relations is not empty, it means
> > > * that there are some relations to be excluded from the publication.
> > > * Else, relations is the list of relations to be added to the
> > > * publication.
> > > */
> > >
> > > Shall we rephrase slightly to:
> > >
> > > /*
> > >  * If the publication is for ALL TABLES and 'relations' is not empty,
> > >  * it indicates that some relations should be excluded from the publication.
> > >  * Add those excluded relations to the publication with 'prexcept' set to true.
> > >  * Otherwise, 'relations' contains the list of relations to be explicitly
> > >  * included in the publication.
> > >  */
> > >
> > > 7)
> > > + /* Associate objects with the publication. */
> > > + if (stmt->for_all_tables)
> > > + {
> > > + /* Invalidate relcache so that publication info is rebuilt. */
> > > + CacheInvalidateRelcacheAll();
> > > + }
> > >
> > > I think this comment is misplaced. We shall have it at previous place, atop:
> > > if (stmt->for_all_tables)
> > > This is because here we are just trying to invalidate cache while at
> > > previous place we are trying to associate.
> > >
> >
> > Few more:
> >
> > 8)
> > get_rel_sync_entry()
> > + List    *exceptTablePubids = NIL;
> >
> > At all other places, we are using exceptpubids, shall we use the same here?
> >
> > 9)
> > ObjectsInPublicationToOids()
> >
> >   case PUBLICATIONOBJ_TABLE:
> > + case PUBLICATIONOBJ_EXCEPT_TABLE:
> > + pubobj->pubtable->except = (pubobj->pubobjtype ==
> > PUBLICATIONOBJ_EXCEPT_TABLE);
> >   *rels = lappend(*rels, pubobj->pubtable);
> >   break;
> >
> > It looks slightly odd that for pubobjtype case
> > 'PUBLICATIONOBJ_EXCEPT_TABLE', we have to check pubobjtype against
> > PUBLICATIONOBJ_EXCEPT_TABLE itself.
> >
> > Shall we make it:
> > case PUBLICATIONOBJ_EXCEPT_TABLE:
> >     pubobj->pubtable->except = true;
> >     /* fall through */
> > case PUBLICATIONOBJ_TABLE:
> >     *rels = lappend(*rels, pubobj->pubtable);
> >     break;
> >
> We should also make pubobj->pubtable->except = false for PUBLICATIONOBJ_TABLE?

yes, right.

> Updated the condition like:
>       case PUBLICATIONOBJ_EXCEPT_TABLE:
>         pubobj->pubtable->except = true;
>         *rels = lappend(*rels, pubobj->pubtable);
>         break;
>       case PUBLICATIONOBJ_TABLE:
>         pubobj->pubtable->except = false;
>         *rels = lappend(*rels, pubobj->pubtable);
>         break;
>

Looks good.

> > 10)
> > I want to understand the usage of DO_PUBLICATION_EXCEPT_REL. Can you
> > give a scenario where its usage in DOTypeNameCompare() will be hit?
> > Its all other usages too need some analysis and validation.
> >
> In the current patch we are not setting an objecttype to
> DO_PUBLICATION_EXCEPT_REL.
> We are storing the list of except tables in 'pubinfo[i].excepttbls'
> list in function getPublications and "pubinfo[i].dobj.objType =
> DO_PUBLICATION". So, I don't see any requirement of
> DO_PUBLICATION_EXCEPT_REL now. I have removed it.
>

Yes, that was my initial thought as well, that we might not need it.
But I’ll review it further and let you know.

> > 11)
> > + List    *except_objects; /* List of publication object to be excluded */
> >
> > object --> objects
> > Currently since we exclude only tables, does it make sense to name it
> > as except_tables?
> >
> I have also addressed the remaining comments and attached the updated v33 patch.
> [1]:
https://www.postgresql.org/message-id/CALDaNm0qoNtsX%2B9KPug6qb%3DuC-H2iPMYW%2BgL%3DHehx%2BNgOxga6w%40mail.gmail.com
>

Thanks, will review.

thanks
Shveta

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

22 декабря 2025 г., 09:07:27

Hi Shlok.

Some review comments for patch v33-0001 (code part)

======
src/backend/catalog/pg_publication.c

GetPublicationRelationsInternal:

1.
Static function names should be snake_case.

~~~

GetPublicationIncludedRelations:

2.
+/*
+ * Return the list of relation OIDs for a publication.
+ *
+ * For a FOR TABLE publication, this returns the list of relations explicitly
+ * included in the publication.
+ *
+ * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
+ * GetAllPublicationRelations() to obtain the complete set of tables covered by
+ * the publication.
+ */
+List *
+GetPublicationIncludedRelations(Oid pubid, PublicationPartOpt pub_partopt)
+{
+ Assert(!GetPublication(pubid)->alltables);
+
+ return GetPublicationRelationsInternal(pubid, pub_partopt, false);
+}

Why isn't the Assert also saying something about puballsequences, as
mentioned in the function comment?

~~~

GetAllPublicationRelations:

3.
+ * root partitioned tables. The list also excludes tables that are
+ * explicitly excluded via the EXCEPT TABLE clause of the publication
+ * identified by pubid. Neither of these rules applies to FOR ALL SEQUENCES
+ * publications.

3.
It seems wrong to say "FOR ALL SEQUENCES" ... that seems to assume the
"FOR ALL SEQUENCES" and "FOR ALL TABLES" cannot co-exist. Did you mean
"Neither of ... to published sequences"?

~

4.
-GetAllPublicationRelations(char relkind, bool pubviaroot)
+GetAllPublicationRelations(Oid pubid, char relkind, bool pubviaroot)

There are tricky rules about relation vs sequences and the
publish_via_partition_root parameter value. It would be better if you
encapsulate all this within this function. Specifically, it would be
simpler if you passed the 'Publication' arg instead of the pubid. Then
you can get the pubviaroot value from that (within the function)
instead of passing around "fake" values of false when you are looking
at RELKIND_SEQUENCE.
======
src/backend/commands/publicationcmds.c

ObjectsInAllPublicationToOids:

5.
+ foreach_ptr(PublicationAllObjSpec, puballobj, puballobjspec_list)
+ {
+ if (puballobj->pubobjtype != PUBLICATION_ALL_TABLES)
+ continue;
+
+ foreach_ptr(PublicationObjSpec, pubobj, puballobj->except_tables)
+ {
+ pubobj->pubtable->except = true;
+ *rels = lappend(*rels, pubobj->pubtable);
+ }
+ }

I think it's tidier to code this like below:

if (puballobj->pubobjtype == PUBLICATION_ALL_TABLES)
{
  foreach_ptr...
}

~~~

pub_contains_invalid_column:

6.
 bool
 pub_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
  bool pubviaroot, char pubgencols_type,
- bool *invalid_column_list,
- bool *invalid_gen_col)
+ bool *invalid_column_list, bool *invalid_gen_col)

Why does this change even exist at all in this patch?

~~~

CreatePublication:

7.
+ /*
+ * If the publication is for ALL TABLES and 'relations' is not empty, it
+ * indicates that some relations should be excluded from the publication.
+ * Add those excluded relations to the publication with 'prexcept' set to
+ * true. Otherwise, 'relations' contains the list of relations to be
+ * explicitly included in the publication.
+ */
+ if (relations != NIL)
+ {
+ List    *rels;
+
+ rels = OpenTableList(relations);
+ TransformPubWhereClauses(rels, pstate->p_sourcetext,
+ publish_via_partition_root);
+
+ CheckPubRelationColumnList(stmt->pubname, rels,
+    schemaidlist != NIL,
+    publish_via_partition_root);
+
+ PublicationAddTables(puboid, rels, true, NULL);
+ CloseTableList(rels);
+ }
+

The comment and the code don't match. The comment is talking about
rules for FOR ALL TABLES, but puballtables is not part of any
condition here (??). Was all this supposed to be within the "if
(stmt->for_all_tables)" code block?

======
src/bin/pg_dump/pg_dump.c

8.
- "SELECT tableoid, oid, prpubid, prrelid, "
+ "SELECT tableoid, oid, prpubid, prrelid,\n"
  "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual, "
  "(CASE\n"
  "  WHEN pr.prattrs IS NOT NULL THEN\n"
@@ -4868,6 +4929,9 @@ getPublicationTables(Archive *fout, TableInfo
tblinfo[], int numTables)
  "      WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
  "  ELSE NULL END) prattrs "
  "FROM pg_catalog.pg_publication_rel pr");
+ if (fout->remoteVersion >= 190000)
+ appendPQExpBufferStr(query, " WHERE prexcept = false");

8a
Isn't it better to qualify everything here with the alias 'pr'?

~

8b.
Also "WHERE NOT pr.prexcept;" might be more conssitent with other code
I saw in describe.c

======
src/bin/pg_dump/pg_dump.h

9.
  PublishGencolsType pubgencols_type;
+ SimplePtrList excepttbls;
 } PublicationInfo;

How about "tables instead of "tbls" (e.g. "excepttables" or
"except_tables") here? That would also be more consistent with the
other puballtables member.

======
src/test/regress/sql/publication.sql

10.
 RESET client_min_messages;
 \dRp+ testpub3
 \dRp+ testpub4
+\dRp+ testpub5
+\dRp+ testpub6
+\dRp+ testpub7


I feel it would be better to keep each \dRp+ together with the test it
belongs to, rather than have a bunch of different tests which are then
followed by a bunch of different \dRp+. Note: this same comment
applies to other place of places -- not just here. Check everywhere
you do \dRp+

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

От

Shlok Kyal

Дата:

23 декабря 2025 г., 09:32:49

On Mon, 22 Dec 2025 at 11:37, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for patch v33-0001 (code part)
>
> ======
> src/backend/catalog/pg_publication.c
>
> GetPublicationRelationsInternal:
>
> 1.
> Static function names should be snake_case.
>
> ~~~
>
> GetPublicationIncludedRelations:
>
> 2.
> +/*
> + * Return the list of relation OIDs for a publication.
> + *
> + * For a FOR TABLE publication, this returns the list of relations explicitly
> + * included in the publication.
> + *
> + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
> + * GetAllPublicationRelations() to obtain the complete set of tables covered by
> + * the publication.
> + */
> +List *
> +GetPublicationIncludedRelations(Oid pubid, PublicationPartOpt pub_partopt)
> +{
> + Assert(!GetPublication(pubid)->alltables);
> +
> + return GetPublicationRelationsInternal(pubid, pub_partopt, false);
> +}
>
> Why isn't the Assert also saying something about puballsequences, as
> mentioned in the function comment?
>
I reported a similar kind of issue in HEAD in [1].
As per the latest discussion, I understood that it is ok to call this
function for ALL SEQUENCES.
I have updated the comment.

> ~~~
>
> GetAllPublicationRelations:
>
> 3.
> + * root partitioned tables. The list also excludes tables that are
> + * explicitly excluded via the EXCEPT TABLE clause of the publication
> + * identified by pubid. Neither of these rules applies to FOR ALL SEQUENCES
> + * publications.
>
> 3.
> It seems wrong to say "FOR ALL SEQUENCES" ... that seems to assume the
> "FOR ALL SEQUENCES" and "FOR ALL TABLES" cannot co-exist. Did you mean
> "Neither of ... to published sequences"?
>
I have modified the comment.

> ~
>
> 4.
> -GetAllPublicationRelations(char relkind, bool pubviaroot)
> +GetAllPublicationRelations(Oid pubid, char relkind, bool pubviaroot)
>
> There are tricky rules about relation vs sequences and the
> publish_via_partition_root parameter value. It would be better if you
> encapsulate all this within this function. Specifically, it would be
> simpler if you passed the 'Publication' arg instead of the pubid. Then
> you can get the pubviaroot value from that (within the function)
> instead of passing around "fake" values of false when you are looking
> at RELKIND_SEQUENCE.
> ======
> src/backend/commands/publicationcmds.c
>
> ObjectsInAllPublicationToOids:
>
> 5.
> + foreach_ptr(PublicationAllObjSpec, puballobj, puballobjspec_list)
> + {
> + if (puballobj->pubobjtype != PUBLICATION_ALL_TABLES)
> + continue;
> +
> + foreach_ptr(PublicationObjSpec, pubobj, puballobj->except_tables)
> + {
> + pubobj->pubtable->except = true;
> + *rels = lappend(*rels, pubobj->pubtable);
> + }
> + }
>
> I think it's tidier to code this like below:
>
> if (puballobj->pubobjtype == PUBLICATION_ALL_TABLES)
> {
>   foreach_ptr...
> }
>
> ~~~
>
> pub_contains_invalid_column:
>
> 6.
>  bool
>  pub_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
>   bool pubviaroot, char pubgencols_type,
> - bool *invalid_column_list,
> - bool *invalid_gen_col)
> + bool *invalid_column_list, bool *invalid_gen_col)
>
> Why does this change even exist at all in this patch?
This change is not required. I have reverted it.

>
> ~~~
>
> CreatePublication:
>
> 7.
> + /*
> + * If the publication is for ALL TABLES and 'relations' is not empty, it
> + * indicates that some relations should be excluded from the publication.
> + * Add those excluded relations to the publication with 'prexcept' set to
> + * true. Otherwise, 'relations' contains the list of relations to be
> + * explicitly included in the publication.
> + */
> + if (relations != NIL)
> + {
> + List    *rels;
> +
> + rels = OpenTableList(relations);
> + TransformPubWhereClauses(rels, pstate->p_sourcetext,
> + publish_via_partition_root);
> +
> + CheckPubRelationColumnList(stmt->pubname, rels,
> +    schemaidlist != NIL,
> +    publish_via_partition_root);
> +
> + PublicationAddTables(puboid, rels, true, NULL);
> + CloseTableList(rels);
> + }
> +
>
> The comment and the code don't match. The comment is talking about
> rules for FOR ALL TABLES, but puballtables is not part of any
> condition here (??). Was all this supposed to be within the "if
> (stmt->for_all_tables)" code block?
>
For both ALL TABLES publication and non-ALL TABLES publication we need
the same code block.
Setting of prexcept flag will be handled in PublicationAddTables.
This comment clarifies what the list 'relations' would mean for ALL
TABLES publication and non-ALL TABLES publication

> ======
> src/bin/pg_dump/pg_dump.c
>
> 8.
> - "SELECT tableoid, oid, prpubid, prrelid, "
> + "SELECT tableoid, oid, prpubid, prrelid,\n"
>   "pg_catalog.pg_get_expr(prqual, prrelid) AS prrelqual, "
>   "(CASE\n"
>   "  WHEN pr.prattrs IS NOT NULL THEN\n"
> @@ -4868,6 +4929,9 @@ getPublicationTables(Archive *fout, TableInfo
> tblinfo[], int numTables)
>   "      WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
>   "  ELSE NULL END) prattrs "
>   "FROM pg_catalog.pg_publication_rel pr");
> + if (fout->remoteVersion >= 190000)
> + appendPQExpBufferStr(query, " WHERE prexcept = false");
>
> 8a
> Isn't it better to qualify everything here with the alias 'pr'?
>
It is an existing code. So I prefer not to modify it in this patch. I
have added the alias for the column added by this patch.

> ~
>
> 8b.
> Also "WHERE NOT pr.prexcept;" might be more conssitent with other code
> I saw in describe.c
>
> ======
> src/bin/pg_dump/pg_dump.h
>
> 9.
>   PublishGencolsType pubgencols_type;
> + SimplePtrList excepttbls;
>  } PublicationInfo;
>
> How about "tables instead of "tbls" (e.g. "excepttables" or
> "except_tables") here? That would also be more consistent with the
> other puballtables member.
>
> ======
> src/test/regress/sql/publication.sql
>
> 10.
>  RESET client_min_messages;
>  \dRp+ testpub3
>  \dRp+ testpub4
> +\dRp+ testpub5
> +\dRp+ testpub6
> +\dRp+ testpub7
>
>
> I feel it would be better to keep each \dRp+ together with the test it
> belongs to, rather than have a bunch of different tests which are then
> followed by a bunch of different \dRp+. Note: this same comment
> applies to other place of places -- not just here. Check everywhere
> you do \dRp+
>

I have addressed the remaining comments, did some cosmetic changes and
addressed the comment shared by Shveta in [2].
[1]: https://www.postgresql.org/message-id/CAA4eK1+rnjBOvkiQC2r4LuTwuje653iVPPAXcmJZXPpKvsNbOQ@mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAJpy0uCf5tXvqyVS3GQzU9J5HdSLAxX6Lxt1UKY4HJ8qnimCAw%40mail.gmail.com

Thanks,
Shlok Kyal

Вложения

v34-0001-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch

Re: Skipping schema changes in publication

От

Peter Smith

Дата:

24 декабря 2025 г., 06:12:31

Hi Shlok

Some review comments for patch v34-0001 (code)

======
src/backend/catalog/pg_publication.c

1.
+static List *
+get_publication_relations_internal(Oid pubid, PublicationPartOpt pub_partopt,
+    bool except_flag)

No need to name this function as "_internal"; the snake_case name and
static already indicate it is internal.

======
src/bin/pg_dump/pg_dump.c

getPublications:

2.
+ if (fout->remoteVersion >= 190000)
+ {
+ int ntbls;
+ PGresult   *res_tbls;
+
+ resetPQExpBuffer(query);
+ appendPQExpBuffer(query,
+   "SELECT prrelid\n"
+   "FROM pg_catalog.pg_publication_rel\n"
+   "WHERE prpubid = %u and prexcept = true",
+   pubinfo[i].dobj.catId.oid);
+
+ res_tbls = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+ ntbls = PQntuples(res_tbls);
+ if (ntbls == 0)
+ continue;
+
+ for (int j = 0; j < ntbls; j++)
+ {
+ Oid prrelid;
+ TableInfo  *tbinfo;
+
+ prrelid = atooid(PQgetvalue(res_tbls, j, 0));
+
+ tbinfo = findTableByOid(prrelid);
+ if (tbinfo == NULL)
+ continue;
+
+ simple_ptr_list_append(&pubinfo[i].except_tables, tbinfo);
+ }
+
+ PQclear(res_tbls);
+ }

2a.
I suppose this code is for populating the list of all tables except
those excluded, but there is no explanatory comment stating the
purpose of all this.

~

2b.
BEFORE
"WHERE prpubid = %u and prexcept = true"

SUGGESTION
"WHERE prpubid = %u AND prexcept"

~~~

dumpPublication:

3.
+ {
+ bool first_tbl = true;
+
  appendPQExpBufferStr(query, " FOR ALL TABLES");
+
+ /* Include exception tables if the publication has EXCEPT TABLEs */
+ for (SimplePtrListCell *cell = pubinfo->except_tables.head; cell;
cell = cell->next)
+ {
+ TableInfo  *tbinfo = (TableInfo *) cell->ptr;
+
+ if (first_tbl)
+ {
+ appendPQExpBufferStr(query, " EXCEPT TABLE (");
+ first_tbl = false;
+ }
+ else
+ appendPQExpBufferStr(query, ", ");
+ appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
+ }
+ if (!first_tbl)
+ appendPQExpBufferStr(query, ")");
+ }

3a.
That code comment seems backwards.

BEFORE
/* Include exception tables if the publication has EXCEPT TABLEs */

SUGGESTION
/* Include EXCEPT TABLE clause if there are except_tables. */

~~~

3b.
Although it works OK, I felt the following looked strange:
+ if (!first_tbl)
+ appendPQExpBufferStr(query, ")");

IMO it would be better implemented as a counter:

Replace
bool first_tbl = true;
with
int n_excluded = 0;

Then,
+ if (first_tbl)
+ {
+ appendPQExpBufferStr(query, " EXCEPT TABLE (");
+ first_tbl = false;
+ }
becomes
+ if (++n_excluded == 1)
+ appendPQExpBufferStr(query, " EXCEPT TABLE (");

And,
+ if (!first_tbl)
+ appendPQExpBufferStr(query, ")");
becomes
+ if (n_excluded > 0)
+ appendPQExpBufferStr(query, ")");

======
src/bin/psql/describe.c

describeOneTableDetails:

4.
+ /* Print publication the relation is excluded explicitly */
+ if (pset.sversion >= 190000)

The comment doesn't seem right:

SUGGESTION
Print publications that the table is explicitly excluded from

======
src/test/regress/sql/publication.sql

5.
Missing tests.

There are no test cases to show that \d is working for printing the
"Except Publications:".

======
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Skipping schema changes in publication

От

shveta malik

Дата:

26 декабря 2025 г., 12:57:22

On Tue, Dec 23, 2025 at 12:03 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
>
> I have addressed the remaining comments, did some cosmetic changes and
> addressed the comment shared by Shveta in [2].
> [1]: https://www.postgresql.org/message-id/CAA4eK1+rnjBOvkiQC2r4LuTwuje653iVPPAXcmJZXPpKvsNbOQ@mail.gmail.com
> [2]: https://www.postgresql.org/message-id/CAJpy0uCf5tXvqyVS3GQzU9J5HdSLAxX6Lxt1UKY4HJ8qnimCAw%40mail.gmail.com
>

Thank You for the patch. Please find a few comments:

1)
GetTopMostAncestorInPublication():

+ if (list_member_oid(aexceptpubids, puboid))
+ {
+ list_free(aexceptpubids);
+ continue;
+ }

We need to do 'list_free(apubids)' as well here.

2)
GetTopMostAncestorInPublication(). Currently it has:

if (list_member_oid(aexceptpubids, puboid))
...
if (list_member_oid(apubids, puboid))
...
else
...schema mapping check

IMO more natural order of checks will be

if (list_member_oid(apubids, puboid))
..
else if (list_member_oid(aexceptpubids, puboid))
...
else
...schema mapping check

3)
+/*
+ * Return the list of relation OIDs excluded from a publication.
+ * This is only applicable for FOR ALL TABLES publications.
+ */
+List *
+GetPublicationExcludedRelations(Oid pubid, PublicationPartOpt pub_partopt)

a) Since now 'Relations' term means both tables and sequences, but
here we mean only Tables, we can rename it to have 'Tables' rather
than 'Relations'

b) Similar to GetAllPublicationRelations which is for 'ALL Tables'
pub, we can rename it to have 'All'

So the name can be 'GetAllPublicationExcludedTables' to be more clear.

Also we can move this function close to GetAllPublicationRelations as
it is more related to that.

4)
ObjectsInPublicationToOids()
+ case PUBLICATIONOBJ_EXCEPT_TABLE:
+ pubobj->pubtable->except = true;
+ *rels = lappend(*rels, pubobj->pubtable);
+ break;

Let me know when this will be hit when we already have
'ObjectsInAllPublicationToOids' in place?

5)
get_rel_sync_entry():
+ level++;
+ GetRelationPublications(ancestor, NULL, &aexceptpubids);
+
+ if (!list_member_oid(aexceptpubids, pub->oid))
+ {
+ pub_relid = ancestor;
+ ancestor_level = level;
+ }
+ }

Consider the following table structure:
t1 has a partition p1, which in turn has a child partition
child_part1. When publish_via_partition_root is set to true, any
changes made to child_part1 are replicated through t1. If we add t1 to
the EXCEPT list, get_rel_sync_entry() still marks p1 as an ancestor to
publish changes or child_part1. Is it correct?

6)
RelationBuildPublicationDesc() also needs some more analysis about
getting and setting ancestor part for above case.

7)
Currently the way we deal with the except table in pg_dump.c differs
from how we deal with included-table. To explain the same, how about
adding below comment in getPublications() just before we fetch
except-list:

We process EXCEPT TABLES here instead of in getPublicationTables(),
and output them directly in dumpPublication(). This differs from the
approach used in dumpPublicationTable() and
dumpPublicationNamespace(). Following that approach would require
dumping table additions later as ALTER PUBLICATION … ADD EXCEPT, which
is currently not supported.

thanks
Shveta

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Skipping schema changes in publication

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения