Обсуждение: Skipping schema changes in publication

Поиск
Список
Период
Сортировка

Skipping schema changes in publication

От
vignesh C
Дата:
Hi,

This feature adds an option to skip changes of all tables in specified
schema while creating publication.
This feature is helpful for use cases where the user wants to
subscribe to all the changes except for the changes present in a few
schemas.
Ex:
CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;

A new column pnskip is added to table "pg_publication_namespace", to
maintain the schemas that the user wants to skip publishing through
the publication. Modified the output plugin (pgoutput) to skip
publishing the changes if the relation is part of skip schema
publication.
As a continuation to this, I will work on implementing skipping tables
from all tables in schema and skipping tables from all tables
publication.

Attached patch has the implementation for this.
This feature is for the pg16 version.
Thoughts?

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Hi,
>
> This feature adds an option to skip changes of all tables in specified
> schema while creating publication.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> schemas.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
>
> A new column pnskip is added to table "pg_publication_namespace", to
> maintain the schemas that the user wants to skip publishing through
> the publication. Modified the output plugin (pgoutput) to skip
> publishing the changes if the relation is part of skip schema
> publication.
> As a continuation to this, I will work on implementing skipping tables
> from all tables in schema and skipping tables from all tables
> publication.
>
> Attached patch has the implementation for this.

The patch was not applying on top of HEAD because of the recent
commits, attached patch is rebased on top of HEAD.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Hi,
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
> > A new column pnskip is added to table "pg_publication_namespace", to
> > maintain the schemas that the user wants to skip publishing through
> > the publication. Modified the output plugin (pgoutput) to skip
> > publishing the changes if the relation is part of skip schema
> > publication.
> > As a continuation to this, I will work on implementing skipping tables
> > from all tables in schema and skipping tables from all tables
> > publication.
> >
> > Attached patch has the implementation for this.
>
> The patch was not applying on top of HEAD because of the recent
> commits, attached patch is rebased on top of HEAD.

The patch does not apply on top of HEAD because of the recent commit,
attached patch is rebased on top of HEAD.

I have also included the implementation for skipping a few tables from
all tables publication, the 0002 patch has the implementation for the
same.
This feature is helpful for use cases where the user wants to
subscribe to all the changes except for the changes present in a few
tables.
Ex:
CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
OR
ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Tue, Apr 12, 2022 at 11:53 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > This feature adds an option to skip changes of all tables in specified
> > > schema while creating publication.
> > > This feature is helpful for use cases where the user wants to
> > > subscribe to all the changes except for the changes present in a few
> > > schemas.
> > > Ex:
> > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > > OR
> > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> > >
> > > A new column pnskip is added to table "pg_publication_namespace", to
> > > maintain the schemas that the user wants to skip publishing through
> > > the publication. Modified the output plugin (pgoutput) to skip
> > > publishing the changes if the relation is part of skip schema
> > > publication.
> > > As a continuation to this, I will work on implementing skipping tables
> > > from all tables in schema and skipping tables from all tables
> > > publication.
> > >
> > > Attached patch has the implementation for this.
> >
> > The patch was not applying on top of HEAD because of the recent
> > commits, attached patch is rebased on top of HEAD.
>
> The patch does not apply on top of HEAD because of the recent commit,
> attached patch is rebased on top of HEAD.
>
> I have also included the implementation for skipping a few tables from
> all tables publication, the 0002 patch has the implementation for the
> same.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> tables.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
>

For the second syntax (Alter Publication ...), isn't it better to
avoid using ADD? It looks odd to me because we are not adding anything
in publication with this sytax.


-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, Apr 12, 2022 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 12, 2022 at 11:53 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > This feature adds an option to skip changes of all tables in specified
> > > > schema while creating publication.
> > > > This feature is helpful for use cases where the user wants to
> > > > subscribe to all the changes except for the changes present in a few
> > > > schemas.
> > > > Ex:
> > > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > > > OR
> > > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> > > >
> > > > A new column pnskip is added to table "pg_publication_namespace", to
> > > > maintain the schemas that the user wants to skip publishing through
> > > > the publication. Modified the output plugin (pgoutput) to skip
> > > > publishing the changes if the relation is part of skip schema
> > > > publication.
> > > > As a continuation to this, I will work on implementing skipping tables
> > > > from all tables in schema and skipping tables from all tables
> > > > publication.
> > > >
> > > > Attached patch has the implementation for this.
> > >
> > > The patch was not applying on top of HEAD because of the recent
> > > commits, attached patch is rebased on top of HEAD.
> >
> > The patch does not apply on top of HEAD because of the recent commit,
> > attached patch is rebased on top of HEAD.
> >
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
> >
>
> For the second syntax (Alter Publication ...), isn't it better to
> avoid using ADD? It looks odd to me because we are not adding anything
> in publication with this sytax.

I was thinking of the scenario where user initially creates the
publication for all tables:
CREATE PUBLICATION pub1 FOR ALL TABLES;

After that user decides to skip few tables ex: t1, t2
 ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2;

I thought of supporting this syntax if incase user decides to add the
skipping of a few tables later.
Thoughts?

Regards,
Vignesh



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Tue, Apr 12, 2022 at 4:17 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Apr 12, 2022 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > For the second syntax (Alter Publication ...), isn't it better to
> > avoid using ADD? It looks odd to me because we are not adding anything
> > in publication with this sytax.
>
> I was thinking of the scenario where user initially creates the
> publication for all tables:
> CREATE PUBLICATION pub1 FOR ALL TABLES;
>
> After that user decides to skip few tables ex: t1, t2
>  ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2;
>
> I thought of supporting this syntax if incase user decides to add the
> skipping of a few tables later.
>

I understand that part but what I pointed out was that it might be
better to avoid using ADD keyword in this syntax like: ALTER
PUBLICATION pub1 SKIP TABLE t1,t2;

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, Apr 12, 2022 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 12, 2022 at 4:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Apr 12, 2022 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > For the second syntax (Alter Publication ...), isn't it better to
> > > avoid using ADD? It looks odd to me because we are not adding anything
> > > in publication with this sytax.
> >
> > I was thinking of the scenario where user initially creates the
> > publication for all tables:
> > CREATE PUBLICATION pub1 FOR ALL TABLES;
> >
> > After that user decides to skip few tables ex: t1, t2
> >  ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2;
> >
> > I thought of supporting this syntax if incase user decides to add the
> > skipping of a few tables later.
> >
>
> I understand that part but what I pointed out was that it might be
> better to avoid using ADD keyword in this syntax like: ALTER
> PUBLICATION pub1 SKIP TABLE t1,t2;

Currently we are supporting Alter publication using the following syntax:
ALTER PUBLICATION pub1 ADD TABLE t1;
ALTER PUBLICATION pub1 SET TABLE t1;
ALTER PUBLICATION pub1 DROP TABLE T1;
ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1;
ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1;
ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1;

I have extended the new syntax in similar lines:
ALTER PUBLICATION pub1 ADD SKIP TABLE t1;
ALTER PUBLICATION pub1 SET SKIP TABLE t1;
ALTER PUBLICATION pub1 DROP SKIP TABLE T1;

I did it like this to maintain consistency.
But I'm fine doing it either way to keep it simple for the user.

Regards,
Vignesh



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Wed, Apr 13, 2022 at 8:45 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Apr 12, 2022 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I understand that part but what I pointed out was that it might be
> > better to avoid using ADD keyword in this syntax like: ALTER
> > PUBLICATION pub1 SKIP TABLE t1,t2;
>
> Currently we are supporting Alter publication using the following syntax:
> ALTER PUBLICATION pub1 ADD TABLE t1;
> ALTER PUBLICATION pub1 SET TABLE t1;
> ALTER PUBLICATION pub1 DROP TABLE T1;
> ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1;
> ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1;
> ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1;
>
> I have extended the new syntax in similar lines:
> ALTER PUBLICATION pub1 ADD SKIP TABLE t1;
> ALTER PUBLICATION pub1 SET SKIP TABLE t1;
> ALTER PUBLICATION pub1 DROP SKIP TABLE T1;
>
> I did it like this to maintain consistency.
>

What is the difference between ADD and SET variants? I understand we
need some way to remove the SKIP table setting but not sure if DROP is
the best alternative.

The other ideas could be:
To set skip tables: ALTER PUBLICATION pub1 SKIP TABLE t1, t2...;
To reset skip tables: ALTER PUBLICATION pub1 SKIP TABLE; /* basically
an empty list*/
Yet another way to reset skip tables: ALTER PUBLICATION pub1 RESET
SKIP TABLE; /* Here we need to introduce RESET. */

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Wed, Apr 13, 2022 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 13, 2022 at 8:45 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, Apr 12, 2022 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I understand that part but what I pointed out was that it might be
> > > better to avoid using ADD keyword in this syntax like: ALTER
> > > PUBLICATION pub1 SKIP TABLE t1,t2;
> >
> > Currently we are supporting Alter publication using the following syntax:
> > ALTER PUBLICATION pub1 ADD TABLE t1;
> > ALTER PUBLICATION pub1 SET TABLE t1;
> > ALTER PUBLICATION pub1 DROP TABLE T1;
> > ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1;
> > ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1;
> > ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1;
> >
> > I have extended the new syntax in similar lines:
> > ALTER PUBLICATION pub1 ADD SKIP TABLE t1;
> > ALTER PUBLICATION pub1 SET SKIP TABLE t1;
> > ALTER PUBLICATION pub1 DROP SKIP TABLE T1;
> >
> > I did it like this to maintain consistency.
> >
>
> What is the difference between ADD and SET variants? I understand we
> need some way to remove the SKIP table setting but not sure if DROP is
> the best alternative.
>
> The other ideas could be:
> To set skip tables: ALTER PUBLICATION pub1 SKIP TABLE t1, t2...;
> To reset skip tables: ALTER PUBLICATION pub1 SKIP TABLE; /* basically
> an empty list*/
> Yet another way to reset skip tables: ALTER PUBLICATION pub1 RESET
> SKIP TABLE; /* Here we need to introduce RESET. */
>

When you were talking about SKIP TABLE then I liked the idea of:

ALTER ... SET SKIP TABLE; /* empty list to reset the table skips */
ALTER ... SET SKIP TABLE t1,t2; /* non-empty list to replace the table skips */

But when you apply that rule to SKIP ALL TABLES IN SCHEMA, then the
reset syntax looks too awkward.

ALTER ... SET SKIP ALL TABLES IN SCHEMA; /* empty list to reset the
schema skips */
ALTER ... SET SKIP ALL TABLES IN SCHEMA s1,s2; /* non-empty list to
replace the schema skips */

~~~

IMO it might be simpler to do it like:

ALTER ... DROP SKIP; /* reset/remove the skip */
ALTER ... SET SKIP TABLE t1,t2; /* non-empty list to replace table skips */
ALTER ... SET SKIP ALL TABLES IS SCHEMA s1,s2; /* non-empty list to
replace schema skips */

I don't really think that the ALTER ... SET SKIP empty list should be
supported (because reason above)
I don't really think that the ALTER ... ADD SKIP should be supported.

===

More questions - What happens if the skip table or skip schema no
longer exists exist?  Does that mean error?  Maybe there is a
dependency on it but OTOH it might be annoying - e.g. to disallow a
DROP TABLE when the only dependency was that the user wanted to SKIP
it...

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Skipping schema changes in publication

От
"shiy.fnst@fujitsu.com"
Дата:
On Tue, Apr 12, 2022 2:23 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> The patch does not apply on top of HEAD because of the recent commit,
> attached patch is rebased on top of HEAD.
> 

Thanks for your patch. Here are some comments for 0001 patch.

1. doc/src/sgml/catalogs.sgml
@@ -6438,6 +6438,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
        A null value indicates that all columns are published.
       </para></entry>
      </row>
+
+    <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pnskip</structfield> <type>bool</type>
+      </para>
+      <para>
+       True if the schema is skip schema
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>

This change is added to pg_publication_rel, I think it should be added to
pg_publication_namespace, right?

2.
postgres=# alter publication p1 add skip all tables in schema s1,s2;
ERROR:  schema "s1" is already member of publication "p1"

This error message seems odd to me, can we improve it? Something like:
schema "s1" is already skipped in publication "p1"

3.
create table tbl (a int primary key);
create schema s1;
create schema s2;
create table s1.tbl (a int);
create publication p1 for all tables skip all tables in schema s1,s2;

postgres=# \dRp+
                               Publication p1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | t          | t       | t       | t       | t         | f
Skip tables from schemas:
    "s1"
    "s2"

postgres=# select * from pg_publication_tables;
 pubname | schemaname | tablename
---------+------------+-----------
 p1      | public     | tbl
 p1      | s1         | tbl
(2 rows)

There shouldn't be a record of s1.tbl, since all tables in schema s1 are skipped.

I found that it is caused by the following code:

src/backend/catalog/pg_publication.c
+    foreach(cell, pubschemalist)
+    {
+        PublicationSchInfo *pubsch = (PublicationSchInfo *) lfirst(cell);
+
+        skipschemaidlist = lappend_oid(result, pubsch->oid);
+    }

The first argument to append_oid() seems wrong, should it be:

skipschemaidlist = lappend_oid(skipschemaidlist, pubsch->oid);


4. src/backend/commands/publicationcmds.c

/*
 * Convert the PublicationObjSpecType list into schema oid list and
 * PublicationTable list.
 */
static void
ObjectsInPublicationToOids(List *pubobjspec_list, ParseState *pstate,
                           List **rels, List **schemas)

Should we modify the comment of ObjectsInPublicationToOids()?
"schema oid list" should be "PublicationSchInfo list".

Regards,
Shi yu


RE: Skipping schema changes in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tue, Apr 12, 2022 at 2:23 PM vignesh C <vignesh21@gmail.com> wrote:
> The patch does not apply on top of HEAD because of the recent commit,
> attached patch is rebased on top of HEAD.
Thanks for your patches.

Here are some comments for v1-0001:
1.
I found the patch add the following two new functions in gram.y:
preprocess_alltables_pubobj_list, check_skip_in_pubobj_list.
These two functions look similar. So could we just add one new function?
Besides, do we need the API `location` in new function
preprocess_alltables_pubobj_list? It seems that "location" is not used in this
new function.
In addition, the location of error cursor in the messages seems has a little
problem. For example:
postgres=# create publication pub for all TABLES skip all tables in schema public, table test;
ERROR:  only SKIP ALL TABLES IN SCHEMA or SKIP TABLE can be specified with ALL TABLES option
LINE 1: create publication pub for all TABLES skip all tables in sch...
        ^
(The location of error cursor is under 'create')

2. I think maybe there is a minor missing in function
preprocess_alltables_pubobj_list and check_skip_in_pubobj_list:
We seem to be missing the CURRENT_SCHEMA case.
For example(In function preprocess_alltables_pubobj_list) :
+        /* Only SKIP ALL TABLES IN SCHEMA option supported with ALL TABLES */
+        if (pubobj->pubobjtype != PUBLICATIONOBJ_TABLES_IN_SCHEMA ||
+            !pubobj->skip)
maybe need to be changed like this:
+        /* Only SKIP ALL TABLES IN SCHEMA option supported with ALL TABLES */
+        if ((pubobj->pubobjtype != PUBLICATIONOBJ_TABLES_IN_SCHEMA &&
+            pubobj->pubobjtype != PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA) &&
+            pubobj->skip)

3. I think maybe there are some minor missing in create_publication.sgml.
+    [ FOR ALL TABLES [SKIP ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> |
CURRENT_SCHEMA}]
 
maybe need to be changed to this:
+    [ FOR ALL TABLES [SKIP ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> |
CURRENT_SCHEMA} [, ... ]]
 

4. The error message of function CreatePublication.
Does the message below need to be modified like the comment?
In addition, I think maybe "FOR/SKIP" is better.
@@ -835,18 +843,21 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt)
-        /* FOR ALL TABLES IN SCHEMA requires superuser */
+        /* FOR [SKIP] ALL TABLES IN SCHEMA requires superuser */
         if (list_length(schemaidlist) > 0 && !superuser())
             ereport(ERROR,
                     errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
                     errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));

5.
I think there are some minor missing in tab-complete.c.
+             Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "SKIP", "ALL", "TABLES", "IN", "SCHEMA"))
maybe need to be changed to this:
+             Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "SKIP", "ALL", "TABLES", "IN",
"SCHEMA"))

+              Matches("CREATE", "PUBLICATION", MatchAny, "SKIP", "FOR", "ALL", "TABLES", "IN", "SCHEMA", MatchAny))
&&
maybe need to be changed to this:
+              Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "SKIP", "ALL", "TABLES", "IN",
"SCHEMA",MatchAny)) &&
 

6.
In function get_rel_sync_entry, do we need `if (!publish)` in below code?
I think `publish` is always false here, as we delete the check for
"pub->alltables".
```
-            /*
-             * If this is a FOR ALL TABLES publication, pick the partition root
-             * and set the ancestor level accordingly.
-             */
-            if (pub->alltables)
-            {
-                ......
-            }
-
             if (!publish)
```

Regards,
Wang wei

Re: Skipping schema changes in publication

От
Peter Eisentraut
Дата:
On 12.04.22 08:23, vignesh C wrote:
> I have also included the implementation for skipping a few tables from
> all tables publication, the 0002 patch has the implementation for the
> same.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> tables.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;

We have already allocated the "skip" terminology for skipping 
transactions, which is a dynamic run-time action.  We are also using the 
term "skip" elsewhere to skip locked rows, which is similarly a run-time 
action.  I think it would be confusing to use the term SKIP for DDL 
construction.

Let's find another term like "omit", "except", etc.

I would also think about this in broader terms.  For example, sometimes 
people want features like "all columns except these" in certain places. 
The syntax for those things should be similar.

That said, I'm not sure this feature is worth the trouble.  If this is 
useful, what about "whole database except these schemas"?  What about 
"create this database from this template except these schemas".  This 
could get out of hand.  I think we should encourage users to group their 
object the way they want and not offer these complicated negative 
selection mechanisms.



Re: Skipping schema changes in publication

От
"Euler Taveira"
Дата:
On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
On 12.04.22 08:23, vignesh C wrote:
> I have also included the implementation for skipping a few tables from
> all tables publication, the 0002 patch has the implementation for the
> same.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> tables.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;

We have already allocated the "skip" terminology for skipping 
transactions, which is a dynamic run-time action.  We are also using the 
term "skip" elsewhere to skip locked rows, which is similarly a run-time 
action.  I think it would be confusing to use the term SKIP for DDL 
construction.
I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
SCHEMA and if I were to suggest a keyword, it would be EXCEPT.

I would also think about this in broader terms.  For example, sometimes 
people want features like "all columns except these" in certain places. 
The syntax for those things should be similar.
The questions are:
What kind of issues does it solve?
Do we have a workaround for it?

That said, I'm not sure this feature is worth the trouble.  If this is 
useful, what about "whole database except these schemas"?  What about 
"create this database from this template except these schemas".  This 
could get out of hand.  I think we should encourage users to group their 
object the way they want and not offer these complicated negative 
selection mechanisms.
I have the same impression too. We already provide a way to:

* include individual tables;
* include all tables;
* include all tables in a certain schema.

Doesn't it cover the majority of the use cases? We don't need to cover all
possible cases in one DDL command. IMO the current grammar for CREATE
PUBLICATION is already complicated after the ALL TABLES IN SCHEMA. You are
proposing to add "ALL TABLES SKIP ALL TABLES" that sounds repetitive but it is
not; doesn't seem well-thought-out. I'm also concerned about possible gotchas
for this proposal. The first command above suggests that it skips all tables in a
certain schema. What happen if I decide to include a particular table of the
skipped schema (second command)?

ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
ALTER PUBLICATION pub1 ADD TABLE s1.foo;

Having said that I'm not wedded to this proposal. Unless someone provides
compelling use cases for this additional syntax, I think we should leave the
publication syntax as is.


--
Euler Taveira

Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
>
> On 12.04.22 08:23, vignesh C wrote:
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
>
> We have already allocated the "skip" terminology for skipping
> transactions, which is a dynamic run-time action.  We are also using the
> term "skip" elsewhere to skip locked rows, which is similarly a run-time
> action.  I think it would be confusing to use the term SKIP for DDL
> construction.
>
> I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
> SCHEMA and if I were to suggest a keyword, it would be EXCEPT.
>

+1 for EXCEPT.

> I would also think about this in broader terms.  For example, sometimes
> people want features like "all columns except these" in certain places.
> The syntax for those things should be similar.
>
> The questions are:
> What kind of issues does it solve?

As far as I understand, it is for usability, otherwise, users need to
list all required columns' names even if they don't want to hide most
of the columns in the table. Consider user doesn't want to publish the
'salary' or other sensitive information of executives/employees but
would like to publish all other columns. I feel in such cases it will
be a lot of work for the user especially when the table has many
columns. I see that Oracle has a similar feature [1]. I think without
this it will be difficult for users to use this feature in some cases.

> Do we have a workaround for it?
>

I can't think of any except the user needs to manually input all
required columns. Can you think of any other workaround?

> That said, I'm not sure this feature is worth the trouble.  If this is
> useful, what about "whole database except these schemas"?  What about
> "create this database from this template except these schemas".  This
> could get out of hand.  I think we should encourage users to group their
> object the way they want and not offer these complicated negative
> selection mechanisms.
>
> I have the same impression too. We already provide a way to:
>
> * include individual tables;
> * include all tables;
> * include all tables in a certain schema.
>
> Doesn't it cover the majority of the use cases?
>

Similar to columns, the same applies to tables. Users need to manually
add all tables for a database even when she wants to avoid only a
handful of tables from the database say because they contain sensitive
information or are not required. I think we don't need to cover all
possible exceptions but a few where users can avoid some tables would
be useful. If not, what kind of alternative do users have except for
listing all columns or all tables that are required.


[1] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Thu, Apr 14, 2022 at 7:18 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 12.04.22 08:23, vignesh C wrote:
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
>
> We have already allocated the "skip" terminology for skipping
> transactions, which is a dynamic run-time action.  We are also using the
> term "skip" elsewhere to skip locked rows, which is similarly a run-time
> action.  I think it would be confusing to use the term SKIP for DDL
> construction.
>
> Let's find another term like "omit", "except", etc.

+1 for Except

> I would also think about this in broader terms.  For example, sometimes
> people want features like "all columns except these" in certain places.
> The syntax for those things should be similar.
>
> That said, I'm not sure this feature is worth the trouble.  If this is
> useful, what about "whole database except these schemas"?  What about
> "create this database from this template except these schemas".  This
> could get out of hand.  I think we should encourage users to group their
> object the way they want and not offer these complicated negative
> selection mechanisms.

I thought this feature would help when there are many many tables in
the database and the user wants only certain confidential tables like
credit card information. In this case instead of specifying the whole
table list it will be better to specify "ALL TABLES EXCEPT
cred_info_tbl".
I had seen that mysql also has a similar option replicate-ignore-table
to ignore the changes on specific tables as mentioned in [1].
Similar use case exists in pg_dump too. pg_dump has an option
exclude-table that will be used for not dumping any tables that are
matching the table specified as in [2].

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] - https://www.postgresql.org/docs/devel/app-pgdump.html

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, Apr 18, 2022 at 12:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote:
> >
> > On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
> >
> > On 12.04.22 08:23, vignesh C wrote:
> > > I have also included the implementation for skipping a few tables from
> > > all tables publication, the 0002 patch has the implementation for the
> > > same.
> > > This feature is helpful for use cases where the user wants to
> > > subscribe to all the changes except for the changes present in a few
> > > tables.
> > > Ex:
> > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > > OR
> > > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
> >
> > We have already allocated the "skip" terminology for skipping
> > transactions, which is a dynamic run-time action.  We are also using the
> > term "skip" elsewhere to skip locked rows, which is similarly a run-time
> > action.  I think it would be confusing to use the term SKIP for DDL
> > construction.
> >
> > I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
> > SCHEMA and if I were to suggest a keyword, it would be EXCEPT.
> >
>
> +1 for EXCEPT.

Updated patch by changing the syntax to use EXCEPT instead of SKIP.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Bharath Rupireddy
Дата:
On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Hi,
>
> This feature adds an option to skip changes of all tables in specified
> schema while creating publication.
> This feature is helpful for use cases where the user wants to
> subscribe to all the changes except for the changes present in a few
> schemas.
> Ex:
> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> OR
> ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
>
> A new column pnskip is added to table "pg_publication_namespace", to
> maintain the schemas that the user wants to skip publishing through
> the publication. Modified the output plugin (pgoutput) to skip
> publishing the changes if the relation is part of skip schema
> publication.
> As a continuation to this, I will work on implementing skipping tables
> from all tables in schema and skipping tables from all tables
> publication.
>
> Attached patch has the implementation for this.
> This feature is for the pg16 version.
> Thoughts?

The feature seems to be useful especially when there are lots of
schemas in a database. However, I don't quite like the syntax. Do we
have 'SKIP' identifier in any of the SQL statements in SQL standard?
Can we think of adding skip_schema_list as an option, something like
below?

CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset

Regards,
Bharath Rupireddy.



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Sat, Apr 23, 2022 at 2:09 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Hi,
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
> > A new column pnskip is added to table "pg_publication_namespace", to
> > maintain the schemas that the user wants to skip publishing through
> > the publication. Modified the output plugin (pgoutput) to skip
> > publishing the changes if the relation is part of skip schema
> > publication.
> > As a continuation to this, I will work on implementing skipping tables
> > from all tables in schema and skipping tables from all tables
> > publication.
> >
> > Attached patch has the implementation for this.
> > This feature is for the pg16 version.
> > Thoughts?
>
> The feature seems to be useful especially when there are lots of
> schemas in a database. However, I don't quite like the syntax. Do we
> have 'SKIP' identifier in any of the SQL statements in SQL standard?
> Can we think of adding skip_schema_list as an option, something like
> below?
>
> CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
> ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
> ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset
>

I had been wondering for some time if there was any way to introduce a
more flexible pattern matching into PUBLICATION but without bloating
the syntax. Maybe your idea to use an option for the "skip" gives a
way to do it...

For example, if we could use regex (for <schemaname>.<tablename>
patterns) for the option value then....

~~

e.g.1. Exclude certain tables:

// do NOT publish any tables of schemas s1,s2
CREATE PUBLICATION foo FOR ALL TABLES (exclude_match = '(s1\..*)|(s2\..*)');

// do NOT publish my secret tables (those called "mysecretXXX")
CREATE PUBLICATION foo FOR ALL TABLES (exclude_match = '(.*\.mysecret.*)');

~~

e.g.2. Only allow certain tables.

// ONLY publish my tables (those called "mytableXXX")
CREATE PUBLICATION foo FOR ALL TABLES (subset_match = '(.*\.mytable.*)');

// So following is equivalent to FOR ALL TABLES IN SCHEMA s1
CREATE PUBLICATION foo FOR ALL TABLES (subset_match = '(s1\..*)');

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Skipping schema changes in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Thursday, April 21, 2022 12:15 PM vignesh C <vignesh21@gmail.com> wrote:
> Updated patch by changing the syntax to use EXCEPT instead of SKIP.
Hi


This is my review comments on the v2 patch.

(1) gram.y

I think we can make a unified function that merges
preprocess_alltables_pubobj_list with check_except_in_pubobj_list.

With regard to preprocess_alltables_pubobj_list,
we don't use the 2nd argument "location" in this function.

(2) create_publication.sgml

+  <para>
+   Create a publication that publishes all changes in all the tables except for
+   the changes of <structname>users</structname> and
+   <structname>departments</structname> table;

This sentence should end ":" not ";".

(3) publication.out & publication.sql

+-- fail - can't set except table to schema  publication
+ALTER PUBLICATION testpub_forschema SET EXCEPT TABLE testpub_tbl1;

There is one unnecessary space in the comment.
Kindly change from "schema  publication" to "schema publication".

(4) pg_dump.c & describe.c

In your first email of this thread, you explained this feature
is for PG16. Don't we need additional branch for PG16 ?

@@ -6322,6 +6328,21 @@ describePublications(const char *pattern)
                        }
                }

+               if (pset.sversion >= 150000)
+               {


@@ -4162,7 +4164,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
        /* Collect all publication membership info. */
        if (fout->remoteVersion >= 150000)
                appendPQExpBufferStr(query,
-                                                        "SELECT tableoid, oid, prpubid, prrelid, "
+                                                        "SELECT tableoid, oid, prpubid, prrelid, prexcept,"


(5) psql-ref.sgml

+        If <literal>+</literal> is appended to the command name, the tables,
+        except tables and schemas associated with each publication are shown as
+        well.

I'm not sure if "except tables" is a good description.
I suggest "excluded tables". This applies to the entire patch,
in case if this is reasonable suggestion.


Best Regards,
    Takamichi Osumi


Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, Apr 26, 2022 at 11:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, April 21, 2022 12:15 PM vignesh C <vignesh21@gmail.com> wrote:
> > Updated patch by changing the syntax to use EXCEPT instead of SKIP.
> Hi
>
>
> This is my review comments on the v2 patch.
>
> (1) gram.y
>
> I think we can make a unified function that merges
> preprocess_alltables_pubobj_list with check_except_in_pubobj_list.
>
> With regard to preprocess_alltables_pubobj_list,
> we don't use the 2nd argument "location" in this function.

Removed location and made a unified function.

> (2) create_publication.sgml
>
> +  <para>
> +   Create a publication that publishes all changes in all the tables except for
> +   the changes of <structname>users</structname> and
> +   <structname>departments</structname> table;
>
> This sentence should end ":" not ";".

Modified

> (3) publication.out & publication.sql
>
> +-- fail - can't set except table to schema  publication
> +ALTER PUBLICATION testpub_forschema SET EXCEPT TABLE testpub_tbl1;
>
> There is one unnecessary space in the comment.
> Kindly change from "schema  publication" to "schema publication".

Modified

> (4) pg_dump.c & describe.c
>
> In your first email of this thread, you explained this feature
> is for PG16. Don't we need additional branch for PG16 ?
>
> @@ -6322,6 +6328,21 @@ describePublications(const char *pattern)
>                         }
>                 }
>
> +               if (pset.sversion >= 150000)
> +               {
>
>
> @@ -4162,7 +4164,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
>         /* Collect all publication membership info. */
>         if (fout->remoteVersion >= 150000)
>                 appendPQExpBufferStr(query,
> -                                                        "SELECT tableoid, oid, prpubid, prrelid, "
> +                                                        "SELECT tableoid, oid, prpubid, prrelid, prexcept,"
>

Modified by adding a comment saying "FIXME: 150000 should be changed
to 160000 later for PG16."

> (5) psql-ref.sgml
>
> +        If <literal>+</literal> is appended to the command name, the tables,
> +        except tables and schemas associated with each publication are shown as
> +        well.
>
> I'm not sure if "except tables" is a good description.
> I suggest "excluded tables". This applies to the entire patch,
> in case if this is reasonable suggestion.

Modified it in most of the places where it was applicable. I felt the
usage was ok in a few places.

Thanks for the comments, the attached v3 patch has the changes for the same.

Regards.
Vignesh

Вложения

RE: Skipping schema changes in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Wednesday, April 27, 2022 9:50 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for the comments, the attached v3 patch has the changes for the same.
Hi

Thank you for updating the patch. Several minor comments on v3.

(1) commit message

The new syntax allows specifying schemas. For example:
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
OR
ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2;

We have above sentence, but it looks better
to make the description a bit more accurate.

Kindly change
From :
"The new syntax allows specifying schemas"
To :
"The new syntax allows specifying excluded relations"

Also, kindly change "OR" to "or",
because this description is not syntax.

(2) publication_add_relation

@@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
                ObjectIdGetDatum(pubid);
        values[Anum_pg_publication_rel_prrelid - 1] =
                ObjectIdGetDatum(relid);
+       values[Anum_pg_publication_rel_prexcept - 1] =
+               BoolGetDatum(pri->except);
+

        /* Add qualifications, if available */

It would be better to remove the blank line,
because with this change, we'll have two blank
lines in a row.

(3) pg_dump.h & pg_dump_sort.c

@@ -80,6 +80,7 @@ typedef enum
        DO_REFRESH_MATVIEW,
        DO_POLICY,
        DO_PUBLICATION,
+       DO_PUBLICATION_EXCEPT_REL,
        DO_PUBLICATION_REL,
        DO_PUBLICATION_TABLE_IN_SCHEMA,
        DO_SUBSCRIPTION

@@ -90,6 +90,7 @@ enum dbObjectTypePriorities
        PRIO_FK_CONSTRAINT,
        PRIO_POLICY,
        PRIO_PUBLICATION,
+       PRIO_PUBLICATION_EXCEPT_REL,
        PRIO_PUBLICATION_REL,
        PRIO_PUBLICATION_TABLE_IN_SCHEMA,
        PRIO_SUBSCRIPTION,
@@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] =
        PRIO_REFRESH_MATVIEW,           /* DO_REFRESH_MATVIEW */
        PRIO_POLICY,                            /* DO_POLICY */
        PRIO_PUBLICATION,                       /* DO_PUBLICATION */
+       PRIO_PUBLICATION_EXCEPT_REL,    /* DO_PUBLICATION_EXCEPT_REL */
        PRIO_PUBLICATION_REL,           /* DO_PUBLICATION_REL */
        PRIO_PUBLICATION_TABLE_IN_SCHEMA,       /* DO_PUBLICATION_TABLE_IN_SCHEMA */
        PRIO_SUBSCRIPTION                       /* DO_SUBSCRIPTION */

How about having similar order between
pg_dump.h and pg_dump_sort.c, like
we'll add DO_PUBLICATION_EXCEPT_REL
after DO_PUBLICATION_REL in pg_dump.h ?


(4) GetAllTablesPublicationRelations

+       /*
+        * pg_publication_rel and pg_publication_namespace  will only have except
+        * tables in case of all tables publication, no need to pass except flag
+        * to get the relations.
+        */
+       List       *exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
+

There is one unnecessary space in a comment
"...pg_publication_namespace  will only have...". Kindly remove it.

Then, how about diving the variable declaration and
the insertion of the return value of GetPublicationRelations ?
That might be aligned with other places in this file.

(5) GetTopMostAncestorInPublication


@@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
        foreach(lc, ancestors)
        {
                Oid                     ancestor = lfirst_oid(lc);
-               List       *apubids = GetRelationPublications(ancestor);
+               List       *apubids = GetRelationPublications(ancestor, false);
                List       *aschemaPubids = NIL;
+               List       *aexceptpubids;

                level++;

@@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
                else
                {
                        aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
-                       if (list_member_oid(aschemaPubids, puboid))
+                       aexceptpubids = GetRelationPublications(ancestor, true);
+                       if (list_member_oid(aschemaPubids, puboid) ||
+                               (puballtables && !list_member_oid(aexceptpubids, puboid)))
                        {
                                topmost_relid = ancestor;

It seems we forgot to call list_free for "aexceptpubids".


Best Regards,
    Takamichi Osumi


Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Fri, Apr 22, 2022 at 9:39 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > This feature adds an option to skip changes of all tables in specified
> > schema while creating publication.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > schemas.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
> >
>
> The feature seems to be useful especially when there are lots of
> schemas in a database. However, I don't quite like the syntax. Do we
> have 'SKIP' identifier in any of the SQL statements in SQL standard?
>

After discussion, it seems EXCEPT is a preferred choice and the same
is used in the other existing syntax as well.

> Can we think of adding skip_schema_list as an option, something like
> below?
>
> CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2');
> ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set
> ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset
>

Yeah, that is also an option but it seems it will be difficult to
extend if want to support "all columns except (c1, ..)" for the column
list feature.

The other thing to decide is for which all objects we want to support
EXCEPT clause as it may not be useful for everything as indicated by
Peter E. and Euler. We have seen that Oracle supports "all columns
except (c1, ..)" [1] and MySQL seems to support for tables [2]. I
guess we should restrict ourselves to those two cases for now and then
we can extend it later for schemas if required or people agree. Also,
we should see the syntax we choose here should be extendable.

Another idea that occurred to me today for tables this is as follows:
1. Allow to mention except during create publication ... For All Tables.
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
2. Allow to Reset it. This new syntax will reset all objects in the
publications.
Alter Publication ... RESET;
3. Allow to add it to an existing publication
Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];

I think it can be extended in a similar way for schema syntax as well.

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Thu, Apr 28, 2022 at 4:50 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Wednesday, April 27, 2022 9:50 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the comments, the attached v3 patch has the changes for the same.
> Hi
>
> Thank you for updating the patch. Several minor comments on v3.
>
> (1) commit message
>
> The new syntax allows specifying schemas. For example:
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> OR
> ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2;
>
> We have above sentence, but it looks better
> to make the description a bit more accurate.
>
> Kindly change
> From :
> "The new syntax allows specifying schemas"
> To :
> "The new syntax allows specifying excluded relations"
>
> Also, kindly change "OR" to "or",
> because this description is not syntax.

Slightly reworded and modified

> (2) publication_add_relation
>
> @@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri,
>                 ObjectIdGetDatum(pubid);
>         values[Anum_pg_publication_rel_prrelid - 1] =
>                 ObjectIdGetDatum(relid);
> +       values[Anum_pg_publication_rel_prexcept - 1] =
> +               BoolGetDatum(pri->except);
> +
>
>         /* Add qualifications, if available */
>
> It would be better to remove the blank line,
> because with this change, we'll have two blank
> lines in a row.

Modified

> (3) pg_dump.h & pg_dump_sort.c
>
> @@ -80,6 +80,7 @@ typedef enum
>         DO_REFRESH_MATVIEW,
>         DO_POLICY,
>         DO_PUBLICATION,
> +       DO_PUBLICATION_EXCEPT_REL,
>         DO_PUBLICATION_REL,
>         DO_PUBLICATION_TABLE_IN_SCHEMA,
>         DO_SUBSCRIPTION
>
> @@ -90,6 +90,7 @@ enum dbObjectTypePriorities
>         PRIO_FK_CONSTRAINT,
>         PRIO_POLICY,
>         PRIO_PUBLICATION,
> +       PRIO_PUBLICATION_EXCEPT_REL,
>         PRIO_PUBLICATION_REL,
>         PRIO_PUBLICATION_TABLE_IN_SCHEMA,
>         PRIO_SUBSCRIPTION,
> @@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] =
>         PRIO_REFRESH_MATVIEW,           /* DO_REFRESH_MATVIEW */
>         PRIO_POLICY,                            /* DO_POLICY */
>         PRIO_PUBLICATION,                       /* DO_PUBLICATION */
> +       PRIO_PUBLICATION_EXCEPT_REL,    /* DO_PUBLICATION_EXCEPT_REL */
>         PRIO_PUBLICATION_REL,           /* DO_PUBLICATION_REL */
>         PRIO_PUBLICATION_TABLE_IN_SCHEMA,       /* DO_PUBLICATION_TABLE_IN_SCHEMA */
>         PRIO_SUBSCRIPTION                       /* DO_SUBSCRIPTION */
>
> How about having similar order between
> pg_dump.h and pg_dump_sort.c, like
> we'll add DO_PUBLICATION_EXCEPT_REL
> after DO_PUBLICATION_REL in pg_dump.h ?
>

Modified

> (4) GetAllTablesPublicationRelations
>
> +       /*
> +        * pg_publication_rel and pg_publication_namespace  will only have except
> +        * tables in case of all tables publication, no need to pass except flag
> +        * to get the relations.
> +        */
> +       List       *exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> +
>
> There is one unnecessary space in a comment
> "...pg_publication_namespace  will only have...". Kindly remove it.
>
> Then, how about diving the variable declaration and
> the insertion of the return value of GetPublicationRelations ?
> That might be aligned with other places in this file.

Modified

> (5) GetTopMostAncestorInPublication
>
>
> @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
>         foreach(lc, ancestors)
>         {
>                 Oid                     ancestor = lfirst_oid(lc);
> -               List       *apubids = GetRelationPublications(ancestor);
> +               List       *apubids = GetRelationPublications(ancestor, false);
>                 List       *aschemaPubids = NIL;
> +               List       *aexceptpubids;
>
>                 level++;
>
> @@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
>                 else
>                 {
>                         aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> -                       if (list_member_oid(aschemaPubids, puboid))
> +                       aexceptpubids = GetRelationPublications(ancestor, true);
> +                       if (list_member_oid(aschemaPubids, puboid) ||
> +                               (puballtables && !list_member_oid(aexceptpubids, puboid)))
>                         {
>                                 topmost_relid = ancestor;
>
> It seems we forgot to call list_free for "aexceptpubids".

Modified

The attached v4 patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
...
> Another idea that occurred to me today for tables this is as follows:
> 1. Allow to mention except during create publication ... For All Tables.
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> 2. Allow to Reset it. This new syntax will reset all objects in the
> publications.
> Alter Publication ... RESET;
> 3. Allow to add it to an existing publication
> Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
>
> I think it can be extended in a similar way for schema syntax as well.
>

Consider if the user does
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT t3,t4;

What does it mean?
e.g. Is there only one exception list that is modified? Or did the ADD
ALL TABLES override all meaning of the original list?
e.g. Are we now skipping t1,t2,t3,t4, or are we now only skipping t3,t4?

~~~

Here is a similar example, where the ADD TABLE seems confusing to me
when it intersects with a prior EXCEPT
e.g.
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT t1,t2; // ok
ALTER PUBLICATION pub1 ADD TABLE t1; ???

What does it mean?
e.g. Does the explicit ADD TABLE override the original exception list?
e.g. Is t1 published now or should that ALTER have caused an error?

~~

It feels like there are too many tricky rules when using EXCEPT with
ALTER PUBLICATION. I guess complexities can be described in the
documentation but IMO it would be better if the ALTER syntax could be
unambiguous in the first place. So perhaps the rules should be more
restrictive (e.g. just disallow ALTER ... ADD any table that overlaps
the existing EXCEPT list ??)

------
Kind Regards,
Peter Smith.
Fujitsu Australia.



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Tue, May 3, 2022 at 2:24 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> ...
> > Another idea that occurred to me today for tables this is as follows:
> > 1. Allow to mention except during create publication ... For All Tables.
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > 2. Allow to Reset it. This new syntax will reset all objects in the
> > publications.
> > Alter Publication ... RESET;
> > 3. Allow to add it to an existing publication
> > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
> >
> > I think it can be extended in a similar way for schema syntax as well.
> >
>
> Consider if the user does
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT t3,t4;
>
> What does it mean?
> e.g. Is there only one exception list that is modified? Or did the ADD
> ALL TABLES override all meaning of the original list?
> e.g. Are we now skipping t1,t2,t3,t4, or are we now only skipping t3,t4?
>

This won't be allowed. We won't allow changing ALL TABLES publication
unless the user first performs RESET. This is the purpose of providing
the RESET variant.

> ~~~
>
> Here is a similar example, where the ADD TABLE seems confusing to me
> when it intersects with a prior EXCEPT
> e.g.
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT t1,t2; // ok
> ALTER PUBLICATION pub1 ADD TABLE t1; ???
>
> What does it mean?
> e.g. Does the explicit ADD TABLE override the original exception list?
> e.g. Is t1 published now or should that ALTER have caused an error?
>

This won't be allowed either. We don't allow to Add/Drop from All
Tables publication unless the user performs a RESET. This is true even
today except that we don't have a RESET syntax.

> ~~
>
> It feels like there are too many tricky rules when using EXCEPT with
> ALTER PUBLICATION. I guess complexities can be described in the
> documentation but IMO it would be better if the ALTER syntax could be
> unambiguous in the first place.
>

Agreed.

> So perhaps the rules should be more
> restrictive (e.g. just disallow ALTER ... ADD any table that overlaps
> the existing EXCEPT list ??)
>

I think the current proposal seems to be restrictive enough to avoid
any tricky issues. Do you see any other problem?


-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
Peter Eisentraut
Дата:
On 14.04.22 15:47, Peter Eisentraut wrote:
> That said, I'm not sure this feature is worth the trouble.  If this is 
> useful, what about "whole database except these schemas"?  What about 
> "create this database from this template except these schemas".  This 
> could get out of hand.  I think we should encourage users to group their 
> object the way they want and not offer these complicated negative 
> selection mechanisms.

Another problem in general with this "all except these" way of 
specifying things is that you need to track negative dependencies.

For example, assume you can't add a table to a publication unless it has 
a replica identity.  Now, if you have a publication p1 that says 
includes "all tables except t1", you now have to check p1 whenever a new 
table is created, even though the new table has no direct dependency 
link with p1.  So in more general cases, you would have to check all 
existing objects to see whether their specification is in conflict with 
the new object being created.

Now publications don't actually work that way, so it's not a real 
problem right now, but similar things could work like that.  So I think 
it's worth thinking this through a bit.




Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Wed, May 4, 2022 at 7:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 14.04.22 15:47, Peter Eisentraut wrote:
> > That said, I'm not sure this feature is worth the trouble.  If this is
> > useful, what about "whole database except these schemas"?  What about
> > "create this database from this template except these schemas".  This
> > could get out of hand.  I think we should encourage users to group their
> > object the way they want and not offer these complicated negative
> > selection mechanisms.
>
> Another problem in general with this "all except these" way of
> specifying things is that you need to track negative dependencies.
>
> For example, assume you can't add a table to a publication unless it has
> a replica identity.  Now, if you have a publication p1 that says
> includes "all tables except t1", you now have to check p1 whenever a new
> table is created, even though the new table has no direct dependency
> link with p1.  So in more general cases, you would have to check all
> existing objects to see whether their specification is in conflict with
> the new object being created.
>

Yes, I think we should avoid adding such negative dependencies. We
have carefully avoided such dependencies during row filter, column
list work where we don't try to perform DDL time verification.
However, it is not clear to me how this proposal is related to this
example or in general about tracking negative dependencies? AFAIR, we
currently have such a check while changing persistence of logged table
(logged to unlogged, see ATPrepChangePersistence) where we cannot
allow changing persistence if that relation is part of some
publication. But as per my understanding, this feature shouldn't add
any such new dependencies. I agree that we have to ensure that
existing checks shouldn't break due to this feature.

> Now publications don't actually work that way, so it's not a real
> problem right now, but similar things could work like that.  So I think
> it's worth thinking this through a bit.
>

This is a good point and I agree that we should be careful to not add
some new negative dependencies unless it is really required but I
can't see how this proposal will make it more prone to such checks.

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Thu, May 5, 2022 at 9:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 4, 2022 at 7:05 PM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > On 14.04.22 15:47, Peter Eisentraut wrote:
> > > That said, I'm not sure this feature is worth the trouble.  If this is
> > > useful, what about "whole database except these schemas"?  What about
> > > "create this database from this template except these schemas".  This
> > > could get out of hand.  I think we should encourage users to group their
> > > object the way they want and not offer these complicated negative
> > > selection mechanisms.
> >
> > Another problem in general with this "all except these" way of
> > specifying things is that you need to track negative dependencies.
> >
> > For example, assume you can't add a table to a publication unless it has
> > a replica identity.  Now, if you have a publication p1 that says
> > includes "all tables except t1", you now have to check p1 whenever a new
> > table is created, even though the new table has no direct dependency
> > link with p1.  So in more general cases, you would have to check all
> > existing objects to see whether their specification is in conflict with
> > the new object being created.
> >
>
> Yes, I think we should avoid adding such negative dependencies. We
> have carefully avoided such dependencies during row filter, column
> list work where we don't try to perform DDL time verification.
> However, it is not clear to me how this proposal is related to this
> example or in general about tracking negative dependencies?
>

I mean to say that even if we have such a restriction, it would apply
to "for all tables" or other publications as well. In your example,
consider one wants to Alter a table and remove its replica identity,
we have to check whether the table is part of any publication similar
to what we are doing for relation persistence in
ATPrepChangePersistence.

> AFAIR, we
> currently have such a check while changing persistence of logged table
> (logged to unlogged, see ATPrepChangePersistence) where we cannot
> allow changing persistence if that relation is part of some
> publication. But as per my understanding, this feature shouldn't add
> any such new dependencies. I agree that we have to ensure that
> existing checks shouldn't break due to this feature.
>
> > Now publications don't actually work that way, so it's not a real
> > problem right now, but similar things could work like that.  So I think
> > it's worth thinking this through a bit.
> >
>
> This is a good point and I agree that we should be careful to not add
> some new negative dependencies unless it is really required but I
> can't see how this proposal will make it more prone to such checks.
>

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
...
>
> Another idea that occurred to me today for tables this is as follows:
> 1. Allow to mention except during create publication ... For All Tables.
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> 2. Allow to Reset it. This new syntax will reset all objects in the
> publications.
> Alter Publication ... RESET;
> 3. Allow to add it to an existing publication
> Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
>
> I think it can be extended in a similar way for schema syntax as well.
>

If the proposed syntax ALTER PUBLICATION ... RESET will reset all the
objects in the publication then there still seems simple way to remove
only the EXCEPT list but leave everything else intact. IIUC to clear
just the EXCEPT list would require a 2 step process - 1) ALTER ...
RESET then 2) ALTER ... ADD ALL TABLES again.

I was wondering if it might be useful to have a variation that *only*
resets the EXCEPT list, but still leaves everything else as-is?

So, instead of:
ALTER PUBLICATION pubname RESET

use a syntax something like:
ALTER PUBLICATION pubname RESET {ALL | EXCEPT}
or
ALTER PUBLICATION pubname RESET [EXCEPT]

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, May 6, 2022 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> ...
> >
> > Another idea that occurred to me today for tables this is as follows:
> > 1. Allow to mention except during create publication ... For All Tables.
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > 2. Allow to Reset it. This new syntax will reset all objects in the
> > publications.
> > Alter Publication ... RESET;
> > 3. Allow to add it to an existing publication
> > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
> >
> > I think it can be extended in a similar way for schema syntax as well.
> >
>
> If the proposed syntax ALTER PUBLICATION ... RESET will reset all the
> objects in the publication then there still seems simple way to remove
> only the EXCEPT list but leave everything else intact. IIUC to clear
> just the EXCEPT list would require a 2 step process - 1) ALTER ...
> RESET then 2) ALTER ... ADD ALL TABLES again.
>
> I was wondering if it might be useful to have a variation that *only*
> resets the EXCEPT list, but still leaves everything else as-is?
>
> So, instead of:
> ALTER PUBLICATION pubname RESET

+1 for this syntax as this syntax can be extendable to include options
like (except/all/etc) later.
Currently we can support this syntax and can be extended later based
on the requirements.

The new feature will handle the various use cases based on the
behavior given below:
-- CREATE Publication with EXCEPT TABLE syntax
CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; -- ok
Alter Publication pub1 RESET;
-- All Tables and options are reset similar to creating publication
without any publication object and publication option (create
publication pub1)
\dRp+ pub1
Publication pub2
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
---------+------------+---------+---------+---------+-----------+----------
vignesh | f | t | t | t | t | f
(1 row)

-- Can add except table after reset of publication
ALTER PUBLICATION pub1 Add ALL TABLES EXCEPT TABLE t1,t2; -- ok

-- Cannot add except table without reset of publication
ALTER PUBLICATION pub1 Add EXCEPT TABLE t3,t4; -- not ok, need to be reset

Alter Publication pub1 RESET;
-- Cannot add table to ALL TABLES Publication
ALTER PUBLICATION pub1 Add ALL TABLES EXCEPT TABLE t1,t2, t3, t4,
TABLE t5; -- not ok, ALL TABLES Publications does not support
including of TABLES

Alter Publication pub1 RESET;
-- Cannot add table to ALL TABLES Publication
ALTER PUBLICATION pub1 Add ALL TABLES TABLE t1,t2; -- not ok, ALL
TABLES Publications does not support including of TABLES

-- Cannot add ALL TABLES IN SCHEMA to ALL TABLES Publication
ALTER PUBLICATION pub1 Add ALL TABLES ALL TABLES IN SCHEMA sch1, sch2;
-- not ok, ALL TABLES Publications does not support including of ALL
TABLES IN SCHEMA

-- Existing syntax should work as it is
CREATE PUBLICATION pub1 FOR TABLE t1;
ALTER PUBLICATION pub1 ADD TABLE t1; -- ok, existing ALTER should work
as it is (ok without reset)
ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1; -- ok, existing
ALTER should work as it is (ok without reset)
ALTER PUBLICATION pub1 DROP TABLE t1; -- ok, existing ALTER should
work as it is (ok without reset)
ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1; -- ok, existing
ALTER should work as it is (ok without reset)
ALTER PUBLICATION pub1 SET TABLE t1; -- ok, existing ALTER should work
as it is (ok without reset)
ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1; -- ok, existing
ALTER should work as it is (ok without reset)

I will modify the patch to handle this.

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, May 10, 2022 at 9:08 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, May 6, 2022 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > ...
> > >
> > > Another idea that occurred to me today for tables this is as follows:
> > > 1. Allow to mention except during create publication ... For All Tables.
> > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > > 2. Allow to Reset it. This new syntax will reset all objects in the
> > > publications.
> > > Alter Publication ... RESET;
> > > 3. Allow to add it to an existing publication
> > > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2];
> > >
> > > I think it can be extended in a similar way for schema syntax as well.
> > >
> >
> > If the proposed syntax ALTER PUBLICATION ... RESET will reset all the
> > objects in the publication then there still seems simple way to remove
> > only the EXCEPT list but leave everything else intact. IIUC to clear
> > just the EXCEPT list would require a 2 step process - 1) ALTER ...
> > RESET then 2) ALTER ... ADD ALL TABLES again.
> >
> > I was wondering if it might be useful to have a variation that *only*
> > resets the EXCEPT list, but still leaves everything else as-is?
> >
> > So, instead of:
> > ALTER PUBLICATION pubname RESET
>
> +1 for this syntax as this syntax can be extendable to include options
> like (except/all/etc) later.
> Currently we can support this syntax and can be extended later based
> on the requirements.

The attached patch has the implementation for "ALTER PUBLICATION
pubname RESET". This command will reset the publication to default
state which includes resetting the publication options, setting ALL
TABLES option to false and dropping the relations and schemas that are
associated with the publication.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Thu, May 12, 2022 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
...
> The attached patch has the implementation for "ALTER PUBLICATION
> pubname RESET". This command will reset the publication to default
> state which includes resetting the publication options, setting ALL
> TABLES option to false and dropping the relations and schemas that are
> associated with the publication.
>

Please see below my review comments for the v1-0001 (RESET) patch

======

1. Commit message

This patch adds a new RESET option to ALTER PUBLICATION which

Wording: "RESET option" -> "RESET clause"

~~~

2. doc/src/sgml/ref/alter_publication.sgml

+  <para>
+   The <literal>RESET</literal> clause will reset the publication to default
+   state which includes resetting the publication options, setting
+   <literal>ALL TABLES</literal> option to <literal>false</literal>
and drop the
+   relations and schemas that are associated with the publication.
   </para>

2a. Wording: "to default state" -> "to the default state"

2b. Wording: "and drop the relations..." -> "and dropping all relations..."

~~~

3. doc/src/sgml/ref/alter_publication.sgml

+   invoking user to be a superuser.  <literal>RESET</literal> of publication
+   requires invoking user to be a superuser. To alter the owner, you must also

Wording: "requires invoking user" -> "requires the invoking user"

~~~

4. doc/src/sgml/ref/alter_publication.sgml - Example

@@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL
TABLES IN SCHEMA marketing, sales;
    <structname>production_publication</structname>:
 <programlisting>
 ALTER PUBLICATION production_publication ADD TABLE users,
departments, ALL TABLES IN SCHEMA production;
+</programlisting></para>
+
+  <para>
+   Resetting the publication <structname>production_publication</structname>:
+<programlisting>
+ALTER PUBLICATION production_publication RESET;

Wording: "Resetting the publication" -> "Reset the publication"

~~~

5. src/backend/commands/publicationcmds.c

+ /* Check and reset the options */

IMO the code can just reset all these options unconditionally. I did
not see the point to check for existing option values first. I feel
the simpler code outweighs any negligible performance difference in
this case.

~~~

6. src/backend/commands/publicationcmds.c

+ /* Check and reset the options */

Somehow it seemed a pity having to hardcode all these default values
true/false in multiple places; e.g. the same is already hardcoded in
the parse_publication_options function.

To avoid multiple hard coded bools you could just call the
parse_publication_options with an empty options list. That would set
the defaults which you can then use:
values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert);

Alternatively, maybe there should be #defines to use instead of having
the scattered hardcoded bool defaults:
#define PUBACTION_DEFAULT_INSERT true
#define PUBACTION_DEFAULT_UPDATE true
etc

~~~

7. src/include/nodes/parsenodes.h

@@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction
 {
  AP_AddObjects, /* add objects to publication */
  AP_DropObjects, /* remove objects from publication */
- AP_SetObjects /* set list of objects */
+ AP_SetObjects, /* set list of objects */
+ AP_ReSetPublication /* reset the publication */
 } AlterPublicationAction;

Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication"

~~~

8. src/test/regress/sql/publication.sql

8a.
+-- Test for RESET PUBLICATION
SUGGESTED
+-- Tests for ALTER PUBLICATION ... RESET

8b.
+-- Verify that 'ALL TABLES' option is reset
SUGGESTED:
+-- Verify that 'ALL TABLES' flag is reset

8c.
+-- Verify that publish option and publish via root option is reset
SUGGESTED:
+-- Verify that publish options and publish_via_partition_root option are reset

8d.
+-- Verify that only superuser can execute RESET publication
SUGGESTED
+-- Verify that only superuser can reset a publication

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, May 13, 2022 at 9:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, May 12, 2022 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> ...
> > The attached patch has the implementation for "ALTER PUBLICATION
> > pubname RESET". This command will reset the publication to default
> > state which includes resetting the publication options, setting ALL
> > TABLES option to false and dropping the relations and schemas that are
> > associated with the publication.
> >
>
> Please see below my review comments for the v1-0001 (RESET) patch
>
> ======
>
> 1. Commit message
>
> This patch adds a new RESET option to ALTER PUBLICATION which
>
> Wording: "RESET option" -> "RESET clause"

Modified

> ~~~
>
> 2. doc/src/sgml/ref/alter_publication.sgml
>
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to default
> +   state which includes resetting the publication options, setting
> +   <literal>ALL TABLES</literal> option to <literal>false</literal>
> and drop the
> +   relations and schemas that are associated with the publication.
>    </para>
>
> 2a. Wording: "to default state" -> "to the default state"

Modified

> 2b. Wording: "and drop the relations..." -> "and dropping all relations..."

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml
>
> +   invoking user to be a superuser.  <literal>RESET</literal> of publication
> +   requires invoking user to be a superuser. To alter the owner, you must also
>
> Wording: "requires invoking user" -> "requires the invoking user"

Modified

> ~~~
>
> 4. doc/src/sgml/ref/alter_publication.sgml - Example
>
> @@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL
> TABLES IN SCHEMA marketing, sales;
>     <structname>production_publication</structname>:
>  <programlisting>
>  ALTER PUBLICATION production_publication ADD TABLE users,
> departments, ALL TABLES IN SCHEMA production;
> +</programlisting></para>
> +
> +  <para>
> +   Resetting the publication <structname>production_publication</structname>:
> +<programlisting>
> +ALTER PUBLICATION production_publication RESET;
>
> Wording: "Resetting the publication" -> "Reset the publication"

Modified

> ~~~
>
> 5. src/backend/commands/publicationcmds.c
>
> + /* Check and reset the options */
>
> IMO the code can just reset all these options unconditionally. I did
> not see the point to check for existing option values first. I feel
> the simpler code outweighs any negligible performance difference in
> this case.

Modified

> ~~~
>
> 6. src/backend/commands/publicationcmds.c
>
> + /* Check and reset the options */
>
> Somehow it seemed a pity having to hardcode all these default values
> true/false in multiple places; e.g. the same is already hardcoded in
> the parse_publication_options function.
>
> To avoid multiple hard coded bools you could just call the
> parse_publication_options with an empty options list. That would set
> the defaults which you can then use:
> values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert);
>
> Alternatively, maybe there should be #defines to use instead of having
> the scattered hardcoded bool defaults:
> #define PUBACTION_DEFAULT_INSERT true
> #define PUBACTION_DEFAULT_UPDATE true
> etc

I have used #define for default value and used it in both the functions.

> ~~~
>
> 7. src/include/nodes/parsenodes.h
>
> @@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction
>  {
>   AP_AddObjects, /* add objects to publication */
>   AP_DropObjects, /* remove objects from publication */
> - AP_SetObjects /* set list of objects */
> + AP_SetObjects, /* set list of objects */
> + AP_ReSetPublication /* reset the publication */
>  } AlterPublicationAction;
>
> Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication"

Modified

> ~~~
>
> 8. src/test/regress/sql/publication.sql
>
> 8a.
> +-- Test for RESET PUBLICATION
> SUGGESTED
> +-- Tests for ALTER PUBLICATION ... RESET

Modified

> 8b.
> +-- Verify that 'ALL TABLES' option is reset
> SUGGESTED:
> +-- Verify that 'ALL TABLES' flag is reset

Modified

> 8c.
> +-- Verify that publish option and publish via root option is reset
> SUGGESTED:
> +-- Verify that publish options and publish_via_partition_root option are reset

Modified

> 8d.
> +-- Verify that only superuser can execute RESET publication
> SUGGESTED
> +-- Verify that only superuser can reset a publication

Modified

Thanks for the comments, the attached v5 patch has the changes for the
same. Also I have made the changes for SKIP Table based on the new
syntax, the changes for the same are available in
v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.

Regards,
Vignesh

Вложения

RE: Skipping schema changes in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for the comments, the attached v5 patch has the changes for the same.
> Also I have made the changes for SKIP Table based on the new syntax, the
> changes for the same are available in
> v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
Hi,


Thank you for updating the patch.
I'll share few minor review comments on v5-0001.


(1) doc/src/sgml/ref/alter_publication.sgml

@@ -73,12 +85,13 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
    Adding a table to a publication additionally requires owning that table.
    The <literal>ADD ALL TABLES IN SCHEMA</literal> and
    <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the
-   invoking user to be a superuser.  To alter the owner, you must also be a
-   direct or indirect member of the new owning role. The new owner must have
-   <literal>CREATE</literal> privilege on the database.  Also, the new owner
-   of a <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN
-   SCHEMA</literal> publication must be a superuser. However, a superuser can
-   change the ownership of a publication regardless of these restrictions.
+   invoking user to be a superuser.  <literal>RESET</literal> of publication
+   requires the invoking user to be a superuser. To alter the owner, you must
...


I suggest to combine the first part of your change with one existing sentence
before your change, to make our description concise.

FROM:
"The <literal>ADD ALL TABLES IN SCHEMA</literal> and
<literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the
invoking user to be a superuser.  <literal>RESET</literal> of publication
requires the invoking user to be a superuser."

TO:
"The <literal>ADD ALL TABLES IN SCHEMA</literal>,
<literal>SET ALL TABLES IN SCHEMA</literal> to a publication and
<literal>RESET</literal> of publication requires the invoking user to be a superuser."


(2) typo

+++ b/src/backend/commands/publicationcmds.c
@@ -53,6 +53,13 @@
 #include "utils/syscache.h"
 #include "utils/varlena.h"

+#define PUB_ATION_INSERT_DEFAULT true
+#define PUB_ACTION_UPDATE_DEFAULT true


Kindly change
FROM:
"PUB_ATION_INSERT_DEFAULT"
TO:
"PUB_ACTION_INSERT_DEFAULT"


(3) src/test/regress/expected/publication.out

+-- Verify that only superuser can reset a publication
+ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
+SET ROLE regress_publication_user2;
+ALTER PUBLICATION testpub_reset RESET; -- fail


We have "-- fail" for one case in this patch.
On the other hand, isn't better to add "-- ok" (or "-- success") for
other successful statements,
when we consider the entire tests description consistency ?


Best Regards,
    Takamichi Osumi


RE: Skipping schema changes in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote:
> Thanks for the comments, the attached v5 patch has the changes for the same.
> Also I have made the changes for SKIP Table based on the new syntax, the
> changes for the same are available in
> v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
Hi,



Several comments on v5-0002.

(1) One unnecessary space before "except_pub_obj_list" syntax definition

+ except_pub_obj_list:  ExceptPublicationObjSpec
+                                       { $$ = list_make1($1); }
+                       | except_pub_obj_list ',' ExceptPublicationObjSpec
+                                       { $$ = lappend($1, $3); }
+                       |  /*EMPTY*/                                                            { $$ = NULL; }
+       ;
+

From above part, kindly change
FROM:
" except_pub_obj_list:  ExceptPublicationObjSpec"
TO:
"except_pub_obj_list:  ExceptPublicationObjSpec"


(2) doc/src/sgml/ref/create_publication.sgml

(2-1)

@@ -22,7 +22,7 @@ PostgreSQL documentation
  <refsynopsisdiv>
 <synopsis>
 CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
-    [ FOR ALL TABLES
+    [ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]]
       | FOR <replaceable class="parameter">publication_object</replaceable> [, ... ] ]
     [ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable
class="parameter">value</replaceable>][, ... ] ) ]
 


Here I think we need to add two more whitespaces around square brackets.
Please change
FROM:
"[ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]]"
TO:
"[ FOR ALL TABLES [ EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ] ]"

When I check other documentations, I see whitespaces before/after square brackets.

(2-2)
This whitespace alignment applies to alter_publication.sgml as well.

(3)


@@ -156,6 +156,24 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
     </listitem>
    </varlistentry>

+
+   <varlistentry>
+    <term><literal>EXCEPT TABLE</literal></term>
+    <listitem>
+     <para>
+      Marks the publication as one that excludes replicating changes for the
+      specified tables.
+     </para>
+
+     <para>
+      <literal>EXCEPT TABLE</literal> can be specified only for
+      <literal>FOR ALL TABLES</literal> publication. It is not supported for
+      <literal>FOR ALL TABLES IN SCHEMA </literal> publication and
+      <literal>FOR TABLE</literal> publication.
+     </para>
+    </listitem>
+   </varlistentry>
+

This EXCEPT TABLE clause is only for FOR ALL TABLES.
So, how about extracting the main message from above part and
moving it to an exising paragraph below, instead of having one independent paragraph ?

   <varlistentry>
    <term><literal>FOR ALL TABLES</literal></term>
    <listitem>
     <para>
      Marks the publication as one that replicates changes for all tables in
      the database, including tables created in the future.
     </para>
    </listitem>
   </varlistentry>

Something like
"Marks the publication as one that replicates changes for all tables in
the database, including tables created in the future. EXCEPT TABLE indicates
excluded tables for the defined publication.
"


(4) One minor confirmation about the syntax

Currently, we allow one way of writing to indicate excluded tables like below.

(example) CREATE PUBLICATION mypub FOR ALL TABLES EXCEPT TABLE tab3, tab4, EXCEPT TABLE tab5;

This is because we define ExceptPublicationObjSpec with EXCEPT TABLE.
Is it OK to have a room to write duplicate "EXCEPT TABLE" clauses ?
I think there is no harm in having this,
but I'd like to confirm whether this syntax might be better to be adjusted or not.


(5) CheckAlterPublication

+
+       if (excepttable && !stmt->for_all_tables)
+               ereport(ERROR,
+                               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("publication \"%s\" is not defined as FOR ALL TABLES",
+                                               NameStr(pubform->pubname)),
+                                errdetail("except table cannot be added to, dropped from, or set on NON ALL TABLES
publications.")));

Could you please add a test for this ?



Best Regards,
    Takamichi Osumi


Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Below are my review comments for v5-0001.

There is some overlap with comments recently posted by Osumi-san [1].

(I also have review comments for v5-0002; will post them tomorrow)

======

1. Commit message

This patch adds a new RESET clause to ALTER PUBLICATION which will reset
the publication to default state which includes resetting the publication
options, setting ALL TABLES option to false and dropping the relations and
schemas that are associated with the publication.

SUGGEST
"to default state" -> "to the default state"
"ALL TABLES option" -> "ALL TABLES flag"

~~~

2. doc/src/sgml/ref/alter_publication.sgml

+  <para>
+   The <literal>RESET</literal> clause will reset the publication to the
+   default state which includes resetting the publication options, setting
+   <literal>ALL TABLES</literal> option to <literal>false</literal> and
+   dropping all relations and schemas that are associated with the publication.
   </para>

"ALL TABLES option" -> "ALL TABLES flag"

~~~

3. doc/src/sgml/ref/alter_publication.sgml

+   invoking user to be a superuser.  <literal>RESET</literal> of publication
+   requires the invoking user to be a superuser. To alter the owner, you must

SUGGESTION
To <literal>RESET</literal> a publication requires the invoking user
to be a superuser.

~~~

4. src/backend/commands/publicationcmds.c

@@ -53,6 +53,13 @@
 #include "utils/syscache.h"
 #include "utils/varlena.h"

+#define PUB_ATION_INSERT_DEFAULT true
+#define PUB_ACTION_UPDATE_DEFAULT true
+#define PUB_ACTION_DELETE_DEFAULT true
+#define PUB_ACTION_TRUNCATE_DEFAULT true
+#define PUB_VIA_ROOT_DEFAULT false
+#define PUB_ALL_TABLES_DEFAULT false

4a.
Typo: "ATION" -> "ACTION"

4b.
I think these #defines deserve a 1 line comment.
e.g.
/* CREATE PUBLICATION default values for flags and options */

4c.
Since the "_DEFAULT" is a common part of all the names, maybe it is
tidier if it comes first.
e.g.
#define PUB_DEFAULT_ACTION_INSERT true
#define PUB_DEFAULT_ACTION_UPDATE true
#define PUB_DEFAULT_ACTION_DELETE true
#define PUB_DEFAULT_ACTION_TRUNCATE true
#define PUB_DEFAULT_VIA_ROOT false
#define PUB_DEFAULT_ALL_TABLES false

------
[1]
https://www.postgresql.org/message-id/TYCPR01MB8373C3120C2B3112001ED6F1EDCF9%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Below are my review comments for v5-0002.

There may be an overlap with comments recently posted by Osumi-san [1].

(I also have review comments for v5-0002; will post them tomorrow)

======

1. General

Is it really necessary to have to say "EXCEPT TABLE" instead of just
"EXCEPT". It seems unnecessarily verbose and redundant when you write
"FOR ALL TABLES EXCEPT TABLE...".

If you want to keep this TABLE keyword (maybe you have plans for other
kinds of except?) then IMO perhaps at least it can be the optional
default except type. e.g. EXCEPT [TABLE].

~~~

2. General

(I was unsure whether to even mention this one).

I understand the "EXCEPT" is chosen as the user-facing syntax, but it
still seems strange when reading the patch to see attribute members
and column names called 'except'. I think the problem is that "except"
is not a verb, so saying except=t/f just does not make much sense.
Sometimes I feel that for all the internal usage
(code/comments/catalog) using "skip" and "skip-list" etc would be a
much better choice of names. OTOH I can see that having consistency
with the outside syntax might also be good. Anyway, please consider -
maybe other people feel the same?

~~~

3. General

The ONLY keyword seems supported by the syntax for tables of the
except-list (more on this in later comments) but:
a) I am not sure if the patch code is accounting for that, and
b) There are no test cases using ONLY.

~~~

4. Commit message

A new option "EXCEPT TABLE" in Create/Alter Publication allows
one or more tables to be excluded, publisher will exclude sending the data
of the excluded tables to the subscriber.

SUGGESTION
A new "EXCEPT TABLE" clause for CREATE/ALTER PUBLICATION allows one or
more tables to be excluded. The publisher will not send the data of
excluded tables to the subscriber.

~~

5. Commit message

The new syntax allows specifying exclude relations while creating a publication
or exclude relations in alter publication. For example:

SUGGESTION
The new syntax allows specifying excluded relations when creating or
altering a publication. For example:

~~~

6. Commit message

A new column prexcept is added to table "pg_publication_rel", to maintain
the relations that the user wants to exclude publishing through the publication.

SUGGESTION
A new column "prexcept" is added to table "pg_publication_rel", to
maintain the relations that the user wants to exclude from the
publications.

~~~

7. Commit message

Modified the output plugin (pgoutput) to exclude publishing the changes of the
excluded tables.

I did not feel it was necessary to say this. It is already said above
that the data is not sent, so that seems enough.

~~~

8. Commit message

Updates pg_dump to identify and dump the excluded tables of the publications.
Updates the \d family of commands to display excluded tables of the
publications and \dRp+ variant will now display associated except tables if any.

SUGGESTION
pg_dump is updated to identify and dump the excluded tables of the publications.

The psql \d family of commands to display excluded tables. e.g. psql
\dRp+ variant will now display associated "except tables" if any.

~~~

9. doc/src/sgml/catalogs.sgml

@@ -6426,6 +6426,15 @@ SCRAM-SHA-256$<replaceable><iteration
count></replaceable>:<replaceable>&l
       if there is no publication qualifying condition.</para></entry>
      </row>

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+      <structfield>prexcept</structfield> <type>bool</type>
+      </para>
+      <para>
+       True if the table must be excluded
+      </para></entry>
+     </row>

Other descriptions on this page refer to "relation" instead of
"table". Probably this should do the same to be consistent.

~~~

10. doc/src/sgml/logical-replication.sgml

@@ -1167,8 +1167,9 @@ CONTEXT:  processing remote data for replication
origin "pg_16395" during "INSER
   <para>
    To add tables to a publication, the user must have ownership rights on the
    table. To add all tables in schema to a publication, the user must be a
-   superuser. To create a publication that publishes all tables or
all tables in
-   schema automatically, the user must be a superuser.
+   superuser. To add all tables to a publication, the user must be a superuser.
+   To create a publication that publishes all tables or all tables in schema
+   automatically, the user must be a superuser.
   </para>

It seems like a valid change but how is this related to this EXCEPT
patch. Maybe this fix should be patched separately?

~~~

11. doc/src/sgml/ref/alter_publication.sgml

@@ -22,6 +22,7 @@ PostgreSQL documentation
  <refsynopsisdiv>
 <synopsis>
 ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
ADD <replaceable class="parameter">publication_object</replaceable> [,
...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
ADD ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable
class="parameter">table_name</replaceable> [ * ] [, ... ]]

The [ONLY] looks misplaced when the syntax is described like this. For
example, in practice it is possible to write "EXCEPT TABLE ONLY t1,
ONLY t2, t3, ONLY t4" but it doesn't seem that way by looking at these
PG DOCS.

IMO would be better described like this:

[ FOR ALL TABLES [ EXCEPT TABLE exception_object [,...] ]]

where exception_object is:

    [ ONLY ] table_name [ * ]

~~~

12. doc/src/sgml/ref/alter_publication.sgml

@@ -82,8 +83,8 @@ ALTER PUBLICATION <replaceable
class="parameter">name</replaceable> RESET

   <para>
    You must own the publication to use <command>ALTER PUBLICATION</command>.
-   Adding a table to a publication additionally requires owning that table.
-   The <literal>ADD ALL TABLES IN SCHEMA</literal> and
+   Adding a table or excluding a table to a publication additionally requires
+   owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal> and

SUGGESTION
Adding a table to or excluding a table from a publication additionally
requires owning that table.

~~~

13. doc/src/sgml/ref/alter_publication.sgml

@@ -213,6 +214,14 @@ ALTER PUBLICATION sales_publication ADD ALL
TABLES IN SCHEMA marketing, sales;
 </programlisting>
   </para>

+  <para>
+   Alter publication <structname>production_publication</structname> that
+   publishes all tables except <structname>users</structname> and
+   <structname>departments</structname> tables:
+<programlisting>

"that publishes" -> "to publish"

~~~

14. doc/src/sgml/ref/create_publication.sgml

(Same comment about the ONLY syntax as #11)

~~~

15. doc/src/sgml/ref/create_publication.sgml

+   <varlistentry>
+    <term><literal>EXCEPT TABLE</literal></term>
+    <listitem>
+     <para>
+      Marks the publication as one that excludes replicating changes for the
+      specified tables.
+     </para>
+
+     <para>
+      <literal>EXCEPT TABLE</literal> can be specified only for
+      <literal>FOR ALL TABLES</literal> publication. It is not supported for
+      <literal>FOR ALL TABLES IN SCHEMA </literal> publication and
+      <literal>FOR TABLE</literal> publication.
+     </para>
+    </listitem>
+   </varlistentry>

IMO you can remove all that "It is not supported for..." sentence. You
don't need to spell that out again when it is already clear from the
syntax.

~~~

16. doc/src/sgml/ref/psql-ref.sgml

@@ -1868,8 +1868,9 @@ testdb=>
         If <replaceable class="parameter">pattern</replaceable> is
         specified, only those publications whose names match the pattern are
         listed.
-        If <literal>+</literal> is appended to the command name, the tables and
-        schemas associated with each publication are shown as well.
+        If <literal>+</literal> is appended to the command name, the tables,
+        excluded tables and schemas associated with each publication
are shown as
+        well.
         </para>

Perhaps this is OK just as-is, but OTOH I felt that the change was
almost unnecessary because saying it displays "the tables" kind of
implies it would also have to account for the "excluded tables" too.

~~~

17. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication

@@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List
*ancestors, int *ancestor_level
  foreach(lc, ancestors)
  {
  Oid ancestor = lfirst_oid(lc);
- List    *apubids = GetRelationPublications(ancestor);
+ List    *apubids = GetRelationPublications(ancestor, false);
  List    *aschemaPubids = NIL;
+ List    *aexceptpubids = NIL;

17a.
I think the var "aschemaPubids" and "aexceptpubids" are only used in
the 'else' block so it seems better they can be declared and freed in
that block too instead of always.

17b.
Also, the camel-case of those variables is inconsistent so may fix
that at the same time.

~~~

18. src/backend/catalog/pg_publication.c - GetRelationPublications

@@ -666,7 +673,7 @@ publication_add_schema(Oid pubid, Oid schemaid,
bool if_not_exists)

 /* Gets list of publication oids for a relation */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, bool bexcept)

18a.
I felt that "except_flag" is a better name than "bexcept" for this param.

18b.
The function comment should be updated to say only relations matching
this except_flag are returned in the list.

~~~

19. src/backend/catalog/pg_publication.c - GetAllTablesPublicationRelations

@@ -787,6 +795,15 @@ GetAllTablesPublicationRelations(bool pubviaroot)
  HeapTuple tuple;
  List    *result = NIL;

+ /*
+ * pg_publication_rel and pg_publication_namespace will only have excluded
+ * tables in case of all tables publication, no need to pass except flag
+ * to get the relations.
+ */
+ List    *exceptpubtablelist;
+
+ exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
+

19a.
I wasn't very sure of the meaning/intent of the comment, but IIUC it
seems to be explaining why it is not necessary to use an "except_flag"
parameter in this code. Is it necessary/helpful to explain parameters
that do NOT exist?

19b.
The var name "exceptpubtablelist" seems a bit overkill. (e.g.
"excepttablelist" or "exceptlist" etc... are shorter but seem equally
informative).

~~~

20. src/backend/commands/publicationcmds.c  - CreatePublication

@@ -843,54 +849,52 @@ CreatePublication(ParseState *pstate,
CreatePublicationStmt *stmt)
  /* Make the changes visible. */
  CommandCounterIncrement();

- /* Associate objects with the publication. */
- if (stmt->for_all_tables)
- {
- /* Invalidate relcache so that publication info is rebuilt. */
- CacheInvalidateRelcacheAll();
- }
- else
- {
- ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
-    &schemaidlist);
+ ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
+ &schemaidlist);

- /* FOR ALL TABLES IN SCHEMA requires superuser */
- if (list_length(schemaidlist) > 0 && !superuser())
- ereport(ERROR,
- errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));
+ /* FOR ALL TABLES IN SCHEMA requires superuser */
+ if (list_length(schemaidlist) > 0 && !superuser())
+ ereport(ERROR,
+ errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));

- if (list_length(relations) > 0)
- {
- List    *rels;
+ if (list_length(relations) > 0)
+ {
+ List    *rels;

- rels = OpenTableList(relations);
- CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
-   PUBLICATIONOBJ_TABLE);
+ rels = OpenTableList(relations);
+ CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
+ PUBLICATIONOBJ_TABLE);

- TransformPubWhereClauses(rels, pstate->p_sourcetext,
- publish_via_partition_root);
+ TransformPubWhereClauses(rels, pstate->p_sourcetext,
+ publish_via_partition_root);

- CheckPubRelationColumnList(rels, pstate->p_sourcetext,
-    publish_via_partition_root);
+ CheckPubRelationColumnList(rels, pstate->p_sourcetext,
+ publish_via_partition_root);

- PublicationAddTables(puboid, rels, true, NULL);
- CloseTableList(rels);
- }
+ PublicationAddTables(puboid, rels, true, NULL);
+ CloseTableList(rels);
+ }

- if (list_length(schemaidlist) > 0)
- {
- /*
- * Schema lock is held until the publication is created to prevent
- * concurrent schema deletion.
- */
- LockSchemaList(schemaidlist);
- PublicationAddSchemas(puboid, schemaidlist, true, NULL);
- }
+ if (list_length(schemaidlist) > 0)
+ {
+ /*
+ * Schema lock is held until the publication is created to prevent
+ * concurrent schema deletion.
+ */
+ LockSchemaList(schemaidlist);
+ PublicationAddSchemas(puboid, schemaidlist, true, NULL);
  }

  table_close(rel, RowExclusiveLock);

+ /* Associate objects with the publication. */
+ if (stmt->for_all_tables)
+ {
+ /* Invalidate relcache so that publication info is rebuilt. */
+ CacheInvalidateRelcacheAll();
+ }
+

This function is refactored a lot to not use "if/else" as it did
before. But AFAIK (maybe I misunderstood) this refactor doesn't seem
to actually have anything to do with the EXCEPT patch. If it really is
unrelated maybe it should not be part of this patch.

~~~

21. src/backend/commands/publicationcmds.c - CheckPublicationDefValues

+ if (pubform->puballtables)
+ return false;
+
+ if (!pubform->pubinsert || !pubform->pubupdate || !pubform->pubdelete ||
+ !pubform->pubtruncate || pubform->pubviaroot)
+ return false;

Now you have all the #define for the PUB_DEFAULT_XXX values, perhaps
this function should be using them instead of the hardcoded
assumptions what the default values are.

e.g.

if (pubform->puballtables != PUB_DEFAULT_ALL_TABLES) return false;
if (pubform->pubinsert != PUB_DEFAULT_ACTION_INSERT) return false;
...
etc.

~~~

22. src/backend/commands/publicationcmds.c -  CheckAlterPublication


@@ -1442,6 +1516,19 @@ CheckAlterPublication(AlterPublicationStmt
*stmt, HeapTuple tup,
    List *tables, List *schemaidlist)
 {
  Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
+ ListCell   *lc;
+ bool nonexcepttable = false;
+ bool excepttable = false;
+
+ foreach(lc, tables)
+ {
+ PublicationTable *pub_table = lfirst_node(PublicationTable, lc);
+
+ if (!pub_table->except)
+ nonexcepttable = true;
+ else
+ excepttable = true;
+ }

22a.
The names are very confusing. e.g. "nonexcepttable" is like a double-negative.

SUGGEST:
bool has_tables = false;
bool has_except_tables = false;

22b.
Reverse the "if" condition to be positive instead of negative (remove !)
e.g.
if (pub_table->except)
has_except_table = true;
else
has_table = true;

~~~

23. src/backend/commands/publicationcmds.c -  CheckAlterPublication

@@ -1461,12 +1548,19 @@ CheckAlterPublication(AlterPublicationStmt
*stmt, HeapTuple tup,
  errdetail("Tables from schema cannot be added to, dropped from, or
set on FOR ALL TABLES publications.")));

  /* Check that user is allowed to manipulate the publication tables. */
- if (tables && pubform->puballtables)
+ if (nonexcepttable && tables && pubform->puballtables)
  ereport(ERROR,

Seems no reason for "tables" to be in the condition since
"nonexcepttable" can't be true if "tables" is NIL.

~~~

24. src/backend/commands/publicationcmds.c -  CheckAlterPublication

+
+ if (excepttable && !stmt->for_all_tables)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("publication \"%s\" is not defined as FOR ALL TABLES",
+ NameStr(pubform->pubname)),
+ errdetail("except table cannot be added to, dropped from, or set on
NON ALL TABLES publications.")));

The errdetail message seems over-complex.

SUGGESTION
"EXCEPT TABLE clause is only allowed for FOR ALL TABLES publications."

~~~

25. src/backend/commands/publicationcmds.c - AlterPublication

@@ -1500,6 +1594,20 @@ AlterPublication(ParseState *pstate,
AlterPublicationStmt *stmt)
  aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
     stmt->pubname);

+ if (stmt->for_all_tables)
+ {
+ bool isdefault = CheckPublicationDefValues(tup);
+
+ if (!isdefault)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("Setting ALL TABLES requires publication \"%s\" to have
default values",
+    stmt->pubname),
+ errhint("Either the publication has tables/schemas associated or
does not have default publication options or ALL TABLES option is
set."));

The errhint message seems over-complex.

SUGGESTION
"Use ALTER PUBLICATION ... RESET"

~~~

26. src/bin/pg_dump/pg_dump.c - dumpPublication

@@ -3980,8 +3982,34 @@ dumpPublication(Archive *fout, const
PublicationInfo *pubinfo)
    qpubname);

  if (pubinfo->puballtables)
+ {
+ SimplePtrListCell *cell;
+ bool first = true;
  appendPQExpBufferStr(query, " FOR ALL TABLES");

+ /* Include exception tables if the publication has except tables */
+ for (cell = exceptinfo.head; cell; cell = cell->next)
+ {
+ PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
+ PublicationInfo *relpubinfo = pubrinfo->publication;
+ TableInfo  *tbinfo;
+
+ if (pubinfo == relpubinfo)
+ {
+ tbinfo = pubrinfo->pubtable;
+
+ if (first)
+ {
+ appendPQExpBufferStr(query, " EXCEPT TABLE ONLY");
+ first = false;
+ }
+ else
+ appendPQExpBufferStr(query, ", ");
+ appendPQExpBuffer(query, " %s", fmtQualifiedDumpable(tbinfo));
+ }
+ }
+ }
+

IIUC this usage of ONLY looks incorrect.

26a.
Firstly, if you want to hardwire ONLY then shouldn't it apply to every
of the except-list table, not just the first one? e.g. "EXCEPT TABLE
ONLY t1, ONLY t2, ONLY t3..."

26b.
Secondly, is it even correct to unconditionally hardwire the ONLY? How
do you know that is how the user wanted it?

~~~

27. src/bin/pg_dump/pg_dump.c

@@ -127,6 +127,8 @@ static SimpleOidList foreign_servers_include_oids
= {NULL, NULL};
 static SimpleStringList extension_include_patterns = {NULL, NULL};
 static SimpleOidList extension_include_oids = {NULL, NULL};

+static SimplePtrList exceptinfo = {NULL, NULL};
+

Probably I just did not understand how this logic works, but how does
this static work properly if there are multiple publications and 2
different EXCEPT lists? E.g. where is it clearing the "exceptinfo" so
that multiple EXCEPT TABLE lists don't become muddled?

~~~

28. src/bin/pg_dump/pg_dump.c - dumpPublicationTable

@@ -4330,8 +4378,11 @@ dumpPublicationTable(Archive *fout, const
PublicationRelInfo *pubrinfo)

  query = createPQExpBuffer();

- appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
+ appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD ",
    fmtId(pubinfo->dobj.name));
+
+ appendPQExpBufferStr(query, "TABLE ONLY");
+

That code refactor does not seem necessary for this patch.

~~~

29. src/bin/pg_dump/pg_dump_sort.c

@@ -90,6 +90,7 @@ enum dbObjectTypePriorities
  PRIO_FK_CONSTRAINT,
  PRIO_POLICY,
  PRIO_PUBLICATION,
+ PRIO_PUBLICATION_EXCEPT_REL,
  PRIO_PUBLICATION_REL,
  PRIO_PUBLICATION_TABLE_IN_SCHEMA,
  PRIO_SUBSCRIPTION,

I'm not sure how this enum is used (so perhaps this makes no
difference) but judging by the enum comment why did you put the sort
priority order PRIO_PUBLICATION_EXCEPT_REL before
PRIO_PUBLICATION_REL. Wouldn’t it make more sense the other way
around?

~~~

30. src/bin/psql/describe.c

@@ -2950,17 +2950,34 @@ describeOneTableDetails(const char *schemaname,
    "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
    "        ELSE NULL END) "
    "FROM pg_catalog.pg_publication p\n"
-   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
-   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
-   "WHERE pr.prrelid = '%s'\n"
-   "UNION\n"
+   " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+   " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
+   "WHERE pr.prrelid = '%s'",
+   oid, oid, oid);

I feel that trailing "\n" ("WHERE pr.prrelid = '%s'\n") should not
have been removed.

~~~

31. src/bin/psql/describe.c

+ /* FIXME: 150000 should be changed to 160000 later for PG16. */
+ if (pset.sversion >= 150000)
+ appendPQExpBufferStr(&buf, " AND pr.prexcept = 'f'\n");
+
+ appendPQExpBuffer(&buf, "UNION\n"

The "UNION\n" param might be better wrapped onto the next line like it
used to be.

~~~

32. src/bin/psql/describe.c

+ /* FIXME: 150000 should be changed to 160000 later for PG16. */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+   " AND NOT EXISTS (SELECT 1\n"
+   " FROM pg_catalog.pg_publication_rel pr\n"
+   " JOIN pg_catalog.pg_class pc\n"
+   "   ON pr.prrelid = pc.oid\n"
+   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
+   oid);

The whitespace indents in the SQL seem excessive here.

~~~

33. src/bin/psql/describe.c - describePublications

@@ -6322,6 +6344,22 @@ describePublications(const char *pattern)
  }
  }

+ /* FIXME: 150000 should be changed to 160000 later for PG16. */
+ if (pset.sversion >= 150000)
+ {
+ /* Get the excluded tables for the specified publication */
+ printfPQExpBuffer(&buf,
+   "SELECT concat(c.relnamespace::regnamespace, '.', c.relname)\n"
+   "FROM pg_catalog.pg_class c\n"
+   "     JOIN pg_catalog.pg_publication_rel pr ON c.oid = pr.prrelid\n"
+   "WHERE pr.prpubid = '%s'\n"
+   "  AND pr.prexcept = 't'\n"
+   "ORDER BY 1", pubid);
+ if (!addFooterToPublicationDesc(&buf, "Except tables:",
+ true, &cont))
+ goto error_return;
+ }
+

I think this code is misplaced. Shouldn't it be if/else and be above
the other 150000 check, otherwise when you change this to PG16 it may
not work as expected?

~~~

34. src/bin/psql/describe.c - describePublications

+ if (!addFooterToPublicationDesc(&buf, "Except tables:",
+ true, &cont))
+ goto error_return;
+ }

Should this be using the _T() macros same as the other prompts for translation?

~~~

35. src/include/catalog/pg_publication.h

I thought the param "bexpect" should be "except_flag".

(same comment as #18a)

~~~

36. src/include/catalog/pg_publication_rel.h

@@ -31,6 +31,7 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
  Oid oid; /* oid */
  Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
  Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
+ bool prexcept BKI_DEFAULT(f); /* except the relation */

SUGGEST (comment)
/* skip the relation */

~~~

37. src/include/commands/publicationcmds.h

@@ -32,8 +32,8 @@ extern ObjectAddress AlterPublicationOwner(const
char *name, Oid newOwnerId);
 extern void AlterPublicationOwner_oid(Oid pubid, Oid newOwnerId);
 extern void InvalidatePublicationRels(List *relids);
 extern bool pub_rf_contains_invalid_column(Oid pubid, Relation relation,
-    List *ancestors, bool pubviaroot);
+    List *ancestors, bool pubviaroot, bool alltables);
 extern bool pub_collist_contains_invalid_column(Oid pubid, Relation relation,
- List *ancestors, bool pubviaroot);
+ List *ancestors, bool pubviaroot, bool alltables);

Elsewhere in this patch, a similarly added param is called
"puballtables" (not "alltables"). Please check all places and use a
consistent param name for all of them.

~~~

38. src/test/regress/sql/publication.sql

There don't seem to be any tests for more than one EXCEPT TABLE (e.g.
no list tests?)

~~~

38. src/test/regress/sql/publication.sql

Maybe adjust all the below comments (a-d) to say "EXCEPT TABLES"
intead of "except tables"

38a.
+-- can't add except table to 'FOR ALL TABLES' publication

38b.
+-- can't add except table to 'FOR TABLE' publication

38c.
+-- can't add except table to 'FOR ALL TABLES IN SCHEMA' publication

38d.
+-- can't add except table when publish_via_partition_root option does not
+-- have default value

38e.
+-- can't add except table when the publication options does not have default
+-- values

SUGGESTION
can't add EXCEPT TABLE when the publication options are not the default values

~~~

39. .../t/032_rep_changes_except_table.pl

39a.
+# Check the table data does not sync for excluded table
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM sch1.tab1");
+is($result, qq(0||), 'check tablesync is excluded for excluded tables');

Maybe the "is" message should say "check there is no initial data
copied for the excluded table"

~~~


40 .../t/032_rep_changes_except_table.pl

+# Insert some data into few tables and verify that inserted data is not
+# replicated
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))");

The comment is not quite correct. You are inserting into only one
table here - not "few tables".

~~~

41. .../t/032_rep_changes_except_table.pl

+# Alter publication to exclude data changes in public.tab1 and verify that
+# subscriber does not get the new table data.

"new table data" -> "changed data for this table"

------
[1]
https://www.postgresql.org/message-id/TYCPR01MB83737C28187A6E0BADAE98F0EDCF9%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Tue, May 17, 2022 at 7:35 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Below are my review comments for v5-0002.
>
> There may be an overlap with comments recently posted by Osumi-san [1].
>
> (I also have review comments for v5-0002; will post them tomorrow)
>
> ======
>
> 1. General
>
> Is it really necessary to have to say "EXCEPT TABLE" instead of just
> "EXCEPT". It seems unnecessarily verbose and redundant when you write
> "FOR ALL TABLES EXCEPT TABLE...".
>
> If you want to keep this TABLE keyword (maybe you have plans for other
> kinds of except?)
>

I don't think there is an immediate plan but one can imagine using
EXCEPT SCHEMA. Then for column lists, one may want to use the syntax
Create Publication pub1 For Table t1 Except Cols (c1, ..);

> then IMO perhaps at least it can be the optional
> default except type. e.g. EXCEPT [TABLE].
>

Yeah, that might be okay, so, even if we plan to extend this in the
future, by default we will consider the list of tables after EXCEPT
but if the user mentions EXCEPT SCHEMA or something else then we can
use a different object. Is that sound okay?

>
> 3. General
>
> The ONLY keyword seems supported by the syntax for tables of the
> except-list (more on this in later comments) but:
> a) I am not sure if the patch code is accounting for that, and
> b) There are no test cases using ONLY.
>
> ~~~
>

Isn't it better to map ONLY with the way it can already be specified
in CREATE PUBLICATION? I am not sure what exactly is proposed and what
is your suggestion? Can you please explain if it is different from the
way we use it for CREATE PUBLICATION?

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Tue, May 17, 2022 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, May 17, 2022 at 7:35 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Below are my review comments for v5-0002.
> >
> > There may be an overlap with comments recently posted by Osumi-san [1].
> >
> > (I also have review comments for v5-0002; will post them tomorrow)
> >
> > ======
> >
> > 1. General
> >
> > Is it really necessary to have to say "EXCEPT TABLE" instead of just
> > "EXCEPT". It seems unnecessarily verbose and redundant when you write
> > "FOR ALL TABLES EXCEPT TABLE...".
> >
> > If you want to keep this TABLE keyword (maybe you have plans for other
> > kinds of except?)
> >
>
> I don't think there is an immediate plan but one can imagine using
> EXCEPT SCHEMA. Then for column lists, one may want to use the syntax
> Create Publication pub1 For Table t1 Except Cols (c1, ..);
>
> > then IMO perhaps at least it can be the optional
> > default except type. e.g. EXCEPT [TABLE].
> >
>
> Yeah, that might be okay, so, even if we plan to extend this in the
> future, by default we will consider the list of tables after EXCEPT
> but if the user mentions EXCEPT SCHEMA or something else then we can
> use a different object. Is that sound okay?

Yes. That is what I meant.

>
> >
> > 3. General
> >
> > The ONLY keyword seems supported by the syntax for tables of the
> > except-list (more on this in later comments) but:
> > a) I am not sure if the patch code is accounting for that, and
> > b) There are no test cases using ONLY.
> >
> > ~~~
> >
>
> Isn't it better to map ONLY with the way it can already be specified
> in CREATE PUBLICATION? I am not sure what exactly is proposed and what
> is your suggestion? Can you please explain if it is different from the
> way we use it for CREATE PUBLICATION?
>

Yes, I am not proposing anything different to how ONLY already works
for published tables. I was only questioning whether the patch behaves
correctly when ONLY is specified for the tables of the EXCEPT list. I
had some doubt about it because there are a few other review comments
I wrote (e.g. in pg_dump.c),  and also I did not find any ONLY tests,

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Skipping schema changes in publication

От
"shiy.fnst@fujitsu.com"
Дата:
On Sat, May 14, 2022 9:33 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> Thanks for the comments, the attached v5 patch has the changes for the
> same. Also I have made the changes for SKIP Table based on the new
> syntax, the changes for the same are available in
> v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
>

Thanks for your patch. Here are some comments on v5-0001 patch.

+        Oid            relid = lfirst_oid(lc);
+
+        prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
+                               ObjectIdGetDatum(relid),
+                               ObjectIdGetDatum(pubid));
+        if (!OidIsValid(prid))
+            ereport(ERROR,
+                    (errcode(ERRCODE_UNDEFINED_OBJECT),
+                     errmsg("relation \"%s\" is not part of the publication",
+                            RelationGetRelationName(rel))));

I think the relation in the error message should be the one whose oid is
"relid", instead of relation "rel".

Besides, I think it might be better not to report an error in this case. If
"prid" is invalid, just ignore this relation. Because in RESET cases, we want to
drop all tables in the publications, and there is no specific table.
(If you agree with that, similarly missing_ok should be set to true when calling
PublicationDropSchemas().)

Regards,
Shi yu

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, May 16, 2022 at 8:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the comments, the attached v5 patch has the changes for the same.
> > Also I have made the changes for SKIP Table based on the new syntax, the
> > changes for the same are available in
> > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
> Hi,
>
>
> Thank you for updating the patch.
> I'll share few minor review comments on v5-0001.
>
>
> (1) doc/src/sgml/ref/alter_publication.sgml
>
> @@ -73,12 +85,13 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
>     Adding a table to a publication additionally requires owning that table.
>     The <literal>ADD ALL TABLES IN SCHEMA</literal> and
>     <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the
> -   invoking user to be a superuser.  To alter the owner, you must also be a
> -   direct or indirect member of the new owning role. The new owner must have
> -   <literal>CREATE</literal> privilege on the database.  Also, the new owner
> -   of a <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN
> -   SCHEMA</literal> publication must be a superuser. However, a superuser can
> -   change the ownership of a publication regardless of these restrictions.
> +   invoking user to be a superuser.  <literal>RESET</literal> of publication
> +   requires the invoking user to be a superuser. To alter the owner, you must
> ...
>
>
> I suggest to combine the first part of your change with one existing sentence
> before your change, to make our description concise.
>
> FROM:
> "The <literal>ADD ALL TABLES IN SCHEMA</literal> and
> <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the
> invoking user to be a superuser.  <literal>RESET</literal> of publication
> requires the invoking user to be a superuser."
>
> TO:
> "The <literal>ADD ALL TABLES IN SCHEMA</literal>,
> <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and
> <literal>RESET</literal> of publication requires the invoking user to be a superuser."

Modified

>
> (2) typo
>
> +++ b/src/backend/commands/publicationcmds.c
> @@ -53,6 +53,13 @@
>  #include "utils/syscache.h"
>  #include "utils/varlena.h"
>
> +#define PUB_ATION_INSERT_DEFAULT true
> +#define PUB_ACTION_UPDATE_DEFAULT true
>
>
> Kindly change
> FROM:
> "PUB_ATION_INSERT_DEFAULT"
> TO:
> "PUB_ACTION_INSERT_DEFAULT"

Modified

>
> (3) src/test/regress/expected/publication.out
>
> +-- Verify that only superuser can reset a publication
> +ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
> +SET ROLE regress_publication_user2;
> +ALTER PUBLICATION testpub_reset RESET; -- fail
>
>
> We have "-- fail" for one case in this patch.
> On the other hand, isn't better to add "-- ok" (or "-- success") for
> other successful statements,
> when we consider the entire tests description consistency ?

We generally do not mention success comments for all the success cases
as that might be an overkill. I felt it is better to keep it as it is.
Thoughts?

The attached v6 patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, May 16, 2022 at 2:53 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Below are my review comments for v5-0001.
>
> There is some overlap with comments recently posted by Osumi-san [1].
>
> (I also have review comments for v5-0002; will post them tomorrow)
>
> ======
>
> 1. Commit message
>
> This patch adds a new RESET clause to ALTER PUBLICATION which will reset
> the publication to default state which includes resetting the publication
> options, setting ALL TABLES option to false and dropping the relations and
> schemas that are associated with the publication.
>
> SUGGEST
> "to default state" -> "to the default state"
> "ALL TABLES option" -> "ALL TABLES flag"

Modified

> ~~~
>
> 2. doc/src/sgml/ref/alter_publication.sgml
>
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to the
> +   default state which includes resetting the publication options, setting
> +   <literal>ALL TABLES</literal> option to <literal>false</literal> and
> +   dropping all relations and schemas that are associated with the publication.
>    </para>
>
> "ALL TABLES option" -> "ALL TABLES flag"

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml
>
> +   invoking user to be a superuser.  <literal>RESET</literal> of publication
> +   requires the invoking user to be a superuser. To alter the owner, you must
>
> SUGGESTION
> To <literal>RESET</literal> a publication requires the invoking user
> to be a superuser.

 I have combined it with the earlier sentence.

> ~~~
>
> 4. src/backend/commands/publicationcmds.c
>
> @@ -53,6 +53,13 @@
>  #include "utils/syscache.h"
>  #include "utils/varlena.h"
>
> +#define PUB_ATION_INSERT_DEFAULT true
> +#define PUB_ACTION_UPDATE_DEFAULT true
> +#define PUB_ACTION_DELETE_DEFAULT true
> +#define PUB_ACTION_TRUNCATE_DEFAULT true
> +#define PUB_VIA_ROOT_DEFAULT false
> +#define PUB_ALL_TABLES_DEFAULT false
>
> 4a.
> Typo: "ATION" -> "ACTION"

Modified

> 4b.
> I think these #defines deserve a 1 line comment.
> e.g.
> /* CREATE PUBLICATION default values for flags and options */

Added comment

> 4c.
> Since the "_DEFAULT" is a common part of all the names, maybe it is
> tidier if it comes first.
> e.g.
> #define PUB_DEFAULT_ACTION_INSERT true
> #define PUB_DEFAULT_ACTION_UPDATE true
> #define PUB_DEFAULT_ACTION_DELETE true
> #define PUB_DEFAULT_ACTION_TRUNCATE true
> #define PUB_DEFAULT_VIA_ROOT false
> #define PUB_DEFAULT_ALL_TABLES false

Modified

The v6 patch attached at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Wed, May 18, 2022 at 8:30 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Sat, May 14, 2022 9:33 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v5 patch has the changes for the
> > same. Also I have made the changes for SKIP Table based on the new
> > syntax, the changes for the same are available in
> > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
> >
>
> Thanks for your patch. Here are some comments on v5-0001 patch.
>
> +               Oid                     relid = lfirst_oid(lc);
> +
> +               prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
> +                                                          ObjectIdGetDatum(relid),
> +                                                          ObjectIdGetDatum(pubid));
> +               if (!OidIsValid(prid))
> +                       ereport(ERROR,
> +                                       (errcode(ERRCODE_UNDEFINED_OBJECT),
> +                                        errmsg("relation \"%s\" is not part of the publication",
> +                                                       RelationGetRelationName(rel))));
>
> I think the relation in the error message should be the one whose oid is
> "relid", instead of relation "rel".

Modified it

> Besides, I think it might be better not to report an error in this case. If
> "prid" is invalid, just ignore this relation. Because in RESET cases, we want to
> drop all tables in the publications, and there is no specific table.
> (If you agree with that, similarly missing_ok should be set to true when calling
> PublicationDropSchemas().)

Ideally this scenario should not happen, but if it happens I felt we
should throw an error in this case.

The v6 patch attached at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, May 16, 2022 at 2:00 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the comments, the attached v5 patch has the changes for the same.
> > Also I have made the changes for SKIP Table based on the new syntax, the
> > changes for the same are available in
> > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch.
> Hi,
>
>
>
> Several comments on v5-0002.
>
> (1) One unnecessary space before "except_pub_obj_list" syntax definition
>
> + except_pub_obj_list:  ExceptPublicationObjSpec
> +                                       { $$ = list_make1($1); }
> +                       | except_pub_obj_list ',' ExceptPublicationObjSpec
> +                                       { $$ = lappend($1, $3); }
> +                       |  /*EMPTY*/                                                            { $$ = NULL; }
> +       ;
> +
>
> From above part, kindly change
> FROM:
> " except_pub_obj_list:  ExceptPublicationObjSpec"
> TO:
> "except_pub_obj_list:  ExceptPublicationObjSpec"
>

Modified

> (2) doc/src/sgml/ref/create_publication.sgml
>
> (2-1)
>
> @@ -22,7 +22,7 @@ PostgreSQL documentation
>   <refsynopsisdiv>
>  <synopsis>
>  CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
> -    [ FOR ALL TABLES
> +    [ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]]
>        | FOR <replaceable class="parameter">publication_object</replaceable> [, ... ] ]
>      [ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable
class="parameter">value</replaceable>][, ... ] ) ]
 
>
>
> Here I think we need to add two more whitespaces around square brackets.
> Please change
> FROM:
> "[ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]]"
> TO:
> "[ FOR ALL TABLES [ EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ] ]"
>
> When I check other documentations, I see whitespaces before/after square brackets.
>
> (2-2)
> This whitespace alignment applies to alter_publication.sgml as well.

Modified

> (3)
>
>
> @@ -156,6 +156,24 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
>      </listitem>
>     </varlistentry>
>
> +
> +   <varlistentry>
> +    <term><literal>EXCEPT TABLE</literal></term>
> +    <listitem>
> +     <para>
> +      Marks the publication as one that excludes replicating changes for the
> +      specified tables.
> +     </para>
> +
> +     <para>
> +      <literal>EXCEPT TABLE</literal> can be specified only for
> +      <literal>FOR ALL TABLES</literal> publication. It is not supported for
> +      <literal>FOR ALL TABLES IN SCHEMA </literal> publication and
> +      <literal>FOR TABLE</literal> publication.
> +     </para>
> +    </listitem>
> +   </varlistentry>
> +
>
> This EXCEPT TABLE clause is only for FOR ALL TABLES.
> So, how about extracting the main message from above part and
> moving it to an exising paragraph below, instead of having one independent paragraph ?
>
>    <varlistentry>
>     <term><literal>FOR ALL TABLES</literal></term>
>     <listitem>
>      <para>
>       Marks the publication as one that replicates changes for all tables in
>       the database, including tables created in the future.
>      </para>
>     </listitem>
>    </varlistentry>
>
> Something like
> "Marks the publication as one that replicates changes for all tables in
> the database, including tables created in the future. EXCEPT TABLE indicates
> excluded tables for the defined publication.
> "
>

Modified

> (4) One minor confirmation about the syntax
>
> Currently, we allow one way of writing to indicate excluded tables like below.
>
> (example) CREATE PUBLICATION mypub FOR ALL TABLES EXCEPT TABLE tab3, tab4, EXCEPT TABLE tab5;
>
> This is because we define ExceptPublicationObjSpec with EXCEPT TABLE.
> Is it OK to have a room to write duplicate "EXCEPT TABLE" clauses ?
> I think there is no harm in having this,
> but I'd like to confirm whether this syntax might be better to be adjusted or not.

Changed it to allow except table only once

>
> (5) CheckAlterPublication
>
> +
> +       if (excepttable && !stmt->for_all_tables)
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                                errmsg("publication \"%s\" is not defined as FOR ALL TABLES",
> +                                               NameStr(pubform->pubname)),
> +                                errdetail("except table cannot be added to, dropped from, or set on NON ALL TABLES
publications.")));
>
> Could you please add a test for this ?

This code can be removed because of grammar optimization, it will not
allow tables without "ALL TABLES". Removed this code

The v6 patch attached at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com

Regards,
Vignesh



RE: Skipping schema changes in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Thursday, May 19, 2022 2:45 AM vignesh C <vignesh21@gmail.com> wrote:
> On Mon, May 16, 2022 at 8:32 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> > (3) src/test/regress/expected/publication.out
> >
> > +-- Verify that only superuser can reset a publication ALTER
> > +PUBLICATION testpub_reset OWNER TO regress_publication_user2; SET
> > +ROLE regress_publication_user2; ALTER PUBLICATION testpub_reset
> > +RESET; -- fail
> >
> >
> > We have "-- fail" for one case in this patch.
> > On the other hand, isn't better to add "-- ok" (or "-- success") for
> > other successful statements, when we consider the entire tests
> > description consistency ?
> 
> We generally do not mention success comments for all the success cases as
> that might be an overkill. I felt it is better to keep it as it is.
> Thoughts?
Thank you for updating the patches !

In terms of this point,
I meant to say we add "-- ok" for each successful
"ALTER PUBLICATION testpub_reset RESET;" statement.
That means, we'll have just three places to add "--ok"
and I thought this was not an overkill.

*But*, I'm also OK with your idea.
Please don't change the comments
and keep them as it is like v6.


Best Regards,
    Takamichi Osumi


Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, May 17, 2022 at 7:35 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Below are my review comments for v5-0002.
>
> There may be an overlap with comments recently posted by Osumi-san [1].
>
> (I also have review comments for v5-0002; will post them tomorrow)
>
> ======
>
> 1. General
>
> Is it really necessary to have to say "EXCEPT TABLE" instead of just
> "EXCEPT". It seems unnecessarily verbose and redundant when you write
> "FOR ALL TABLES EXCEPT TABLE...".
>
> If you want to keep this TABLE keyword (maybe you have plans for other
> kinds of except?) then IMO perhaps at least it can be the optional
> default except type. e.g. EXCEPT [TABLE].

I have made TABLE optional.

> ~~~
>
> 2. General
>
> (I was unsure whether to even mention this one).
>
> I understand the "EXCEPT" is chosen as the user-facing syntax, but it
> still seems strange when reading the patch to see attribute members
> and column names called 'except'. I think the problem is that "except"
> is not a verb, so saying except=t/f just does not make much sense.
> Sometimes I feel that for all the internal usage
> (code/comments/catalog) using "skip" and "skip-list" etc would be a
> much better choice of names. OTOH I can see that having consistency
> with the outside syntax might also be good. Anyway, please consider -
> maybe other people feel the same?

Earlier we had discussed whether to use SKIP, but felt SKIP was not
appropriate and planned to use except as in [1]. Let's use except
unless we find a better alternative.

> ~~~
>
> 3. General
>
> The ONLY keyword seems supported by the syntax for tables of the
> except-list (more on this in later comments) but:
> a) I am not sure if the patch code is accounting for that, and

I have kept the behavior similar to FOR TABLE

> b) There are no test cases using ONLY.

Added tests for the same

> ~~~
>
> 4. Commit message
>
> A new option "EXCEPT TABLE" in Create/Alter Publication allows
> one or more tables to be excluded, publisher will exclude sending the data
> of the excluded tables to the subscriber.
>
> SUGGESTION
> A new "EXCEPT TABLE" clause for CREATE/ALTER PUBLICATION allows one or
> more tables to be excluded. The publisher will not send the data of
> excluded tables to the subscriber.

Modified

> ~~
>
> 5. Commit message
>
> The new syntax allows specifying exclude relations while creating a publication
> or exclude relations in alter publication. For example:
>
> SUGGESTION
> The new syntax allows specifying excluded relations when creating or
> altering a publication. For example:

Modified

> ~~~
>
> 6. Commit message
>
> A new column prexcept is added to table "pg_publication_rel", to maintain
> the relations that the user wants to exclude publishing through the publication.
>
> SUGGESTION
> A new column "prexcept" is added to table "pg_publication_rel", to
> maintain the relations that the user wants to exclude from the
> publications.

Modified

> ~~~
>
> 7. Commit message
>
> Modified the output plugin (pgoutput) to exclude publishing the changes of the
> excluded tables.
>
> I did not feel it was necessary to say this. It is already said above
> that the data is not sent, so that seems enough.

Modified

> ~~~
>
> 8. Commit message
>
> Updates pg_dump to identify and dump the excluded tables of the publications.
> Updates the \d family of commands to display excluded tables of the
> publications and \dRp+ variant will now display associated except tables if any.
>
> SUGGESTION
> pg_dump is updated to identify and dump the excluded tables of the publications.
>
> The psql \d family of commands to display excluded tables. e.g. psql
> \dRp+ variant will now display associated "except tables" if any.

Modified

> ~~~
>
> 9. doc/src/sgml/catalogs.sgml
>
> @@ -6426,6 +6426,15 @@ SCRAM-SHA-256$<replaceable><iteration
> count></replaceable>:<replaceable>&l
>        if there is no publication qualifying condition.</para></entry>
>       </row>
>
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +      <structfield>prexcept</structfield> <type>bool</type>
> +      </para>
> +      <para>
> +       True if the table must be excluded
> +      </para></entry>
> +     </row>
>
> Other descriptions on this page refer to "relation" instead of
> "table". Probably this should do the same to be consistent.

Modified

> ~~~
>
> 10. doc/src/sgml/logical-replication.sgml
>
> @@ -1167,8 +1167,9 @@ CONTEXT:  processing remote data for replication
> origin "pg_16395" during "INSER
>    <para>
>     To add tables to a publication, the user must have ownership rights on the
>     table. To add all tables in schema to a publication, the user must be a
> -   superuser. To create a publication that publishes all tables or
> all tables in
> -   schema automatically, the user must be a superuser.
> +   superuser. To add all tables to a publication, the user must be a superuser.
> +   To create a publication that publishes all tables or all tables in schema
> +   automatically, the user must be a superuser.
>    </para>
>
> It seems like a valid change but how is this related to this EXCEPT
> patch. Maybe this fix should be patched separately?

Earlier we were not allowed to add ALL TABLES, while altering
publication. This is mentioned in this patch as we suport:
ALTER PUBLICATION pubname ADD ALL TABLES syntax.

> ~~~
>
> 11. doc/src/sgml/ref/alter_publication.sgml
>
> @@ -22,6 +22,7 @@ PostgreSQL documentation
>   <refsynopsisdiv>
>  <synopsis>
>  ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD <replaceable class="parameter">publication_object</replaceable> [,
> ...]
> +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable
> class="parameter">table_name</replaceable> [ * ] [, ... ]]
>
> The [ONLY] looks misplaced when the syntax is described like this. For
> example, in practice it is possible to write "EXCEPT TABLE ONLY t1,
> ONLY t2, t3, ONLY t4" but it doesn't seem that way by looking at these
> PG DOCS.
>
> IMO would be better described like this:
>
> [ FOR ALL TABLES [ EXCEPT TABLE exception_object [,...] ]]
>
> where exception_object is:
>
>     [ ONLY ] table_name [ * ]

Modified

> ~~~
>
> 12. doc/src/sgml/ref/alter_publication.sgml
>
> @@ -82,8 +83,8 @@ ALTER PUBLICATION <replaceable
> class="parameter">name</replaceable> RESET
>
>    <para>
>     You must own the publication to use <command>ALTER PUBLICATION</command>.
> -   Adding a table to a publication additionally requires owning that table.
> -   The <literal>ADD ALL TABLES IN SCHEMA</literal> and
> +   Adding a table or excluding a table to a publication additionally requires
> +   owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal> and
>
> SUGGESTION
> Adding a table to or excluding a table from a publication additionally
> requires owning that table.

Modified

> ~~~
>
> 13. doc/src/sgml/ref/alter_publication.sgml
>
> @@ -213,6 +214,14 @@ ALTER PUBLICATION sales_publication ADD ALL
> TABLES IN SCHEMA marketing, sales;
>  </programlisting>
>    </para>
>
> +  <para>
> +   Alter publication <structname>production_publication</structname> that
> +   publishes all tables except <structname>users</structname> and
> +   <structname>departments</structname> tables:
> +<programlisting>
>
> "that publishes" -> "to publish"

Modified

> ~~~
>
> 14. doc/src/sgml/ref/create_publication.sgml
>
> (Same comment about the ONLY syntax as #11)

Modified

> ~~~
>
> 15. doc/src/sgml/ref/create_publication.sgml
>
> +   <varlistentry>
> +    <term><literal>EXCEPT TABLE</literal></term>
> +    <listitem>
> +     <para>
> +      Marks the publication as one that excludes replicating changes for the
> +      specified tables.
> +     </para>
> +
> +     <para>
> +      <literal>EXCEPT TABLE</literal> can be specified only for
> +      <literal>FOR ALL TABLES</literal> publication. It is not supported for
> +      <literal>FOR ALL TABLES IN SCHEMA </literal> publication and
> +      <literal>FOR TABLE</literal> publication.
> +     </para>
> +    </listitem>
> +   </varlistentry>
>
> IMO you can remove all that "It is not supported for..." sentence. You
> don't need to spell that out again when it is already clear from the
> syntax.

Modified

> ~~~
>
> 16. doc/src/sgml/ref/psql-ref.sgml
>
> @@ -1868,8 +1868,9 @@ testdb=>
>          If <replaceable class="parameter">pattern</replaceable> is
>          specified, only those publications whose names match the pattern are
>          listed.
> -        If <literal>+</literal> is appended to the command name, the tables and
> -        schemas associated with each publication are shown as well.
> +        If <literal>+</literal> is appended to the command name, the tables,
> +        excluded tables and schemas associated with each publication
> are shown as
> +        well.
>          </para>
>
> Perhaps this is OK just as-is, but OTOH I felt that the change was
> almost unnecessary because saying it displays "the tables" kind of
> implies it would also have to account for the "excluded tables" too.

I mentioned it that way so that it is clearer and to avoid confusions
to be pointed out by other members later. I felt let's keep it this
way.

> ~~~
>
> 17. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
>
> @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List
> *ancestors, int *ancestor_level
>   foreach(lc, ancestors)
>   {
>   Oid ancestor = lfirst_oid(lc);
> - List    *apubids = GetRelationPublications(ancestor);
> + List    *apubids = GetRelationPublications(ancestor, false);
>   List    *aschemaPubids = NIL;
> + List    *aexceptpubids = NIL;
>
> 17a.
> I think the var "aschemaPubids" and "aexceptpubids" are only used in
> the 'else' block so it seems better they can be declared and freed in
> that block too instead of always.

Modified

> 17b.
> Also, the camel-case of those variables is inconsistent so may fix
> that at the same time.

Modified

> ~~~
>
> 18. src/backend/catalog/pg_publication.c - GetRelationPublications
>
> @@ -666,7 +673,7 @@ publication_add_schema(Oid pubid, Oid schemaid,
> bool if_not_exists)
>
>  /* Gets list of publication oids for a relation */
>  List *
> -GetRelationPublications(Oid relid)
> +GetRelationPublications(Oid relid, bool bexcept)
>
> 18a.
> I felt that "except_flag" is a better name than "bexcept" for this param.

Modified

> 18b.
> The function comment should be updated to say only relations matching
> this except_flag are returned in the list.

Modified

> ~~~
>
> 19. src/backend/catalog/pg_publication.c - GetAllTablesPublicationRelations
>
> @@ -787,6 +795,15 @@ GetAllTablesPublicationRelations(bool pubviaroot)
>   HeapTuple tuple;
>   List    *result = NIL;
>
> + /*
> + * pg_publication_rel and pg_publication_namespace will only have excluded
> + * tables in case of all tables publication, no need to pass except flag
> + * to get the relations.
> + */
> + List    *exceptpubtablelist;
> +
> + exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> +
>
> 19a.
> I wasn't very sure of the meaning/intent of the comment, but IIUC it
> seems to be explaining why it is not necessary to use an "except_flag"
> parameter in this code. Is it necessary/helpful to explain parameters
> that do NOT exist?

I have removed it

> 19b.
> The var name "exceptpubtablelist" seems a bit overkill. (e.g.
> "excepttablelist" or "exceptlist" etc... are shorter but seem equally
> informative).

Modified

> ~~~
>
> 20. src/backend/commands/publicationcmds.c  - CreatePublication
>
> @@ -843,54 +849,52 @@ CreatePublication(ParseState *pstate,
> CreatePublicationStmt *stmt)
>   /* Make the changes visible. */
>   CommandCounterIncrement();
>
> - /* Associate objects with the publication. */
> - if (stmt->for_all_tables)
> - {
> - /* Invalidate relcache so that publication info is rebuilt. */
> - CacheInvalidateRelcacheAll();
> - }
> - else
> - {
> - ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
> -    &schemaidlist);
> + ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
> + &schemaidlist);
>
> - /* FOR ALL TABLES IN SCHEMA requires superuser */
> - if (list_length(schemaidlist) > 0 && !superuser())
> - ereport(ERROR,
> - errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> - errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));
> + /* FOR ALL TABLES IN SCHEMA requires superuser */
> + if (list_length(schemaidlist) > 0 && !superuser())
> + ereport(ERROR,
> + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication"));
>
> - if (list_length(relations) > 0)
> - {
> - List    *rels;
> + if (list_length(relations) > 0)
> + {
> + List    *rels;
>
> - rels = OpenTableList(relations);
> - CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
> -   PUBLICATIONOBJ_TABLE);
> + rels = OpenTableList(relations);
> + CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
> + PUBLICATIONOBJ_TABLE);
>
> - TransformPubWhereClauses(rels, pstate->p_sourcetext,
> - publish_via_partition_root);
> + TransformPubWhereClauses(rels, pstate->p_sourcetext,
> + publish_via_partition_root);
>
> - CheckPubRelationColumnList(rels, pstate->p_sourcetext,
> -    publish_via_partition_root);
> + CheckPubRelationColumnList(rels, pstate->p_sourcetext,
> + publish_via_partition_root);
>
> - PublicationAddTables(puboid, rels, true, NULL);
> - CloseTableList(rels);
> - }
> + PublicationAddTables(puboid, rels, true, NULL);
> + CloseTableList(rels);
> + }
>
> - if (list_length(schemaidlist) > 0)
> - {
> - /*
> - * Schema lock is held until the publication is created to prevent
> - * concurrent schema deletion.
> - */
> - LockSchemaList(schemaidlist);
> - PublicationAddSchemas(puboid, schemaidlist, true, NULL);
> - }
> + if (list_length(schemaidlist) > 0)
> + {
> + /*
> + * Schema lock is held until the publication is created to prevent
> + * concurrent schema deletion.
> + */
> + LockSchemaList(schemaidlist);
> + PublicationAddSchemas(puboid, schemaidlist, true, NULL);
>   }
>
>   table_close(rel, RowExclusiveLock);
>
> + /* Associate objects with the publication. */
> + if (stmt->for_all_tables)
> + {
> + /* Invalidate relcache so that publication info is rebuilt. */
> + CacheInvalidateRelcacheAll();
> + }
> +
>
> This function is refactored a lot to not use "if/else" as it did
> before. But AFAIK (maybe I misunderstood) this refactor doesn't seem
> to actually have anything to do with the EXCEPT patch. If it really is
> unrelated maybe it should not be part of this patch.

Earlier tables cannot be specified with all tables, now except tables
can be specified with all tables, except tables should be added to
pg_publication_rel, to handle it the code changes are required.

> ~~~
>
> 21. src/backend/commands/publicationcmds.c - CheckPublicationDefValues
>
> + if (pubform->puballtables)
> + return false;
> +
> + if (!pubform->pubinsert || !pubform->pubupdate || !pubform->pubdelete ||
> + !pubform->pubtruncate || pubform->pubviaroot)
> + return false;
>
> Now you have all the #define for the PUB_DEFAULT_XXX values, perhaps
> this function should be using them instead of the hardcoded
> assumptions what the default values are.
>
> e.g.
>
> if (pubform->puballtables != PUB_DEFAULT_ALL_TABLES) return false;
> if (pubform->pubinsert != PUB_DEFAULT_ACTION_INSERT) return false;
> ...
> etc.

Modified

> ~~~
>
> 22. src/backend/commands/publicationcmds.c -  CheckAlterPublication
>
>
> @@ -1442,6 +1516,19 @@ CheckAlterPublication(AlterPublicationStmt
> *stmt, HeapTuple tup,
>     List *tables, List *schemaidlist)
>  {
>   Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
> + ListCell   *lc;
> + bool nonexcepttable = false;
> + bool excepttable = false;
> +
> + foreach(lc, tables)
> + {
> + PublicationTable *pub_table = lfirst_node(PublicationTable, lc);
> +
> + if (!pub_table->except)
> + nonexcepttable = true;
> + else
> + excepttable = true;
> + }
>
> 22a.
> The names are very confusing. e.g. "nonexcepttable" is like a double-negative.
>
> SUGGEST:
> bool has_tables = false;
> bool has_except_tables = false;
>
> 22b.
> Reverse the "if" condition to be positive instead of negative (remove !)
> e.g.
> if (pub_table->except)
> has_except_table = true;
> else
> has_table = true;

This code can be removed because of grammar optimization, it will not
allow except table without "ALL TABLES". Removed these changes.

> ~~~
>
> 23. src/backend/commands/publicationcmds.c -  CheckAlterPublication
>
> @@ -1461,12 +1548,19 @@ CheckAlterPublication(AlterPublicationStmt
> *stmt, HeapTuple tup,
>   errdetail("Tables from schema cannot be added to, dropped from, or
> set on FOR ALL TABLES publications.")));
>
>   /* Check that user is allowed to manipulate the publication tables. */
> - if (tables && pubform->puballtables)
> + if (nonexcepttable && tables && pubform->puballtables)
>   ereport(ERROR,
>
> Seems no reason for "tables" to be in the condition since
> "nonexcepttable" can't be true if "tables" is NIL.

This code can be removed because of grammar optimization, it will not
allow except table without "ALL TABLES". Removed these changes.

> ~~~
>
> 24. src/backend/commands/publicationcmds.c -  CheckAlterPublication
>
> +
> + if (excepttable && !stmt->for_all_tables)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("publication \"%s\" is not defined as FOR ALL TABLES",
> + NameStr(pubform->pubname)),
> + errdetail("except table cannot be added to, dropped from, or set on
> NON ALL TABLES publications.")));
>
> The errdetail message seems over-complex.
>
> SUGGESTION
> "EXCEPT TABLE clause is only allowed for FOR ALL TABLES publications."

This code can be removed because of grammar optimization, it will not
allow except table without "ALL TABLES". Removed this code

> ~~~
>
> 25. src/backend/commands/publicationcmds.c - AlterPublication
>
> @@ -1500,6 +1594,20 @@ AlterPublication(ParseState *pstate,
> AlterPublicationStmt *stmt)
>   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
>      stmt->pubname);
>
> + if (stmt->for_all_tables)
> + {
> + bool isdefault = CheckPublicationDefValues(tup);
> +
> + if (!isdefault)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> default values",
> +    stmt->pubname),
> + errhint("Either the publication has tables/schemas associated or
> does not have default publication options or ALL TABLES option is
> set."));
>
> The errhint message seems over-complex.
>
> SUGGESTION
> "Use ALTER PUBLICATION ... RESET"

Modified

> ~~~
>
> 26. src/bin/pg_dump/pg_dump.c - dumpPublication
>
> @@ -3980,8 +3982,34 @@ dumpPublication(Archive *fout, const
> PublicationInfo *pubinfo)
>     qpubname);
>
>   if (pubinfo->puballtables)
> + {
> + SimplePtrListCell *cell;
> + bool first = true;
>   appendPQExpBufferStr(query, " FOR ALL TABLES");
>
> + /* Include exception tables if the publication has except tables */
> + for (cell = exceptinfo.head; cell; cell = cell->next)
> + {
> + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
> + PublicationInfo *relpubinfo = pubrinfo->publication;
> + TableInfo  *tbinfo;
> +
> + if (pubinfo == relpubinfo)
> + {
> + tbinfo = pubrinfo->pubtable;
> +
> + if (first)
> + {
> + appendPQExpBufferStr(query, " EXCEPT TABLE ONLY");
> + first = false;
> + }
> + else
> + appendPQExpBufferStr(query, ", ");
> + appendPQExpBuffer(query, " %s", fmtQualifiedDumpable(tbinfo));
> + }
> + }
> + }
> +
>
> IIUC this usage of ONLY looks incorrect.
>
> 26a.
> Firstly, if you want to hardwire ONLY then shouldn't it apply to every
> of the except-list table, not just the first one? e.g. "EXCEPT TABLE
> ONLY t1, ONLY t2, ONLY t3..."

Modified, included ONLY for all the tables

> 26b.
> Secondly, is it even correct to unconditionally hardwire the ONLY? How
> do you know that is how the user wanted it?

The table ONLY selection is handled appropriately while creating
publication and stored in pg_publication_rel. When we dump all the
parent and child table will be included specifying ONLY will handle
both scenarios with and without ONLY. This is the same behavior as in
FOR TABLE publication

> ~~~
>
> 27. src/bin/pg_dump/pg_dump.c
>
> @@ -127,6 +127,8 @@ static SimpleOidList foreign_servers_include_oids
> = {NULL, NULL};
>  static SimpleStringList extension_include_patterns = {NULL, NULL};
>  static SimpleOidList extension_include_oids = {NULL, NULL};
>
> +static SimplePtrList exceptinfo = {NULL, NULL};
> +
>
> Probably I just did not understand how this logic works, but how does
> this static work properly if there are multiple publications and 2
> different EXCEPT lists? E.g. where is it clearing the "exceptinfo" so
> that multiple EXCEPT TABLE lists don't become muddled?

Currently exceptinfo holds all the exception tables and the
corresponding publications. When we dump the publication it will
select the appropriate exception tables that correspond to the
publication and dump the exception tables associated for this
publication. Since this is a special syntax "CREATE PUBLICATION FOR
ALL TABLES EXCEPT TABLE tb1 .." all the except tables should be
specified in a single statement unlike the other publication objects.

> ~~~
>
> 28. src/bin/pg_dump/pg_dump.c - dumpPublicationTable
>
> @@ -4330,8 +4378,11 @@ dumpPublicationTable(Archive *fout, const
> PublicationRelInfo *pubrinfo)
>
>   query = createPQExpBuffer();
>
> - appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
> + appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD ",
>     fmtId(pubinfo->dobj.name));
> +
> + appendPQExpBufferStr(query, "TABLE ONLY");
> +
>
> That code refactor does not seem necessary for this patch.

Modified

> ~~~
>
> 29. src/bin/pg_dump/pg_dump_sort.c
>
> @@ -90,6 +90,7 @@ enum dbObjectTypePriorities
>   PRIO_FK_CONSTRAINT,
>   PRIO_POLICY,
>   PRIO_PUBLICATION,
> + PRIO_PUBLICATION_EXCEPT_REL,
>   PRIO_PUBLICATION_REL,
>   PRIO_PUBLICATION_TABLE_IN_SCHEMA,
>   PRIO_SUBSCRIPTION,
>
> I'm not sure how this enum is used (so perhaps this makes no
> difference) but judging by the enum comment why did you put the sort
> priority order PRIO_PUBLICATION_EXCEPT_REL before
> PRIO_PUBLICATION_REL. Wouldn’t it make more sense the other way
> around?

This order does not matter, since the new syntax is like "CREATE
PUBLICATION.. FOR ALL TABLES EXCEPT TABLE ....", all the except tables
need to be accumulated and handled during dump publication. This code
changes take care of accumulating the exception table which will be
used later by dump publication

> ~~~
>
> 30. src/bin/psql/describe.c
>
> @@ -2950,17 +2950,34 @@ describeOneTableDetails(const char *schemaname,
>     "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
>     "        ELSE NULL END) "
>     "FROM pg_catalog.pg_publication p\n"
> -   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> -   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> -   "WHERE pr.prrelid = '%s'\n"
> -   "UNION\n"
> +   " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> +   " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prrelid = '%s'",
> +   oid, oid, oid);
>
> I feel that trailing "\n" ("WHERE pr.prrelid = '%s'\n") should not
> have been removed.

Modified

> ~~~
>
> 31. src/bin/psql/describe.c
>
> + /* FIXME: 150000 should be changed to 160000 later for PG16. */
> + if (pset.sversion >= 150000)
> + appendPQExpBufferStr(&buf, " AND pr.prexcept = 'f'\n");
> +
> + appendPQExpBuffer(&buf, "UNION\n"
>
> The "UNION\n" param might be better wrapped onto the next line like it
> used to be.

Modified

> ~~~
>
> 32. src/bin/psql/describe.c
>
> + /* FIXME: 150000 should be changed to 160000 later for PG16. */
> + if (pset.sversion >= 150000)
> + appendPQExpBuffer(&buf,
> +   " AND NOT EXISTS (SELECT 1\n"
> +   " FROM pg_catalog.pg_publication_rel pr\n"
> +   " JOIN pg_catalog.pg_class pc\n"
> +   "   ON pr.prrelid = pc.oid\n"
> +   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
> +   oid);
>
> The whitespace indents in the SQL seem excessive here.

Modified

> ~~~
>
> 33. src/bin/psql/describe.c - describePublications
>
> @@ -6322,6 +6344,22 @@ describePublications(const char *pattern)
>   }
>   }
>
> + /* FIXME: 150000 should be changed to 160000 later for PG16. */
> + if (pset.sversion >= 150000)
> + {
> + /* Get the excluded tables for the specified publication */
> + printfPQExpBuffer(&buf,
> +   "SELECT concat(c.relnamespace::regnamespace, '.', c.relname)\n"
> +   "FROM pg_catalog.pg_class c\n"
> +   "     JOIN pg_catalog.pg_publication_rel pr ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prpubid = '%s'\n"
> +   "  AND pr.prexcept = 't'\n"
> +   "ORDER BY 1", pubid);
> + if (!addFooterToPublicationDesc(&buf, "Except tables:",
> + true, &cont))
> + goto error_return;
> + }
> +
>
> I think this code is misplaced. Shouldn't it be if/else and be above
> the other 150000 check, otherwise when you change this to PG16 it may
> not work as expected?

I moved this to else. I felt this is applicable only for all tables
publication. Just keeping in else is fine.

> ~~~
>
> 34. src/bin/psql/describe.c - describePublications
>
> + if (!addFooterToPublicationDesc(&buf, "Except tables:",
> + true, &cont))
> + goto error_return;
> + }
>
> Should this be using the _T() macros same as the other prompts for translation?

Modified

> ~~~
>
> 35. src/include/catalog/pg_publication.h
>
> I thought the param "bexpect" should be "except_flag".
>
> (same comment as #18a)

Modified

> ~~~
>
> 36. src/include/catalog/pg_publication_rel.h
>
> @@ -31,6 +31,7 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
>   Oid oid; /* oid */
>   Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */
>   Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */
> + bool prexcept BKI_DEFAULT(f); /* except the relation */
>
> SUGGEST (comment)
> /* skip the relation */

Changed it to exclude the relation

> ~~~
>
> 37. src/include/commands/publicationcmds.h
>
> @@ -32,8 +32,8 @@ extern ObjectAddress AlterPublicationOwner(const
> char *name, Oid newOwnerId);
>  extern void AlterPublicationOwner_oid(Oid pubid, Oid newOwnerId);
>  extern void InvalidatePublicationRels(List *relids);
>  extern bool pub_rf_contains_invalid_column(Oid pubid, Relation relation,
> -    List *ancestors, bool pubviaroot);
> +    List *ancestors, bool pubviaroot, bool alltables);
>  extern bool pub_collist_contains_invalid_column(Oid pubid, Relation relation,
> - List *ancestors, bool pubviaroot);
> + List *ancestors, bool pubviaroot, bool alltables);
>
> Elsewhere in this patch, a similarly added param is called
> "puballtables" (not "alltables"). Please check all places and use a
> consistent param name for all of them.

Modified

> ~~~
>
> 38. src/test/regress/sql/publication.sql
>
> There don't seem to be any tests for more than one EXCEPT TABLE (e.g.
> no list tests?)

Modified

> ~~~
>
> 38. src/test/regress/sql/publication.sql
>
> Maybe adjust all the below comments (a-d) to say "EXCEPT TABLES"
> intead of "except tables"
>
> 38a.
> +-- can't add except table to 'FOR ALL TABLES' publication
>
> 38b.
> +-- can't add except table to 'FOR TABLE' publication
>
> 38c.
> +-- can't add except table to 'FOR ALL TABLES IN SCHEMA' publication
>
> 38d.
> +-- can't add except table when publish_via_partition_root option does not
> +-- have default value
>
> 38e.
> +-- can't add except table when the publication options does not have default
> +-- values
>
> SUGGESTION
> can't add EXCEPT TABLE when the publication options are not the default values

Modified
> ~~~
>
> 39. .../t/032_rep_changes_except_table.pl
>
> 39a.
> +# Check the table data does not sync for excluded table
> +my $result = $node_subscriber->safe_psql('postgres',
> + "SELECT count(*), min(a), max(a) FROM sch1.tab1");
> +is($result, qq(0||), 'check tablesync is excluded for excluded tables');
>
> Maybe the "is" message should say "check there is no initial data
> copied for the excluded table"

Modified

> ~~~
>
>
> 40 .../t/032_rep_changes_except_table.pl
>
> +# Insert some data into few tables and verify that inserted data is not
> +# replicated
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))");
>
> The comment is not quite correct. You are inserting into only one
> table here - not "few tables".

Modified

> ~~~
>
> 41. .../t/032_rep_changes_except_table.pl
>
> +# Alter publication to exclude data changes in public.tab1 and verify that
> +# subscriber does not get the new table data.
>
> "new table data" -> "changed data for this table"

Modified

Thanks for the comments, the v6 patch attached at [2] has the changes
for the same.
[1] - https://www.postgresql.org/message-id/a2004f08-eb2f-b124-115c-f8f18667e585%40enterprisedb.com
[2] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Below are my review comments for v6-0001.

======

1. General.

The patch failed 'publication' tests in the make check phase.

Please add this work to the commit-fest so that the 'cfbot' can report
such errors sooner.

~~~

2. src/backend/commands/publicationcmds.c - AlterPublicationReset

+/*
+ * Reset the publication.
+ *
+ * Reset the publication options, publication relations and
publication schemas.
+ */
+static void
+AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt,
+ Relation rel, HeapTuple tup)

SUGGESTION (Make the comment similar to the sgml text instead of
repeating "publication" 4x !)
/*
 * Reset the publication options, set the ALL TABLES flag to false, and
 * drop all relations and schemas that are associated with the publication.
 */

~~~

3. src/test/regress/expected/publication.out

make check failed. The diff is below:

@@ -1716,7 +1716,7 @@
 -- Verify that only superuser can reset a publication
 ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
 SET ROLE regress_publication_user2;
-ALTER PUBLICATION testpub_reset RESET; -- fail
+ALTER PUBLICATION testpub_reset RESET; -- fail - must be superuser
 ERROR:  must be superuser to RESET publication
 SET ROLE regress_publication_user;
 DROP PUBLICATION testpub_reset;


------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
FYI, although the v6-0002 patch applied cleanly, I found that the SGML
was malformed and so the pg docs build fails.

~~~
e.g.

[postgres@CentOS7-x64 sgml]$ make STYLE=website html
{ \
  echo "<!ENTITY version \"15beta1\">"; \
  echo "<!ENTITY majorversion \"15\">"; \
} > version.sgml
'/usr/bin/perl' ./mk_feature_tables.pl YES
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt >
features-supported.sgml
'/usr/bin/perl' ./mk_feature_tables.pl NO
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt >
features-unsupported.sgml
'/usr/bin/perl' ./generate-errcodes-table.pl
../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
'/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
/usr/bin/xmllint --path . --noout --valid postgres.sgml
ref/create_publication.sgml:171: parser error : Opening and ending tag
mismatch: varlistentry line 166 and listitem
    </listitem>
               ^
ref/create_publication.sgml:172: parser error : Opening and ending tag
mismatch: variablelist line 60 and varlistentry
   </varlistentry>
                  ^
ref/create_publication.sgml:226: parser error : Opening and ending tag
mismatch: refsect1 line 57 and variablelist
  </variablelist>
                 ^
...

I will work around it locally, but for future patches please check the
SGML builds ok before posting.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Fri, May 20, 2022 at 10:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI, although the v6-0002 patch applied cleanly, I found that the SGML
> was malformed and so the pg docs build fails.
>
> ~~~
> e.g.
>
> [postgres@CentOS7-x64 sgml]$ make STYLE=website html
> { \
>   echo "<!ENTITY version \"15beta1\">"; \
>   echo "<!ENTITY majorversion \"15\">"; \
> } > version.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl YES
> ../../../src/backend/catalog/sql_feature_packages.txt
> ../../../src/backend/catalog/sql_features.txt >
> features-supported.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl NO
> ../../../src/backend/catalog/sql_feature_packages.txt
> ../../../src/backend/catalog/sql_features.txt >
> features-unsupported.sgml
> '/usr/bin/perl' ./generate-errcodes-table.pl
> ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
> '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> ref/create_publication.sgml:171: parser error : Opening and ending tag
> mismatch: varlistentry line 166 and listitem
>     </listitem>
>                ^
> ref/create_publication.sgml:172: parser error : Opening and ending tag
> mismatch: variablelist line 60 and varlistentry
>    </varlistentry>
>                   ^
> ref/create_publication.sgml:226: parser error : Opening and ending tag
> mismatch: refsect1 line 57 and variablelist
>   </variablelist>
>                  ^
> ...
>
> I will work around it locally, but for future patches please check the
> SGML builds ok before posting.

FYI, I rewrote the bad SGML fragment like this:

   <varlistentry>
    <term><literal>EXCEPT TABLE</literal></term>
    <listitem>
     <para>
      This clause specifies a list of tables to exclude from the publication. It
      can only be used with <literal>FOR ALL TABLES</literal>.
     </para>
    </listitem>
   </varlistentry>

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Below are my review comments for v6-0002.

======

1. Commit message.
The psql \d family of commands to display excluded tables.

SUGGESTION
The psql \d family of commands can now display excluded tables.

~~~

2. doc/src/sgml/ref/alter_publication.sgml

@@ -22,6 +22,7 @@ PostgreSQL documentation
  <refsynopsisdiv>
 <synopsis>
 ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
ADD <replaceable class="parameter">publication_object</replaceable> [,
...]
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]

The "exception_object" font is wrong. Should look the same as
"publication_object"

~~~

3. doc/src/sgml/ref/alter_publication.sgml - Examples

@@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL
TABLES IN SCHEMA marketing, sales;
 </programlisting>
   </para>

+  <para>
+   Alter publication <structname>production_publication</structname> to publish
+   all tables except <structname>users</structname> and
+   <structname>departments</structname> tables:
+<programlisting>
+ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE
users, departments;
+</programlisting></para>

Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
show TABLE keyword is optional.

~~~

4. doc/src/sgml/ref/create_publication.sgml

An SGML tag error caused building the docs to fail. My fix was
previously reported [1].

~~~

5. doc/src/sgml/ref/create_publication.sgml

@@ -22,7 +22,7 @@ PostgreSQL documentation
  <refsynopsisdiv>
 <synopsis>
 CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
-    [ FOR ALL TABLES
+    [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]

The "exception_object" font is wrong. Should look the same as
"publication_object"

~~~

6. doc/src/sgml/ref/create_publication.sgml - Examples

@@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR
TABLE users, departments, ALL TABL
 CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;
 </programlisting></para>

+  <para>
+   Create a publication that publishes all changes in all the tables except for
+   the changes of <structname>users</structname> and
+   <structname>departments</structname> table:
+<programlisting>
+CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments;
+</programlisting>
+  </para>
+

6a.
Typo: "FOR ALL TABLE" -> "FOR ALL TABLES"

6b.
Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
show TABLE keyword is optional.

~~~

7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication

@@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List
*ancestors, int *ancestor_level
  }
  else
  {
- aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
- if (list_member_oid(aschemaPubids, puboid))
+ List    *aschemapubids = NIL;
+ List    *aexceptpubids = NIL;
+
+ aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
+ aexceptpubids = GetRelationPublications(ancestor, true);
+ if (list_member_oid(aschemapubids, puboid) ||
+ (puballtables && !list_member_oid(aexceptpubids, puboid)))
  {

You could re-write this as multiple conditions instead of one. That
could avoid always assigning the 'aexceptpubids', so it might be a
more efficient way to write this logic.

~~~

8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues

+/*
+ * Check if the publication has default values
+ *
+ * Check the following:
+ * Publication is having default options
+ *  Publication is not associated with relations
+ *  Publication is not associated with schemas
+ *  Publication is not set with "FOR ALL TABLES"
+ */
+static bool
+CheckPublicationDefValues(HeapTuple tup)

8a.
Remove the tab. Replace with spaces.

8b.
It might be better if this comment order is the same as the logic order.
e.g.

* Check the following:
*  Publication is not set with "FOR ALL TABLES"
*  Publication is having default options
*  Publication is not associated with schemas
*  Publication is not associated with relations

~~~

9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables

+/*
+ * Reset the publication.
+ *
+ * Reset the publication options, publication relations and
publication schemas.
+ */
+static void
+AlterPublicationSetAllTables(Relation rel, HeapTuple tup)

The function comment and the function name do not seem to match here;
something looks like a cut/paste error ??

~~~

10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables

+ /* set all tables option */
+ values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true);
+ replaces[Anum_pg_publication_puballtables - 1] = true;

SUGGEST (comment)
/* set all ALL TABLES flag */

~~~

11. src/backend/catalog/pg_publication.c - AlterPublication

@@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
AlterPublicationStmt *stmt)
  aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
     stmt->pubname);

+ if (stmt->for_all_tables)
+ {
+ bool isdefault = CheckPublicationDefValues(tup);
+
+ if (!isdefault)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("Setting ALL TABLES requires publication \"%s\" to have
default values",
+    stmt->pubname),
+ errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));

The errmsg should start with a lowercase letter.

~~~

12. src/backend/catalog/pg_publication.c - AlterPublication

@@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
AlterPublicationStmt *stmt)
  aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
     stmt->pubname);

+ if (stmt->for_all_tables)
+ {
+ bool isdefault = CheckPublicationDefValues(tup);
+
+ if (!isdefault)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("Setting ALL TABLES requires publication \"%s\" to have
default values",
+    stmt->pubname),
+ errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));

Example test:

postgres=# create table t1(a int);
CREATE TABLE
postgres=# create publication p1 for table t1;
CREATE PUBLICATION
postgres=# alter publication p1 add all tables except t1;
2022-05-20 14:34:49.301 AEST [21802] ERROR:  Setting ALL TABLES
requires publication "p1" to have default values
2022-05-20 14:34:49.301 AEST [21802] HINT:  Use ALTER PUBLICATION ...
RESET to reset the publication
2022-05-20 14:34:49.301 AEST [21802] STATEMENT:  alter publication p1
add all tables except t1;
ERROR:  Setting ALL TABLES requires publication "p1" to have default values
HINT:  Use ALTER PUBLICATION ... RESET to reset the publication
postgres=# alter publication p1 set all tables except t1;

That error message does not quite match what the user was doing.
Firstly, they were adding the ALL TABLES, not setting it. Secondly,
all the values of the publication were already defaults (only there
was an existing table t1 in the publication). Maybe some minor changes
to the message wording can be a better reflect what the user is doing
here.

~~~

13. src/backend/parser/gram.y

@@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE
aggregate_with_argtypes OWNER TO RoleSpec
  *
  * CREATE PUBLICATION name [WITH options]
  *
- * CREATE PUBLICATION FOR ALL TABLES [WITH options]
+ * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]]
[WITH options]

Comment should show the "TABLE" keyword is optional

~~~

14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable

@@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const
PublicationRelInfo *pubrinfo)

  appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
    fmtId(pubinfo->dobj.name));
+
  appendPQExpBuffer(query, " %s",
    fmtQualifiedDumpable(tbinfo));

This additional whitespace seems unrelated to this patch

~~~

15. src/include/nodes/parsenodes.h

15a.
@@ -3999,6 +3999,7 @@ typedef struct PublicationTable
  RangeVar   *relation; /* relation to be published */
  Node    *whereClause; /* qualifications */
  List    *columns; /* List of columns in a publication table */
+ bool except; /* except relation */
 } PublicationTable;

Maybe the comment should be more like similar ones:
/* exclude the relation */

15b.
@@ -4007,6 +4008,7 @@ typedef struct PublicationTable
 typedef enum PublicationObjSpecType
 {
  PUBLICATIONOBJ_TABLE, /* A table */
+ PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */
  PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */
  PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of

Maybe the comment should be more like:
/* A table to be excluded */

~~~

16. src/test/regress/sql/publication.sql

I did not see any test cases using EXCEPT when the optional TABLE
keyword is omitted.


------
[1] https://www.postgresql.org/message-id/CAHut%2BPtZDfBJ1d%3D3kSexgM5m%2BP_ok8sdsJXKimsXycaMyqXsNA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Thu, May 19, 2022 at 1:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Below are my review comments for v6-0001.
>
> ======
>
> 1. General.
>
> The patch failed 'publication' tests in the make check phase.
>
> Please add this work to the commit-fest so that the 'cfbot' can report
> such errors sooner.

Added commitfest entry

> ~~~
>
> 2. src/backend/commands/publicationcmds.c - AlterPublicationReset
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, publication relations and
> publication schemas.
> + */
> +static void
> +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt,
> + Relation rel, HeapTuple tup)
>
> SUGGESTION (Make the comment similar to the sgml text instead of
> repeating "publication" 4x !)
> /*
>  * Reset the publication options, set the ALL TABLES flag to false, and
>  * drop all relations and schemas that are associated with the publication.
>  */

Modified

> ~~~
>
> 3. src/test/regress/expected/publication.out
>
> make check failed. The diff is below:
>
> @@ -1716,7 +1716,7 @@
>  -- Verify that only superuser can reset a publication
>  ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
>  SET ROLE regress_publication_user2;
> -ALTER PUBLICATION testpub_reset RESET; -- fail
> +ALTER PUBLICATION testpub_reset RESET; -- fail - must be superuser
>  ERROR:  must be superuser to RESET publication
>  SET ROLE regress_publication_user;
>  DROP PUBLICATION testpub_reset;

It passed for me locally because the change was present in the 002
patch. I have moved the change to 001.

The attached v7 patch has the changes for the same.
[1] - https://commitfest.postgresql.org/38/3646/

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, May 20, 2022 at 5:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> FYI, although the v6-0002 patch applied cleanly, I found that the SGML
> was malformed and so the pg docs build fails.
>
> ~~~
> e.g.
>
> [postgres@CentOS7-x64 sgml]$ make STYLE=website html
> { \
>   echo "<!ENTITY version \"15beta1\">"; \
>   echo "<!ENTITY majorversion \"15\">"; \
> } > version.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl YES
> ../../../src/backend/catalog/sql_feature_packages.txt
> ../../../src/backend/catalog/sql_features.txt >
> features-supported.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl NO
> ../../../src/backend/catalog/sql_feature_packages.txt
> ../../../src/backend/catalog/sql_features.txt >
> features-unsupported.sgml
> '/usr/bin/perl' ./generate-errcodes-table.pl
> ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
> '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> ref/create_publication.sgml:171: parser error : Opening and ending tag
> mismatch: varlistentry line 166 and listitem
>     </listitem>
>                ^
> ref/create_publication.sgml:172: parser error : Opening and ending tag
> mismatch: variablelist line 60 and varlistentry
>    </varlistentry>
>                   ^
> ref/create_publication.sgml:226: parser error : Opening and ending tag
> mismatch: refsect1 line 57 and variablelist
>   </variablelist>
>                  ^
> ...
>
> I will work around it locally, but for future patches please check the
> SGML builds ok before posting.

Thanks for reporting this, I have made the changes for this.
The v7 patch attached at [1] has the changes for the same.

[1] - https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, May 20, 2022 at 11:23 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Below are my review comments for v6-0002.
>
> ======
>
> 1. Commit message.
> The psql \d family of commands to display excluded tables.
>
> SUGGESTION
> The psql \d family of commands can now display excluded tables.

Modified

> ~~~
>
> 2. doc/src/sgml/ref/alter_publication.sgml
>
> @@ -22,6 +22,7 @@ PostgreSQL documentation
>   <refsynopsisdiv>
>  <synopsis>
>  ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD <replaceable class="parameter">publication_object</replaceable> [,
> ...]
> +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
>
> The "exception_object" font is wrong. Should look the same as
> "publication_object"

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml - Examples
>
> @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL
> TABLES IN SCHEMA marketing, sales;
>  </programlisting>
>    </para>
>
> +  <para>
> +   Alter publication <structname>production_publication</structname> to publish
> +   all tables except <structname>users</structname> and
> +   <structname>departments</structname> tables:
> +<programlisting>
> +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE
> users, departments;
> +</programlisting></para>
>
> Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> show TABLE keyword is optional.

Modified

> ~~~
>
> 4. doc/src/sgml/ref/create_publication.sgml
>
> An SGML tag error caused building the docs to fail. My fix was
> previously reported [1].

Modified

> ~~~
>
> 5. doc/src/sgml/ref/create_publication.sgml
>
> @@ -22,7 +22,7 @@ PostgreSQL documentation
>   <refsynopsisdiv>
>  <synopsis>
>  CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
> -    [ FOR ALL TABLES
> +    [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
>
> The "exception_object" font is wrong. Should look the same as
> "publication_object"

Modified

> ~~~
>
> 6. doc/src/sgml/ref/create_publication.sgml - Examples
>
> @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR
> TABLE users, departments, ALL TABL
>  CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;
>  </programlisting></para>
>
> +  <para>
> +   Create a publication that publishes all changes in all the tables except for
> +   the changes of <structname>users</structname> and
> +   <structname>departments</structname> table:
> +<programlisting>
> +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments;
> +</programlisting>
> +  </para>
> +
>
> 6a.
> Typo: "FOR ALL TABLE" -> "FOR ALL TABLES"

Modified

> 6b.
> Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> show TABLE keyword is optional.

Modified

> ~~~
>
> 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
>
> @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List
> *ancestors, int *ancestor_level
>   }
>   else
>   {
> - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> - if (list_member_oid(aschemaPubids, puboid))
> + List    *aschemapubids = NIL;
> + List    *aexceptpubids = NIL;
> +
> + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
> + aexceptpubids = GetRelationPublications(ancestor, true);
> + if (list_member_oid(aschemapubids, puboid) ||
> + (puballtables && !list_member_oid(aexceptpubids, puboid)))
>   {
>
> You could re-write this as multiple conditions instead of one. That
> could avoid always assigning the 'aexceptpubids', so it might be a
> more efficient way to write this logic.

Modified

> ~~~
>
> 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues
>
> +/*
> + * Check if the publication has default values
> + *
> + * Check the following:
> + * Publication is having default options
> + *  Publication is not associated with relations
> + *  Publication is not associated with schemas
> + *  Publication is not set with "FOR ALL TABLES"
> + */
> +static bool
> +CheckPublicationDefValues(HeapTuple tup)
>
> 8a.
> Remove the tab. Replace with spaces.

Modified

> 8b.
> It might be better if this comment order is the same as the logic order.
> e.g.
>
> * Check the following:
> *  Publication is not set with "FOR ALL TABLES"
> *  Publication is having default options
> *  Publication is not associated with schemas
> *  Publication is not associated with relations

Modified

> ~~~
>
> 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, publication relations and
> publication schemas.
> + */
> +static void
> +AlterPublicationSetAllTables(Relation rel, HeapTuple tup)
>
> The function comment and the function name do not seem to match here;
> something looks like a cut/paste error ??

Modified

> ~~~
>
> 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
>
> + /* set all tables option */
> + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true);
> + replaces[Anum_pg_publication_puballtables - 1] = true;
>
> SUGGEST (comment)
> /* set all ALL TABLES flag */

Modified

> ~~~
>
> 11. src/backend/catalog/pg_publication.c - AlterPublication
>
> @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> AlterPublicationStmt *stmt)
>   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
>      stmt->pubname);
>
> + if (stmt->for_all_tables)
> + {
> + bool isdefault = CheckPublicationDefValues(tup);
> +
> + if (!isdefault)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> default values",
> +    stmt->pubname),
> + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
>
> The errmsg should start with a lowercase letter.

Modified

> ~~~
>
> 12. src/backend/catalog/pg_publication.c - AlterPublication
>
> @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> AlterPublicationStmt *stmt)
>   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
>      stmt->pubname);
>
> + if (stmt->for_all_tables)
> + {
> + bool isdefault = CheckPublicationDefValues(tup);
> +
> + if (!isdefault)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> default values",
> +    stmt->pubname),
> + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
>
> Example test:
>
> postgres=# create table t1(a int);
> CREATE TABLE
> postgres=# create publication p1 for table t1;
> CREATE PUBLICATION
> postgres=# alter publication p1 add all tables except t1;
> 2022-05-20 14:34:49.301 AEST [21802] ERROR:  Setting ALL TABLES
> requires publication "p1" to have default values
> 2022-05-20 14:34:49.301 AEST [21802] HINT:  Use ALTER PUBLICATION ...
> RESET to reset the publication
> 2022-05-20 14:34:49.301 AEST [21802] STATEMENT:  alter publication p1
> add all tables except t1;
> ERROR:  Setting ALL TABLES requires publication "p1" to have default values
> HINT:  Use ALTER PUBLICATION ... RESET to reset the publication
> postgres=# alter publication p1 set all tables except t1;
>
> That error message does not quite match what the user was doing.
> Firstly, they were adding the ALL TABLES, not setting it. Secondly,
> all the values of the publication were already defaults (only there
> was an existing table t1 in the publication). Maybe some minor changes
> to the message wording can be a better reflect what the user is doing
> here.

Modified

> ~~~
>
> 13. src/backend/parser/gram.y
>
> @@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE
> aggregate_with_argtypes OWNER TO RoleSpec
>   *
>   * CREATE PUBLICATION name [WITH options]
>   *
> - * CREATE PUBLICATION FOR ALL TABLES [WITH options]
> + * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]]
> [WITH options]
>
> Comment should show the "TABLE" keyword is optional

Modified

> ~~~
>
> 14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable
>
> @@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const
> PublicationRelInfo *pubrinfo)
>
>   appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
>     fmtId(pubinfo->dobj.name));
> +
>   appendPQExpBuffer(query, " %s",
>     fmtQualifiedDumpable(tbinfo));
>
> This additional whitespace seems unrelated to this patch

Modified

> ~~~
>
> 15. src/include/nodes/parsenodes.h
>
> 15a.
> @@ -3999,6 +3999,7 @@ typedef struct PublicationTable
>   RangeVar   *relation; /* relation to be published */
>   Node    *whereClause; /* qualifications */
>   List    *columns; /* List of columns in a publication table */
> + bool except; /* except relation */
>  } PublicationTable;
>
> Maybe the comment should be more like similar ones:
> /* exclude the relation */

Modified

> 15b.
> @@ -4007,6 +4008,7 @@ typedef struct PublicationTable
>  typedef enum PublicationObjSpecType
>  {
>   PUBLICATIONOBJ_TABLE, /* A table */
> + PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */
>   PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */
>   PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of
>
> Maybe the comment should be more like:
> /* A table to be excluded */

Modified

> ~~~
>
> 16. src/test/regress/sql/publication.sql
>
> I did not see any test cases using EXCEPT when the optional TABLE
> keyword is omitted.

Added a test

Thanks for the comments, the v7 patch attached at [1] has the changes
for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Sat, May 21, 2022 at 11:06 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, May 20, 2022 at 11:23 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Below are my review comments for v6-0002.
> >
> > ======
> >
> > 1. Commit message.
> > The psql \d family of commands to display excluded tables.
> >
> > SUGGESTION
> > The psql \d family of commands can now display excluded tables.
>
> Modified
>
> > ~~~
> >
> > 2. doc/src/sgml/ref/alter_publication.sgml
> >
> > @@ -22,6 +22,7 @@ PostgreSQL documentation
> >   <refsynopsisdiv>
> >  <synopsis>
> >  ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> > ADD <replaceable class="parameter">publication_object</replaceable> [,
> > ...]
> > +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> > ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
> >
> > The "exception_object" font is wrong. Should look the same as
> > "publication_object"
>
> Modified
>
> > ~~~
> >
> > 3. doc/src/sgml/ref/alter_publication.sgml - Examples
> >
> > @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL
> > TABLES IN SCHEMA marketing, sales;
> >  </programlisting>
> >    </para>
> >
> > +  <para>
> > +   Alter publication <structname>production_publication</structname> to publish
> > +   all tables except <structname>users</structname> and
> > +   <structname>departments</structname> tables:
> > +<programlisting>
> > +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE
> > users, departments;
> > +</programlisting></para>
> >
> > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> > show TABLE keyword is optional.
>
> Modified
>
> > ~~~
> >
> > 4. doc/src/sgml/ref/create_publication.sgml
> >
> > An SGML tag error caused building the docs to fail. My fix was
> > previously reported [1].
>
> Modified
>
> > ~~~
> >
> > 5. doc/src/sgml/ref/create_publication.sgml
> >
> > @@ -22,7 +22,7 @@ PostgreSQL documentation
> >   <refsynopsisdiv>
> >  <synopsis>
> >  CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
> > -    [ FOR ALL TABLES
> > +    [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
> >
> > The "exception_object" font is wrong. Should look the same as
> > "publication_object"
>
> Modified
>
> > ~~~
> >
> > 6. doc/src/sgml/ref/create_publication.sgml - Examples
> >
> > @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR
> > TABLE users, departments, ALL TABL
> >  CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales;
> >  </programlisting></para>
> >
> > +  <para>
> > +   Create a publication that publishes all changes in all the tables except for
> > +   the changes of <structname>users</structname> and
> > +   <structname>departments</structname> table:
> > +<programlisting>
> > +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments;
> > +</programlisting>
> > +  </para>
> > +
> >
> > 6a.
> > Typo: "FOR ALL TABLE" -> "FOR ALL TABLES"
>
> Modified
>
> > 6b.
> > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> > show TABLE keyword is optional.
>
> Modified
>
> > ~~~
> >
> > 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
> >
> > @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List
> > *ancestors, int *ancestor_level
> >   }
> >   else
> >   {
> > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> > - if (list_member_oid(aschemaPubids, puboid))
> > + List    *aschemapubids = NIL;
> > + List    *aexceptpubids = NIL;
> > +
> > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
> > + aexceptpubids = GetRelationPublications(ancestor, true);
> > + if (list_member_oid(aschemapubids, puboid) ||
> > + (puballtables && !list_member_oid(aexceptpubids, puboid)))
> >   {
> >
> > You could re-write this as multiple conditions instead of one. That
> > could avoid always assigning the 'aexceptpubids', so it might be a
> > more efficient way to write this logic.
>
> Modified
>
> > ~~~
> >
> > 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues
> >
> > +/*
> > + * Check if the publication has default values
> > + *
> > + * Check the following:
> > + * Publication is having default options
> > + *  Publication is not associated with relations
> > + *  Publication is not associated with schemas
> > + *  Publication is not set with "FOR ALL TABLES"
> > + */
> > +static bool
> > +CheckPublicationDefValues(HeapTuple tup)
> >
> > 8a.
> > Remove the tab. Replace with spaces.
>
> Modified
>
> > 8b.
> > It might be better if this comment order is the same as the logic order.
> > e.g.
> >
> > * Check the following:
> > *  Publication is not set with "FOR ALL TABLES"
> > *  Publication is having default options
> > *  Publication is not associated with schemas
> > *  Publication is not associated with relations
>
> Modified
>
> > ~~~
> >
> > 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
> >
> > +/*
> > + * Reset the publication.
> > + *
> > + * Reset the publication options, publication relations and
> > publication schemas.
> > + */
> > +static void
> > +AlterPublicationSetAllTables(Relation rel, HeapTuple tup)
> >
> > The function comment and the function name do not seem to match here;
> > something looks like a cut/paste error ??
>
> Modified
>
> > ~~~
> >
> > 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
> >
> > + /* set all tables option */
> > + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true);
> > + replaces[Anum_pg_publication_puballtables - 1] = true;
> >
> > SUGGEST (comment)
> > /* set all ALL TABLES flag */
>
> Modified
>
> > ~~~
> >
> > 11. src/backend/catalog/pg_publication.c - AlterPublication
> >
> > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> > AlterPublicationStmt *stmt)
> >   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
> >      stmt->pubname);
> >
> > + if (stmt->for_all_tables)
> > + {
> > + bool isdefault = CheckPublicationDefValues(tup);
> > +
> > + if (!isdefault)
> > + ereport(ERROR,
> > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> > default values",
> > +    stmt->pubname),
> > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> >
> > The errmsg should start with a lowercase letter.
>
> Modified
>
> > ~~~
> >
> > 12. src/backend/catalog/pg_publication.c - AlterPublication
> >
> > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> > AlterPublicationStmt *stmt)
> >   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
> >      stmt->pubname);
> >
> > + if (stmt->for_all_tables)
> > + {
> > + bool isdefault = CheckPublicationDefValues(tup);
> > +
> > + if (!isdefault)
> > + ereport(ERROR,
> > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > + errmsg("Setting ALL TABLES requires publication \"%s\" to have
> > default values",
> > +    stmt->pubname),
> > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> >
> > Example test:
> >
> > postgres=# create table t1(a int);
> > CREATE TABLE
> > postgres=# create publication p1 for table t1;
> > CREATE PUBLICATION
> > postgres=# alter publication p1 add all tables except t1;
> > 2022-05-20 14:34:49.301 AEST [21802] ERROR:  Setting ALL TABLES
> > requires publication "p1" to have default values
> > 2022-05-20 14:34:49.301 AEST [21802] HINT:  Use ALTER PUBLICATION ...
> > RESET to reset the publication
> > 2022-05-20 14:34:49.301 AEST [21802] STATEMENT:  alter publication p1
> > add all tables except t1;
> > ERROR:  Setting ALL TABLES requires publication "p1" to have default values
> > HINT:  Use ALTER PUBLICATION ... RESET to reset the publication
> > postgres=# alter publication p1 set all tables except t1;
> >
> > That error message does not quite match what the user was doing.
> > Firstly, they were adding the ALL TABLES, not setting it. Secondly,
> > all the values of the publication were already defaults (only there
> > was an existing table t1 in the publication). Maybe some minor changes
> > to the message wording can be a better reflect what the user is doing
> > here.
>
> Modified
>
> > ~~~
> >
> > 13. src/backend/parser/gram.y
> >
> > @@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE
> > aggregate_with_argtypes OWNER TO RoleSpec
> >   *
> >   * CREATE PUBLICATION name [WITH options]
> >   *
> > - * CREATE PUBLICATION FOR ALL TABLES [WITH options]
> > + * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]]
> > [WITH options]
> >
> > Comment should show the "TABLE" keyword is optional
>
> Modified
>
> > ~~~
> >
> > 14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable
> >
> > @@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const
> > PublicationRelInfo *pubrinfo)
> >
> >   appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
> >     fmtId(pubinfo->dobj.name));
> > +
> >   appendPQExpBuffer(query, " %s",
> >     fmtQualifiedDumpable(tbinfo));
> >
> > This additional whitespace seems unrelated to this patch
>
> Modified
>
> > ~~~
> >
> > 15. src/include/nodes/parsenodes.h
> >
> > 15a.
> > @@ -3999,6 +3999,7 @@ typedef struct PublicationTable
> >   RangeVar   *relation; /* relation to be published */
> >   Node    *whereClause; /* qualifications */
> >   List    *columns; /* List of columns in a publication table */
> > + bool except; /* except relation */
> >  } PublicationTable;
> >
> > Maybe the comment should be more like similar ones:
> > /* exclude the relation */
>
> Modified
>
> > 15b.
> > @@ -4007,6 +4008,7 @@ typedef struct PublicationTable
> >  typedef enum PublicationObjSpecType
> >  {
> >   PUBLICATIONOBJ_TABLE, /* A table */
> > + PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */
> >   PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */
> >   PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of
> >
> > Maybe the comment should be more like:
> > /* A table to be excluded */
>
> Modified
>
> > ~~~
> >
> > 16. src/test/regress/sql/publication.sql
> >
> > I did not see any test cases using EXCEPT when the optional TABLE
> > keyword is omitted.
>
> Added a test
>
> Thanks for the comments, the v7 patch attached at [1] has the changes
> for the same.
> [1] -
https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com

Attached v7 patch which fixes the buildfarm warning for an unused
warning in release mode as in  [1].
[1] - https://cirrus-ci.com/task/6220288017825792

Regards,
Vignesh

Вложения

RE: Skipping schema changes in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> Attached v7 patch which fixes the buildfarm warning for an unused warning in
> release mode as in  [1].
Hi, thank you for the patches.


I'll share several review comments.

For v7-0001.

(1) I'll suggest some minor rewording.

+  <para>
+   The <literal>RESET</literal> clause will reset the publication to the
+   default state which includes resetting the publication options, setting
+   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
+   dropping all relations and schemas that are associated with the publication.

My suggestion is
"The RESET clause will reset the publication to the
default state. It resets the publication operations,
sets ALL TABLES flag to false and drops all relations
and schemas associated with the publication."

(2) typo and rewording

+/*
+ * Reset the publication.
+ *
+ * Reset the publication options, setting ALL TABLES flag to false and drop
+ * all relations and schemas that are associated with the publication.
+ */

The "setting" in this sentence should be "set".

How about changing like below ?
FROM:
"Reset the publication options, setting ALL TABLES flag to false and drop
all relations and schemas that are associated with the publication."
TO:
"Reset the publication operations, set ALL TABLES flag to false and drop
all relations and schemas associated with the publication."

(3) AlterPublicationReset

Do we need to call CacheInvalidateRelcacheAll() or
InvalidatePublicationRels() at the end of
AlterPublicationReset() like AlterPublicationOptions() ?


For v7-0002.

(4)

+       if (stmt->for_all_tables)
+       {
+               bool            isdefault = CheckPublicationDefValues(tup);
+
+               if (!isdefault)
+                       ereport(ERROR,
+                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+                                       errmsg("adding ALL TABLES requires the publication to have default publication
options,no tables/....
 
+                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));


The errmsg string has three messages for user and is a bit long
(we have two sentences there connected by 'and').
Can't we make it concise and split it into a couple of lines for code readability ?

I'll suggest a change below.
FROM:
"adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALL
TABLESflag should not be set"
 
TO:
"adding ALL TABLES requires the publication defined not for ALL TABLES"
"to have default publish actions without any associated tables/schemas"

(5) typo

   <varlistentry>
+    <term><literal>EXCEPT TABLE</literal></term>
+    <listitem>
+     <para>
+      This clause specifies a list of tables to exclude from the publication.
+      It can only be used with <literal>FOR ALL TABLES</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+

Kindly change
FROM:
This clause specifies a list of tables to exclude from the publication.
TO:
This clause specifies a list of tables to be excluded from the publication.
or
This clause specifies a list of tables excluded from the publication.

(6) Minor suggestion for an expression change

       Marks the publication as one that replicates changes for all tables in
-      the database, including tables created in the future.
+      the database, including tables created in the future. If
+      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
+      the changes for the specified tables.


I'll suggest a minor rewording.
FROM:
...exclude replicating the changes for the specified tables
TO:
...exclude replication changes for the specified tables

(7)
(7-1)

+/*
+ * Check if the publication has default values
+ *
+ * Check the following:
+ * a) Publication is not set with "FOR ALL TABLES"
+ * b) Publication is having default options
+ * c) Publication is not associated with schemas
+ * d) Publication is not associated with relations
+ */
+static bool
+CheckPublicationDefValues(HeapTuple tup)


I think this header comment can be improved.
FROM:
Check the following:
TO:
Returns true if the publication satisfies all the following conditions:

(7-2)

b) should be changed as well
FROM:
Publication is having default options
TO:
Publication has the default publish operations



Best Regards,
    Takamichi Osumi


Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Here are some minor review comments for v7-0001.

======

1. General

Probably the commit message and all the PG docs and code comments
should be changed to refer to "publication parameters" instead of
(currently) "publication options". This is because these things are
really called "publication_parameters" in the PG docs [1].

All the following review comments are just examples of this suggestion.

~~~

2. Commit message

"includes resetting the publication options," -> "includes resetting
the publication parameters,"

~~~

3. doc/src/sgml/ref/alter_publication.sgml

+  <para>
+   The <literal>RESET</literal> clause will reset the publication to the
+   default state which includes resetting the publication options, setting
+   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
+   dropping all relations and schemas that are associated with the publication.
   </para>


"resetting the publication options," -> "resetting the publication parameters,"

~~~

4. src/backend/commands/publicationcmds.c

@@ -53,6 +53,14 @@
 #include "utils/syscache.h"
 #include "utils/varlena.h"

+/* CREATE PUBLICATION default values for flags and options */
+#define PUB_DEFAULT_ACTION_INSERT true
+#define PUB_DEFAULT_ACTION_UPDATE true
+#define PUB_DEFAULT_ACTION_DELETE true
+#define PUB_DEFAULT_ACTION_TRUNCATE true
+#define PUB_DEFAULT_VIA_ROOT false
+#define PUB_DEFAULT_ALL_TABLES false

"flags and options" -> "flags and publication parameters"

~~~

5. src/backend/commands/publicationcmds.c

+/*
+ * Reset the publication.
+ *
+ * Reset the publication options, setting ALL TABLES flag to false and drop
+ * all relations and schemas that are associated with the publication.
+ */
+static void
+AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt,
+   Relation rel, HeapTuple tup)

"Reset the publication options," -> "Reset the publication parameters,"

~~~

6. src/test/regress/sql/publication.sql

+-- Verify that publish options and publish_via_partition_root option are reset
+\dRp+ testpub_reset
+ALTER PUBLICATION testpub_reset RESET;
+\dRp+ testpub_reset

SUGGESTION
-- Verify that 'publish' and 'publish_via_partition_root' publication
parameters are reset

------
[1] https://www.postgresql.org/docs/current/sql-createpublication.html

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Here are my review comments for patch v7-0002.

======

1. doc/src/sgml/logical-replication.sgml

@@ -1167,8 +1167,9 @@ CONTEXT:  processing remote data for replication
origin "pg_16395" during "INSER
   <para>
    To add tables to a publication, the user must have ownership rights on the
    table. To add all tables in schema to a publication, the user must be a
-   superuser. To create a publication that publishes all tables or
all tables in
-   schema automatically, the user must be a superuser.
+   superuser. To add all tables to a publication, the user must be a superuser.
+   To create a publication that publishes all tables or all tables in schema
+   automatically, the user must be a superuser.
   </para>

I felt that maybe this whole paragraph should be rearranged. Put the
"create publication" parts before the "alter publication" parts;
Re-word the sentences more similarly. I also felt the ALL TABLES and
ALL TABLES IN SCHEMA etc should be written uppercase/literal since
that is what was meant.

SUGGESTION
To create a publication using FOR ALL TABLES or FOR ALL TABLES IN
SCHEMA, the user must be a superuser. To add ALL TABLES or ALL TABLES
IN SCHEMA to a publication, the user must be a superuser. To add
tables to a publication, the user must have ownership rights on the
table.

~~~

2. doc/src/sgml/ref/alter_publication.sgml

@@ -82,8 +88,8 @@ ALTER PUBLICATION <replaceable
class="parameter">name</replaceable> RESET

   <para>
    You must own the publication to use <command>ALTER PUBLICATION</command>.
-   Adding a table to a publication additionally requires owning that table.
-   The <literal>ADD ALL TABLES IN SCHEMA</literal>,
+   Adding a table to or excluding a table from a publication additionally
+   requires owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal>,
    <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and

Isn't this missing some information that says ADD ALL TABLES requires
the invoking user to be a superuser?

~~~

3. doc/src/sgml/ref/alter_publication.sgml - examples

+  <para>
+   Alter publication <structname>production_publication</structname> to publish
+   all tables except <structname>users</structname> and
+   <structname>departments</structname> tables:
+<programlisting>
+ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT users,
departments;
+</programlisting></para>
+

I didn't think it needs to say "tables" 2x (e.g. remove the last "tables")

~~~

4. doc/src/sgml/ref/create_publication.sgml - examples

+  <para>
+   Create a publication that publishes all changes in all the tables except for
+   the changes of <structname>users</structname> and
+   <structname>departments</structname> tables:
+<programlisting>
+CREATE PUBLICATION mypublication FOR ALL TABLES EXCEPT users, departments;
+</programlisting>
+  </para>

I didn't think it needs to say "tables" 2x (e.g. remove the last "tables")

~~~

5. src/backend/catalog/pg_publication.c

  foreach(lc, ancestors)
  {
  Oid ancestor = lfirst_oid(lc);
- List    *apubids = GetRelationPublications(ancestor);
- List    *aschemaPubids = NIL;
+ List    *apubids = GetRelationPublications(ancestor, false);
+ List    *aschemapubids = NIL;
+ List    *aexceptpubids = NIL;

  level++;

- if (list_member_oid(apubids, puboid))
+ /* check if member of table publications */
+ if (!list_member_oid(apubids, puboid))
  {
- topmost_relid = ancestor;
-
- if (ancestor_level)
- *ancestor_level = level;
- }
- else
- {
- aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
- if (list_member_oid(aschemaPubids, puboid))
+ /* check if member of schema publications */
+ aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
+ if (!list_member_oid(aschemapubids, puboid))
  {
- topmost_relid = ancestor;
-
- if (ancestor_level)
- *ancestor_level = level;
+ /*
+ * If the publication is all tables publication and the table
+ * is not part of exception tables.
+ */
+ if (puballtables)
+ {
+ aexceptpubids = GetRelationPublications(ancestor, true);
+ if (list_member_oid(aexceptpubids, puboid))
+ goto next;
+ }
+ else
+ goto next;
  }
  }

+ topmost_relid = ancestor;
+
+ if (ancestor_level)
+ *ancestor_level = level;
+
+next:
  list_free(apubids);
- list_free(aschemaPubids);
+ list_free(aschemapubids);
+ list_free(aexceptpubids);
  }


I felt those negative (!) conditions and those goto are making this
logic hard to understand. Can’t it be simplified more than this? Even
just having another bool flag might help make it easier.

e.g. Perhaps something a bit like this (but add some comments)

foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
List    *apubids = GetRelationPublications(ancestor);
List    *aschemaPubids = NIL;
List    *aexceptpubids = NIL;
bool set_top = false;
level++;

set_top = list_member_oid(apubids, puboid);
if (!set_top)
{
aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
set_top = list_member_oid(aschemaPubids, puboid);

if (!set_top && puballtables)
{
aexceptpubids = GetRelationPublications(ancestor, true);
set_top = !list_member_oid(aexceptpubids, puboid);
}
}
if (set_top)
{
topmost_relid = ancestor;

if (ancestor_level)
*ancestor_level = level;
}

list_free(apubids);
list_free(aschemapubids);
list_free(aexceptpubids);
}

------

6. src/backend/commands/publicationcmds.c - CheckPublicationDefValues

+/*
+ * Check if the publication has default values
+ *
+ * Check the following:
+ * a) Publication is not set with "FOR ALL TABLES"
+ * b) Publication is having default options
+ * c) Publication is not associated with schemas
+ * d) Publication is not associated with relations
+ */
+static bool
+CheckPublicationDefValues(HeapTuple tup)

I think Osumi-san already gave a review [1] about this same comment.

So I only wanted to add that it should not say "options" here:
"default options" -> "default publication parameter values"

~~~

7. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables

+#ifdef USE_ASSERT_CHECKING
+ Assert(!pubform->puballtables);
+#endif

Why is this #ifdef needed? Isn't that logic built into the Assert macro already?

~~~

8. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables

+ /* set ALL TABLES flag */

Use uppercase 'S' to match other comments.

~~~

9. src/backend/commands/publicationcmds.c - AlterPublication

+ if (!isdefault)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("adding ALL TABLES requires the publication to have default
publication options, no tables/schemas associated and ALL TABLES flag
should not be set"),
+ errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));

IMO this errmsg text is not very good but I think Osumi-san [1] has
also given a review comment about the same errmsg.

So I only wanted to add that should not say "options" here:
"default publication options" -> "default publication parameter values"

~~~

10. src/backend/parser/gram.y

/*****************************************************************************
 *
 * ALTER PUBLICATION name SET ( options )
 *
 * ALTER PUBLICATION name ADD pub_obj [, ...]
 *
 * ALTER PUBLICATION name DROP pub_obj [, ...]
 *
 * ALTER PUBLICATION name SET pub_obj [, ...]
 *
 * ALTER PUBLICATION name RESET
 *
 * pub_obj is one of:
 *
 * TABLE table_name [, ...]
 * ALL TABLES IN SCHEMA schema_name [, ...]
 *
 *****************************************************************************/

-

 Should the above comment be updated to mention also ADD ALL TABLES
... EXCEPT [TABLE] ...

~~~

11. src/bin/pg_dump/pg_dump.c - dumpPublication

+ /* Include exception tables if the publication has except tables */
+ for (cell = exceptinfo.head; cell; cell = cell->next)
+ {
+ PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
+ PublicationInfo *relpubinfo = pubrinfo->publication;
+ TableInfo  *tbinfo;
+
+ if (pubinfo == relpubinfo)

I am unsure if that variable 'relpubinfo' is of much use; it is only
used one time.

~~~

12. src/bin/pg_dump/t/002_pg_dump.pl

I think there should be more test cases here:

E.g.1. EXCEPT TABLE should also test a list of tables

E.g.2. EXCEPT with optional TABLE keyword ommitted

~~~

13. src/bin/psql/describe.c - question about the SQL

Since the new 'except' is a boolean column, wouldn't it be more
natural if all the SQL was treating it as one?

e.g. should the SQL be saying "IS preexpect", "IS NOT prexcept";
instead of comparing preexpect to 't' and 'f' character.

~~~

14. .../t/032_rep_changes_except_table.pl

+# Test replication with publications created using FOR ALL TABLES EXCEPT TABLE
+# option.
+# Create schemas and tables on publisher

"option" -> "clause"

------
[1]
https://www.postgresql.org/message-id/TYCPR01MB83730A2F1D6A5303E9C1416AEDD99%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > release mode as in  [1].
> Hi, thank you for the patches.
>
>
> I'll share several review comments.
>
> For v7-0001.
>
> (1) I'll suggest some minor rewording.
>
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to the
> +   default state which includes resetting the publication options, setting
> +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> +   dropping all relations and schemas that are associated with the publication.
>
> My suggestion is
> "The RESET clause will reset the publication to the
> default state. It resets the publication operations,
> sets ALL TABLES flag to false and drops all relations
> and schemas associated with the publication."

I felt the existing looks better. I would prefer to keep it that way.

> (2) typo and rewording
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, setting ALL TABLES flag to false and drop
> + * all relations and schemas that are associated with the publication.
> + */
>
> The "setting" in this sentence should be "set".
>
> How about changing like below ?
> FROM:
> "Reset the publication options, setting ALL TABLES flag to false and drop
> all relations and schemas that are associated with the publication."
> TO:
> "Reset the publication operations, set ALL TABLES flag to false and drop
> all relations and schemas associated with the publication."

 I felt the existing looks better. I would prefer to keep it that way.

> (3) AlterPublicationReset
>
> Do we need to call CacheInvalidateRelcacheAll() or
> InvalidatePublicationRels() at the end of
> AlterPublicationReset() like AlterPublicationOptions() ?

CacheInvalidateRelcacheAll should be called if we change all tables
from true to false, else the cache will not be invalidated. Modified

>
> For v7-0002.
>
> (4)
>
> +       if (stmt->for_all_tables)
> +       {
> +               bool            isdefault = CheckPublicationDefValues(tup);
> +
> +               if (!isdefault)
> +                       ereport(ERROR,
> +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
>
>
> The errmsg string has three messages for user and is a bit long
> (we have two sentences there connected by 'and').
> Can't we make it concise and split it into a couple of lines for code readability ?
>
> I'll suggest a change below.
> FROM:
> "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALL
TABLESflag should not be set"
 
> TO:
> "adding ALL TABLES requires the publication defined not for ALL TABLES"
> "to have default publish actions without any associated tables/schemas"

Added errdetail and split it

> (5) typo
>
>    <varlistentry>
> +    <term><literal>EXCEPT TABLE</literal></term>
> +    <listitem>
> +     <para>
> +      This clause specifies a list of tables to exclude from the publication.
> +      It can only be used with <literal>FOR ALL TABLES</literal>.
> +     </para>
> +    </listitem>
> +   </varlistentry>
> +
>
> Kindly change
> FROM:
> This clause specifies a list of tables to exclude from the publication.
> TO:
> This clause specifies a list of tables to be excluded from the publication.
> or
> This clause specifies a list of tables excluded from the publication.

Modified

> (6) Minor suggestion for an expression change
>
>        Marks the publication as one that replicates changes for all tables in
> -      the database, including tables created in the future.
> +      the database, including tables created in the future. If
> +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> +      the changes for the specified tables.
>
>
> I'll suggest a minor rewording.
> FROM:
> ...exclude replicating the changes for the specified tables
> TO:
> ...exclude replication changes for the specified tables

I felt the existing is better.

> (7)
> (7-1)
>
> +/*
> + * Check if the publication has default values
> + *
> + * Check the following:
> + * a) Publication is not set with "FOR ALL TABLES"
> + * b) Publication is having default options
> + * c) Publication is not associated with schemas
> + * d) Publication is not associated with relations
> + */
> +static bool
> +CheckPublicationDefValues(HeapTuple tup)
>
>
> I think this header comment can be improved.
> FROM:
> Check the following:
> TO:
> Returns true if the publication satisfies all the following conditions:

Modified

> (7-2)
>
> b) should be changed as well
> FROM:
> Publication is having default options
> TO:
> Publication has the default publish operations

Changed it to "Publication is having default publication parameter values"

Thanks for the comments, the attached v8 patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
'On Mon, May 30, 2022 at 1:51 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some minor review comments for v7-0001.
>
> ======
>
> 1. General
>
> Probably the commit message and all the PG docs and code comments
> should be changed to refer to "publication parameters" instead of
> (currently) "publication options". This is because these things are
> really called "publication_parameters" in the PG docs [1].
>
> All the following review comments are just examples of this suggestion.

Modified

> ~~~
>
> 2. Commit message
>
> "includes resetting the publication options," -> "includes resetting
> the publication parameters,"

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml
>
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to the
> +   default state which includes resetting the publication options, setting
> +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> +   dropping all relations and schemas that are associated with the publication.
>    </para>
>
>
> "resetting the publication options," -> "resetting the publication parameters,"

Modified

> ~~~
>
> 4. src/backend/commands/publicationcmds.c
>
> @@ -53,6 +53,14 @@
>  #include "utils/syscache.h"
>  #include "utils/varlena.h"
>
> +/* CREATE PUBLICATION default values for flags and options */
> +#define PUB_DEFAULT_ACTION_INSERT true
> +#define PUB_DEFAULT_ACTION_UPDATE true
> +#define PUB_DEFAULT_ACTION_DELETE true
> +#define PUB_DEFAULT_ACTION_TRUNCATE true
> +#define PUB_DEFAULT_VIA_ROOT false
> +#define PUB_DEFAULT_ALL_TABLES false
>
> "flags and options" -> "flags and publication parameters"

Modified

> ~~~
>
> 5. src/backend/commands/publicationcmds.c
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, setting ALL TABLES flag to false and drop
> + * all relations and schemas that are associated with the publication.
> + */
> +static void
> +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt,
> +   Relation rel, HeapTuple tup)
>
> "Reset the publication options," -> "Reset the publication parameters,"

Modified

> ~~~
>
> 6. src/test/regress/sql/publication.sql
>
> +-- Verify that publish options and publish_via_partition_root option are reset
> +\dRp+ testpub_reset
> +ALTER PUBLICATION testpub_reset RESET;
> +\dRp+ testpub_reset
>
> SUGGESTION
> -- Verify that 'publish' and 'publish_via_partition_root' publication
> parameters are reset

Modified, I have split this into two tests as it will help the 0002
patch to add few tests with the existing steps for  'publish' and
'publish_via_partition_root' publication parameter.

Thanks for the comments. the v8 patch attached at [1] has the fixes
for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0sAU4s1KTLOEWv%3DrYo5dQK6uFTJn_0FKj3XG1Nv4D-qw%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Tue, May 31, 2022 at 11:51 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are my review comments for patch v7-0002.
>
> ======
>
> 1. doc/src/sgml/logical-replication.sgml
>
> @@ -1167,8 +1167,9 @@ CONTEXT:  processing remote data for replication
> origin "pg_16395" during "INSER
>    <para>
>     To add tables to a publication, the user must have ownership rights on the
>     table. To add all tables in schema to a publication, the user must be a
> -   superuser. To create a publication that publishes all tables or
> all tables in
> -   schema automatically, the user must be a superuser.
> +   superuser. To add all tables to a publication, the user must be a superuser.
> +   To create a publication that publishes all tables or all tables in schema
> +   automatically, the user must be a superuser.
>    </para>
>
> I felt that maybe this whole paragraph should be rearranged. Put the
> "create publication" parts before the "alter publication" parts;
> Re-word the sentences more similarly. I also felt the ALL TABLES and
> ALL TABLES IN SCHEMA etc should be written uppercase/literal since
> that is what was meant.
>
> SUGGESTION
> To create a publication using FOR ALL TABLES or FOR ALL TABLES IN
> SCHEMA, the user must be a superuser. To add ALL TABLES or ALL TABLES
> IN SCHEMA to a publication, the user must be a superuser. To add
> tables to a publication, the user must have ownership rights on the
> table.

Modified

> ~~~
>
> 2. doc/src/sgml/ref/alter_publication.sgml
>
> @@ -82,8 +88,8 @@ ALTER PUBLICATION <replaceable
> class="parameter">name</replaceable> RESET
>
>    <para>
>     You must own the publication to use <command>ALTER PUBLICATION</command>.
> -   Adding a table to a publication additionally requires owning that table.
> -   The <literal>ADD ALL TABLES IN SCHEMA</literal>,
> +   Adding a table to or excluding a table from a publication additionally
> +   requires owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal>,
>     <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and
>
> Isn't this missing some information that says ADD ALL TABLES requires
> the invoking user to be a superuser?

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml - examples
>
> +  <para>
> +   Alter publication <structname>production_publication</structname> to publish
> +   all tables except <structname>users</structname> and
> +   <structname>departments</structname> tables:
> +<programlisting>
> +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT users,
> departments;
> +</programlisting></para>
> +
>
> I didn't think it needs to say "tables" 2x (e.g. remove the last "tables")

Modified

> ~~~
>
> 4. doc/src/sgml/ref/create_publication.sgml - examples
>
> +  <para>
> +   Create a publication that publishes all changes in all the tables except for
> +   the changes of <structname>users</structname> and
> +   <structname>departments</structname> tables:
> +<programlisting>
> +CREATE PUBLICATION mypublication FOR ALL TABLES EXCEPT users, departments;
> +</programlisting>
> +  </para>
>
> I didn't think it needs to say "tables" 2x (e.g. remove the last "tables")

Modified

> ~~~
>
> 5. src/backend/catalog/pg_publication.c
>
>   foreach(lc, ancestors)
>   {
>   Oid ancestor = lfirst_oid(lc);
> - List    *apubids = GetRelationPublications(ancestor);
> - List    *aschemaPubids = NIL;
> + List    *apubids = GetRelationPublications(ancestor, false);
> + List    *aschemapubids = NIL;
> + List    *aexceptpubids = NIL;
>
>   level++;
>
> - if (list_member_oid(apubids, puboid))
> + /* check if member of table publications */
> + if (!list_member_oid(apubids, puboid))
>   {
> - topmost_relid = ancestor;
> -
> - if (ancestor_level)
> - *ancestor_level = level;
> - }
> - else
> - {
> - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> - if (list_member_oid(aschemaPubids, puboid))
> + /* check if member of schema publications */
> + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
> + if (!list_member_oid(aschemapubids, puboid))
>   {
> - topmost_relid = ancestor;
> -
> - if (ancestor_level)
> - *ancestor_level = level;
> + /*
> + * If the publication is all tables publication and the table
> + * is not part of exception tables.
> + */
> + if (puballtables)
> + {
> + aexceptpubids = GetRelationPublications(ancestor, true);
> + if (list_member_oid(aexceptpubids, puboid))
> + goto next;
> + }
> + else
> + goto next;
>   }
>   }
>
> + topmost_relid = ancestor;
> +
> + if (ancestor_level)
> + *ancestor_level = level;
> +
> +next:
>   list_free(apubids);
> - list_free(aschemaPubids);
> + list_free(aschemapubids);
> + list_free(aexceptpubids);
>   }
>
>
> I felt those negative (!) conditions and those goto are making this
> logic hard to understand. Can’t it be simplified more than this? Even
> just having another bool flag might help make it easier.
>
> e.g. Perhaps something a bit like this (but add some comments)
>
> foreach(lc, ancestors)
> {
> Oid ancestor = lfirst_oid(lc);
> List    *apubids = GetRelationPublications(ancestor);
> List    *aschemaPubids = NIL;
> List    *aexceptpubids = NIL;
> bool set_top = false;
> level++;
>
> set_top = list_member_oid(apubids, puboid);
> if (!set_top)
> {
> aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> set_top = list_member_oid(aschemaPubids, puboid);
>
> if (!set_top && puballtables)
> {
> aexceptpubids = GetRelationPublications(ancestor, true);
> set_top = !list_member_oid(aexceptpubids, puboid);
> }
> }
> if (set_top)
> {
> topmost_relid = ancestor;
>
> if (ancestor_level)
> *ancestor_level = level;
> }
>
> list_free(apubids);
> list_free(aschemapubids);
> list_free(aexceptpubids);
> }

Modified

> ------
>
> 6. src/backend/commands/publicationcmds.c - CheckPublicationDefValues
>
> +/*
> + * Check if the publication has default values
> + *
> + * Check the following:
> + * a) Publication is not set with "FOR ALL TABLES"
> + * b) Publication is having default options
> + * c) Publication is not associated with schemas
> + * d) Publication is not associated with relations
> + */
> +static bool
> +CheckPublicationDefValues(HeapTuple tup)
>
> I think Osumi-san already gave a review [1] about this same comment.
>
> So I only wanted to add that it should not say "options" here:
> "default options" -> "default publication parameter values"

Modified

> ~~~
>
> 7. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables
>
> +#ifdef USE_ASSERT_CHECKING
> + Assert(!pubform->puballtables);
> +#endif
>
> Why is this #ifdef needed? Isn't that logic built into the Assert macro already?

pubform is used only for assert case. If we don't use it within #ifdef
or PG_USED_FOR_ASSERTS_ONLY, it will throw a unused variable error
without --enable-cassert like:

publicationcmds.c: In function ‘AlterPublicationSetAllTables’:
publicationcmds.c:1250:29: error: unused variable ‘pubform’
[-Werror=unused-variable]
 1250 |         Form_pg_publication pubform = (Form_pg_publication)
GETSTRUCT(tup);
      |                             ^~~~~~~
cc1: all warnings being treated as errors

> ~~~
>
> 8. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables
>
> + /* set ALL TABLES flag */
>
> Use uppercase 'S' to match other comments.

Modified

> ~~~
>
> 9. src/backend/commands/publicationcmds.c - AlterPublication
>
> + if (!isdefault)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("adding ALL TABLES requires the publication to have default
> publication options, no tables/schemas associated and ALL TABLES flag
> should not be set"),
> + errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
>
> IMO this errmsg text is not very good but I think Osumi-san [1] has
> also given a review comment about the same errmsg.
>
> So I only wanted to add that should not say "options" here:
> "default publication options" -> "default publication parameter values"

Modified

> ~~~
>
> 10. src/backend/parser/gram.y
>
> /*****************************************************************************
>  *
>  * ALTER PUBLICATION name SET ( options )
>  *
>  * ALTER PUBLICATION name ADD pub_obj [, ...]
>  *
>  * ALTER PUBLICATION name DROP pub_obj [, ...]
>  *
>  * ALTER PUBLICATION name SET pub_obj [, ...]
>  *
>  * ALTER PUBLICATION name RESET
>  *
>  * pub_obj is one of:
>  *
>  * TABLE table_name [, ...]
>  * ALL TABLES IN SCHEMA schema_name [, ...]
>  *
>  *****************************************************************************/
>
> -
>
>  Should the above comment be updated to mention also ADD ALL TABLES
> ... EXCEPT [TABLE] ...

Modified

> ~~~
>
> 11. src/bin/pg_dump/pg_dump.c - dumpPublication
>
> + /* Include exception tables if the publication has except tables */
> + for (cell = exceptinfo.head; cell; cell = cell->next)
> + {
> + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
> + PublicationInfo *relpubinfo = pubrinfo->publication;
> + TableInfo  *tbinfo;
> +
> + if (pubinfo == relpubinfo)
>
> I am unsure if that variable 'relpubinfo' is of much use; it is only
> used one time.

Removed relpubinfo

> ~~~
>
> 12. src/bin/pg_dump/t/002_pg_dump.pl
>
> I think there should be more test cases here:
>
> E.g.1. EXCEPT TABLE should also test a list of tables
>
> E.g.2. EXCEPT with optional TABLE keyword ommitted

Added a test for list of tables and modified one of the test to remove TABLE.

> ~~~
>
> 13. src/bin/psql/describe.c - question about the SQL
>
> Since the new 'except' is a boolean column, wouldn't it be more
> natural if all the SQL was treating it as one?
>
> e.g. should the SQL be saying "IS preexpect", "IS NOT prexcept";
> instead of comparing preexpect to 't' and 'f' character.

modified

> ~~~
>
> 14. .../t/032_rep_changes_except_table.pl
>
> +# Test replication with publications created using FOR ALL TABLES EXCEPT TABLE
> +# option.
> +# Create schemas and tables on publisher
>
> "option" -> "clause"

Modified.

Thanks for the comments. The v8 patch attached at [1] has the fixes
for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0sAU4s1KTLOEWv%3DrYo5dQK6uFTJn_0FKj3XG1Nv4D-qw%40mail.gmail.com

Regards,
Vignesh



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v8 patch has the changes for the same.
>

AFAICS, the summary of this proposal is that we want to support
exclude of certain objects from publication with two kinds of
variants. The first variant is to add support to exclude specific
tables from ALL TABLES PUBLICATION. Without this feature, users need
to manually add all tables for a database even when she wants to avoid
only a handful of tables from the database say because they contain
sensitive information or are not required. We have seen that other
database like MySQL also provides similar feature [1] (See
REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
follows:

CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
or
ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;

This will allow us to publish all the tables in the current database
except t1 and t2. Now, I see that pg_dump has a similar option
provided by switch --exclude-table but that allows tables matching
patterns which is not the case here. I am not sure if we need a
similar variant here.

Then users will be allowed to reset the publication by:
ALTER PUBLICATION pub1 RESET;

This will reset the publication to the default state which includes
resetting the publication parameters, setting the ALL TABLES flag to
false, and dropping the relations and schemas that are associated with
the publication. I don't know if we want to go further with allowing
to RESET specific parameters and if so which parameters and what would
its syntax be?

The second variant is to add support to exclude certain columns of a
table while publishing a particular table. Currently, users need to
list all required columns' names even if they don't want to hide most
of the columns in the table (for example Create Publication pub For
Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
or other sensitive information of executives/employees but would like
to publish all other columns. I feel in such cases it will be a lot of
work for the user especially when the table has many columns. I see
that Oracle has a similar feature [2]. I think without this it will be
difficult for users to use this feature in some cases. The patch for
this is not proposed but I would imagine syntax for it to be something
like "Create Publication pub For Table t1 Except (c3)" and similar
variants for Alter Publication.

Have I missed anything?

Thoughts on the proposal/syntax would be appreciated?

[1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html
[2] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.



RE: Skipping schema changes in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v8 patch has the changes for the
> same.
> >
> 
> AFAICS, the summary of this proposal is that we want to support
> exclude of certain objects from publication with two kinds of
> variants. The first variant is to add support to exclude specific
> tables from ALL TABLES PUBLICATION. Without this feature, users need
> to manually add all tables for a database even when she wants to avoid
> only a handful of tables from the database say because they contain
> sensitive information or are not required. We have seen that other
> database like MySQL also provides similar feature [1] (See
> REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
> follows:
> 
> CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> or
> ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;
> 
> This will allow us to publish all the tables in the current database
> except t1 and t2. Now, I see that pg_dump has a similar option
> provided by switch --exclude-table but that allows tables matching
> patterns which is not the case here. I am not sure if we need a
> similar variant here.
> 
> Then users will be allowed to reset the publication by:
> ALTER PUBLICATION pub1 RESET;
> 
> This will reset the publication to the default state which includes
> resetting the publication parameters, setting the ALL TABLES flag to
> false, and dropping the relations and schemas that are associated with
> the publication. I don't know if we want to go further with allowing
> to RESET specific parameters and if so which parameters and what would
> its syntax be?
> 
> The second variant is to add support to exclude certain columns of a
> table while publishing a particular table. Currently, users need to
> list all required columns' names even if they don't want to hide most
> of the columns in the table (for example Create Publication pub For
> Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
> or other sensitive information of executives/employees but would like
> to publish all other columns. I feel in such cases it will be a lot of
> work for the user especially when the table has many columns. I see
> that Oracle has a similar feature [2]. I think without this it will be
> difficult for users to use this feature in some cases. The patch for
> this is not proposed but I would imagine syntax for it to be something
> like "Create Publication pub For Table t1 Except (c3)" and similar
> variants for Alter Publication.

I think the feature to exclude certain columns of a table would be useful.

In some production scenarios, we usually do not want to replicate
sensitive fields(column) in the table. Although we already can achieve
this by specify all replicated columns in the list[1], but that seems a
hard work when the table has hundreds of columns.

[1]
CREATE TABLE test(a int, b int, c int,..., sensitive text);
CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...);

In addition, it's not easy to maintain the column list like above. Because
we sometimes need to add new fields or delete fields due to business
needs. Every time we add a column(or delete a column in column list), we
need to update the column list.

If we support Except:
CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive);

We don't need to update the column list in most cases.

Thanks for "hametan" for providing the use case off-list.

Best regards,
Hou zj




Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Tue, Jun 14, 2022 at 9:10 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for the comments, the attached v8 patch has the changes for the
> > same.
> > >
> >
> > AFAICS, the summary of this proposal is that we want to support
> > exclude of certain objects from publication with two kinds of
> > variants. The first variant is to add support to exclude specific
> > tables from ALL TABLES PUBLICATION. Without this feature, users need
> > to manually add all tables for a database even when she wants to avoid
> > only a handful of tables from the database say because they contain
> > sensitive information or are not required. We have seen that other
> > database like MySQL also provides similar feature [1] (See
> > REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
> > follows:
> >
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > or
> > ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;
> >
> > This will allow us to publish all the tables in the current database
> > except t1 and t2. Now, I see that pg_dump has a similar option
> > provided by switch --exclude-table but that allows tables matching
> > patterns which is not the case here. I am not sure if we need a
> > similar variant here.
> >
> > Then users will be allowed to reset the publication by:
> > ALTER PUBLICATION pub1 RESET;
> >
> > This will reset the publication to the default state which includes
> > resetting the publication parameters, setting the ALL TABLES flag to
> > false, and dropping the relations and schemas that are associated with
> > the publication. I don't know if we want to go further with allowing
> > to RESET specific parameters and if so which parameters and what would
> > its syntax be?
> >
> > The second variant is to add support to exclude certain columns of a
> > table while publishing a particular table. Currently, users need to
> > list all required columns' names even if they don't want to hide most
> > of the columns in the table (for example Create Publication pub For
> > Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
> > or other sensitive information of executives/employees but would like
> > to publish all other columns. I feel in such cases it will be a lot of
> > work for the user especially when the table has many columns. I see
> > that Oracle has a similar feature [2]. I think without this it will be
> > difficult for users to use this feature in some cases. The patch for
> > this is not proposed but I would imagine syntax for it to be something
> > like "Create Publication pub For Table t1 Except (c3)" and similar
> > variants for Alter Publication.
>
> I think the feature to exclude certain columns of a table would be useful.
>
> In some production scenarios, we usually do not want to replicate
> sensitive fields(column) in the table. Although we already can achieve
> this by specify all replicated columns in the list[1], but that seems a
> hard work when the table has hundreds of columns.
>
> [1]
> CREATE TABLE test(a int, b int, c int,..., sensitive text);
> CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...);
>
> In addition, it's not easy to maintain the column list like above. Because
> we sometimes need to add new fields or delete fields due to business
> needs. Every time we add a column(or delete a column in column list), we
> need to update the column list.
>
> If we support Except:
> CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive);
>
> We don't need to update the column list in most cases.
>

Right, this is a valid point and I think it makes sense for me to
support such a feature for column list and also to exclude a
particular table(s) from the ALL TABLES publication.

Peter E., Euler, and others, do you have any objections to supporting
the above-mentioned two cases?

-- 
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > release mode as in  [1].
> > Hi, thank you for the patches.
> >
> >
> > I'll share several review comments.
> >
> > For v7-0001.
> >
> > (1) I'll suggest some minor rewording.
> >
> > +  <para>
> > +   The <literal>RESET</literal> clause will reset the publication to the
> > +   default state which includes resetting the publication options, setting
> > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > +   dropping all relations and schemas that are associated with the publication.
> >
> > My suggestion is
> > "The RESET clause will reset the publication to the
> > default state. It resets the publication operations,
> > sets ALL TABLES flag to false and drops all relations
> > and schemas associated with the publication."
>
> I felt the existing looks better. I would prefer to keep it that way.
>
> > (2) typo and rewording
> >
> > +/*
> > + * Reset the publication.
> > + *
> > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > + * all relations and schemas that are associated with the publication.
> > + */
> >
> > The "setting" in this sentence should be "set".
> >
> > How about changing like below ?
> > FROM:
> > "Reset the publication options, setting ALL TABLES flag to false and drop
> > all relations and schemas that are associated with the publication."
> > TO:
> > "Reset the publication operations, set ALL TABLES flag to false and drop
> > all relations and schemas associated with the publication."
>
>  I felt the existing looks better. I would prefer to keep it that way.
>
> > (3) AlterPublicationReset
> >
> > Do we need to call CacheInvalidateRelcacheAll() or
> > InvalidatePublicationRels() at the end of
> > AlterPublicationReset() like AlterPublicationOptions() ?
>
> CacheInvalidateRelcacheAll should be called if we change all tables
> from true to false, else the cache will not be invalidated. Modified
>
> >
> > For v7-0002.
> >
> > (4)
> >
> > +       if (stmt->for_all_tables)
> > +       {
> > +               bool            isdefault = CheckPublicationDefValues(tup);
> > +
> > +               if (!isdefault)
> > +                       ereport(ERROR,
> > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> >
> >
> > The errmsg string has three messages for user and is a bit long
> > (we have two sentences there connected by 'and').
> > Can't we make it concise and split it into a couple of lines for code readability ?
> >
> > I'll suggest a change below.
> > FROM:
> > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and
ALLTABLES flag should not be set"
 
> > TO:
> > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > "to have default publish actions without any associated tables/schemas"
>
> Added errdetail and split it
>
> > (5) typo
> >
> >    <varlistentry>
> > +    <term><literal>EXCEPT TABLE</literal></term>
> > +    <listitem>
> > +     <para>
> > +      This clause specifies a list of tables to exclude from the publication.
> > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > +     </para>
> > +    </listitem>
> > +   </varlistentry>
> > +
> >
> > Kindly change
> > FROM:
> > This clause specifies a list of tables to exclude from the publication.
> > TO:
> > This clause specifies a list of tables to be excluded from the publication.
> > or
> > This clause specifies a list of tables excluded from the publication.
>
> Modified
>
> > (6) Minor suggestion for an expression change
> >
> >        Marks the publication as one that replicates changes for all tables in
> > -      the database, including tables created in the future.
> > +      the database, including tables created in the future. If
> > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > +      the changes for the specified tables.
> >
> >
> > I'll suggest a minor rewording.
> > FROM:
> > ...exclude replicating the changes for the specified tables
> > TO:
> > ...exclude replication changes for the specified tables
>
> I felt the existing is better.
>
> > (7)
> > (7-1)
> >
> > +/*
> > + * Check if the publication has default values
> > + *
> > + * Check the following:
> > + * a) Publication is not set with "FOR ALL TABLES"
> > + * b) Publication is having default options
> > + * c) Publication is not associated with schemas
> > + * d) Publication is not associated with relations
> > + */
> > +static bool
> > +CheckPublicationDefValues(HeapTuple tup)
> >
> >
> > I think this header comment can be improved.
> > FROM:
> > Check the following:
> > TO:
> > Returns true if the publication satisfies all the following conditions:
>
> Modified
>
> > (7-2)
> >
> > b) should be changed as well
> > FROM:
> > Publication is having default options
> > TO:
> > Publication has the default publish operations
>
> Changed it to "Publication is having default publication parameter values"
>
> Thanks for the comments, the attached v8 patch has the changes for the same.

The patch needed to be rebased on top of HEAD because of commit
"0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
version for the changes of the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> > >
> > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > release mode as in  [1].
> > > Hi, thank you for the patches.
> > >
> > >
> > > I'll share several review comments.
> > >
> > > For v7-0001.
> > >
> > > (1) I'll suggest some minor rewording.
> > >
> > > +  <para>
> > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > +   default state which includes resetting the publication options, setting
> > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > +   dropping all relations and schemas that are associated with the publication.
> > >
> > > My suggestion is
> > > "The RESET clause will reset the publication to the
> > > default state. It resets the publication operations,
> > > sets ALL TABLES flag to false and drops all relations
> > > and schemas associated with the publication."
> >
> > I felt the existing looks better. I would prefer to keep it that way.
> >
> > > (2) typo and rewording
> > >
> > > +/*
> > > + * Reset the publication.
> > > + *
> > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > + * all relations and schemas that are associated with the publication.
> > > + */
> > >
> > > The "setting" in this sentence should be "set".
> > >
> > > How about changing like below ?
> > > FROM:
> > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > all relations and schemas that are associated with the publication."
> > > TO:
> > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > all relations and schemas associated with the publication."
> >
> >  I felt the existing looks better. I would prefer to keep it that way.
> >
> > > (3) AlterPublicationReset
> > >
> > > Do we need to call CacheInvalidateRelcacheAll() or
> > > InvalidatePublicationRels() at the end of
> > > AlterPublicationReset() like AlterPublicationOptions() ?
> >
> > CacheInvalidateRelcacheAll should be called if we change all tables
> > from true to false, else the cache will not be invalidated. Modified
> >
> > >
> > > For v7-0002.
> > >
> > > (4)
> > >
> > > +       if (stmt->for_all_tables)
> > > +       {
> > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > +
> > > +               if (!isdefault)
> > > +                       ereport(ERROR,
> > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > >
> > >
> > > The errmsg string has three messages for user and is a bit long
> > > (we have two sentences there connected by 'and').
> > > Can't we make it concise and split it into a couple of lines for code readability ?
> > >
> > > I'll suggest a change below.
> > > FROM:
> > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and
ALLTABLES flag should not be set"
 
> > > TO:
> > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > "to have default publish actions without any associated tables/schemas"
> >
> > Added errdetail and split it
> >
> > > (5) typo
> > >
> > >    <varlistentry>
> > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > +    <listitem>
> > > +     <para>
> > > +      This clause specifies a list of tables to exclude from the publication.
> > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > +     </para>
> > > +    </listitem>
> > > +   </varlistentry>
> > > +
> > >
> > > Kindly change
> > > FROM:
> > > This clause specifies a list of tables to exclude from the publication.
> > > TO:
> > > This clause specifies a list of tables to be excluded from the publication.
> > > or
> > > This clause specifies a list of tables excluded from the publication.
> >
> > Modified
> >
> > > (6) Minor suggestion for an expression change
> > >
> > >        Marks the publication as one that replicates changes for all tables in
> > > -      the database, including tables created in the future.
> > > +      the database, including tables created in the future. If
> > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > +      the changes for the specified tables.
> > >
> > >
> > > I'll suggest a minor rewording.
> > > FROM:
> > > ...exclude replicating the changes for the specified tables
> > > TO:
> > > ...exclude replication changes for the specified tables
> >
> > I felt the existing is better.
> >
> > > (7)
> > > (7-1)
> > >
> > > +/*
> > > + * Check if the publication has default values
> > > + *
> > > + * Check the following:
> > > + * a) Publication is not set with "FOR ALL TABLES"
> > > + * b) Publication is having default options
> > > + * c) Publication is not associated with schemas
> > > + * d) Publication is not associated with relations
> > > + */
> > > +static bool
> > > +CheckPublicationDefValues(HeapTuple tup)
> > >
> > >
> > > I think this header comment can be improved.
> > > FROM:
> > > Check the following:
> > > TO:
> > > Returns true if the publication satisfies all the following conditions:
> >
> > Modified
> >
> > > (7-2)
> > >
> > > b) should be changed as well
> > > FROM:
> > > Publication is having default options
> > > TO:
> > > Publication has the default publish operations
> >
> > Changed it to "Publication is having default publication parameter values"
> >
> > Thanks for the comments, the attached v8 patch has the changes for the same.
>
> The patch needed to be rebased on top of HEAD because of commit
> "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> version for the changes of the same.

I had missed attaching one of the changes that was present locally.
The updated patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Nitin Jadhav
Дата:
I spent some time on understanding the proposal and the patch. Here
are a few comments wrt the test cases.

> +ALTER PUBLICATION testpub_reset ADD TABLE pub_sch1.tbl1;
> +
> +-- Verify that tables associated with the publication are dropped after RESET
> +\dRp+ testpub_reset
> +ALTER PUBLICATION testpub_reset RESET;
> +\dRp+ testpub_reset
>
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public;
> +
> +-- Verify that schemas associated with the publication are dropped after RESET
> +\dRp+ testpub_reset
> +ALTER PUBLICATION testpub_reset RESET;
> +\dRp+ testpub_reset

The results for the above two cases are the same before and after the
reset. Is there any way to verify that?
---

> +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES' publication
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
> +
>
> +-- Can't add EXCEPT TABLE to 'FOR TABLE' publication
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
> +
>
> +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES IN SCHEMA' publication
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
> +

I did not understand the objective of these tests. I think we need to
improve the comments.

Thanks & Regards,



On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > > <osumi.takamichi@fujitsu.com> wrote:
> > > >
> > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > > release mode as in  [1].
> > > > Hi, thank you for the patches.
> > > >
> > > >
> > > > I'll share several review comments.
> > > >
> > > > For v7-0001.
> > > >
> > > > (1) I'll suggest some minor rewording.
> > > >
> > > > +  <para>
> > > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > > +   default state which includes resetting the publication options, setting
> > > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > > +   dropping all relations and schemas that are associated with the publication.
> > > >
> > > > My suggestion is
> > > > "The RESET clause will reset the publication to the
> > > > default state. It resets the publication operations,
> > > > sets ALL TABLES flag to false and drops all relations
> > > > and schemas associated with the publication."
> > >
> > > I felt the existing looks better. I would prefer to keep it that way.
> > >
> > > > (2) typo and rewording
> > > >
> > > > +/*
> > > > + * Reset the publication.
> > > > + *
> > > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > > + * all relations and schemas that are associated with the publication.
> > > > + */
> > > >
> > > > The "setting" in this sentence should be "set".
> > > >
> > > > How about changing like below ?
> > > > FROM:
> > > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > > all relations and schemas that are associated with the publication."
> > > > TO:
> > > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > > all relations and schemas associated with the publication."
> > >
> > >  I felt the existing looks better. I would prefer to keep it that way.
> > >
> > > > (3) AlterPublicationReset
> > > >
> > > > Do we need to call CacheInvalidateRelcacheAll() or
> > > > InvalidatePublicationRels() at the end of
> > > > AlterPublicationReset() like AlterPublicationOptions() ?
> > >
> > > CacheInvalidateRelcacheAll should be called if we change all tables
> > > from true to false, else the cache will not be invalidated. Modified
> > >
> > > >
> > > > For v7-0002.
> > > >
> > > > (4)
> > > >
> > > > +       if (stmt->for_all_tables)
> > > > +       {
> > > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > > +
> > > > +               if (!isdefault)
> > > > +                       ereport(ERROR,
> > > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > > >
> > > >
> > > > The errmsg string has three messages for user and is a bit long
> > > > (we have two sentences there connected by 'and').
> > > > Can't we make it concise and split it into a couple of lines for code readability ?
> > > >
> > > > I'll suggest a change below.
> > > > FROM:
> > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated
andALL TABLES flag should not be set"
 
> > > > TO:
> > > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > > "to have default publish actions without any associated tables/schemas"
> > >
> > > Added errdetail and split it
> > >
> > > > (5) typo
> > > >
> > > >    <varlistentry>
> > > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > > +    <listitem>
> > > > +     <para>
> > > > +      This clause specifies a list of tables to exclude from the publication.
> > > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > > +     </para>
> > > > +    </listitem>
> > > > +   </varlistentry>
> > > > +
> > > >
> > > > Kindly change
> > > > FROM:
> > > > This clause specifies a list of tables to exclude from the publication.
> > > > TO:
> > > > This clause specifies a list of tables to be excluded from the publication.
> > > > or
> > > > This clause specifies a list of tables excluded from the publication.
> > >
> > > Modified
> > >
> > > > (6) Minor suggestion for an expression change
> > > >
> > > >        Marks the publication as one that replicates changes for all tables in
> > > > -      the database, including tables created in the future.
> > > > +      the database, including tables created in the future. If
> > > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > > +      the changes for the specified tables.
> > > >
> > > >
> > > > I'll suggest a minor rewording.
> > > > FROM:
> > > > ...exclude replicating the changes for the specified tables
> > > > TO:
> > > > ...exclude replication changes for the specified tables
> > >
> > > I felt the existing is better.
> > >
> > > > (7)
> > > > (7-1)
> > > >
> > > > +/*
> > > > + * Check if the publication has default values
> > > > + *
> > > > + * Check the following:
> > > > + * a) Publication is not set with "FOR ALL TABLES"
> > > > + * b) Publication is having default options
> > > > + * c) Publication is not associated with schemas
> > > > + * d) Publication is not associated with relations
> > > > + */
> > > > +static bool
> > > > +CheckPublicationDefValues(HeapTuple tup)
> > > >
> > > >
> > > > I think this header comment can be improved.
> > > > FROM:
> > > > Check the following:
> > > > TO:
> > > > Returns true if the publication satisfies all the following conditions:
> > >
> > > Modified
> > >
> > > > (7-2)
> > > >
> > > > b) should be changed as well
> > > > FROM:
> > > > Publication is having default options
> > > > TO:
> > > > Publication has the default publish operations
> > >
> > > Changed it to "Publication is having default publication parameter values"
> > >
> > > Thanks for the comments, the attached v8 patch has the changes for the same.
> >
> > The patch needed to be rebased on top of HEAD because of commit
> > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> > version for the changes of the same.
>
> I had missed attaching one of the changes that was present locally.
> The updated patch has the changes for the same.
>
> Regards,
> Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Thu, Aug 18, 2022 at 12:33 PM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
>
> I spent some time on understanding the proposal and the patch. Here
> are a few comments wrt the test cases.
>
> > +ALTER PUBLICATION testpub_reset ADD TABLE pub_sch1.tbl1;
> > +
> > +-- Verify that tables associated with the publication are dropped after RESET
> > +\dRp+ testpub_reset
> > +ALTER PUBLICATION testpub_reset RESET;
> > +\dRp+ testpub_reset
> >
> > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public;
> > +
> > +-- Verify that schemas associated with the publication are dropped after RESET
> > +\dRp+ testpub_reset
> > +ALTER PUBLICATION testpub_reset RESET;
> > +\dRp+ testpub_reset
>
> The results for the above two cases are the same before and after the
> reset. Is there any way to verify that?

If you see the expected, first \dRp+ command includes:
+Tables:
+    "pub_sch1.tbl1"
The second \dRp+ does not include the Tables.
We are trying to verify that after reset, the tables will be removed
from the publication.
The second test is similar to the first, the only difference here is
that we test schema instead of tables. i.e we verify that the schemas
will be removed from the publication.

> ---
>
> > +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES' publication
> > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
> > +
> >
> > +-- Can't add EXCEPT TABLE to 'FOR TABLE' publication
> > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
> > +
> >
> > +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES IN SCHEMA' publication
> > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
> > +
>
> I did not understand the objective of these tests. I think we need to
> improve the comments.

There are different publications like "ALL TABLES", "TABLE", "ALL
TABLES IN SCHEMA" publications. Here we are trying to verify that
except tables cannot be added to "ALL TABLES", "TABLE", "ALL TABLES IN
SCHEMA" publications.
If you see the expected file, you will see the following error:
+-- Can't add EXCEPT TABLE to 'FOR ALL TABLES' publication
+ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1;
+ERROR:  adding ALL TABLES requires the publication to have default
publication parameter values
+DETAIL:  ALL TABLES flag should not be set and no tables/schemas
should be associated.
+HINT:  Use ALTER PUBLICATION ... RESET to reset the publication

I felt the existing comment is ok. Let me know if you still feel any
change is required.

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > > <osumi.takamichi@fujitsu.com> wrote:
> > > >
> > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > > release mode as in  [1].
> > > > Hi, thank you for the patches.
> > > >
> > > >
> > > > I'll share several review comments.
> > > >
> > > > For v7-0001.
> > > >
> > > > (1) I'll suggest some minor rewording.
> > > >
> > > > +  <para>
> > > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > > +   default state which includes resetting the publication options, setting
> > > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > > +   dropping all relations and schemas that are associated with the publication.
> > > >
> > > > My suggestion is
> > > > "The RESET clause will reset the publication to the
> > > > default state. It resets the publication operations,
> > > > sets ALL TABLES flag to false and drops all relations
> > > > and schemas associated with the publication."
> > >
> > > I felt the existing looks better. I would prefer to keep it that way.
> > >
> > > > (2) typo and rewording
> > > >
> > > > +/*
> > > > + * Reset the publication.
> > > > + *
> > > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > > + * all relations and schemas that are associated with the publication.
> > > > + */
> > > >
> > > > The "setting" in this sentence should be "set".
> > > >
> > > > How about changing like below ?
> > > > FROM:
> > > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > > all relations and schemas that are associated with the publication."
> > > > TO:
> > > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > > all relations and schemas associated with the publication."
> > >
> > >  I felt the existing looks better. I would prefer to keep it that way.
> > >
> > > > (3) AlterPublicationReset
> > > >
> > > > Do we need to call CacheInvalidateRelcacheAll() or
> > > > InvalidatePublicationRels() at the end of
> > > > AlterPublicationReset() like AlterPublicationOptions() ?
> > >
> > > CacheInvalidateRelcacheAll should be called if we change all tables
> > > from true to false, else the cache will not be invalidated. Modified
> > >
> > > >
> > > > For v7-0002.
> > > >
> > > > (4)
> > > >
> > > > +       if (stmt->for_all_tables)
> > > > +       {
> > > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > > +
> > > > +               if (!isdefault)
> > > > +                       ereport(ERROR,
> > > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/....
 
> > > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > > >
> > > >
> > > > The errmsg string has three messages for user and is a bit long
> > > > (we have two sentences there connected by 'and').
> > > > Can't we make it concise and split it into a couple of lines for code readability ?
> > > >
> > > > I'll suggest a change below.
> > > > FROM:
> > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated
andALL TABLES flag should not be set"
 
> > > > TO:
> > > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > > "to have default publish actions without any associated tables/schemas"
> > >
> > > Added errdetail and split it
> > >
> > > > (5) typo
> > > >
> > > >    <varlistentry>
> > > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > > +    <listitem>
> > > > +     <para>
> > > > +      This clause specifies a list of tables to exclude from the publication.
> > > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > > +     </para>
> > > > +    </listitem>
> > > > +   </varlistentry>
> > > > +
> > > >
> > > > Kindly change
> > > > FROM:
> > > > This clause specifies a list of tables to exclude from the publication.
> > > > TO:
> > > > This clause specifies a list of tables to be excluded from the publication.
> > > > or
> > > > This clause specifies a list of tables excluded from the publication.
> > >
> > > Modified
> > >
> > > > (6) Minor suggestion for an expression change
> > > >
> > > >        Marks the publication as one that replicates changes for all tables in
> > > > -      the database, including tables created in the future.
> > > > +      the database, including tables created in the future. If
> > > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > > +      the changes for the specified tables.
> > > >
> > > >
> > > > I'll suggest a minor rewording.
> > > > FROM:
> > > > ...exclude replicating the changes for the specified tables
> > > > TO:
> > > > ...exclude replication changes for the specified tables
> > >
> > > I felt the existing is better.
> > >
> > > > (7)
> > > > (7-1)
> > > >
> > > > +/*
> > > > + * Check if the publication has default values
> > > > + *
> > > > + * Check the following:
> > > > + * a) Publication is not set with "FOR ALL TABLES"
> > > > + * b) Publication is having default options
> > > > + * c) Publication is not associated with schemas
> > > > + * d) Publication is not associated with relations
> > > > + */
> > > > +static bool
> > > > +CheckPublicationDefValues(HeapTuple tup)
> > > >
> > > >
> > > > I think this header comment can be improved.
> > > > FROM:
> > > > Check the following:
> > > > TO:
> > > > Returns true if the publication satisfies all the following conditions:
> > >
> > > Modified
> > >
> > > > (7-2)
> > > >
> > > > b) should be changed as well
> > > > FROM:
> > > > Publication is having default options
> > > > TO:
> > > > Publication has the default publish operations
> > >
> > > Changed it to "Publication is having default publication parameter values"
> > >
> > > Thanks for the comments, the attached v8 patch has the changes for the same.
> >
> > The patch needed to be rebased on top of HEAD because of commit
> > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> > version for the changes of the same.
>
> I had missed attaching one of the changes that was present locally.
> The updated patch has the changes for the same.

The patch needed to be rebased on top of HEAD because of a recent
commit. The updated v8 patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Ian Lawrence Barwick
Дата:
2022年8月19日(金) 2:41 vignesh C <vignesh21@gmail.com>:
>
> On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com
> > > > <osumi.takamichi@fujitsu.com> wrote:
> > > > >
> > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in
> > > > > > release mode as in  [1].
> > > > > Hi, thank you for the patches.
> > > > >
> > > > >
> > > > > I'll share several review comments.
> > > > >
> > > > > For v7-0001.
> > > > >
> > > > > (1) I'll suggest some minor rewording.
> > > > >
> > > > > +  <para>
> > > > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > > > +   default state which includes resetting the publication options, setting
> > > > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > > > +   dropping all relations and schemas that are associated with the publication.
> > > > >
> > > > > My suggestion is
> > > > > "The RESET clause will reset the publication to the
> > > > > default state. It resets the publication operations,
> > > > > sets ALL TABLES flag to false and drops all relations
> > > > > and schemas associated with the publication."
> > > >
> > > > I felt the existing looks better. I would prefer to keep it that way.
> > > >
> > > > > (2) typo and rewording
> > > > >
> > > > > +/*
> > > > > + * Reset the publication.
> > > > > + *
> > > > > + * Reset the publication options, setting ALL TABLES flag to false and drop
> > > > > + * all relations and schemas that are associated with the publication.
> > > > > + */
> > > > >
> > > > > The "setting" in this sentence should be "set".
> > > > >
> > > > > How about changing like below ?
> > > > > FROM:
> > > > > "Reset the publication options, setting ALL TABLES flag to false and drop
> > > > > all relations and schemas that are associated with the publication."
> > > > > TO:
> > > > > "Reset the publication operations, set ALL TABLES flag to false and drop
> > > > > all relations and schemas associated with the publication."
> > > >
> > > >  I felt the existing looks better. I would prefer to keep it that way.
> > > >
> > > > > (3) AlterPublicationReset
> > > > >
> > > > > Do we need to call CacheInvalidateRelcacheAll() or
> > > > > InvalidatePublicationRels() at the end of
> > > > > AlterPublicationReset() like AlterPublicationOptions() ?
> > > >
> > > > CacheInvalidateRelcacheAll should be called if we change all tables
> > > > from true to false, else the cache will not be invalidated. Modified
> > > >
> > > > >
> > > > > For v7-0002.
> > > > >
> > > > > (4)
> > > > >
> > > > > +       if (stmt->for_all_tables)
> > > > > +       {
> > > > > +               bool            isdefault = CheckPublicationDefValues(tup);
> > > > > +
> > > > > +               if (!isdefault)
> > > > > +                       ereport(ERROR,
> > > > > +                                       errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> > > > > +                                       errmsg("adding ALL TABLES requires the publication to have default
publicationoptions, no tables/.... 
> > > > > +                                       errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));
> > > > >
> > > > >
> > > > > The errmsg string has three messages for user and is a bit long
> > > > > (we have two sentences there connected by 'and').
> > > > > Can't we make it concise and split it into a couple of lines for code readability ?
> > > > >
> > > > > I'll suggest a change below.
> > > > > FROM:
> > > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated
andALL TABLES flag should not be set" 
> > > > > TO:
> > > > > "adding ALL TABLES requires the publication defined not for ALL TABLES"
> > > > > "to have default publish actions without any associated tables/schemas"
> > > >
> > > > Added errdetail and split it
> > > >
> > > > > (5) typo
> > > > >
> > > > >    <varlistentry>
> > > > > +    <term><literal>EXCEPT TABLE</literal></term>
> > > > > +    <listitem>
> > > > > +     <para>
> > > > > +      This clause specifies a list of tables to exclude from the publication.
> > > > > +      It can only be used with <literal>FOR ALL TABLES</literal>.
> > > > > +     </para>
> > > > > +    </listitem>
> > > > > +   </varlistentry>
> > > > > +
> > > > >
> > > > > Kindly change
> > > > > FROM:
> > > > > This clause specifies a list of tables to exclude from the publication.
> > > > > TO:
> > > > > This clause specifies a list of tables to be excluded from the publication.
> > > > > or
> > > > > This clause specifies a list of tables excluded from the publication.
> > > >
> > > > Modified
> > > >
> > > > > (6) Minor suggestion for an expression change
> > > > >
> > > > >        Marks the publication as one that replicates changes for all tables in
> > > > > -      the database, including tables created in the future.
> > > > > +      the database, including tables created in the future. If
> > > > > +      <literal>EXCEPT TABLE</literal> is specified, then exclude replicating
> > > > > +      the changes for the specified tables.
> > > > >
> > > > >
> > > > > I'll suggest a minor rewording.
> > > > > FROM:
> > > > > ...exclude replicating the changes for the specified tables
> > > > > TO:
> > > > > ...exclude replication changes for the specified tables
> > > >
> > > > I felt the existing is better.
> > > >
> > > > > (7)
> > > > > (7-1)
> > > > >
> > > > > +/*
> > > > > + * Check if the publication has default values
> > > > > + *
> > > > > + * Check the following:
> > > > > + * a) Publication is not set with "FOR ALL TABLES"
> > > > > + * b) Publication is having default options
> > > > > + * c) Publication is not associated with schemas
> > > > > + * d) Publication is not associated with relations
> > > > > + */
> > > > > +static bool
> > > > > +CheckPublicationDefValues(HeapTuple tup)
> > > > >
> > > > >
> > > > > I think this header comment can be improved.
> > > > > FROM:
> > > > > Check the following:
> > > > > TO:
> > > > > Returns true if the publication satisfies all the following conditions:
> > > >
> > > > Modified
> > > >
> > > > > (7-2)
> > > > >
> > > > > b) should be changed as well
> > > > > FROM:
> > > > > Publication is having default options
> > > > > TO:
> > > > > Publication has the default publish operations
> > > >
> > > > Changed it to "Publication is having default publication parameter values"
> > > >
> > > > Thanks for the comments, the attached v8 patch has the changes for the same.
> > >
> > > The patch needed to be rebased on top of HEAD because of commit
> > > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8
> > > version for the changes of the same.
> >
> > I had missed attaching one of the changes that was present locally.
> > The updated patch has the changes for the same.
>
> The patch needed to be rebased on top of HEAD because of a recent
> commit. The updated v8 patch has the changes for the same.

Hi

cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.

[1] http://cfbot.cputube.org/patch_40_3646.log

Thanks

Ian Barwick



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> Hi
>
> cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.
>
> [1] http://cfbot.cputube.org/patch_40_3646.log

Here is an updated patch which is rebased on top of HEAD.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
Ian Lawrence Barwick
Дата:
2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
>
> On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> >
> > Hi
> >
> > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > currently underway, this would be an excellent time to update the patch.
> >
> > [1] http://cfbot.cputube.org/patch_40_3646.log
>
> Here is an updated patch which is rebased on top of HEAD.

Thanks for the updated patch.

While reviewing the patch backlog, we have determined that this patch adds
one or more TAP tests but has not added the test to the "meson.build" file.

To do this, locate the relevant "meson.build" file for each test and add it
in the 'tests' dictionary, which will look something like this:

  'tap': {
    'tests': [
      't/001_basic.pl',
    ],
  },

For some additional details please see this Wiki article:

  https://wiki.postgresql.org/wiki/Meson_for_patch_authors

For more information on the meson build system for PostgreSQL see:

  https://wiki.postgresql.org/wiki/Meson


Regards

Ian Barwick



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
> >
> > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> > >
> > > Hi
> > >
> > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > > currently underway, this would be an excellent time to update the patch.
> > >
> > > [1] http://cfbot.cputube.org/patch_40_3646.log
> >
> > Here is an updated patch which is rebased on top of HEAD.
>
> Thanks for the updated patch.
>
> While reviewing the patch backlog, we have determined that this patch adds
> one or more TAP tests but has not added the test to the "meson.build" file.

Thanks, I have updated the meson.build to include the TAP test. The
attached patch has the changes for the same.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Wed, 16 Nov 2022 at 15:35, vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> >
> > 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
> > >
> > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> > > >
> > > > Hi
> > > >
> > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > > > currently underway, this would be an excellent time to update the patch.
> > > >
> > > > [1] http://cfbot.cputube.org/patch_40_3646.log
> > >
> > > Here is an updated patch which is rebased on top of HEAD.
> >
> > Thanks for the updated patch.
> >
> > While reviewing the patch backlog, we have determined that this patch adds
> > one or more TAP tests but has not added the test to the "meson.build" file.
>
> Thanks, I have updated the meson.build to include the TAP test. The
> attached patch has the changes for the same.

The patch was not applying on top of HEAD, attached a rebased version.

Regards,
Vignesh

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, 20 Jan 2023 at 15:30, vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 16 Nov 2022 at 15:35, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> > >
> > > 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>:
> > > >
> > > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote:
> > > > >
> > > > > Hi
> > > > >
> > > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
> > > > > currently underway, this would be an excellent time to update the patch.
> > > > >
> > > > > [1] http://cfbot.cputube.org/patch_40_3646.log
> > > >
> > > > Here is an updated patch which is rebased on top of HEAD.
> > >
> > > Thanks for the updated patch.
> > >
> > > While reviewing the patch backlog, we have determined that this patch adds
> > > one or more TAP tests but has not added the test to the "meson.build" file.
> >
> > Thanks, I have updated the meson.build to include the TAP test. The
> > attached patch has the changes for the same.
>
> The patch was not applying on top of HEAD, attached a rebased version.

As I did not see much interest from others, I'm withdrawing this patch
for now. But if there is any interest others in future, I would be
more than happy to work on this feature.

Regards,
Vignesh



Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote:
>
> As I did not see much interest from others, I'm withdrawing this patch
> for now. But if there is any interest others in future, I would be
> more than happy to work on this feature.
>

Just FYI, I noticed a use case for this patch in email [1]. Users
would like to replicate all except a few columns having sensitive
information. The challenge with current column list features is that
adding new tables to columns would lead users to change the respective
publications as well.

[1] - https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE270FDC3047540208%40qq.com

--
With Regards,
Amit Kapila.



RE: Skipping schema changes in publication

От
"Zhijie Hou (Fujitsu)"
Дата:
On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote:
> 
> On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > As I did not see much interest from others, I'm withdrawing this patch
> > for now. But if there is any interest others in future, I would be
> > more than happy to work on this feature.
> >
> 
> Just FYI, I noticed a use case for this patch in email [1]. Users would like to
> replicate all except a few columns having sensitive information. The challenge
> with current column list features is that adding new tables to columns would
> lead users to change the respective publications as well.
> 
> [1] -
> https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE
> 270FDC3047540208%40qq.com

BTW, I noticed that debezium, an open source distributed platform for change
data capture that replies on logical decoding, also support specifying the
column exclusion list[1]. So, this indicates that there could be some use cases
for this feature.

https://debezium.io/documentation/reference/stable/connectors/postgresql.html#postgresql-property-column-exclude-list

Best Regards,
Hou zj

Re: Skipping schema changes in publication

От
Amit Kapila
Дата:
On Wed, Apr 16, 2025 at 8:22 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote:
> >
> > On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > As I did not see much interest from others, I'm withdrawing this patch
> > > for now. But if there is any interest others in future, I would be
> > > more than happy to work on this feature.
> > >
> >
> > Just FYI, I noticed a use case for this patch in email [1]. Users would like to
> > replicate all except a few columns having sensitive information. The challenge
> > with current column list features is that adding new tables to columns would
> > lead users to change the respective publications as well.
> >
> > [1] -
> > https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE
> > 270FDC3047540208%40qq.com
>
> BTW, I noticed that debezium, an open source distributed platform for change
> data capture that replies on logical decoding, also support specifying the
> column exclusion list[1]. So, this indicates that there could be some use cases
> for this feature.
>

Thanks for sharing the link. I see that they support both the include
and exclude lists for columns and tables.

--
With Regards,
Amit Kapila.



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Thu, 17 Apr 2025 at 09:12, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 16, 2025 at 8:22 AM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote:
> > >
> > > On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > As I did not see much interest from others, I'm withdrawing this patch
> > > > for now. But if there is any interest others in future, I would be
> > > > more than happy to work on this feature.
> > > >
> > >
> > > Just FYI, I noticed a use case for this patch in email [1]. Users would like to
> > > replicate all except a few columns having sensitive information. The challenge
> > > with current column list features is that adding new tables to columns would
> > > lead users to change the respective publications as well.
> > >
> > > [1] -
> > > https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE
> > > 270FDC3047540208%40qq.com
> >
> > BTW, I noticed that debezium, an open source distributed platform for change
> > data capture that replies on logical decoding, also support specifying the
> > column exclusion list[1]. So, this indicates that there could be some use cases
> > for this feature.
> >
>
> Thanks for sharing the link. I see that they support both the include
> and exclude lists for columns and tables.
>

Hi Hackers,

I see there is some interest in the functionality added by this patch.
I have rebased the patches in [1]. I saw a new column 'pubgencols' was
added in pg_publication in PG 18. So, I have modified v11-0001 to
RESET this as well.
I am also working on creating a patch to exclude columns in
publication as per suggestion in [2].

[1]: https://www.postgresql.org/message-id/CALDaNm3dWZCYDih55qTNAYsjCvYXMFv%3D46UsDWmfCnXMt3kPCg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1KRdAPC%3D5%3D7tQ1GW0cRwD%3DzaDMi%2BT4u_k4GxPhPY6e8BQ%40mail.gmail.com

Thanks and Regards,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Tue, Jun 17, 2025 at 5:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
...
> I have attached a patch support excluding columns for publication.
>
> I have added a syntax: "FOR TABLE table_name EXCEPT (c1, c2, ..)"
> It can be used with CREATE or ALTER PUBLICATION.
>
> v12-0003 patch contains the changes for the same.
>

Hi Shlok,

I was interested in your new EXCEPT (col-list) so I had a quick look
at your patch v12-0003 (only looked at the documentation).

Below are some comments:

======

1. Chapter 29.5 "Column Lists".

I think new EXCEPT syntax needs a mention here as well.

======

doc/src/sgml/catalogs.sgml

2.
+      <para>
+       This is an array of values that indicates which table columns are
+       excluded from the publication.  For example, a value of
+       <literal>1 3</literal> would mean that the columns except the first and
+       the third columns are published.
+       A null value indicates that no columns are excluded from being
published.
+      </para></entry>

The sentence "A null value indicates that no columns are excluded from
being published" seems kind of confusing, because if the user has a
"normal" column-list  although nothing was being *explicitly* excluded
(using EXCEPT), any columns not named are *implicitly* excluded from
being published.

~

3.
TBH, I was wondering why a new catalog attribute was necessary...

Can't you simply re-use the existing attribute "prattrs" attribute.
e.g. let's just define negative means exclude.

e.g. a value of 1 3 means only the 1st and 3rd columns are published
e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
e.g. a value of null mean all columns are published

(mixes of negative and positive will not be possible)

======

doc/src/sgml/ref/alter_publication.sgml

4. ALTER PUBLICATION syntax

The syntax is currently written as:
TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
]

Can't this be more simply written as:
TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
] [ WHERE ( expression ) ] [, ... ]

~~~

5.
+  <para>
+   Alter publication <structname>mypublication</structname> to add table
+   <structname>users</structname> except column
+   <structname>security_pin</structname>:
+<programlisting>
+ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);

Those tags don't seem correct. e.g. "users" and "security_pin" are not
<structname> (???).

Perhaps, every other example here is wrong too and you just copied
them? Anyway, something here looks wrong to me.

======
doc/src/sgml/ref/create_publication.sgml

6. CREATE PUBLICATION syntax

The syntax is currently written as:
TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
]

Can't this be more simply written as:
TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
] [ WHERE ( expression ) ] [, ... ]

~~~

7.
+     <para>
+      When a column list is specified with EXCEPT, the named columns are not
+      replicated. The excluded column list cannot contain generated
columns. The
+      column list and excluded column list cannot be specified together.
+      Specifying a column list has no effect on <literal>TRUNCATE</literal>
+      commands.
+     </para>

IMO you don't need to say "The column list and excluded column list
cannot be specified together." because AFAIK the syntax makes that
impossible to do anyhow.

~~~

8.
+  <para>
+   Create a publication that publishes all changes for table
<structname>users</structname>
+   except changes for columns <structname>security_pin</structname>:
+<programlisting>
+CREATE PUBLICATION users_safe FOR TABLE users EXCEPT (security_pin);
+</programlisting>
+  </para>

8a.
Same review comment as previously -- Those tags don't seem correct.
e.g. "users" and "security_pin" are not <structname> (???).
Again, are all the other existing tags also wrong? Maybe a new thread
needed to address these?

~

8b.
Plural?  /except changes for columns/except changes for column/

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Wed, 18 Jun 2025 at 06:34, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Tue, Jun 17, 2025 at 5:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> ...
> > I have attached a patch support excluding columns for publication.
> >
> > I have added a syntax: "FOR TABLE table_name EXCEPT (c1, c2, ..)"
> > It can be used with CREATE or ALTER PUBLICATION.
> >
> > v12-0003 patch contains the changes for the same.
> >
>
> Hi Shlok,
>
> I was interested in your new EXCEPT (col-list) so I had a quick look
> at your patch v12-0003 (only looked at the documentation).
>
> Below are some comments:
>
> ======
>
> 1. Chapter 29.5 "Column Lists".
>
> I think new EXCEPT syntax needs a mention here as well.
>
Added

> ======
>
> doc/src/sgml/catalogs.sgml
>
> 2.
> +      <para>
> +       This is an array of values that indicates which table columns are
> +       excluded from the publication.  For example, a value of
> +       <literal>1 3</literal> would mean that the columns except the first and
> +       the third columns are published.
> +       A null value indicates that no columns are excluded from being
> published.
> +      </para></entry>
>
> The sentence "A null value indicates that no columns are excluded from
> being published" seems kind of confusing, because if the user has a
> "normal" column-list  although nothing was being *explicitly* excluded
> (using EXCEPT), any columns not named are *implicitly* excluded from
> being published.
>
I have removed this line.

> ~
>
> 3.
> TBH, I was wondering why a new catalog attribute was necessary...
>
> Can't you simply re-use the existing attribute "prattrs" attribute.
> e.g. let's just define negative means exclude.
>
> e.g. a value of 1 3 means only the 1st and 3rd columns are published
> e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> e.g. a value of null mean all columns are published
>
> (mixes of negative and positive will not be possible)
>

Currently I have added a new attribute 'prexcludeattrs' in
pg_publication_rel table. I used this approach because it will be
easier for user to get the exclude column list, in code no extra
processing is required to get the exclude column list.

For an approach to use negative numbers for exclude columns. I see an
advantage that we do not need to introduce a new column for
pg_publication_rel. But in code, each time we want to get a column
list or exclude column list we need an extra processing of 'prattrs'
columns. Also I don't see any existing catalog table using a negative
attribute for column list.

Based on above observations, I feel that the current is better.

Please correct me if I missed an advantage for the approach you suggested.

> ======
>
> doc/src/sgml/ref/alter_publication.sgml
>
> 4. ALTER PUBLICATION syntax
>
> The syntax is currently written as:
> TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
> EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
> ]
>
> Can't this be more simply written as:
> TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
> ] [ WHERE ( expression ) ] [, ... ]
>
> ~~~
Fixed

>
> 5.
> +  <para>
> +   Alter publication <structname>mypublication</structname> to add table
> +   <structname>users</structname> except column
> +   <structname>security_pin</structname>:
> +<programlisting>
> +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
>
> Those tags don't seem correct. e.g. "users" and "security_pin" are not
> <structname> (???).
>
> Perhaps, every other example here is wrong too and you just copied
> them? Anyway, something here looks wrong to me.
>
I saw different documents and usage of tags seems not well defined.
For example for table we are using tags in document
create_publication.sgml, update.sgml <structname> is used, in document
table.sgml, advanced.sgml <classname> is used, and in
logical-replication.sgml <literal>  is used. Similarly for column
names <structname>, <structfield> or <literal> are used in different
parts of the document.

I kept the changed tag to <structfield> for the column for this patch.
Do you have any suggestions?

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 6. CREATE PUBLICATION syntax
>
> The syntax is currently written as:
> TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [
> EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ...
> ]
>
> Can't this be more simply written as:
> TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] )
> ] [ WHERE ( expression ) ] [, ... ]
>
> ~~~
Fixed

>
> 7.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. The excluded column list cannot contain generated
> columns. The
> +      column list and excluded column list cannot be specified together.
> +      Specifying a column list has no effect on <literal>TRUNCATE</literal>
> +      commands.
> +     </para>
>
> IMO you don't need to say "The column list and excluded column list
> cannot be specified together." because AFAIK the syntax makes that
> impossible to do anyhow.
>
Removed this line

> ~~~
>
> 8.
> +  <para>
> +   Create a publication that publishes all changes for table
> <structname>users</structname>
> +   except changes for columns <structname>security_pin</structname>:
> +<programlisting>
> +CREATE PUBLICATION users_safe FOR TABLE users EXCEPT (security_pin);
> +</programlisting>
> +  </para>
>
> 8a.
> Same review comment as previously -- Those tags don't seem correct.
> e.g. "users" and "security_pin" are not <structname> (???).
> Again, are all the other existing tags also wrong? Maybe a new thread
> needed to address these?
>
> ~
Same as point 5.
I also feel this should be addressed in a new thread.

> 8b.
> Plural?  /except changes for columns/except changes for column/
Fixed

Also in this patch I added displaying "EXCEPT (column_list)" for \dRp+
and \d table_name psql commands.

Thanks and Regards,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
...
> > 3.
> > TBH, I was wondering why a new catalog attribute was necessary...
> >
> > Can't you simply re-use the existing attribute "prattrs" attribute.
> > e.g. let's just define negative means exclude.
> >
> > e.g. a value of 1 3 means only the 1st and 3rd columns are published
> > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> > e.g. a value of null mean all columns are published
> >
> > (mixes of negative and positive will not be possible)
> >
>
> Currently I have added a new attribute 'prexcludeattrs' in
> pg_publication_rel table. I used this approach because it will be
> easier for user to get the exclude column list, in code no extra
> processing is required to get the exclude column list.
>
> For an approach to use negative numbers for exclude columns. I see an
> advantage that we do not need to introduce a new column for
> pg_publication_rel. But in code, each time we want to get a column
> list or exclude column list we need an extra processing of 'prattrs'
> columns. Also I don't see any existing catalog table using a negative
> attribute for column list.
>
> Based on above observations, I feel that the current is better.
>
> Please correct me if I missed an advantage for the approach you suggested.
>

OK. Maybe using negative numbers was a bridge too far...

But IMO it is not good to have 2 separate attributes for the lists.
Doing so implies they can coexist, but that is not true. I felt there
are not really 2 "kinds" of columns list anyway -- there is just a
"column list" which defines columns that are either included or
excluded from the publication determined by EXCEPT.

Having  dual lists gets weird/confusing to describe them -- you end up
continually having to refer to the other one to clarify behaviour.

e.g. Does 'prattrs' value NULL mean publish everything? Well, no...
that depends if there is a non null 'prexcludeattrs'
e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well,
no... that depends if there is a non null 'prattrs'

Furthermore, all the code is doubling up referring to "column list"
and "exclude column list"  -- code / docs / comments / error messages.
There are quite a lot of places the patch touches that I thought were
not really needed if you don't have 2 different kinds of column-lists.

To summarise, I felt it would be better to just keep the existing
'prattrs' as the one-and-only column list, but add another BOOLEAN
attribute to flag whether 'prattrs' columns should be included or
excluded.

prattrs;   prattrs_exclude;  Means
--------------------------------------------
1 2 3     f                          only cols 1,2,3 will be published
4 5 6     t                          only cols 4,5,6 will NOT be published
null       f                          all cols are published (flag is ignored)
null       t                          all cols are published (flag is ignored)

> > 5.
> > +  <para>
> > +   Alter publication <structname>mypublication</structname> to add table
> > +   <structname>users</structname> except column
> > +   <structname>security_pin</structname>:
> > +<programlisting>
> > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
> >
> > Those tags don't seem correct. e.g. "users" and "security_pin" are not
> > <structname> (???).
> >
> > Perhaps, every other example here is wrong too and you just copied
> > them? Anyway, something here looks wrong to me.
> >
> I saw different documents and usage of tags seems not well defined.
> For example for table we are using tags in document
> create_publication.sgml, update.sgml <structname> is used, in document
> table.sgml, advanced.sgml <classname> is used, and in
> logical-replication.sgml <literal>  is used. Similarly for column
> names <structname>, <structfield> or <literal> are used in different
> parts of the document.
>
> I kept the changed tag to <structfield> for the column for this patch.
> Do you have any suggestions?

No, for this patch I think it is best that you just follow nearby code
(as you are already doing). I plan to raise another thread to ask what
are the guidelines for this  sort of markup which is currently used
inconsistently in different places.

//////////

Below are a few more review comments for v13-0003

======
Commit message

1.
Typo /THe/The/

~~~

2.
The new syntax allows specifying excluded column list when creating or
altering a publication. For example:
CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list)
or
ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list)

~

I felt since you say these "For example:" it would be better to give
real examples.
e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)".

~~~

3.
Typo /family of command/family of commands/

======
doc/src/sgml/logical-replication.sgml

4.
I am not sure that it was a good idea to be making a new term called
an "exclude column list"... because in introduces a new concept of
something that sounds like it is a different kind of list, and now you
have to keep referring everywhere to both to "column list" versus
"exclude column list". All the doubling up add more complication I
think.

IMO really there is just a "column list". Whether that list is for
exclusion or not just depends on the presence of EXCEPT. So I felt
maybe all places mentioning "exclude column list" could be rephrased.

======
src/backend/catalog/pg_publication.c

5.
+/*
+ * Returns true if the relation has exluded column list associated with the
+ * publication, false otherwise.
+ *
+ * If a exclude column list is found, the corresponding bitmap is returned
+ * through the cols parameter, if provided. The bitmap is constructed
within the
+ * given memory context (mcxt).
+ */
+

Typo /exluded column/an excluded column/
Typo /exclude column list/excluded column list/

~~~

6.
+/*
+ * pub_exclude_collist_validate
+ * Process and validate the 'excluded columns' list and ensure the columns
+ * are all valid to exclude from publication.  Checks for and raises an
+ * ERROR for any unknown columns, system columns, duplicate columns, or
+ * generated columns.
+ *

Why can't you exclude generated columns?

e.g. Maybe PUBLICATION says publish_generated_columns=stored and there
are 100s of such columns, but the user just wants to exclude one of
them. Why say they cannot do that? Hmm. Perhaps this is being already
handled elsewhere, in which case this comment still seems misleading.

======
src/backend/commands/publicationcmds.c

7.
+ * With REPLICA IDENTITY FULL, no column list and no excluded column
+ * list is allowed.

Really, just "no column list is allowed." same as it said before.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
> > > 5.
> > > +  <para>
> > > +   Alter publication <structname>mypublication</structname> to add table
> > > +   <structname>users</structname> except column
> > > +   <structname>security_pin</structname>:
> > > +<programlisting>
> > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
> > >
> > > Those tags don't seem correct. e.g. "users" and "security_pin" are not
> > > <structname> (???).
> > >
> > > Perhaps, every other example here is wrong too and you just copied
> > > them? Anyway, something here looks wrong to me.
> > >
> > I saw different documents and usage of tags seems not well defined.
> > For example for table we are using tags in document
> > create_publication.sgml, update.sgml <structname> is used, in document
> > table.sgml, advanced.sgml <classname> is used, and in
> > logical-replication.sgml <literal>  is used. Similarly for column
> > names <structname>, <structfield> or <literal> are used in different
> > parts of the document.
> >
> > I kept the changed tag to <structfield> for the column for this patch.
> > Do you have any suggestions?
>
> No, for this patch I think it is best that you just follow nearby code
> (as you are already doing). I plan to raise another thread to ask what
> are the guidelines for this  sort of markup which is currently used
> inconsistently in different places.
>

FYI - I created a new thread asking this markup question [1].

======
[1] https://www.postgresql.org/message-id/CAHut%2BPvtf24r%2BbdPgBind84dBLPvgNL7aB%2B%3DHxAUupdPuo2gRg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Fri, 20 Jun 2025 at 09:28, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> ...
> > > 3.
> > > TBH, I was wondering why a new catalog attribute was necessary...
> > >
> > > Can't you simply re-use the existing attribute "prattrs" attribute.
> > > e.g. let's just define negative means exclude.
> > >
> > > e.g. a value of 1 3 means only the 1st and 3rd columns are published
> > > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> > > e.g. a value of null mean all columns are published
> > >
> > > (mixes of negative and positive will not be possible)
> > >
> >
> > Currently I have added a new attribute 'prexcludeattrs' in
> > pg_publication_rel table. I used this approach because it will be
> > easier for user to get the exclude column list, in code no extra
> > processing is required to get the exclude column list.
> >
> > For an approach to use negative numbers for exclude columns. I see an
> > advantage that we do not need to introduce a new column for
> > pg_publication_rel. But in code, each time we want to get a column
> > list or exclude column list we need an extra processing of 'prattrs'
> > columns. Also I don't see any existing catalog table using a negative
> > attribute for column list.
> >
> > Based on above observations, I feel that the current is better.
> >
> > Please correct me if I missed an advantage for the approach you suggested.
> >
>
> OK. Maybe using negative numbers was a bridge too far...
>
> But IMO it is not good to have 2 separate attributes for the lists.
> Doing so implies they can coexist, but that is not true. I felt there
> are not really 2 "kinds" of columns list anyway -- there is just a
> "column list" which defines columns that are either included or
> excluded from the publication determined by EXCEPT.
>
> Having  dual lists gets weird/confusing to describe them -- you end up
> continually having to refer to the other one to clarify behaviour.
>
> e.g. Does 'prattrs' value NULL mean publish everything? Well, no...
> that depends if there is a non null 'prexcludeattrs'
> e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well,
> no... that depends if there is a non null 'prattrs'
>
> Furthermore, all the code is doubling up referring to "column list"
> and "exclude column list"  -- code / docs / comments / error messages.
> There are quite a lot of places the patch touches that I thought were
> not really needed if you don't have 2 different kinds of column-lists.
>
> To summarise, I felt it would be better to just keep the existing
> 'prattrs' as the one-and-only column list, but add another BOOLEAN
> attribute to flag whether 'prattrs' columns should be included or
> excluded.
>
> prattrs;   prattrs_exclude;  Means
> --------------------------------------------
> 1 2 3     f                          only cols 1,2,3 will be published
> 4 5 6     t                          only cols 4,5,6 will NOT be published
> null       f                          all cols are published (flag is ignored)
> null       t                          all cols are published (flag is ignored)
>

I agree with your point and also it would be a better approach. In
patch 0001 an column 'prexcept' was added in pg_publication_rel. We
use that only for publication with all tables. I have reused this
column for patch 0003. If publication is not for all tables and the
'prexcept' flag is true, it implies that the columns in 'prattrs' are
to be excluded from being published. I have included the changes for
it in v14-0003 patch.

> > > 5.
> > > +  <para>
> > > +   Alter publication <structname>mypublication</structname> to add table
> > > +   <structname>users</structname> except column
> > > +   <structname>security_pin</structname>:
> > > +<programlisting>
> > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
> > >
> > > Those tags don't seem correct. e.g. "users" and "security_pin" are not
> > > <structname> (???).
> > >
> > > Perhaps, every other example here is wrong too and you just copied
> > > them? Anyway, something here looks wrong to me.
> > >
> > I saw different documents and usage of tags seems not well defined.
> > For example for table we are using tags in document
> > create_publication.sgml, update.sgml <structname> is used, in document
> > table.sgml, advanced.sgml <classname> is used, and in
> > logical-replication.sgml <literal>  is used. Similarly for column
> > names <structname>, <structfield> or <literal> are used in different
> > parts of the document.
> >
> > I kept the changed tag to <structfield> for the column for this patch.
> > Do you have any suggestions?
>
> No, for this patch I think it is best that you just follow nearby code
> (as you are already doing). I plan to raise another thread to ask what
> are the guidelines for this  sort of markup which is currently used
> inconsistently in different places.
Thanks for starting a thread for it.

>
> //////////
>
> Below are a few more review comments for v13-0003
>
> ======
> Commit message
>
> 1.
> Typo /THe/The/
>
> ~~~
Fixed

> 2.
> The new syntax allows specifying excluded column list when creating or
> altering a publication. For example:
> CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list)
> or
> ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list)
>
> ~
>
> I felt since you say these "For example:" it would be better to give
> real examples.
> e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)".
>
Fixed

> ~~~
>
> 3.
> Typo /family of command/family of commands/
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 4.
> I am not sure that it was a good idea to be making a new term called
> an "exclude column list"... because in introduces a new concept of
> something that sounds like it is a different kind of list, and now you
> have to keep referring everywhere to both to "column list" versus
> "exclude column list". All the doubling up add more complication I
> think.
>
> IMO really there is just a "column list". Whether that list is for
> exclusion or not just depends on the presence of EXCEPT. So I felt
> maybe all places mentioning "exclude column list" could be rephrased.
>
> ======
> src/backend/catalog/pg_publication.c
>
> 5.
> +/*
> + * Returns true if the relation has exluded column list associated with the
> + * publication, false otherwise.
> + *
> + * If a exclude column list is found, the corresponding bitmap is returned
> + * through the cols parameter, if provided. The bitmap is constructed
> within the
> + * given memory context (mcxt).
> + */
> +
>
> Typo /exluded column/an excluded column/
> Typo /exclude column list/excluded column list/
>
updated the comment according to latest implementation

> ~~~
>
> 6.
> +/*
> + * pub_exclude_collist_validate
> + * Process and validate the 'excluded columns' list and ensure the columns
> + * are all valid to exclude from publication.  Checks for and raises an
> + * ERROR for any unknown columns, system columns, duplicate columns, or
> + * generated columns.
> + *
>
> Why can't you exclude generated columns?
>
> e.g. Maybe PUBLICATION says publish_generated_columns=stored and there
> are 100s of such columns, but the user just wants to exclude one of
> them. Why say they cannot do that? Hmm. Perhaps this is being already
> handled elsewhere, in which case this comment still seems misleading.
>
I have removed this restriction. Now we can specify stored generated
columns in EXCEPT (column_list) when we use the
'publish_generated_columns' flag.

> ======
> src/backend/commands/publicationcmds.c
>
> 7.
> + * With REPLICA IDENTITY FULL, no column list and no excluded column
> + * list is allowed.
>
> Really, just "no column list is allowed." same as it said before.
>
> ======
Fixed

Thanks and Regards,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok.

Below are some review comments for v14-0003

======
1. GENERAL

Since the new syntax uses EXCEPT, then, in my opinion, you should try
to use that same term where possible when describing things. I
understand it is hard to do this in text and I agree often it makes
more sense to say "exclude" columns etc, but OTOH in the code there
are lots of places where you could have named vars/params differently:
e.g. 'except_collist' instead of 'exclude_collist' might have been
better.

======
Commit message

2.
Column list specifed with EXCEPT is stored in column "prattrs" in table
"pg_publication_rel" and also column "prexcept" is set to "true", to maintain
the column list that user wants to exclude from the publication.

~

That paragraph could do with some rewording. For example, AFAIK,
"prattrs" is for all column lists -- not just except col-lists, but
the way it is described here sounds different.

Also, /specifed/specified/

======
doc/src/sgml/catalogs.sgml

3. (52.42. pg_publication_rel)

       <para>
-       True if the relation must be excluded
+       True if the relation or column list must be excluded. If publication is
+       created <literal>FOR ALL TABLES</literal> and it is specified as true,
+       the relation should be excluded. Else if it is true the columns in
+       <literal>prattrs</literal> should be excluded from being published.
       </para></entry>

I felt this could be expressed more simply without mentioning anything
about FOR ALL TABLES.

SUGGESTION
True if the column list or relation must be excluded from publication.
If a column list is specified in <literal>prattrs</literal>, then
exclude only those columns. If <literal>prattrs</literal> is NULL,
then exclude the entire relation.

======
doc/src/sgml/logical-replication.sgml

4. (29.5. Column Lists)

   <para>
-   Each publication can optionally specify which columns of each table are
-   replicated to subscribers. The table on the subscriber side must have at
-   least all the columns that are published. If no column list is specified,
-   then all columns on the publisher are replicated.
+   Each publication can optionally specify which columns of each
table should be
+   replicated or excluded from replication. On the subscriber side, the table
+   must include at least all the columns that are published. If no column list
+   is provided, all columns from the publisher are replicated by default.
    See <xref linkend="sql-createpublication"/> for details on the syntax.
   </para>

I felt this patch may have changed too much text. IMO, you only needed
to say "... are replicated or excluded from replication.". The other
changes did not seem necessary.

~~~

5.
   <para>
-   If no column list is specified, any columns added to the table later are
-   automatically replicated. This means that having a column list which names
-   all columns is not the same as having no column list at all.
+   If no column list or a column list with EXCEPT is specified, any columns
+   added to the table later are automatically replicated. This means
that having
+   a column list which names all columns is not the same as having no
+   column list at all. If an column list is specified, any columns added to the
+   table later are automatically replicated.
   </para>

5a.
"This means that having a column list which names all columns is not
the same as having no column list at all." -- That note does not make
sense when you say EXCEPT. I think some rewording is needed here.

~

5b.
"If an column list is specified, any columns added to the table later
are automatically replicated.".

This made no sense -- some words missing?

~~~

6.
    Generated columns can also be specified in a column list. This allows
    generated columns to be published, regardless of the publication parameter
    <link linkend="sql-createpublication-params-with-publish-generated-columns">
-   <literal>publish_generated_columns</literal></link>. See
-   <xref linkend="logical-replication-gencols"/> for details.
+   <literal>publish_generated_columns</literal></link>. Generated columns can
+   be included in column list specified with EXCEPT clause if publication
+   parameter
+   <link linkend="sql-createpublication-params-with-publish-generated-columns">
+   <literal>publish_generated_columns</literal></link> is not set to
+   <literal>none</literal>. Specified generated columns will not be published.
+   See <xref linkend="logical-replication-gencols"/> for details.
   </para>

I am not so sure about this. It seemed overly strict to me.

Why can't it simply say:
"Generated columns can also be specified in a column list. This allows
generated columns to be published or excluded, regardless of the
publication parameter..."

Specifically, I don't know why you need to say:
Generated columns can be included in column list specified with EXCEPT
clause if publication parameter publish_generated_columns is not set
to none. Specified generated columns will not be published.

IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named
cols. So if param is "stored", then the named cols will be excluded.
OTOH, if param is "none" then all generated cols will be excluded
anyway, so why not just allow the EXCEPT (gencol,gencol2) here as
well, because the result will be the same.


~~~

7. (29.5.1. Examples)

    <para>
-    Create a table <literal>t1</literal> to be used in the following example.
+    Create tables <literal>t1</literal>, <literal>t2</literal> to be
used in the
+    following example.

/Create tables t1, t2/Create tables t1 and t2/

~~~

8.
    <para>
     Create a publication <literal>p1</literal>. A column list is defined for
-    table <literal>t1</literal> to reduce the number of columns that will be
-    replicated. Notice that the order of column names in the column list does
-    not matter.
+    table <literal>t1</literal> and a column list is defined for table
+    <literal>t2</literal> with EXCEPT clause to reduce the number of
columns that will be
+    replicated. Notice that the order of column names in the column
lists does not matter.

BEFORE
A column list is defined for table t1 and a column list is defined for
table t2...

SUGGESTION (added comma, etc.)
A column list is defined for table t1, and another column list is
defined for table t2...

~~~

9.
The final example still says:
"Only data from the column list of publication p1 is replicated."

That doesn't seem quite appropriate now that you also have an EXCEPT
column list.

SUGGESTION:
Only data specified by the column lists of publication p1 is replicated.

======
doc/src/sgml/ref/create_publication.sgml

10.
+     <para>
+      When a column list is specified with EXCEPT, the named columns are not
+      replicated. Specifying a column list has no effect on
+      <literal>TRUNCATE</literal> commands.
+     </para>

I felt that to be clearer the preceding paragraph should be changed as follows:

/When a column list is specified, only the named columns are
replicated./When a column list without EXCEPT is specified, only the
named columns are replicated./

~~~

11. CREATE PUBLICATION (NOTES section)

11a.
The NOTES talk about replica identity columns -- should you mention EXCEPT here?

~

11b.
The NOTES talk about generated columns -- should you mention EXCEPT here?

======
src/backend/catalog/pg_publication.c

12. check_and_fetch_column_list

+ if (!isnull)
+ except = DatumGetBool(cfdatum);
+
+ *except_columns = except && !pub->alltables;

AFAICT, you can Assert(!pub->alltables) because you already checked
that earlier up front.
So you don't need 'except' var either. Just assign *except_cols up
front and then overwrite it later if true.

SUGGESTION:

*except_cols = false;

if (pub->alltables)
  return false;
...
if (!isnull)
 *except_cols = DatumGetBool(cfdatum);

~~~

13. publication_add_relation

  /* Validate and translate column names into a Bitmapset of attnums. */
- attnums = pub_collist_validate(pri->relation, pri->columns);
+ attnums = pub_collist_validate(pri->relation, pri->columns,
+    pri->except && !pub->alltables,
+    pub->pubgencols_type);


I am wondering why we are even calling a function to validate column
lists if pub->alltables was true. AFAIK, that combination of
column-lists and FOR ALL TABLES is not even possible, so the code
seems strange.

~~~

14. pub_exclude_collist_validate
.
+ /*
+ * Check if column list specified with EXCEPT have any stored
+ * generated column and 'publish_generated_columns' is not set to
+ * 'stored'.
+ */
+ if (except_columns &&
+ TupleDescAttr(tupdesc, attnum - 1)->attgenerated ==
ATTRIBUTE_GENERATED_STORED &&
+ pubgencols_type != PUBLISH_GENCOLS_STORED)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot use stored generated column \"%s\" in publication
column list specified with EXCEPT when \"%s\" set to \"%s\"",
+    colname, "publish_generated_columns", "stored"));

As mentioned in the above DOCS comments, I was having doubts about why
we have this error.

If the parameter says "none", then generated columns will not be
replicated, so why should we care if the user also says
EXCEPT(gencol1,gencol2). Either way, the result will be the same; the
generated column will not be published.

~~~

15. GetRelationPublications

  {
  HeapTuple tup = &pubrellist->members[i]->tuple;
  Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
+ HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
+ bool is_table_excluded = ((Form_pg_publication)
GETSTRUCT(pubtup))->puballtables &&
+ ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept;

- if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
+ if (except_flag == is_table_excluded)
  result = lappend_oid(result, pubid);
+
+ ReleaseS


I'm not 100% sure you need the additional 'pubtup'... Can't you just
look at the "prattrs" field to see if a column-list was specified? If
"prattrs" is null and "prexcept" is true, isn't that the same
combination as what you are looking for here?

~~~

16. pg_get_publication_tables

+ columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
+    Anum_pg_publication_rel_prattrs,
+    &(nulls[2]));
+
+ /* if column list is specified with EXCEPT */
+ if (!pub->alltables && except)
+ columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL);
+ else
+ values[2] = columnsDatum;

16a.
Something seems fishy here. Isn't there a pathway where you missed
assigning value[2] to anything?

~

16b.
Also, I feel there should be some other boolean variable used here
instead of checking bot (!pub->alltables && except) in multiple
places.


======
src/backend/replication/pgoutput/pgoutput.c

17. RelationSyncEntry
+
+ /* Indicate if no column is included in the publication */
+ bool no_cols_published;

Maybe this can have a more explanatory comment to explain why it is needed?

~~~

18. check_and_init_gencol

+ bool found = false;
+ bool except_columns = false;
+
+ found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL,
+ NULL, &except_columns);
+
  /*
  * The column list takes precedence over the
  * 'publish_generated_columns' parameter. Those will be checked later,
- * see pgoutput_column_list_init.
+ * see pgoutput_column_list_init. But when a column list is specified
+ * with EXCEPT, it should be checked.
  */
- if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL))
+ if (found && !except_columns)
  continue;

The variable 'found' seems a poor name; how about 'has_column_list' or similar?

~~~

19. pgoutput_change

+ /*
+ * If all columns of a table is present in column list specified with
+ * EXCEPT, skip publishing the changes.
+ */
+ if (relentry->no_cols_published)
+ return;

/is present/are present/

======
src/bin/pg_dump/pg_dump.c

20. getPublicationTables

+ if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
  pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
+ else
+ pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;

  pubrinfo[j].dobj.catId.tableoid =
  atooid(PQgetvalue(res, i, i_tableoid));
@@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo
tblinfo[], int numTables)
  pubrinfo[j].pubrelqual = NULL;
  else
  pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
+ pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0);


Why not assign pubrinfo[j].pubexcept earlier so you don't have to
repeat the strcmp?

~~~

21.
- if (strcmp(prexcept, "t") == 0)
+ if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
  simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);

Why not assign pubrinfo[j].pubexcept earlier so you don't have to
repeat the strcmp? Same also for the PQgetisnull(res, i,
i_prattrs))...

~~~

22. dumpPublicationTable

  if (pubrinfo->pubrattrs)
- appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
+ {
+ if (pubrinfo->pubexcept)
+ appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs);
+ else
+ appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
+ }

SUGGESTION
{
  if (pubrinfo->pubexcept)
    appendPQExpBuffer(query, " EXCEPT");

  appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
}

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
shveta malik
Дата:
On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
>  I have included the changes for
> it in v14-0003 patch.
>
Thanks for the patches. I have reviewed patch001 alone, please find
few comments:

1)
+  <para>
+   The <literal>RESET</literal> clause will reset the publication to the
+   default state which includes resetting the publication parameters, setting
+   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
+   dropping all relations and schemas that are associated with the
+   publication.
   </para>

It is misleading, as far as I have understood, we do not drop the
tables or schemas associated with the pub; we just remove those from
the publication's object list. See previous doc:
"The ADD and DROP clauses will add and remove one or more
tables/schemas from the publication"

Perhaps we want to say the same thing when we speak about the 'drop'
aspect of RESET.

2)
AlterPublicationReset():

+ if (!OidIsValid(prid))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("relation \"%s\" is not part of the publication",
+ get_rel_name(relid))));

Can you please help me understand which scenario will give this error?

Another question is do we really need this error? IIUC, we generally
give errors if a user has explicitly called out a name of an object
and that object is not found. Example:

postgres=# alter publication pubnew drop table t1,tab2;
ERROR:  relation "t1" is not part of the publication

While in a few other cases, we pass missing_okay as true and do not
give errors. Please see other callers of performDeletion in
publicationcmds.c itself. There we have usage of missing_okay=true. I
have not researched myself, but please analyze the cases where
missing_okay is passed as true to figure out if those match our RESET
case. Try to reproduce if possible and then take a call.

3)
+ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public;
+ERROR:  syntax error at or near "ALL"
+LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub...

There is a problem in syntax, I think the intention of testcase was to
run this query successfully.

thanks
Shveta



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Thu, 26 Jun 2025 at 09:06, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Below are some review comments for v14-0003
>
> ======
> 1. GENERAL
>
> Since the new syntax uses EXCEPT, then, in my opinion, you should try
> to use that same term where possible when describing things. I
> understand it is hard to do this in text and I agree often it makes
> more sense to say "exclude" columns etc, but OTOH in the code there
> are lots of places where you could have named vars/params differently:
> e.g. 'except_collist' instead of 'exclude_collist' might have been
> better.
>
Fixed the variable names.

> ======
> Commit message
>
> 2.
> Column list specifed with EXCEPT is stored in column "prattrs" in table
> "pg_publication_rel" and also column "prexcept" is set to "true", to maintain
> the column list that user wants to exclude from the publication.
>
> ~
>
> That paragraph could do with some rewording. For example, AFAIK,
> "prattrs" is for all column lists -- not just except col-lists, but
> the way it is described here sounds different.
>
> Also, /specifed/specified/
>
Reworded the paragraph

> ======
> doc/src/sgml/catalogs.sgml
>
> 3. (52.42. pg_publication_rel)
>
>        <para>
> -       True if the relation must be excluded
> +       True if the relation or column list must be excluded. If publication is
> +       created <literal>FOR ALL TABLES</literal> and it is specified as true,
> +       the relation should be excluded. Else if it is true the columns in
> +       <literal>prattrs</literal> should be excluded from being published.
>        </para></entry>
>
> I felt this could be expressed more simply without mentioning anything
> about FOR ALL TABLES.
>
> SUGGESTION
> True if the column list or relation must be excluded from publication.
> If a column list is specified in <literal>prattrs</literal>, then
> exclude only those columns. If <literal>prattrs</literal> is NULL,
> then exclude the entire relation.
>
Fixed

> ======
> doc/src/sgml/logical-replication.sgml
>
> 4. (29.5. Column Lists)
>
>    <para>
> -   Each publication can optionally specify which columns of each table are
> -   replicated to subscribers. The table on the subscriber side must have at
> -   least all the columns that are published. If no column list is specified,
> -   then all columns on the publisher are replicated.
> +   Each publication can optionally specify which columns of each
> table should be
> +   replicated or excluded from replication. On the subscriber side, the table
> +   must include at least all the columns that are published. If no column list
> +   is provided, all columns from the publisher are replicated by default.
>     See <xref linkend="sql-createpublication"/> for details on the syntax.
>    </para>
>
> I felt this patch may have changed too much text. IMO, you only needed
> to say "... are replicated or excluded from replication.". The other
> changes did not seem necessary.
>
> ~~~
Fixed

> 5.
>    <para>
> -   If no column list is specified, any columns added to the table later are
> -   automatically replicated. This means that having a column list which names
> -   all columns is not the same as having no column list at all.
> +   If no column list or a column list with EXCEPT is specified, any columns
> +   added to the table later are automatically replicated. This means
> that having
> +   a column list which names all columns is not the same as having no
> +   column list at all. If an column list is specified, any columns added to the
> +   table later are automatically replicated.
>    </para>
>
> 5a.
> "This means that having a column list which names all columns is not
> the same as having no column list at all." -- That note does not make
> sense when you say EXCEPT. I think some rewording is needed here.
>
Fixed

> ~
>
> 5b.
> "If an column list is specified, any columns added to the table later
> are automatically replicated.".
>
> This made no sense -- some words missing?
>
This change was done by mistake. Removed it.

> ~~~
>
> 6.
>     Generated columns can also be specified in a column list. This allows
>     generated columns to be published, regardless of the publication parameter
>     <link linkend="sql-createpublication-params-with-publish-generated-columns">
> -   <literal>publish_generated_columns</literal></link>. See
> -   <xref linkend="logical-replication-gencols"/> for details.
> +   <literal>publish_generated_columns</literal></link>. Generated columns can
> +   be included in column list specified with EXCEPT clause if publication
> +   parameter
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> is not set to
> +   <literal>none</literal>. Specified generated columns will not be published.
> +   See <xref linkend="logical-replication-gencols"/> for details.
>    </para>
>
> I am not so sure about this. It seemed overly strict to me.
>
> Why can't it simply say:
> "Generated columns can also be specified in a column list. This allows
> generated columns to be published or excluded, regardless of the
> publication parameter..."
>
> Specifically, I don't know why you need to say:
> Generated columns can be included in column list specified with EXCEPT
> clause if publication parameter publish_generated_columns is not set
> to none. Specified generated columns will not be published.
>
> IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named
> cols. So if param is "stored", then the named cols will be excluded.
> OTOH, if param is "none" then all generated cols will be excluded
> anyway, so why not just allow the EXCEPT (gencol,gencol2) here as
> well, because the result will be the same.
>
>
I have removed this change. And allowed specifying generated columns
in EXCEPT column list as well irrespective of value of
‘publish_generated_columns’.

> ~~~
>
> 7. (29.5.1. Examples)
>
>     <para>
> -    Create a table <literal>t1</literal> to be used in the following example.
> +    Create tables <literal>t1</literal>, <literal>t2</literal> to be
> used in the
> +    following example.
>
> /Create tables t1, t2/Create tables t1 and t2/
>
Fixed

> ~~~
>
> 8.
>     <para>
>      Create a publication <literal>p1</literal>. A column list is defined for
> -    table <literal>t1</literal> to reduce the number of columns that will be
> -    replicated. Notice that the order of column names in the column list does
> -    not matter.
> +    table <literal>t1</literal> and a column list is defined for table
> +    <literal>t2</literal> with EXCEPT clause to reduce the number of
> columns that will be
> +    replicated. Notice that the order of column names in the column
> lists does not matter.
>
> BEFORE
> A column list is defined for table t1 and a column list is defined for
> table t2...
>
> SUGGESTION (added comma, etc.)
> A column list is defined for table t1, and another column list is
> defined for table t2...
>
Fixed

> ~~~
>
> 9.
> The final example still says:
> "Only data from the column list of publication p1 is replicated."
>
> That doesn't seem quite appropriate now that you also have an EXCEPT
> column list.
>
> SUGGESTION:
> Only data specified by the column lists of publication p1 is replicated.
>
Fixed

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 10.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands.
> +     </para>
>
> I felt that to be clearer the preceding paragraph should be changed as follows:
>
> /When a column list is specified, only the named columns are
> replicated./When a column list without EXCEPT is specified, only the
> named columns are replicated./
>
Fixed

> ~~~
>
> 11. CREATE PUBLICATION (NOTES section)
>
> 11a.
> The NOTES talk about replica identity columns -- should you mention EXCEPT here?
>
Added notes for EXCEPT

> ~
>
> 11b.
> The NOTES talk about generated columns -- should you mention EXCEPT here?
>
I felt it is not needed.

> ======
> src/backend/catalog/pg_publication.c
>
> 12. check_and_fetch_column_list
>
> + if (!isnull)
> + except = DatumGetBool(cfdatum);
> +
> + *except_columns = except && !pub->alltables;
>
> AFAICT, you can Assert(!pub->alltables) because you already checked
> that earlier up front.
> So you don't need 'except' var either. Just assign *except_cols up
> front and then overwrite it later if true.
>
> SUGGESTION:
>
> *except_cols = false;
>
> if (pub->alltables)
>   return false;
> ...
> if (!isnull)
>  *except_cols = DatumGetBool(cfdatum);
>
Fixed

> ~~~
>
> 13. publication_add_relation
>
>   /* Validate and translate column names into a Bitmapset of attnums. */
> - attnums = pub_collist_validate(pri->relation, pri->columns);
> + attnums = pub_collist_validate(pri->relation, pri->columns,
> +    pri->except && !pub->alltables,
> +    pub->pubgencols_type);
>
>
> I am wondering why we are even calling a function to validate column
> lists if pub->alltables was true. AFAIK, that combination of
> column-lists and FOR ALL TABLES is not even possible, so the code
> seems strange.
>
Fixed

> ~~~
>
> 14. pub_exclude_collist_validate
> .
> + /*
> + * Check if column list specified with EXCEPT have any stored
> + * generated column and 'publish_generated_columns' is not set to
> + * 'stored'.
> + */
> + if (except_columns &&
> + TupleDescAttr(tupdesc, attnum - 1)->attgenerated ==
> ATTRIBUTE_GENERATED_STORED &&
> + pubgencols_type != PUBLISH_GENCOLS_STORED)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> + errmsg("cannot use stored generated column \"%s\" in publication
> column list specified with EXCEPT when \"%s\" set to \"%s\"",
> +    colname, "publish_generated_columns", "stored"));
>
> As mentioned in the above DOCS comments, I was having doubts about why
> we have this error.
>
> If the parameter says "none", then generated columns will not be
> replicated, so why should we care if the user also says
> EXCEPT(gencol1,gencol2). Either way, the result will be the same; the
> generated column will not be published.
>
Removed this restriction.

> ~~~
>
> 15. GetRelationPublications
>
>   {
>   HeapTuple tup = &pubrellist->members[i]->tuple;
>   Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
> + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid));
> + bool is_table_excluded = ((Form_pg_publication)
> GETSTRUCT(pubtup))->puballtables &&
> + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept;
>
> - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
> + if (except_flag == is_table_excluded)
>   result = lappend_oid(result, pubid);
> +
> + ReleaseS
>
>
> I'm not 100% sure you need the additional 'pubtup'... Can't you just
> look at the "prattrs" field to see if a column-list was specified? If
> "prattrs" is null and "prexcept" is true, isn't that the same
> combination as what you are looking for here?
>
Yes, we can use this combination as well. Fixed it in latest patch.

> ~~~
>
> 16. pg_get_publication_tables
>
> + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +    Anum_pg_publication_rel_prattrs,
> +    &(nulls[2]));
> +
> + /* if column list is specified with EXCEPT */
> + if (!pub->alltables && except)
> + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL);
> + else
> + values[2] = columnsDatum;
>
> 16a.
> Something seems fishy here. Isn't there a pathway where you missed
> assigning value[2] to anything?
>
Modified this change.

> ~
>
> 16b.
> Also, I feel there should be some other boolean variable used here
> instead of checking bot (!pub->alltables && except) in multiple
> places.
>
Fixed
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 17. RelationSyncEntry
> +
> + /* Indicate if no column is included in the publication */
> + bool no_cols_published;
>
> Maybe this can have a more explanatory comment to explain why it is needed?
>
Fixed

> ~~~
>
> 18. check_and_init_gencol
>
> + bool found = false;
> + bool except_columns = false;
> +
> + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL,
> + NULL, &except_columns);
> +
>   /*
>   * The column list takes precedence over the
>   * 'publish_generated_columns' parameter. Those will be checked later,
> - * see pgoutput_column_list_init.
> + * see pgoutput_column_list_init. But when a column list is specified
> + * with EXCEPT, it should be checked.
>   */
> - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL))
> + if (found && !except_columns)
>   continue;
>
> The variable 'found' seems a poor name; how about 'has_column_list' or similar?
>
Fixed

> ~~~
>
> 19. pgoutput_change
>
> + /*
> + * If all columns of a table is present in column list specified with
> + * EXCEPT, skip publishing the changes.
> + */
> + if (relentry->no_cols_published)
> + return;
>
> /is present/are present/
>
fixed

> ======
> src/bin/pg_dump/pg_dump.c
>
> 20. getPublicationTables
>
> + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
>   pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
>
>   pubrinfo[j].dobj.catId.tableoid =
>   atooid(PQgetvalue(res, i, i_tableoid));
> @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo
> tblinfo[], int numTables)
>   pubrinfo[j].pubrelqual = NULL;
>   else
>   pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual));
> + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0);
>
>
> Why not assign pubrinfo[j].pubexcept earlier so you don't have to
> repeat the strcmp?
>
Fixed

> ~~~
>
> 21.
> - if (strcmp(prexcept, "t") == 0)
> + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs))
>   simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);
>
> Why not assign pubrinfo[j].pubexcept earlier so you don't have to
> repeat the strcmp? Same also for the PQgetisnull(res, i,
> i_prattrs))...
>
Fixed

> ~~~
>
> 22. dumpPublicationTable
>
>   if (pubrinfo->pubrattrs)
> - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> + {
> + if (pubrinfo->pubexcept)
> + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs);
> + else
> + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> + }
>
> SUGGESTION
> {
>   if (pubrinfo->pubexcept)
>     appendPQExpBuffer(query, " EXCEPT");
>
>   appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs);
> }
Fixed

I have addressed the comments shared by you and shared the updated v15
patch set here.

Thanks and Regards,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Thu, 26 Jun 2025 at 15:27, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> >  I have included the changes for
> > it in v14-0003 patch.
> >
> Thanks for the patches. I have reviewed patch001 alone, please find
> few comments:
>
> 1)
> +  <para>
> +   The <literal>RESET</literal> clause will reset the publication to the
> +   default state which includes resetting the publication parameters, setting
> +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> +   dropping all relations and schemas that are associated with the
> +   publication.
>    </para>
>
> It is misleading, as far as I have understood, we do not drop the
> tables or schemas associated with the pub; we just remove those from
> the publication's object list. See previous doc:
> "The ADD and DROP clauses will add and remove one or more
> tables/schemas from the publication"
>
> Perhaps we want to say the same thing when we speak about the 'drop'
> aspect of RESET.
I have updated the document.

> 2)
> AlterPublicationReset():
>
> + if (!OidIsValid(prid))
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("relation \"%s\" is not part of the publication",
> + get_rel_name(relid))));
>
> Can you please help me understand which scenario will give this error?
>
> Another question is do we really need this error? IIUC, we generally
> give errors if a user has explicitly called out a name of an object
> and that object is not found. Example:
>
> postgres=# alter publication pubnew drop table t1,tab2;
> ERROR:  relation "t1" is not part of the publication
>
> While in a few other cases, we pass missing_okay as true and do not
> give errors. Please see other callers of performDeletion in
> publicationcmds.c itself. There we have usage of missing_okay=true. I
> have not researched myself, but please analyze the cases where
> missing_okay is passed as true to figure out if those match our RESET
> case. Try to reproduce if possible and then take a call.
I thought about the above point and I also think this check is not
required. Also, the function was calling PublicationDropSchemas with
missing_ok as false. I have changed it to be true.

> 3)
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public;
> +ERROR:  syntax error at or near "ALL"
> +LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub...
>
> There is a problem in syntax, I think the intention of testcase was to
> run this query successfully.

I have fixed it.

Thanks Shveta for reviewing the patch. I have addressed the comments
and posted an updated version v15 in [1].

[1]: https://www.postgresql.org/message-id/CANhcyEU%2BaPu6iAH2cTA0cDtn3pd6c_njBONCt3FubYZoEEnm8Q%40mail.gmail.com

Thanks and Regards,
Shlok Kyal



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok.

Some review comments for v15-0003.

======
doc/src/sgml/catalogs.sgml

1.
       <para>
-       True if the relation must be excluded
+       True if the column list or relation must be excluded from publication.
+       If a column list is specified in <literal>prattrs</literal>, then
+       exclude only those columns. If <literal>prattrs</literal> is NULL,
+       then exclude the entire relation.
       </para></entry>

I noticed other fields on this page say "null" instead of "NULL". It
seems like "null" is more conventional.

======
doc/src/sgml/logical-replication.sgml

2.
   <para>
    If no column list is specified, any columns added to the table later are
    automatically replicated. This means that having a column list which names
-   all columns is not the same as having no column list at all.
+   all columns is not the same as having no column list at all.
Similarly, if an
+   column list is specified with EXCEPT, any columns added to the table later
+   are also replicated automatically.
   </para>

2a.
CURRENTLY
If no column list or a column list with EXCEPT is specified, any
columns added to the table later are automatically replicated. This
means that having a column list which names all columns is not the
same as having no column list at all. If an column list is specified,
any columns added to the table later are automatically replicated.

~

That still doesn't quite make sense. I think instead of saying "This
means..." it needs to say something a bit like below:

However, a normal column list (without EXCEPT) only replicates the
specified columns and no more. Therefore, having a column list that
names all columns is not the same as having no column list at all, as
more columns may be added to the table later.

~

2b.
And the final sentence "If an column list..." looks like a cut/paste error (??)

~

2c.
Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

~~~

2.5A.
The description about generated columns still says this:

CURRENT:
Generated columns can also be specified in a column list. This allows
generated columns to be published, regardless of the publication
parameter publish_generated_columns. See Section 29.6 for details.

~

But I don't think it is quite correct. IMO gencols behaviour is much
more subtle...

e.g.

a) Normal collist - these named cols are published REGARDLESS of the
'publish_generated_cols' parameter (same as before)

b) EXCEPT collist - you can specify gencols in the list REGARDLESS of
the 'publish_generated_cols' parameter, because since they are named
as "except" then they will not be published anyhow....

c) BUT for EXCEPT collist case, I think any gencols that are *not*
covered by that EXCEPT collist should follow the rules according to
the 'publish_generated_cols' parameter.

So, it is much more tricky than the docs currently say:

Also

2.5B.
- The text says "See Section 29.6 for details," but there are no
examples of these combinations (e.g. EXCEPT collist and diff parameter
setting)

2.5C,
- The regression tests also need to be more complex to cover these

2.5D.
- You might need to add something in the CREATE PUBLICATION "NOTES"
section after all -- even if it just refers to here.

~~~

3.
    <para>
     Create a publication <literal>p1</literal>. A column list is defined for
-    table <literal>t1</literal> to reduce the number of columns that will be
-    replicated. Notice that the order of column names in the column list does
-    not matter.
+    table <literal>t1</literal>, and another column list is defined for table
+    <literal>t2</literal> using the EXCEPT clause to reduce the number of
+    columns that will be replicated. Note that the order of column names in
+    the column lists does not matter.
 <programlisting>
-/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d);
+/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a);
 </programlisting></para>

Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

======
doc/src/sgml/ref/create_publication.sgml

4.
      <para>
-      When a column list is specified, only the named columns are replicated.
-      The column list can contain stored generated columns as well. If the
-      column list is omitted, the publication will replicate all non-generated
-      columns (including any added in the future) by default. Stored generated
-      columns can also be replicated if
<literal>publish_generated_columns</literal>
-      is set to <literal>stored</literal>. Specifying a column list has no
-      effect on <literal>TRUNCATE</literal> commands. See
+      When a column list without EXCEPT is specified, only the named
columns are
+      replicated. The column list can contain stored generated columns as well.
+      If the column list is omitted, the publication will replicate
+      all non-generated columns (including any added in the future) by default.
+      Stored generated columns can also be replicated if
+      <literal>publish_generated_columns</literal> is set to
+      <literal>stored</literal>. Specifying a column list has no effect on
+      <literal>TRUNCATE</literal> commands. See
       <xref linkend="logical-replication-col-lists"/> for details about column
       lists.
      </para>

Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

~~~

5.
+     <para>
+      When a column list is specified with EXCEPT, the named columns are not
+      replicated. Specifying a column list has no effect on
+      <literal>TRUNCATE</literal> commands.
+     </para>

Maybe here EXCEPT should be written as <literal>EXCEPT</literal>.

** Note all the extra subtleties that I mentioned in the review
comment #2.5 above --- e.g. IMO any *un-listed* gencols still should
follow the parameter rules.

~~~

6.
   <para>
    Any column list must include the <literal>REPLICA IDENTITY</literal> columns
-   in order for <command>UPDATE</command> or <command>DELETE</command>
-   operations to be published. There are no column list restrictions if the
-   publication publishes only <command>INSERT</command> operations.
+   and any column list specified with EXCEPT must not include the
+   <literal>REPLICA IDENTITY</literal> columns in order for
+   <command>UPDATE</command> or <command>DELETE</command> operations to be
+   published. There are no column list restrictions if the
publication publishes
+   only <command>INSERT</command> operations.
   </para>

6a.
CURRENT:
Any column list must include the REPLICA IDENTITY columns, and any
column list specified with EXCEPT must not include the REPLICA
IDENTITY columns in order for UPDATE or DELETE operations to be
published.

~

I felt that might be better expressed the other way around. Also, it
might be better to say "not name" instead of "not include" because
EXCEPT + include seemed a bit contrary.


SUGGESTION (maybe like this)
In order for UPDATE or DELETE operations to work, all the REPLICA
IDENTITY columns must be published. So, any column list must name all
REPLICA IDENTITY columns, and any EXCEPT column list must not name any
REPLICA IDENTITY columns.

~~

6b.
Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

======
src/backend/catalog/pg_publication.c

check_and_fetch_column_list:

7.
+ /* Lookup the except attribute */
+ cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple,
+   Anum_pg_publication_rel_prexcept, &isnull);
+
+ if (!isnull)
+ {
+ Assert(!pub->alltables);
+ *except_columns = DatumGetBool(cfdatum);
+ }
+

I felt it would be safer to also assign *except_columns = false;
up-front so the caller could be sure this flag was meaningful on
return.

~~~

pub_form_cols_map:

8.
Maybe use snake case like for other params, so /excepcols/except_cols/

~~~

pg_get_publication_tables:

9.

I felt all the logic in this function maybe can be simpler:

e.g. If you just have "Bitmapset *except_columns = NULL;" then null
nmeans there is no except columns; otherwise there is. This means you
don't need a separate 'bool except_column' variable.

e.g. Assign the Bitmapset *except_columns after you already have the
values[2], instead of doing it later.

e.g. The skip code if (except_columns && bms_is_member(att->attnum,
columns)) could just check the list member, I think, without the
additional bool.

~~~

10.
+ /*
+ * We fetch pubtuple if publication is not FOR ALL TABLES and not
+ * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that
+ * prattrs contains columns to be excluded for replication.
+ */
+ if (!isnull)
+ except_columns = DatumGetBool(exceptDatum);


/indicate/indicates/

======
src/backend/parser/gram.y

11.
+ | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause
+ {
+ $$ = makeNode(PublicationObjSpec);
+ $$->pubobjtype = PUBLICATIONOBJ_TABLE;
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = $2;
+ $$->pubtable->columns = $4;
+ $$->pubtable->whereClause = $5;
+ $$->pubtable->except = true;
+ $$->location = @1;
+ }

I wasn't expecting you would need another 'opt_except_column_list' and
all the code duplication that causes. AFAIK, the syntax is identical
for 'opt_column_list' apart from the preceding EXCEPT so I thought all
you need is to allow the 'opt_column_list' to have an optional EXCEPT
qualifier.

======
src/backend/replication/pgoutput/pgoutput.c

12.
+
+ /*
+ * Indicates whether no columns are published for a given relation. With
+ * the introduction of the EXCEPT clause in column lists, it is now
+ * possible to define a publication that excludes all columns of a table.
+ * However, the 'columns' attribute cannot represent this case, since a
+ * NULL value implies that all columns are published. To distinguish this
+ * scenario, the 'no_cols_published' flag is introduced.
+ */
+ bool no_cols_published;
 } RelationSyncEntry;

But, what about when Bitmapset *columns is not null, but has no bits
set -- doesn't that mean the same as "no columns"?

======
src/include/catalog/pg_publication.h

13.
 extern Bitmapset *pub_form_cols_map(Relation relation,
- PublishGencolsType include_gencols_type);
+ PublishGencolsType include_gencols_type,
+ Bitmapset *exceptcols);

Maybe snake-case like the other params: /exceptcols/except_cols/

======
src/test/regress/sql/publication.sql

14.
+-- Verify that publication is created with EXCEPT
+CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1,
pub_sch1.pub_test_except2 EXCEPT (b, c);
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
+

I think tests should also use psql \dRp+ commands in places to show
that the "describe" stuff is working correctly.

~~~

15.
+-- Check for invalid cases
+CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1,
TABLE pub_test_except1 EXCEPT (b, c);
+CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;

Should explain more about what you are testing here:
a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA
b) syntax error EXCEPT without a col-list

~~~

16.
+-- Verify that publication can be altered with EXCEPT
+ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT
(a, b), pub_sch1.pub_test_except2;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';

The comment is a bit misleading because there are many kinds of
"alter". Maybe say more like
Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list)

~~~

17.
+-- Verify ALTER PUBLICATION ... DROP
+ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b);
+ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1;

Should explain more:
+-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list)
+-- Verify ok - ALTER PUBLICATION ... DROP ...

~~~

18.
+ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d);
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';

Missing comment:
+-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list)

~~~

19.
+-- Verify excluded columns cannot be part of REPLICA IDENTITY
+ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;

+CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;

+DROP INDEX pub_test_except1_a_idx;
+CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+
+DROP INDEX pub_test_except1_a_idx;

19a.
IIUC, really there are multiple tests here, so I think it should all
be split and commented separately.

a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
c) Verify that so long as no clash between RI cols and the EXCEPT
col-list, then it is ok

~

19b.
IMO, some index names could be better:

CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
How about 'pub_test_except1_ac_idx'?

~~~

20.
+DROP PUBLICATION testpub_except;
+DROP TABLE pub_test_except1;
+DROP TABLE pub_sch1.pub_test_except2;

Add a "cleanup" comment.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok,

One more thing, I noticed there is no tab-completion code yet for this
new EXCEPT (column_list) syntax.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
shveta malik
Дата:
On Fri, Jun 27, 2025 at 3:44 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Thu, 26 Jun 2025 at 15:27, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > >  I have included the changes for
> > > it in v14-0003 patch.
> > >
> > Thanks for the patches. I have reviewed patch001 alone, please find
> > few comments:
> >
> > 1)
> > +  <para>
> > +   The <literal>RESET</literal> clause will reset the publication to the
> > +   default state which includes resetting the publication parameters, setting
> > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > +   dropping all relations and schemas that are associated with the
> > +   publication.
> >    </para>
> >
> > It is misleading, as far as I have understood, we do not drop the
> > tables or schemas associated with the pub; we just remove those from
> > the publication's object list. See previous doc:
> > "The ADD and DROP clauses will add and remove one or more
> > tables/schemas from the publication"
> >
> > Perhaps we want to say the same thing when we speak about the 'drop'
> > aspect of RESET.
> I have updated the document.
>
> > 2)
> > AlterPublicationReset():
> >
> > + if (!OidIsValid(prid))
> > + ereport(ERROR,
> > + (errcode(ERRCODE_UNDEFINED_OBJECT),
> > + errmsg("relation \"%s\" is not part of the publication",
> > + get_rel_name(relid))));
> >
> > Can you please help me understand which scenario will give this error?
> >
> > Another question is do we really need this error? IIUC, we generally
> > give errors if a user has explicitly called out a name of an object
> > and that object is not found. Example:
> >
> > postgres=# alter publication pubnew drop table t1,tab2;
> > ERROR:  relation "t1" is not part of the publication
> >
> > While in a few other cases, we pass missing_okay as true and do not
> > give errors. Please see other callers of performDeletion in
> > publicationcmds.c itself. There we have usage of missing_okay=true. I
> > have not researched myself, but please analyze the cases where
> > missing_okay is passed as true to figure out if those match our RESET
> > case. Try to reproduce if possible and then take a call.
> I thought about the above point and I also think this check is not
> required. Also, the function was calling PublicationDropSchemas with
> missing_ok as false. I have changed it to be true.
>

Okay. Is there a reason for not using PublicationDropTables() here? We
have rewritten similar code in the Reset flow.

> > 3)
> > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public;
> > +ERROR:  syntax error at or near "ALL"
> > +LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub...
> >
> > There is a problem in syntax, I think the intention of testcase was to
> > run this query successfully.
>
> I have fixed it.
>
> Thanks Shveta for reviewing the patch. I have addressed the comments
> and posted an updated version v15 in [1].

Thanks for the patches. My review is in progress but please find few
comments on 002:

1)
where exception_object is:
    [ ONLY ] table_name [ * ]

We have the above in CREATE and ALTER pub docs, but we do not explain
ONLY with EXCEPT. We do have an explanation of ONLY under 'FOR TABLE'.
But since 'FOR TABLE' and 'EXCEPT' do not go together, it is somewhat
difficult to connect the dots and find the information ONLY in the
context of EXCEPT. We shall have ONLY explained for EXCEPT as well. Or
we can have ONLY defined in a way that both 'FOR TABLE' and 'EXCEPT'
can refer to it.

2)
We get tab-completion options in this command:
postgres=# create publication pub5 for TABLE tab1 W
WHERE (  WITH (

Similarly in this command:
create publication pub5 for TABLES IN SCHEMA s1

But once we have 'EXCEPT TABLE', we do not get further tab-completion
option like WITH(...)
create publication pub5 for ALL TABLES EXCEPT TABLE tab1

3)
During tab-expansion, 'EXCEPT TABLE' and  'WITH (' in the below
command looks like they are connecting words. Can the gap be increased
similar to tab-expansion of next command shown below:

postgres=# create publication pub4 for ALL TABLES
EXCEPT TABLE  WITH (

postgres=# create publication pub4 for
ALL TABLES        TABLE             TABLES IN SCHEMA

4)
alter_publication.sgml.orig is a left-over in patch002.

thanks
Shveta



Re: Skipping schema changes in publication

От
shveta malik
Дата:
Few more comments on 002:

5)
+GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
 {

+ List    *exceptlist;
+
+ exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);


a) Here, we are assuming that the list provided by
GetPublicationRelations() will be except-tables list only, but there
is no validation of that.
b) We are using GetPublicationRelations() to get the relations which
are excluded from the publication. The name of function and comments
atop function are not in alignment with this usage.

Suggestion:
We can have a new GetPublicationExcludeRelations() function for the
concerned usage. The existing logic of GetPublicationRelations() can
be shifted to a new internal-logic function which will accept a
'except-flag' as well. Both GetPublicationRelations() and
GetPublicationExcludeRelations() can call that new function by passing
'except-flag' as false and true respectively. The new internal
function will validate 'prexcept' against that except-flag passed and
will return the results.

6)
Before your patch002, GetTopMostAncestorInPublication() was checking
pg_publication_rel and pg_publication_namespace to find out if the
table in the ancestor-list is part of a given particular. Both
pg_publication_rel and pg_publication_namespace did not have the entry
"for all tables" publications. That means
GetTopMostAncestorInPublication() was originally not checking whether
the given puboid is an "for all tables" publication to see if a rel
belongs to that particular pub or not. I

But now with the current change, we do check if pub is all-tables pub,
if so, return relid and mark ancestor_level (provided table is not
part of the except list).  IIUC, the result in 2 cases may be
different. Is that the intention? Let me know if my understanding is
wrong.

thanks
Shveta



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 30 Jun 2025 at 11:37, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for v15-0003.
>
> ======
> doc/src/sgml/catalogs.sgml
>
> 1.
>        <para>
> -       True if the relation must be excluded
> +       True if the column list or relation must be excluded from publication.
> +       If a column list is specified in <literal>prattrs</literal>, then
> +       exclude only those columns. If <literal>prattrs</literal> is NULL,
> +       then exclude the entire relation.
>        </para></entry>
>
> I noticed other fields on this page say "null" instead of "NULL". It
> seems like "null" is more conventional.
>
Fixed

> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>    <para>
>     If no column list is specified, any columns added to the table later are
>     automatically replicated. This means that having a column list which names
> -   all columns is not the same as having no column list at all.
> +   all columns is not the same as having no column list at all.
> Similarly, if an
> +   column list is specified with EXCEPT, any columns added to the table later
> +   are also replicated automatically.
>    </para>
>
> 2a.
> CURRENTLY
> If no column list or a column list with EXCEPT is specified, any
> columns added to the table later are automatically replicated. This
> means that having a column list which names all columns is not the
> same as having no column list at all. If an column list is specified,
> any columns added to the table later are automatically replicated.
>
> ~
>
> That still doesn't quite make sense. I think instead of saying "This
> means..." it needs to say something a bit like below:
>
> However, a normal column list (without EXCEPT) only
> specified columns and no more. Therefore, having a column list that
> names all columns is not the same as having no column list at all, as
> more columns may be added to the table later.
>
Fixed

> ~
>
> 2b.
> And the final sentence "If an column list..." looks like a cut/paste error (??)
>
Yes it was a mistake.

> ~
>
> 2c.
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed.

> ~~~
>
> 2.5A.
> The description about generated columns still says this:
>
> CURRENT:
> Generated columns can also be specified in a column list. This allows
> generated columns to be published, regardless of the publication
> parameter publish_generated_columns. See Section 29.6 for details.
>
> ~
>
> But I don't think it is quite correct. IMO gencols behaviour is much
> more subtle...
>
> e.g.
>
> a) Normal collist - these named cols are published REGARDLESS of the
> 'publish_generated_cols' parameter (same as before)
>
> b) EXCEPT collist - you can specify gencols in the list REGARDLESS of
> the 'publish_generated_cols' parameter, because since they are named
> as "except" then they will not be published anyhow....
>
> c) BUT for EXCEPT collist case, I think any gencols that are *not*
> covered by that EXCEPT collist should follow the rules according to
> the 'publish_generated_cols' parameter.
>
> So, it is much more tricky than the docs currently say:
>
Modified the documentation

> Also
>
> 2.5B.
> - The text says "See Section 29.6 for details," but there are no
> examples of these combinations (e.g. EXCEPT collist and diff parameter
> setting)
>
Added documentation.

> 2.5C,
> - The regression tests also need to be more complex to cover these
>
Added tests related to these

> 2.5D.
> - You might need to add something in the CREATE PUBLICATION "NOTES"
> section after all -- even if it just refers to here.
>
Added documentation

> ~~~
>
> 3.
>     <para>
>      Create a publication <literal>p1</literal>. A column list is defined for
> -    table <literal>t1</literal> to reduce the number of columns that will be
> -    replicated. Notice that the order of column names in the column list does
> -    not matter.
> +    table <literal>t1</literal>, and another column list is defined for table
> +    <literal>t2</literal> using the EXCEPT clause to reduce the number of
> +    columns that will be replicated. Note that the order of column names in
> +    the column lists does not matter.
>  <programlisting>
> -/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d);
> +/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a);
>  </programlisting></para>
>
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed

> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 4.
>       <para>
> -      When a column list is specified, only the named columns are replicated.
> -      The column list can contain stored generated columns as well. If the
> -      column list is omitted, the publication will replicate all non-generated
> -      columns (including any added in the future) by default. Stored generated
> -      columns can also be replicated if
> <literal>publish_generated_columns</literal>
> -      is set to <literal>stored</literal>. Specifying a column list has no
> -      effect on <literal>TRUNCATE</literal> commands. See
> +      When a column list without EXCEPT is specified, only the named
> columns are
> +      replicated. The column list can contain stored generated columns as well.
> +      If the column list is omitted, the publication will replicate
> +      all non-generated columns (including any added in the future) by default.
> +      Stored generated columns can also be replicated if
> +      <literal>publish_generated_columns</literal> is set to
> +      <literal>stored</literal>. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands. See
>        <xref linkend="logical-replication-col-lists"/> for details about column
>        lists.
>       </para>
>
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed
> ~~~
>
> 5.
> +     <para>
> +      When a column list is specified with EXCEPT, the named columns are not
> +      replicated. Specifying a column list has no effect on
> +      <literal>TRUNCATE</literal> commands.
> +     </para>
>
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>.
>
Fixed

> ** Note all the extra subtleties that I mentioned in the review
> comment #2.5 above --- e.g. IMO any *un-listed* gencols still should
> follow the parameter rules.
>
> ~~~
>
> 6.
>    <para>
>     Any column list must include the <literal>REPLICA IDENTITY</literal> columns
> -   in order for <command>UPDATE</command> or <command>DELETE</command>
> -   operations to be published. There are no column list restrictions if the
> -   publication publishes only <command>INSERT</command> operations.
> +   and any column list specified with EXCEPT must not include the
> +   <literal>REPLICA IDENTITY</literal> columns in order for
> +   <command>UPDATE</command> or <command>DELETE</command> operations to be
> +   published. There are no column list restrictions if the
> publication publishes
> +   only <command>INSERT</command> operations.
>    </para>
>
> 6a.
> CURRENT:
> Any column list must include the REPLICA IDENTITY columns, and any
> column list specified with EXCEPT must not include the REPLICA
> IDENTITY columns in order for UPDATE or DELETE operations to be
> published.
>
> ~
>
> I felt that might be better expressed the other way around. Also, it
> might be better to say "not name" instead of "not include" because
> EXCEPT + include seemed a bit contrary.
>
>
> SUGGESTION (maybe like this)
> In order for UPDATE or DELETE operations to work, all the REPLICA
> IDENTITY columns must be published. So, any column list must name all
> REPLICA IDENTITY columns, and any EXCEPT column list must not name any
> REPLICA IDENTITY columns.
>
Fixed

> ~~
>
> 6b.
> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>
>
Fixed

> ======
> src/backend/catalog/pg_publication.c
>
> check_and_fetch_column_list:
>
> 7.
> + /* Lookup the except attribute */
> + cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple,
> +   Anum_pg_publication_rel_prexcept, &isnull);
> +
> + if (!isnull)
> + {
> + Assert(!pub->alltables);
> + *except_columns = DatumGetBool(cfdatum);
> + }
> +
>
> I felt it would be safer to also assign *except_columns = false;
> up-front so the caller could be sure this flag was meaningful on
> return.
>
Fixed

> ~~~
>
> pub_form_cols_map:
>
> 8.
> Maybe use snake case like for other params, so /excepcols/except_cols/
>
Fixed

> ~~~
>
> pg_get_publication_tables:
>
> 9.
>
> I felt all the logic in this function maybe can be simpler:
>
> e.g. If you just have "Bitmapset *except_columns = NULL;" then null
> nmeans there is no except columns; otherwise there is. This means you
> don't need a separate 'bool except_column' variable.
>
> e.g. Assign the Bitmapset *except_columns after you already have the
> values[2], instead of doing it later.
>
> e.g. The skip code if (except_columns && bms_is_member(att->attnum,
> columns)) could just check the list member, I think, without the
> additional bool.
>
> ~~~
>
Fixed

> 10.
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and not
> + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that
> + * prattrs contains columns to be excluded for replication.
> + */
> + if (!isnull)
> + except_columns = DatumGetBool(exceptDatum);
>
>
> /indicate/indicates/
>
Fixed

> ======
> src/backend/parser/gram.y
>
> 11.
> + | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->pubobjtype = PUBLICATIONOBJ_TABLE;
> + $$->pubtable = makeNode(PublicationTable);
> + $$->pubtable->relation = $2;
> + $$->pubtable->columns = $4;
> + $$->pubtable->whereClause = $5;
> + $$->pubtable->except = true;
> + $$->location = @1;
> + }
>
> I wasn't expecting you would need another 'opt_except_column_list' and
> all the code duplication that causes. AFAIK, the syntax is identical
> for 'opt_column_list' apart from the preceding EXCEPT so I thought all
> you need is to allow the 'opt_column_list' to have an optional EXCEPT
> qualifier.
>
The main reason I used a separate 'opt_except_column_list' is because
'opt_column_list' can also be NULL. But the column list specified with
EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
it cannot be null.

> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 12.
> +
> + /*
> + * Indicates whether no columns are published for a given relation. With
> + * the introduction of the EXCEPT clause in column lists, it is now
> + * possible to define a publication that excludes all columns of a table.
> + * However, the 'columns' attribute cannot represent this case, since a
> + * NULL value implies that all columns are published. To distinguish this
> + * scenario, the 'no_cols_published' flag is introduced.
> + */
> + bool no_cols_published;
>  } RelationSyncEntry;
>
> But, what about when Bitmapset *columns is not null, but has no bits
> set -- doesn't that mean the same as "no columns"?
>
I think this is possible. A bitmapset which has no set bit is NULL. I
saw following comment in bitmapset.c
"By convention, we always represent a set with
 * the minimum possible number of words, i.e, there are never any trailing
 * zero words.  Enforcing this requires that an empty set is represented as
 * NULL.  Because an empty Bitmapset is represented as NULL, a non-NULL
 * Bitmapset always has at least 1 Bitmapword."

> ======
> src/include/catalog/pg_publication.h
>
> 13.
>  extern Bitmapset *pub_form_cols_map(Relation relation,
> - PublishGencolsType include_gencols_type);
> + PublishGencolsType include_gencols_type,
> + Bitmapset *exceptcols);
>
> Maybe snake-case like the other params: /exceptcols/except_cols/
>
Fixed

> ======
> src/test/regress/sql/publication.sql
>
> 14.
> +-- Verify that publication is created with EXCEPT
> +CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1,
> pub_sch1.pub_test_except2 EXCEPT (b, c);
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
> +
>
> I think tests should also use psql \dRp+ commands in places to show
> that the "describe" stuff is working correctly.
>
> ~~~
Fixed

>
> 15.
> +-- Check for invalid cases
> +CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1,
> TABLE pub_test_except1 EXCEPT (b, c);
> +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
>
> Should explain more about what you are testing here:
> a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA
> b) syntax error EXCEPT without a col-list
>
> ~~~
fixed

>
> 16.
> +-- Verify that publication can be altered with EXCEPT
> +ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT
> (a, b), pub_sch1.pub_test_except2;
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
>
> The comment is a bit misleading because there are many kinds of
> "alter". Maybe say more like
> Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list)
>
> ~~~
Fixed

>
> 17.
> +-- Verify ALTER PUBLICATION ... DROP
> +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b);
> +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1;
>
> Should explain more:
> +-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list)
> +-- Verify ok - ALTER PUBLICATION ... DROP ...
>
> ~~~
Fixed

>
> 18.
> +ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d);
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
>
> Missing comment:
> +-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list)
>
> ~~~
Fixed

>
> 19.
> +-- Verify excluded columns cannot be part of REPLICA IDENTITY
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
> +DROP INDEX pub_test_except1_a_idx;
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +
> +DROP INDEX pub_test_except1_a_idx;
>
> 19a.
> IIUC, really there are multiple tests here, so I think it should all
> be split and commented separately.
>
> a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
> b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> c) Verify that so long as no clash between RI cols and the EXCEPT
> col-list, then it is ok
>
> ~
Fixed

>
> 19b.
> IMO, some index names could be better:
>
> CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
> How about 'pub_test_except1_ac_idx'?
>
> ~~~
>
Fixed

> 20.
> +DROP PUBLICATION testpub_except;
> +DROP TABLE pub_test_except1;
> +DROP TABLE pub_sch1.pub_test_except2;
>
> Add a "cleanup" comment.
>
Added

I have addressed the comments and added the latest v16.

Thanks and Regards,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 30 Jun 2025 at 12:28, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jun 27, 2025 at 3:44 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Thu, 26 Jun 2025 at 15:27, shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > > >
> > > >  I have included the changes for
> > > > it in v14-0003 patch.
> > > >
> > > Thanks for the patches. I have reviewed patch001 alone, please find
> > > few comments:
> > >
> > > 1)
> > > +  <para>
> > > +   The <literal>RESET</literal> clause will reset the publication to the
> > > +   default state which includes resetting the publication parameters, setting
> > > +   <literal>ALL TABLES</literal> flag to <literal>false</literal> and
> > > +   dropping all relations and schemas that are associated with the
> > > +   publication.
> > >    </para>
> > >
> > > It is misleading, as far as I have understood, we do not drop the
> > > tables or schemas associated with the pub; we just remove those from
> > > the publication's object list. See previous doc:
> > > "The ADD and DROP clauses will add and remove one or more
> > > tables/schemas from the publication"
> > >
> > > Perhaps we want to say the same thing when we speak about the 'drop'
> > > aspect of RESET.
> > I have updated the document.
> >
> > > 2)
> > > AlterPublicationReset():
> > >
> > > + if (!OidIsValid(prid))
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_UNDEFINED_OBJECT),
> > > + errmsg("relation \"%s\" is not part of the publication",
> > > + get_rel_name(relid))));
> > >
> > > Can you please help me understand which scenario will give this error?
> > >
> > > Another question is do we really need this error? IIUC, we generally
> > > give errors if a user has explicitly called out a name of an object
> > > and that object is not found. Example:
> > >
> > > postgres=# alter publication pubnew drop table t1,tab2;
> > > ERROR:  relation "t1" is not part of the publication
> > >
> > > While in a few other cases, we pass missing_okay as true and do not
> > > give errors. Please see other callers of performDeletion in
> > > publicationcmds.c itself. There we have usage of missing_okay=true. I
> > > have not researched myself, but please analyze the cases where
> > > missing_okay is passed as true to figure out if those match our RESET
> > > case. Try to reproduce if possible and then take a call.
> > I thought about the above point and I also think this check is not
> > required. Also, the function was calling PublicationDropSchemas with
> > missing_ok as false. I have changed it to be true.
> >
>
> Okay. Is there a reason for not using PublicationDropTables() here? We
> have rewritten similar code in the Reset flow.
>
I feel it's better to use the function PublicationDropTables(). Also
proper locking would be required on tables while dropping them from
publication.
Made changes for the same.

> > > 3)
> > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public;
> > > +ERROR:  syntax error at or near "ALL"
> > > +LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub...
> > >
> > > There is a problem in syntax, I think the intention of testcase was to
> > > run this query successfully.
> >
> > I have fixed it.
> >
> > Thanks Shveta for reviewing the patch. I have addressed the comments
> > and posted an updated version v15 in [1].
>
> Thanks for the patches. My review is in progress but please find few
> comments on 002:
>
> 1)
> where exception_object is:
>     [ ONLY ] table_name [ * ]
>
> We have the above in CREATE and ALTER pub docs, but we do not explain
> ONLY with EXCEPT. We do have an explanation of ONLY under 'FOR TABLE'.
> But since 'FOR TABLE' and 'EXCEPT' do not go together, it is somewhat
> difficult to connect the dots and find the information ONLY in the
> context of EXCEPT. We shall have ONLY explained for EXCEPT as well. Or
> we can have ONLY defined in a way that both 'FOR TABLE' and 'EXCEPT'
> can refer to it.
>
In create_publication.sgml, added it under "EXCEPT_TABLE'. In
alter_publication.sgml, modified the document under item 'table_name'
under "<title>Parameters</title>"

> 2)
> We get tab-completion options in this command:
> postgres=# create publication pub5 for TABLE tab1 W
> WHERE (  WITH (
>
> Similarly in this command:
> create publication pub5 for TABLES IN SCHEMA s1
>
> But once we have 'EXCEPT TABLE', we do not get further tab-completion
> option like WITH(...)
> create publication pub5 for ALL TABLES EXCEPT TABLE tab1
Fixed

> 3)
> During tab-expansion, 'EXCEPT TABLE' and  'WITH (' in the below
> command looks like they are connecting words. Can the gap be increased
> similar to tab-expansion of next command shown below:
>
> postgres=# create publication pub4 for ALL TABLES
> EXCEPT TABLE  WITH (
>
I did not find a place to add any custom space. It is default
behaviour to add 2 spaces between different words. See similar:
postgres=# CREATE PUBLICATION pub1 FOR TABLE t1 W
WHERE (  WITH (

> postgres=# create publication pub4 for
> ALL TABLES        TABLE             TABLES IN SCHEMA
>
I observed that the space between word is dependent on the length of
longest word. Here the longest word is "TABLES IN SCHEMA". The space
between the words are quite noticeable.

> 4)
> alter_publication.sgml.orig is a left-over in patch002.
Fixed

I have added the changes in the latest v16 patch [1].
[1]: https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Thanks and Regards,
Shlok Kyal



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 30 Jun 2025 at 11:54, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> One more thing, I noticed there is no tab-completion code yet for this
> new EXCEPT (column_list) syntax.
>

I have added the tab-completion code in the latest v16 patch [1].
[1]: https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Thanks and Regards,
Shlok Kyal



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 30 Jun 2025 at 16:25, shveta malik <shveta.malik@gmail.com> wrote:
>
> Few more comments on 002:
>
> 5)
> +GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
>  {
>
> + List    *exceptlist;
> +
> + exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
>
>
> a) Here, we are assuming that the list provided by
> GetPublicationRelations() will be except-tables list only, but there
> is no validation of that.
> b) We are using GetPublicationRelations() to get the relations which
> are excluded from the publication. The name of function and comments
> atop function are not in alignment with this usage.
>
> Suggestion:
> We can have a new GetPublicationExcludeRelations() function for the
> concerned usage. The existing logic of GetPublicationRelations() can
> be shifted to a new internal-logic function which will accept a
> 'except-flag' as well. Both GetPublicationRelations() and
> GetPublicationExcludeRelations() can call that new function by passing
> 'except-flag' as false and true respectively. The new internal
> function will validate 'prexcept' against that except-flag passed and
> will return the results.
>
I have made the above change.


> 6)
> Before your patch002, GetTopMostAncestorInPublication() was checking
> pg_publication_rel and pg_publication_namespace to find out if the
> table in the ancestor-list is part of a given particular. Both
> pg_publication_rel and pg_publication_namespace did not have the entry
> "for all tables" publications. That means
> GetTopMostAncestorInPublication() was originally not checking whether
> the given puboid is an "for all tables" publication to see if a rel
> belongs to that particular pub or not. I
>
> But now with the current change, we do check if pub is all-tables pub,
> if so, return relid and mark ancestor_level (provided table is not
> part of the except list).  IIUC, the result in 2 cases may be
> different. Is that the intention? Let me know if my understanding is
> wrong.
>
This is intentional, in function get_rel_sync_entry, we are setting
pub_relid to the topmost published ancestor. In HEAD we are directly
setting using:
            /*
             * If this is a FOR ALL TABLES publication, pick the partition
             * root and set the ancestor level accordingly.
             */
            if (pub->alltables)
            {
                publish = true;
                if (pub->pubviaroot && am_partition)
                {
                    List       *ancestors = get_partition_ancestors(relid);

                    pub_relid = llast_oid(ancestors);
                    ancestor_level = list_length(ancestors);
                }
            }
In HEAD, we can directly use 'llast_oid(ancestors)' to get the topmost
ancestor for case of FOR ALL TABLES.
But with this proposal. This change will no longer be valid as the
'llast_oid(ancestors)' may be excluded in the publication. So, to
handle this change was made in GetTopMostAncestorInPublication.


Also, during testing with the partitioned table and
publish_via_partition_root the behaviour of the current patch is  as
below:
For example we have a partitioned table t1. It has partitions part1
and part2. Now consider the following cases:
1. with publish_via_partition_root = true
     I. If we create publication on all tables with EXCEPT t1, no data
for t1, part1 or part2 is replicated.
     II.  If we create publication on all tables with EXCEPT part1,
data for all tables t1, part1 and part2 is replicated.
2. with publish_via_partition_root = false
     I. If we create publication on all tables with EXCEPT t1, no data
for t1, part1 or part2 is replicated.
     II. If we create publication on all tables with EXCEPT part1,
data for part1 is not replicated

Is this behaviour fine?
I checked for other databases such as MySQL, SQL Server. In that we do
not have such cases as either we replicate the whole partitioned table
or we not replicated at all. We do not have partition level control.
For Oracle, I found that we can include or exclude partitions using
'PARTITIONEXCLUDE' [2], but did not find something similar to
publish_via_partition_root or where partitions are published as
separate tables.
What are your thoughts on the above behaviour?

I have addressed the comments and added the changes in the latest v16 patch [1].
[1]:https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com
[2]: https://docs.oracle.com/en/middleware/goldengate/core/23/reference/partition-partitionexclude.html
Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok.

Some review comments for patch v16-0003.

======
Commit message

1.
The column "prexcept" of system catalog "pg_publication_rel" is set to
"true" when publication is created with EXCEPT table or EXCEPT column
list. If column "prattrs" of system catalog "pg_publication_rel" is also
set or column "puballtables" of system catalog "pg_publication" is
"false", it indicates the column list is specified with EXCEPT clause
and columns in "prattrs" are excluded from being published.

~

Somehow, this seems to contain too much information, making it a bit
confusing. Can't you chop this down to something like below?

SUGESTION
When column "prexcept" of system catalog "pg_publication_rel" is set
to "true", and column "prattrs" of system catalog "pg_publication_rel"
is not NULL, that means the publication was created with "EXCEPT
(column-list)", and the columns in "prattrs" will be excluded from
being published.

======
doc/src/sgml/logical-replication.sgml

2.
    Generated columns can also be specified in a column list. This allows
    generated columns to be published, regardless of the publication parameter
    <link linkend="sql-createpublication-params-with-publish-generated-columns">
+   <literal>publish_generated_columns</literal></link>. Generated
columns can be
+   specified in a column list using the <literal>EXCEPT</literal> clause. This
+   excludes the specified generated columns from being published, regardless of
+   the <link linkend="sql-createpublication-params-with-publish-generated-columns">
+   <literal>publish_generated_columns</literal></link> setting. However, for
+   generated columns that are not listed in the <literal>EXCEPT</literal>
+   clause, whether they are published or not still depends on the value of
+   <link linkend="sql-createpublication-params-with-publish-generated-columns">
    <literal>publish_generated_columns</literal></link>. See
    <xref linkend="logical-replication-gencols"/> for details.
   </para>

~~

For this part:

"Generated columns can be specified in a column list using the
<literal>EXCEPT</literal> clause. This excludes the specified
generated columns from being published, regardless of..."

I think the whole paragraph already said "Generated columns can also
be specified in a column list", so you don't need to repeat it.
Instead, maybe say something like below.

SUGGESTION
Specifying generated columns in a column list using the
<literal>EXCEPT</literal> clause excludes those columns from being
published, regardless of...

~~~

3.
-                               Publication p1
-  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
-----------+------------+---------+---------+---------+-----------+----------
- postgres | f          | t       | t       | t       | t         | f
+                                        Publication p1
+ Owner  | All tables | Inserts | Updates | Deletes | Truncates |
Generated columns | Via root
+--------+------------+---------+---------+---------+-----------+-------------------+----------
+ ubuntu | f          | t       | t       | t       | t         | none
             | f
 Tables:
     "public.t1" (id, a, b, d)
+    "public.t2" EXCEPT (a, d)
 </programlisting></para>


I noticed the Owner changed from "postgres" to "ubuntu". Do you think
it is better to keep this as "postgres" for the example?

======
doc/src/sgml/ref/create_publication.sgml

4.
The tables added to a publication that publishes UPDATE and/or DELETE
operations must have REPLICA IDENTITY defined. Otherwise those
operations will be disallowed on those tables.

In order for UPDATE or DELETE operations to work, all the REPLICA
IDENTITY columns must be published. So, any column list must name all
REPLICA IDENTITY columns, and any EXCEPT column list must not name any
REPLICA IDENTITY columns.

A row filter expression (i.e., the WHERE clause) must contain only
columns that are covered by the REPLICA IDENTITY, in order for UPDATE
and DELETE operations to be published. For publication of INSERT
operations, any column may be used in the WHERE expression. The row
filter allows simple expressions that don't have user-defined
functions, user-defined operators, user-defined types, user-defined
collations, non-immutable built-in functions, or references to system
columns.

The generated columns that are part of the column list specified with
the EXCEPT clause are not published, regardless of the
publish_generated_columns option. However, generated columns that are
not part of the column list specified with the EXCEPT clause are
published according to the value of the publish_generated_columns
option. See Section 29.6 for details.

The generated columns that are part of REPLICA IDENTITY must be
published explicitly either by listing them in the column list or by
enabling the publish_generated_columns option, in order for UPDATE and
DELETE operations to be published.

~~

Notice all those 5 paragraphs (above) are talking about REPLICA
IDENTITY, except the 4th paragraph. Maybe the 4th paragraph should be
moved to last, to keep all the REPLICA IDENTITY stuff together.

======
src/backend/catalog/pg_publication.c

5. pub_form_cols_map

  * Returns a bitmap representing the columns of the specified table.
  *
  * Generated columns are included if include_gencols_type is
- * PUBLISH_GENCOLS_STORED.
+ * PUBLISH_GENCOLS_STORED. Columns that are in the exceptcols are excluded from
+ * the column list.
  */
 Bitmapset *
-pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type)
+pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type,
+   Bitmapset *except_cols)

Forgot to add the underscore in the function comment.

/exceptcols/except_cols/

~~~

6. pg_get_publication_tables

+
+ /*
+ * We fetch pubtuple if publication is not FOR ALL TABLES and not
+ * FOR TABLES IN SCHEMA. So if prexcept is true, it indicates that
+ * prattrs contains columns to be excluded for replication.
+ */
+ exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
+   Anum_pg_publication_rel_prexcept,
+   &isnull);
+
+ if (!isnull && DatumGetBool(exceptDatum) && !nulls[2])
+ except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);

But, you cannot have EXCEPT for null column list, so shouldn't the
!nulls[2] check be done to also guard the SysCacheGetAttr call?

======
src/backend/parser/gram.y

7.

Shlok wrote [1-reply #11]
The main reason I used a separate 'opt_except_column_list' is because
'opt_column_list' can also be NULL. But the column list specified with
EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
it cannot be null.

~

Yeah, but IMO that leads to excessive duplicated code. I think the
code can perhaps be a lot simpler if the grammar is written more like
the synopsis:

e.g. TABLE name opt_EXCEPT opt_column_list

where - opt_EXCEPT is null, and opt_column_list is null... means no col list
where - opt_EXCEPT is null, and opt_column_list is not null... means
normal col list
where - opt_EXCEPT is not null, and opt_column_list not null... means
EXCEPT col list
where - opt_EXCEPT is not null, and opt_column_list null... SYNTAX ERROR

So code it something like this (just adding opt_EXCEPT to the existing
productions)

%type <boolean> opt_ordinality opt_without_overlaps opt_EXCEPT
...
opt_EXCEPT:
EXCEPT { $$ = true; }
| /*EMPTY*/ { $$ = false; }
;
...
TABLE relation_expr opt_EXCEPT opt_column_list OptWhereClause
{
  $$ = makeNode(PublicationObjSpec);
  $$->pubobjtype = PUBLICATIONOBJ_TABLE;
  $$->pubtable = makeNode(PublicationTable);
  $$->pubtable->relation = $2;
  $$->pubtable->except = $3;
  $$->pubtable->columns = $4;
  if ($3 && !$4)
    ereport(ERROR,
      (errcode(ERRCODE_SYNTAX_ERROR),
      errmsg("EXCEPT without column list"),
      parser_errposition(@3)));
  $$->pubtable->whereClause = $5;
  $$->location = @1;
}

etc.

======
src/bin/psql/describe.c

8.
  if (!PQgetisnull(res, i, 3))
+ {
+ if (!PQgetisnull(res, i, 4) && strcmp(PQgetvalue(res, i, 4), "t") == 0)
+ appendPQExpBuffer(buf, " EXCEPT");
  appendPQExpBuffer(buf, " (%s)", PQgetvalue(res, i, 3));
+ }

This growing list of columns makes it hard to understand this function
without looking back at the caller all the time. Maybe you can add a
function comment that at least explains what those attributes 1,2,3,4
represent?

======
src/bin/psql/tab-complete.in.c

9.
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|SET",
"TABLE", MatchAny))
+ COMPLETE_WITH("EXCEPT");

Since it is not allowed to have an EXCEPT with no column list,
shouldn't this say "EXCEPT ("?

~~~

10.
  else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE",
MatchAny) && !ends_with(prev_wd, ','))
- COMPLETE_WITH("WHERE (", "WITH (");
+ COMPLETE_WITH("EXCEPT", "WHERE (", "WITH (");

Ditto. Since it is not allowed to have an EXCEPT with no column list,
shouldn't this say "EXCEPT ("?


======
src/test/regress/expected/publication.out

11.
+-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
+CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+ERROR:  index "pub_test_except1_a_idx" for table "pub_test_except1"
does not exist
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+ERROR:  cannot update table "pub_test_except1"
+DETAIL:  Column list used by the publication does not cover the
replica identity.
+DROP INDEX pub_test_except1_ac_idx;


What's happening here? I'm not sure these are the kind of errors you
were trying to cause.

======
src/test/regress/sql/publication.sql

12.
+-- Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
+ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;


SUGGESTION. Change that comment to:
Verify fails - EXCEPT col-list cannot...

~~~

13.
+-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
+CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+DROP INDEX pub_test_except1_ac_idx;

SUGGESTION. Change that comment to:
Verify fails - EXCEPT col-list cannot...

~~~

14.
+-- Verify that so long as no clash between RI cols and the EXCEPT
+CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+

That comment doesn't make sense. Missing words?

======
.../t/036_rep_changes_except_table.pl

15.
(I haven't reviewed this file in detail yet, but here is a general comment)

I know this patch currently lives in the same thread as all the EXCEPT
TABLE stuff, but that seems just happenstance to me. IMO, this is a
separate enhancement that just shares the keyword EXCEPT. So, I felt
it should have quite separate tests too.

e.g. How about: 037_rep_changes_except_collist.pl

======
[1] https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
shveta malik
Дата:
On Sat, Jul 19, 2025 at 4:17 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Mon, 30 Jun 2025 at 16:25, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Few more comments on 002:
> >
> > 5)
> > +GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
> >  {
> >
> > + List    *exceptlist;
> > +
> > + exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> >
> >
> > a) Here, we are assuming that the list provided by
> > GetPublicationRelations() will be except-tables list only, but there
> > is no validation of that.
> > b) We are using GetPublicationRelations() to get the relations which
> > are excluded from the publication. The name of function and comments
> > atop function are not in alignment with this usage.
> >
> > Suggestion:
> > We can have a new GetPublicationExcludeRelations() function for the
> > concerned usage. The existing logic of GetPublicationRelations() can
> > be shifted to a new internal-logic function which will accept a
> > 'except-flag' as well. Both GetPublicationRelations() and
> > GetPublicationExcludeRelations() can call that new function by passing
> > 'except-flag' as false and true respectively. The new internal
> > function will validate 'prexcept' against that except-flag passed and
> > will return the results.
> >
> I have made the above change.
>
>
> > 6)
> > Before your patch002, GetTopMostAncestorInPublication() was checking
> > pg_publication_rel and pg_publication_namespace to find out if the
> > table in the ancestor-list is part of a given particular. Both
> > pg_publication_rel and pg_publication_namespace did not have the entry
> > "for all tables" publications. That means
> > GetTopMostAncestorInPublication() was originally not checking whether
> > the given puboid is an "for all tables" publication to see if a rel
> > belongs to that particular pub or not. I
> >
> > But now with the current change, we do check if pub is all-tables pub,
> > if so, return relid and mark ancestor_level (provided table is not
> > part of the except list).  IIUC, the result in 2 cases may be
> > different. Is that the intention? Let me know if my understanding is
> > wrong.
> >
> This is intentional, in function get_rel_sync_entry, we are setting
> pub_relid to the topmost published ancestor. In HEAD we are directly
> setting using:
>             /*
>              * If this is a FOR ALL TABLES publication, pick the partition
>              * root and set the ancestor level accordingly.
>              */
>             if (pub->alltables)
>             {
>                 publish = true;
>                 if (pub->pubviaroot && am_partition)
>                 {
>                     List       *ancestors = get_partition_ancestors(relid);
>
>                     pub_relid = llast_oid(ancestors);
>                     ancestor_level = list_length(ancestors);
>                 }
>             }
> In HEAD, we can directly use 'llast_oid(ancestors)' to get the topmost
> ancestor for case of FOR ALL TABLES.
> But with this proposal. This change will no longer be valid as the
> 'llast_oid(ancestors)' may be excluded in the publication. So, to
> handle this change was made in GetTopMostAncestorInPublication.
>
>
> Also, during testing with the partitioned table and
> publish_via_partition_root the behaviour of the current patch is  as
> below:
> For example we have a partitioned table t1. It has partitions part1
> and part2. Now consider the following cases:
> 1. with publish_via_partition_root = true
>      I. If we create publication on all tables with EXCEPT t1, no data
> for t1, part1 or part2 is replicated.
>      II.  If we create publication on all tables with EXCEPT part1,
> data for all tables t1, part1 and part2 is replicated.
> 2. with publish_via_partition_root = false
>      I. If we create publication on all tables with EXCEPT t1, no data
> for t1, part1 or part2 is replicated.
>      II. If we create publication on all tables with EXCEPT part1,
> data for part1 is not replicated
>
> Is this behaviour fine?
> I checked for other databases such as MySQL, SQL Server. In that we do
> not have such cases as either we replicate the whole partitioned table
> or we not replicated at all. We do not have partition level control.
> For Oracle, I found that we can include or exclude partitions using
> 'PARTITIONEXCLUDE' [2], but did not find something similar to
> publish_via_partition_root or where partitions are published as
> separate tables.
> What are your thoughts on the above behaviour?
>

Thank You for the details. I will review this behaviour soon and will
let you know my comments. Meanwhile, please find a few comments on
v16-0001:

1)
we do LockSchemaList() everywhere before we call
PublicationDropSchemas() to prevent concurrent schema deletion. Do we
need that in reset flow as well?

2)
+ /* Drop the schemas associated with the publication */
+ schemas = GetPublicationSchemas(pubid);
+ PublicationDropSchemas(pubid, schemas, true);
+
+ /* Get all relations associated with the publication */
+ relids = GetPublicationRelations(pubid, PUBLICATION_PART_ROOT);

We can rename schemas to schemaids similar to relids, as
GetPublicationSchemas return oids.

3)
+ /* Drop the relations associated with the publication */
+ PublicationDropTables(pubform->oid, rels, true);

we can pass 'pubid' here instead of pubform->oid

4)
Shall we modify the comments:
'Drop the relations associated with the publication' to 'Remove the
associated relations from the publication'
'Drop the schemas associated with the publication'  to 'Remove the
associated schemas from the publication'

Similar changes can be done in test file's comments as well
--Verify that tables associated with the publication are dropped after
RESET
--Verify that schemas associated with the publication are dropped after RESET

thanks
Shveta



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 21 Jul 2025 at 12:17, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> Some review comments for patch v16-0003.
>
> ======
> Commit message
>
> 1.
> The column "prexcept" of system catalog "pg_publication_rel" is set to
> "true" when publication is created with EXCEPT table or EXCEPT column
> list. If column "prattrs" of system catalog "pg_publication_rel" is also
> set or column "puballtables" of system catalog "pg_publication" is
> "false", it indicates the column list is specified with EXCEPT clause
> and columns in "prattrs" are excluded from being published.
>
> ~
>
> Somehow, this seems to contain too much information, making it a bit
> confusing. Can't you chop this down to something like below?
>
> SUGESTION
> When column "prexcept" of system catalog "pg_publication_rel" is set
> to "true", and column "prattrs" of system catalog "pg_publication_rel"
> is not NULL, that means the publication was created with "EXCEPT
> (column-list)", and the columns in "prattrs" will be excluded from
> being published.
>
Modified the commit message as per suggestion.

> ======
> doc/src/sgml/logical-replication.sgml
>
> 2.
>     Generated columns can also be specified in a column list. This allows
>     generated columns to be published, regardless of the publication parameter
>     <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link>. Generated
> columns can be
> +   specified in a column list using the <literal>EXCEPT</literal> clause. This
> +   excludes the specified generated columns from being published, regardless of
> +   the <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> setting. However, for
> +   generated columns that are not listed in the <literal>EXCEPT</literal>
> +   clause, whether they are published or not still depends on the value of
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
>     <literal>publish_generated_columns</literal></link>. See
>     <xref linkend="logical-replication-gencols"/> for details.
>    </para>
>
> ~~
>
> For this part:
>
> "Generated columns can be specified in a column list using the
> <literal>EXCEPT</literal> clause. This excludes the specified
> generated columns from being published, regardless of..."
>
> I think the whole paragraph already said "Generated columns can also
> be specified in a column list", so you don't need to repeat it.
> Instead, maybe say something like below.
>
> SUGGESTION
> Specifying generated columns in a column list using the
> <literal>EXCEPT</literal> clause excludes those columns from being
> published, regardless of...
>
> ~~~
>
Modified

> 3.
> -                               Publication p1
> -  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> -----------+------------+---------+---------+---------+-----------+----------
> - postgres | f          | t       | t       | t       | t         | f
> +                                        Publication p1
> + Owner  | All tables | Inserts | Updates | Deletes | Truncates |
> Generated columns | Via root
> +--------+------------+---------+---------+---------+-----------+-------------------+----------
> + ubuntu | f          | t       | t       | t       | t         | none
>              | f
>  Tables:
>      "public.t1" (id, a, b, d)
> +    "public.t2" EXCEPT (a, d)
>  </programlisting></para>
>
>
> I noticed the Owner changed from "postgres" to "ubuntu". Do you think
> it is better to keep this as "postgres" for the example?
I agree that it is better to keep "postgres". I have reverted back to
the use "postgres"..

>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 4.
> The tables added to a publication that publishes UPDATE and/or DELETE
> operations must have REPLICA IDENTITY defined. Otherwise those
> operations will be disallowed on those tables.
>
> In order for UPDATE or DELETE operations to work, all the REPLICA
> IDENTITY columns must be published. So, any column list must name all
> REPLICA IDENTITY columns, and any EXCEPT column list must not name any
> REPLICA IDENTITY columns.
>
> A row filter expression (i.e., the WHERE clause) must contain only
> columns that are covered by the REPLICA IDENTITY, in order for UPDATE
> and DELETE operations to be published. For publication of INSERT
> operations, any column may be used in the WHERE expression. The row
> filter allows simple expressions that don't have user-defined
> functions, user-defined operators, user-defined types, user-defined
> collations, non-immutable built-in functions, or references to system
> columns.
>
> The generated columns that are part of the column list specified with
> the EXCEPT clause are not published, regardless of the
> publish_generated_columns option. However, generated columns that are
> not part of the column list specified with the EXCEPT clause are
> published according to the value of the publish_generated_columns
> option. See Section 29.6 for details.
>
> The generated columns that are part of REPLICA IDENTITY must be
> published explicitly either by listing them in the column list or by
> enabling the publish_generated_columns option, in order for UPDATE and
> DELETE operations to be published.
>
> ~~
>
> Notice all those 5 paragraphs (above) are talking about REPLICA
> IDENTITY, except the 4th paragraph. Maybe the 4th paragraph should be
> moved to last, to keep all the REPLICA IDENTITY stuff together.
>
Fixed

> ======
> src/backend/catalog/pg_publication.c
>
> 5. pub_form_cols_map
>
>   * Returns a bitmap representing the columns of the specified table.
>   *
>   * Generated columns are included if include_gencols_type is
> - * PUBLISH_GENCOLS_STORED.
> + * PUBLISH_GENCOLS_STORED. Columns that are in the exceptcols are excluded from
> + * the column list.
>   */
>  Bitmapset *
> -pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type)
> +pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type,
> +   Bitmapset *except_cols)
>
> Forgot to add the underscore in the function comment.
>
> /exceptcols/except_cols/
>
Fixed

> ~~~
>
> 6. pg_get_publication_tables
>
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and not
> + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicates that
> + * prattrs contains columns to be excluded for replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum) && !nulls[2])
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
>
> But, you cannot have EXCEPT for null column list, so shouldn't the
> !nulls[2] check be done to also guard the SysCacheGetAttr call?
>
Fixed

> ======
> src/backend/parser/gram.y
>
> 7.
>
> Shlok wrote [1-reply #11]
> The main reason I used a separate 'opt_except_column_list' is because
> 'opt_column_list' can also be NULL. But the column list specified with
> EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
> it cannot be null.
>
> ~
>
> Yeah, but IMO that leads to excessive duplicated code. I think the
> code can perhaps be a lot simpler if the grammar is written more like
> the synopsis:
>
> e.g. TABLE name opt_EXCEPT opt_column_list
>
> where - opt_EXCEPT is null, and opt_column_list is null... means no col list
> where - opt_EXCEPT is null, and opt_column_list is not null... means
> normal col list
> where - opt_EXCEPT is not null, and opt_column_list not null... means
> EXCEPT col list
> where - opt_EXCEPT is not null, and opt_column_list null... SYNTAX ERROR
>
> So code it something like this (just adding opt_EXCEPT to the existing
> productions)
>
> %type <boolean> opt_ordinality opt_without_overlaps opt_EXCEPT
> ...
> opt_EXCEPT:
> EXCEPT { $$ = true; }
> | /*EMPTY*/ { $$ = false; }
> ;
> ...
> TABLE relation_expr opt_EXCEPT opt_column_list OptWhereClause
> {
>   $$ = makeNode(PublicationObjSpec);
>   $$->pubobjtype = PUBLICATIONOBJ_TABLE;
>   $$->pubtable = makeNode(PublicationTable);
>   $$->pubtable->relation = $2;
>   $$->pubtable->except = $3;
>   $$->pubtable->columns = $4;
>   if ($3 && !$4)
>     ereport(ERROR,
>       (errcode(ERRCODE_SYNTAX_ERROR),
>       errmsg("EXCEPT without column list"),
>       parser_errposition(@3)));
>   $$->pubtable->whereClause = $5;
>   $$->location = @1;
> }
>
> etc.
>
I have modified it. I have created a function 'check_except_collist'
to throw error, to avoid duplication code for error message.

> ======
> src/bin/psql/describe.c
>
> 8.
>   if (!PQgetisnull(res, i, 3))
> + {
> + if (!PQgetisnull(res, i, 4) && strcmp(PQgetvalue(res, i, 4), "t") == 0)
> + appendPQExpBuffer(buf, " EXCEPT");
>   appendPQExpBuffer(buf, " (%s)", PQgetvalue(res, i, 3));
> + }
>
> This growing list of columns makes it hard to understand this function
> without looking back at the caller all the time. Maybe you can add a
> function comment that at least explains what those attributes 1,2,3,4
> represent?
>
Added a comment

> ======
> src/bin/psql/tab-complete.in.c
>
> 9.
> + else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|SET",
> "TABLE", MatchAny))
> + COMPLETE_WITH("EXCEPT");
>
> Since it is not allowed to have an EXCEPT with no column list,
> shouldn't this say "EXCEPT ("?
>
Fixed

> ~~~
>
> 10.
>   else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE",
> MatchAny) && !ends_with(prev_wd, ','))
> - COMPLETE_WITH("WHERE (", "WITH (");
> + COMPLETE_WITH("EXCEPT", "WHERE (", "WITH (");
>
> Ditto. Since it is not allowed to have an EXCEPT with no column list,
> shouldn't this say "EXCEPT ("?
>
Fixed

>
> ======
> src/test/regress/expected/publication.out
>
> 11.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +ERROR:  index "pub_test_except1_a_idx" for table "pub_test_except1"
> does not exist
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +ERROR:  cannot update table "pub_test_except1"
> +DETAIL:  Column list used by the publication does not cover the
> replica identity.
> +DROP INDEX pub_test_except1_ac_idx;
>
>
> What's happening here? I'm not sure these are the kind of errors you
> were trying to cause.
>
Yes, it is not the error I was trying to cause. I have modified it.

> ======
> src/test/regress/sql/publication.sql
>
> 12.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
>
>
> SUGGESTION. Change that comment to:
> Verify fails - EXCEPT col-list cannot...
>
Fixed

> ~~~
>
> 13.
> +-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
> +CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +DROP INDEX pub_test_except1_ac_idx;
>
> SUGGESTION. Change that comment to:
> Verify fails - EXCEPT col-list cannot...
>
Fixed

> ~~~
>
> 14.
> +-- Verify that so long as no clash between RI cols and the EXCEPT
> +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
> +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
> pub_test_except1_a_idx;
> +UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
> +
>
> That comment doesn't make sense. Missing words?
>
Fixed

> ======
> .../t/036_rep_changes_except_table.pl
>
> 15.
> (I haven't reviewed this file in detail yet, but here is a general comment)
>
> I know this patch currently lives in the same thread as all the EXCEPT
> TABLE stuff, but that seems just happenstance to me. IMO, this is a
> separate enhancement that just shares the keyword EXCEPT. So, I felt
> it should have quite separate tests too.
>
> e.g. How about: 037_rep_changes_except_collist.pl
>
Modified

> ======
> [1] https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 21 Jul 2025 at 16:22, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sat, Jul 19, 2025 at 4:17 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Mon, 30 Jun 2025 at 16:25, shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > Few more comments on 002:
> > >
> > > 5)
> > > +GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
> > >  {
> > >
> > > + List    *exceptlist;
> > > +
> > > + exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> > >
> > >
> > > a) Here, we are assuming that the list provided by
> > > GetPublicationRelations() will be except-tables list only, but there
> > > is no validation of that.
> > > b) We are using GetPublicationRelations() to get the relations which
> > > are excluded from the publication. The name of function and comments
> > > atop function are not in alignment with this usage.
> > >
> > > Suggestion:
> > > We can have a new GetPublicationExcludeRelations() function for the
> > > concerned usage. The existing logic of GetPublicationRelations() can
> > > be shifted to a new internal-logic function which will accept a
> > > 'except-flag' as well. Both GetPublicationRelations() and
> > > GetPublicationExcludeRelations() can call that new function by passing
> > > 'except-flag' as false and true respectively. The new internal
> > > function will validate 'prexcept' against that except-flag passed and
> > > will return the results.
> > >
> > I have made the above change.
> >
> >
> > > 6)
> > > Before your patch002, GetTopMostAncestorInPublication() was checking
> > > pg_publication_rel and pg_publication_namespace to find out if the
> > > table in the ancestor-list is part of a given particular. Both
> > > pg_publication_rel and pg_publication_namespace did not have the entry
> > > "for all tables" publications. That means
> > > GetTopMostAncestorInPublication() was originally not checking whether
> > > the given puboid is an "for all tables" publication to see if a rel
> > > belongs to that particular pub or not. I
> > >
> > > But now with the current change, we do check if pub is all-tables pub,
> > > if so, return relid and mark ancestor_level (provided table is not
> > > part of the except list).  IIUC, the result in 2 cases may be
> > > different. Is that the intention? Let me know if my understanding is
> > > wrong.
> > >
> > This is intentional, in function get_rel_sync_entry, we are setting
> > pub_relid to the topmost published ancestor. In HEAD we are directly
> > setting using:
> >             /*
> >              * If this is a FOR ALL TABLES publication, pick the partition
> >              * root and set the ancestor level accordingly.
> >              */
> >             if (pub->alltables)
> >             {
> >                 publish = true;
> >                 if (pub->pubviaroot && am_partition)
> >                 {
> >                     List       *ancestors = get_partition_ancestors(relid);
> >
> >                     pub_relid = llast_oid(ancestors);
> >                     ancestor_level = list_length(ancestors);
> >                 }
> >             }
> > In HEAD, we can directly use 'llast_oid(ancestors)' to get the topmost
> > ancestor for case of FOR ALL TABLES.
> > But with this proposal. This change will no longer be valid as the
> > 'llast_oid(ancestors)' may be excluded in the publication. So, to
> > handle this change was made in GetTopMostAncestorInPublication.
> >
> >
> > Also, during testing with the partitioned table and
> > publish_via_partition_root the behaviour of the current patch is  as
> > below:
> > For example we have a partitioned table t1. It has partitions part1
> > and part2. Now consider the following cases:
> > 1. with publish_via_partition_root = true
> >      I. If we create publication on all tables with EXCEPT t1, no data
> > for t1, part1 or part2 is replicated.
> >      II.  If we create publication on all tables with EXCEPT part1,
> > data for all tables t1, part1 and part2 is replicated.
> > 2. with publish_via_partition_root = false
> >      I. If we create publication on all tables with EXCEPT t1, no data
> > for t1, part1 or part2 is replicated.
> >      II. If we create publication on all tables with EXCEPT part1,
> > data for part1 is not replicated
> >
> > Is this behaviour fine?
> > I checked for other databases such as MySQL, SQL Server. In that we do
> > not have such cases as either we replicate the whole partitioned table
> > or we not replicated at all. We do not have partition level control.
> > For Oracle, I found that we can include or exclude partitions using
> > 'PARTITIONEXCLUDE' [2], but did not find something similar to
> > publish_via_partition_root or where partitions are published as
> > separate tables.
> > What are your thoughts on the above behaviour?
> >
>
> Thank You for the details. I will review this behaviour soon and will
> let you know my comments. Meanwhile, please find a few comments on
> v16-0001:
>
> 1)
> we do LockSchemaList() everywhere before we call
> PublicationDropSchemas() to prevent concurrent schema deletion. Do we
> need that in reset flow as well?
Added

>
> 2)
> + /* Drop the schemas associated with the publication */
> + schemas = GetPublicationSchemas(pubid);
> + PublicationDropSchemas(pubid, schemas, true);
> +
> + /* Get all relations associated with the publication */
> + relids = GetPublicationRelations(pubid, PUBLICATION_PART_ROOT);
>
> We can rename schemas to schemaids similar to relids, as
> GetPublicationSchemas return oids.
>
Fixed

> 3)
> + /* Drop the relations associated with the publication */
> + PublicationDropTables(pubform->oid, rels, true);
>
> we can pass 'pubid' here instead of pubform->oid
>
Modified

> 4)
> Shall we modify the comments:
> 'Drop the relations associated with the publication' to 'Remove the
> associated relations from the publication'
> 'Drop the schemas associated with the publication'  to 'Remove the
> associated schemas from the publication'
>
> Similar changes can be done in test file's comments as well
> --Verify that tables associated with the publication are dropped after
> RESET
> --Verify that schemas associated with the publication are dropped after RESET
>
Fixed

I have made the changes in the latest v17 patch [1].
[1]: https://www.postgresql.org/message-id/CANhcyEUtYV-9ujtxLasnxN_peT%2B3LuZjcRx1xUECh1CCmANB8w%40mail.gmail.com

Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok,

Some review comments for patch v17-0003. I also checked the TAP test this time.

======
doc/src/sgml/logical-replication.sgml

1.
+   <literal>publish_generated_columns</literal></link>. Specifying generated
+   columns in a column list using the <literal>EXCEPT</literal> clause excludes
+   the specified generated columns from being published, regardless of the
+   <link linkend="sql-createpublication-params-with-publish-generated-columns">
+   <literal>publish_generated_columns</literal></link> setting. However, for

I think that is not quite the same wording I had previously suggested.
It sounds a bit odd/redundant saying "Specifying" and "specified" in
the same sentence.

======
src/backend/parser/gram.y

2. check_except_collist

I'm wondering if this checking should be done within the existing
preprocess_pubobj_list() function, alongside all the other ERROR
checking. Care needs to be taken to make sure the pubtable->except is
referring to an EXCEPT (col-list), instead of the other kind of EXCEPT
tables, but in general I think it is better to keep all the
publication combinations checking errors like this in one place.


======
src/bin/psql/describe.c

3. addFooterToPublicationDesc

- appendPQExpBuffer(&buf, " (%s)",
-   PQgetvalue(result, i, 2));
+ {
+ if (!PQgetisnull(result, i, 3) &&
+ strcmp(PQgetvalue(result, i, 3), "t") == 0)
+ appendPQExpBuffer(&buf, " EXCEPT (%s)",
+   PQgetvalue(result, i, 2));
+ else
+ appendPQExpBuffer(&buf, " (%s)",
+   PQgetvalue(result, i, 2));
+ }

Do you really need to check !PQgetisnull(result, i, 3) here?  (e.g.
The comment does not say that this attribute can be NULL)

======
.../t/037_rep_changes_except_collist.pl

4.
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+# Logical replication tests for except table publications

Comment is wrong. These tests are for EXCEPT (column-list)

~~~

5.
+# Test for except column publications
+# Initial setup
+$node_publisher->safe_psql('postgres', "CREATE SCHEMA sch1");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab2 (a int, b int NOT NULL, c int)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE sch1.tab2 (a int, b int, c int)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab3 (a int, b int, c int)");
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab4 (a int, b int GENERATED ALWAYS AS (a * 2) STORED,
c int GENERATED ALWAYS AS (a * 3) STORED)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab5 (a int, b int GENERATED ALWAYS AS (a * 2) STORED,
c int GENERATED ALWAYS AS (a * 3) STORED)"
+);
+$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO sch1.tab2 VALUES (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_col FOR TABLE tab2 EXCEPT (a), sch1.tab2
EXCEPT (b, c)"
+);

5a.
I think you don't need to say "Test for except column publications",
because that is the purpose of thie entire file.

~

5b.
You can combine multiple of these safe_psql calls together

~

5c.
It might help make tests easier to read if you named those generated
columns 'b', 'c' cols as 'bgen', 'cgen' instead.

~

5d.
The table names are strange, because why does it start at tab2 when
there is not a tab1?

~~~

6.
+$node_subscriber->safe_psql('postgres', "CREATE SCHEMA sch1");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab2 (a int, b int NOT NULL, c int)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE sch1.tab2 (a int, b int, c int)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab3 (a int, b int, c int)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab4 (a int, b int, c int)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab5 (a int, b int, c int)");

You can combine multiple of these safe_psql calls together

~~~

7.
+# Test initial sync
+my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab2");
+is($result, qq(|2|3),
+ 'check that initial sync for except column publication');

The message seems strange. Do you mean "check initial sync for an
'EXCEPT (column-list)' publication"

NOTE: There are many other messages where you wrote "for except column
publication" but I think maybe all of those can be improved a bit like
above.

~~~

8.
+$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (4, 5, 6)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO sch1.tab2 VALUES (4, 5, 6)");
+$node_publisher->wait_for_catchup('tap_sub_col');

8a.
You can combine multiple of these safe_psql calls together.

NOTE: I won't keep repeating this review comment but I think maybe
there are lots more places where the safe_psql can all be combined to
expected multiple statements.

~

8b.
I felt all those commands should be under the "Test incremental
changes" comment.

~~~

9.
+is($result, qq(1||3), 'check alter publication with EXCEPT');

Maybe that should've said with 'EXCEPT (column-list)'

~~~

10.
+# Test for publication created with publish_generated_columns as true on table
+# with generated columns and column list specified with EXCEPT
+$node_publisher->safe_psql('postgres', "INSERT INTO tab4 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_col SET (publish_generated_columns)");
+$node_publisher->safe_psql('postgres',
+ "ALTER PUBLICATION tap_pub_col SET TABLE tab4 EXCEPT(b)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_col REFRESH PUBLICATION");
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub_col');

10a.
I felt the test comments for both those generated columns parameter
test should give more explanation to say what is the expected result
and why.

~

10b.
How does "ALTER PUBLICATION tap_pub_col SET
(publish_generated_columns)" even work? I thought the
"pubish_generated_columns" is an enum but you did not specify any enum
value here (???)

~~~

11.
+ 'check publication(publish_generated_columns as false) with
generated columns and EXCEPT'

Hmm. I thought there is no such thing as "publish_generated_columns as
false", and also the EXCEPT should say 'EXCEPT (column-list)'

~~~

12.
I wonder if there should be another boundary condition test case as follows:
- have some table with cols a,b,c.
- create a publication 'EXCEPT (a,b,c)', so you don't publish anything at all.
- then ALTER the TABLE to add a column 'd'.
- now the publication should publish only 'd'.

======

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
shveta malik
Дата:
On Sat, Jul 19, 2025 at 4:17 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Mon, 30 Jun 2025 at 16:25, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Few more comments on 002:
> >
> > 5)
> > +GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
> >  {
> >
> > + List    *exceptlist;
> > +
> > + exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> >
> >
> > a) Here, we are assuming that the list provided by
> > GetPublicationRelations() will be except-tables list only, but there
> > is no validation of that.
> > b) We are using GetPublicationRelations() to get the relations which
> > are excluded from the publication. The name of function and comments
> > atop function are not in alignment with this usage.
> >
> > Suggestion:
> > We can have a new GetPublicationExcludeRelations() function for the
> > concerned usage. The existing logic of GetPublicationRelations() can
> > be shifted to a new internal-logic function which will accept a
> > 'except-flag' as well. Both GetPublicationRelations() and
> > GetPublicationExcludeRelations() can call that new function by passing
> > 'except-flag' as false and true respectively. The new internal
> > function will validate 'prexcept' against that except-flag passed and
> > will return the results.
> >
> I have made the above change.

Thank You for the changes.

1)
But on rethinking, shall we make GetPublicationRelations() similar to :

/* Gets list of publication oids for a relation that matches the except_flag */
GetRelationPublications(Oid relid, bool except_flag)

i.e. we can have a single function GetPublicationRelations() taking
except_flag and comment can say: 'Gets list of relation oids for a
publication that matches the except_flag.'

We can get rid of GetPubIncludedOrExcludedRels() and
GetPublicationExcludeRelations().

Thoughts?


2)
we can rename except_table to except_flag to be consistent with
GetRelationPublications()

3)
+ if ((except_table && pubrel->prexcept) || !except_table)
+ result = GetPubPartitionOptionRelations(result, pub_partopt,
+ pubrel->prrelid);

3a)
In the case of '!except_table', we are not matching it with
'pubrel->prexcept', is that intentional?

3 b)
Shall we simplify this similar to the changes in GetRelationPublications() i.e.
if (except_table/flag == pubrel->prexcept)
   result = GetPubPartitionOptionRelations(...)


>
> > 6)
> > Before your patch002, GetTopMostAncestorInPublication() was checking
> > pg_publication_rel and pg_publication_namespace to find out if the
> > table in the ancestor-list is part of a given particular. Both
> > pg_publication_rel and pg_publication_namespace did not have the entry
> > "for all tables" publications. That means
> > GetTopMostAncestorInPublication() was originally not checking whether
> > the given puboid is an "for all tables" publication to see if a rel
> > belongs to that particular pub or not. I
> >
> > But now with the current change, we do check if pub is all-tables pub,
> > if so, return relid and mark ancestor_level (provided table is not
> > part of the except list).  IIUC, the result in 2 cases may be
> > different. Is that the intention? Let me know if my understanding is
> > wrong.
> >
> This is intentional, in function get_rel_sync_entry, we are setting
> pub_relid to the topmost published ancestor. In HEAD we are directly
> setting using:
>             /*
>              * If this is a FOR ALL TABLES publication, pick the partition
>              * root and set the ancestor level accordingly.
>              */
>             if (pub->alltables)
>             {
>                 publish = true;
>                 if (pub->pubviaroot && am_partition)
>                 {
>                     List       *ancestors = get_partition_ancestors(relid);
>
>                     pub_relid = llast_oid(ancestors);
>                     ancestor_level = list_length(ancestors);
>                 }
>             }
> In HEAD, we can directly use 'llast_oid(ancestors)' to get the topmost
> ancestor for case of FOR ALL TABLES.
> But with this proposal. This change will no longer be valid as the
> 'llast_oid(ancestors)' may be excluded in the publication. So, to
> handle this change was made in GetTopMostAncestorInPublication.
>
>
> Also, during testing with the partitioned table and
> publish_via_partition_root the behaviour of the current patch is  as
> below:
> For example we have a partitioned table t1. It has partitions part1
> and part2. Now consider the following cases:
> 1. with publish_via_partition_root = true
>      I. If we create publication on all tables with EXCEPT t1, no data
> for t1, part1 or part2 is replicated.

Okay. Agreed.

>      II.  If we create publication on all tables with EXCEPT part1,
> data for all tables t1, part1 and part2 is replicated.

Okay. Is this because part1 changes are replicated through t1 and
since t1 changes are not restricted, part1 changes will also not be
restricted? In other words, part1 was never published directly in the
first place and thus 'EXCEPT part1' has no meaning when
'publish_via_partition_root' = true? IMO, it is in alignment with the
'publish_via_partition_root' definition but it might not be that
intuitive for users. So shall we emit a WARNING:

WARNING: Partition "part1" is excluded, but publish_via_partition_root
= true, so this will have no effect.
Thoughts?

> 2. with publish_via_partition_root = false
>      I. If we create publication on all tables with EXCEPT t1, no data
> for t1, part1 or part2 is replicated.

I think we shall still publish partitions here. Since
publish_via_partition_root is false, part1 and part2 are published
individually and thus shall we allow publishing of part1 and part 2
here? Thoughts?

>      II. If we create publication on all tables with EXCEPT part1,
> data for part1 is not replicated
>

Agreed.

thanks
Shveta



Re: Skipping schema changes in publication

От
shveta malik
Дата:
Shlok, I was trying to validate the interaction of
'publish_via_partition_root' with 'EXCEPT". Found some unexpected
behaviour, can you please review:

Pub:
---------
CREATE TABLE tab_root (range_col int,i int,j int) PARTITION BY RANGE
(range_col);
CREATE TABLE tab_part_1 PARTITION OF tab_root FOR VALUES FROM (1) to (1000);
CREATE TABLE tab_part_2 PARTITION OF tab_root FOR VALUES FROM (1000) to (2000);
create publication pub2 for all tables except tab_part_2 WITH
(publish_via_partition_root=true);

Sub (tables without partition):
--------
CREATE TABLE tab_root (range_col int,i int,j int);
CREATE TABLE tab_part_1(range_col int,i int,j int);
CREATE TABLE tab_part_2(range_col int,i int,j int);
create subscription sub2 connection '...' publication pub2;

Pub:
--------
insert into tab_part_2 values(1001,1,1);

On Sub, the above row is replicated as expected in tab_root due to
publish_via_partition_root=true on pub.

Now on Pub:
--------
alter publication pub2 set (publish_via_partition_root=false);
insert into tab_part_2 values(1002,2,2);

Now with publish_via_partition_root=false and 'except tab_part_2', the
above row is correctly ignored and not replicated on sub.

But when I try this:
insert into tab_part_1 values(1,1,1);
insert into tab_root values(5,5,5);

Expectation was that the above rows are replicated but that is not the
case. Can you please review? Please let me know if my understanding is
wrong.

thanks
Shveta



Re: Skipping schema changes in publication

От
shveta malik
Дата:
I further tested inherited tables flow as well wrt ONLY and EXCEPT, it
works well. But while reading docs for the saem, I have few concerns.

1)
While explaining ONLY for EXCEPT, create-publication doc says this

+      This does not apply to a partitioned table, however.  The partitions of
+      a partitioned table are always implicitly considered part of the
+      publication, so they are never explicitly excluded from the publication.

I do not understand the last line: "so they are never explicitly
excluded from the publication" . But we can explicitly exclude them
using EXCEPT <partition_name>. Do you mean to say something else here?

2)
alter-publication doc says (in context of EXCEPT):

"If ONLY is specified before the table name, only that table is
affected. If ONLY is not specified, the table and all its descendant
tables (if any) are affected. Optionally, * can be specified after
the table name to explicitly indicate that descendant tables are
affected."

But it does not mention anything for partitions. I think we shall
mention here as well that this does not apply to a partitioned table.
(I tested ONLY and EXCEPT for partition-root. UNLIKE inherited tables,
ONLY has no impact on partitioned tables.)

3)
Shall we explain the relation of 'publish_via_partition_root' with
EXCEPT briefly in docs(once we conclude that design)?

Please note that I have performed all the tests (mentioned here and in
previous emails) on patch001 and patch002. patch003 is not applied in
these tests.

thanks
Shveta



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Tue, 22 Jul 2025 at 07:28, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> Some review comments for patch v17-0003. I also checked the TAP test this time.
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 1.
> +   <literal>publish_generated_columns</literal></link>. Specifying generated
> +   columns in a column list using the <literal>EXCEPT</literal> clause excludes
> +   the specified generated columns from being published, regardless of the
> +   <link linkend="sql-createpublication-params-with-publish-generated-columns">
> +   <literal>publish_generated_columns</literal></link> setting. However, for
>
> I think that is not quite the same wording I had previously suggested.
> It sounds a bit odd/redundant saying "Specifying" and "specified" in
> the same sentence.
>
> ======
> src/backend/parser/gram.y
>
> 2. check_except_collist
>
> I'm wondering if this checking should be done within the existing
> preprocess_pubobj_list() function, alongside all the other ERROR
> checking. Care needs to be taken to make sure the pubtable->except is
> referring to an EXCEPT (col-list), instead of the other kind of EXCEPT
> tables, but in general I think it is better to keep all the
> publication combinations checking errors like this in one place.
>
Added the check in preprocess_pubobj_list(). I checked the syntaxes
and found that this function is not called for "FOR ALL TABLES" cases
and EXCEPT tables can only be used with "FOR ALL TABLES" publications.
So, I think handling for "EXCEPT tables" will not be required in the
function preprocess_pubobj_list()

>
> ======
> src/bin/psql/describe.c
>
> 3. addFooterToPublicationDesc
>
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (!PQgetisnull(result, i, 3) &&
> + strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> +   PQgetvalue(result, i, 2));
> + else
> + appendPQExpBuffer(&buf, " (%s)",
> +   PQgetvalue(result, i, 2));
> + }
>
> Do you really need to check !PQgetisnull(result, i, 3) here?  (e.g.
> The comment does not say that this attribute can be NULL)
>
> ======
> .../t/037_rep_changes_except_collist.pl
>
> 4.
> +# Copyright (c) 2021-2025, PostgreSQL Global Development Group
> +
> +# Logical replication tests for except table publications
>
> Comment is wrong. These tests are for EXCEPT (column-list)
>
> ~~~
>
> 5.
> +# Test for except column publications
> +# Initial setup
> +$node_publisher->safe_psql('postgres', "CREATE SCHEMA sch1");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab2 (a int, b int NOT NULL, c int)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE sch1.tab2 (a int, b int, c int)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab3 (a int, b int, c int)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab4 (a int, b int GENERATED ALWAYS AS (a * 2) STORED,
> c int GENERATED ALWAYS AS (a * 3) STORED)"
> +);
> +$node_publisher->safe_psql('postgres',
> + "CREATE TABLE tab5 (a int, b int GENERATED ALWAYS AS (a * 2) STORED,
> c int GENERATED ALWAYS AS (a * 3) STORED)"
> +);
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (1, 2, 3)");
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO sch1.tab2 VALUES (1, 2, 3)");
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_col FOR TABLE tab2 EXCEPT (a), sch1.tab2
> EXCEPT (b, c)"
> +);
>
> 5a.
> I think you don't need to say "Test for except column publications",
> because that is the purpose of thie entire file.
>
> ~
>
> 5b.
> You can combine multiple of these safe_psql calls together
>
> ~
>
> 5c.
> It might help make tests easier to read if you named those generated
> columns 'b', 'c' cols as 'bgen', 'cgen' instead.
>
> ~
> 5d.
> The table names are strange, because why does it start at tab2 when
> there is not a tab1?
> ~~~
>
> 6.
> +$node_subscriber->safe_psql('postgres', "CREATE SCHEMA sch1");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab2 (a int, b int NOT NULL, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE sch1.tab2 (a int, b int, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab3 (a int, b int, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab4 (a int, b int, c int)");
> +$node_subscriber->safe_psql('postgres',
> + "CREATE TABLE tab5 (a int, b int, c int)");
>
> You can combine multiple of these safe_psql calls together
>
> ~~~
>
> 7.
> +# Test initial sync
> +my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab2");
> +is($result, qq(|2|3),
> + 'check that initial sync for except column publication');
>
> The message seems strange. Do you mean "check initial sync for an
> 'EXCEPT (column-list)' publication"
>
> NOTE: There are many other messages where you wrote "for except column
> publication" but I think maybe all of those can be improved a bit like
> above.
>
> ~~~
>
> 8.
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab2 VALUES (4, 5, 6)");
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO sch1.tab2 VALUES (4, 5, 6)");
> +$node_publisher->wait_for_catchup('tap_sub_col');
>
> 8a.
> You can combine multiple of these safe_psql calls together.
>
> NOTE: I won't keep repeating this review comment but I think maybe
> there are lots more places where the safe_psql can all be combined to
> expected multiple statements.
>
> ~
>
> 8b.
> I felt all those commands should be under the "Test incremental
> changes" comment.
>
> ~~~
>
> 9.
> +is($result, qq(1||3), 'check alter publication with EXCEPT');
>
> Maybe that should've said with 'EXCEPT (column-list)'
>
> ~~~
>
> 10.
> +# Test for publication created with publish_generated_columns as true on table
> +# with generated columns and column list specified with EXCEPT
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab4 VALUES (1)");
> +$node_publisher->safe_psql('postgres',
> + "ALTER PUBLICATION tap_pub_col SET (publish_generated_columns)");
> +$node_publisher->safe_psql('postgres',
> + "ALTER PUBLICATION tap_pub_col SET TABLE tab4 EXCEPT(b)");
> +$node_subscriber->safe_psql('postgres',
> + "ALTER SUBSCRIPTION tap_sub_col REFRESH PUBLICATION");
> +$node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub_col');
>
> 10a.
> I felt the test comments for both those generated columns parameter
> test should give more explanation to say what is the expected result
> and why.
>
> ~
>
> 10b.
> How does "ALTER PUBLICATION tap_pub_col SET
> (publish_generated_columns)" even work? I thought the
> "pubish_generated_columns" is an enum but you did not specify any enum
> value here (???)
>
> ~~~
Yes, it works. It works equivalent to publish_generated_columns = stored.
Eg:
postgres=# CREATE PUBLICATION pub1 FOR TABLE t1 with
(publish_generated_columns);
CREATE PUBLICATION
postgres=# select * from pg_publication;
  oid  | pubname | pubowner | puballtables | pubinsert | pubupdate |
pubdelete | pubtruncate | pubviaroot | pubgencols

-------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------+------------
 16395 | pub1    |       10 | f            | t         | t         | t
        | t           | f          | s
(1 row)

For this patch, I have modified the test to use
'publish_generated_columns = stored'.

>
> 11.
> + 'check publication(publish_generated_columns as false) with
> generated columns and EXCEPT'
>
> Hmm. I thought there is no such thing as "publish_generated_columns as
> false", and also the EXCEPT should say 'EXCEPT (column-list)'
>
> ~~~
>
> 12.
> I wonder if there should be another boundary condition test case as follows:
> - have some table with cols a,b,c.
> - create a publication 'EXCEPT (a,b,c)', so you don't publish anything at all.
> - then ALTER the TABLE to add a column 'd'.
> - now the publication should publish only 'd'.
> ======

I have fixed all the comments and added the changes in the latest v18 patch.

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Tue, 22 Jul 2025 at 14:29, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sat, Jul 19, 2025 at 4:17 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Mon, 30 Jun 2025 at 16:25, shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > Few more comments on 002:
> > >
> > > 5)
> > > +GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
> > >  {
> > >
> > > + List    *exceptlist;
> > > +
> > > + exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> > >
> > >
> > > a) Here, we are assuming that the list provided by
> > > GetPublicationRelations() will be except-tables list only, but there
> > > is no validation of that.
> > > b) We are using GetPublicationRelations() to get the relations which
> > > are excluded from the publication. The name of function and comments
> > > atop function are not in alignment with this usage.
> > >
> > > Suggestion:
> > > We can have a new GetPublicationExcludeRelations() function for the
> > > concerned usage. The existing logic of GetPublicationRelations() can
> > > be shifted to a new internal-logic function which will accept a
> > > 'except-flag' as well. Both GetPublicationRelations() and
> > > GetPublicationExcludeRelations() can call that new function by passing
> > > 'except-flag' as false and true respectively. The new internal
> > > function will validate 'prexcept' against that except-flag passed and
> > > will return the results.
> > >
> > I have made the above change.
>
> Thank You for the changes.
>
> 1)
> But on rethinking, shall we make GetPublicationRelations() similar to :
>
> /* Gets list of publication oids for a relation that matches the except_flag */
> GetRelationPublications(Oid relid, bool except_flag)
>
> i.e. we can have a single function GetPublicationRelations() taking
> except_flag and comment can say: 'Gets list of relation oids for a
> publication that matches the except_flag.'
>
> We can get rid of GetPubIncludedOrExcludedRels() and
> GetPublicationExcludeRelations().
>
> Thoughts?
>
This seems reasonable to me. I have made the changes for the same.

>
> 2)
> we can rename except_table to except_flag to be consistent with
> GetRelationPublications()
>
> 3)
> + if ((except_table && pubrel->prexcept) || !except_table)
> + result = GetPubPartitionOptionRelations(result, pub_partopt,
> + pubrel->prrelid);
>
> 3a)
> In the case of '!except_table', we are not matching it with
> 'pubrel->prexcept', is that intentional?
>
> 3 b)
> Shall we simplify this similar to the changes in GetRelationPublications() i.e.
> if (except_table/flag == pubrel->prexcept)
>    result = GetPubPartitionOptionRelations(...)
>
>
> >
> > > 6)
> > > Before your patch002, GetTopMostAncestorInPublication() was checking
> > > pg_publication_rel and pg_publication_namespace to find out if the
> > > table in the ancestor-list is part of a given particular. Both
> > > pg_publication_rel and pg_publication_namespace did not have the entry
> > > "for all tables" publications. That means
> > > GetTopMostAncestorInPublication() was originally not checking whether
> > > the given puboid is an "for all tables" publication to see if a rel
> > > belongs to that particular pub or not. I
> > >
> > > But now with the current change, we do check if pub is all-tables pub,
> > > if so, return relid and mark ancestor_level (provided table is not
> > > part of the except list).  IIUC, the result in 2 cases may be
> > > different. Is that the intention? Let me know if my understanding is
> > > wrong.
> > >
> > This is intentional, in function get_rel_sync_entry, we are setting
> > pub_relid to the topmost published ancestor. In HEAD we are directly
> > setting using:
> >             /*
> >              * If this is a FOR ALL TABLES publication, pick the partition
> >              * root and set the ancestor level accordingly.
> >              */
> >             if (pub->alltables)
> >             {
> >                 publish = true;
> >                 if (pub->pubviaroot && am_partition)
> >                 {
> >                     List       *ancestors = get_partition_ancestors(relid);
> >
> >                     pub_relid = llast_oid(ancestors);
> >                     ancestor_level = list_length(ancestors);
> >                 }
> >             }
> > In HEAD, we can directly use 'llast_oid(ancestors)' to get the topmost
> > ancestor for case of FOR ALL TABLES.
> > But with this proposal. This change will no longer be valid as the
> > 'llast_oid(ancestors)' may be excluded in the publication. So, to
> > handle this change was made in GetTopMostAncestorInPublication.
> >
> >
> > Also, during testing with the partitioned table and
> > publish_via_partition_root the behaviour of the current patch is  as
> > below:
> > For example we have a partitioned table t1. It has partitions part1
> > and part2. Now consider the following cases:
> > 1. with publish_via_partition_root = true
> >      I. If we create publication on all tables with EXCEPT t1, no data
> > for t1, part1 or part2 is replicated.
>
> Okay. Agreed.
>
> >      II.  If we create publication on all tables with EXCEPT part1,
> > data for all tables t1, part1 and part2 is replicated.
>
> Okay. Is this because part1 changes are replicated through t1 and
> since t1 changes are not restricted, part1 changes will also not be
> restricted? In other words, part1 was never published directly in the
> first place and thus 'EXCEPT part1' has no meaning when
> 'publish_via_partition_root' = true? IMO, it is in alignment with the
> 'publish_via_partition_root' definition but it might not be that
> intuitive for users. So shall we emit a WARNING:
>
> WARNING: Partition "part1" is excluded, but publish_via_partition_root
> = true, so this will have no effect.
> Thoughts?
Your understanding is correct. I have added a WARNING for this case

>
> > 2. with publish_via_partition_root = false
> >      I. If we create publication on all tables with EXCEPT t1, no data
> > for t1, part1 or part2 is replicated.
>
> I think we shall still publish partitions here. Since
> publish_via_partition_root is false, part1 and part2 are published
> individually and thus shall we allow publishing of part1 and part 2
> here? Thoughts?
I made a mistake in explaining this point. Yes your point is correct.
Changes for partitions part1 and part2 will be replicated.
I have documented the behaviour in the docs.

>
> >      II. If we create publication on all tables with EXCEPT part1,
> > data for part1 is not replicated
> >
>
> Agreed.
>

I have addressed the comments and have attached the updated patch in [1].
[1]: https://www.postgresql.org/message-id/CANhcyEXkeg3sjkS3DS9yU1ckz4ozUBNZ%2BRmrWaRNSSVCR8RquA%40mail.gmail.com

Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Tue, 22 Jul 2025 at 15:57, shveta malik <shveta.malik@gmail.com> wrote:
>
> Shlok, I was trying to validate the interaction of
> 'publish_via_partition_root' with 'EXCEPT". Found some unexpected
> behaviour, can you please review:
>
> Pub:
> ---------
> CREATE TABLE tab_root (range_col int,i int,j int) PARTITION BY RANGE
> (range_col);
> CREATE TABLE tab_part_1 PARTITION OF tab_root FOR VALUES FROM (1) to (1000);
> CREATE TABLE tab_part_2 PARTITION OF tab_root FOR VALUES FROM (1000) to (2000);
> create publication pub2 for all tables except tab_part_2 WITH
> (publish_via_partition_root=true);
>
> Sub (tables without partition):
> --------
> CREATE TABLE tab_root (range_col int,i int,j int);
> CREATE TABLE tab_part_1(range_col int,i int,j int);
> CREATE TABLE tab_part_2(range_col int,i int,j int);
> create subscription sub2 connection '...' publication pub2;
>
> Pub:
> --------
> insert into tab_part_2 values(1001,1,1);
>
> On Sub, the above row is replicated as expected in tab_root due to
> publish_via_partition_root=true on pub.
>
> Now on Pub:
> --------
> alter publication pub2 set (publish_via_partition_root=false);
> insert into tab_part_2 values(1002,2,2);
>
> Now with publish_via_partition_root=false and 'except tab_part_2', the
> above row is correctly ignored and not replicated on sub.
>
> But when I try this:
> insert into tab_part_1 values(1,1,1);
> insert into tab_root values(5,5,5);
>
> Expectation was that the above rows are replicated but that is not the
> case. Can you please review? Please let me know if my understanding is
> wrong.

Hi Shveta,

I checked this behaviour on HEAD and found that it is the same
behaviour as HEAD. I think if we alter the parameter
'publish_via_partition_root', we should do ALTER SUBSCRIPTION ..
REFRESH PUBLICATION on subscriber.
I reviewed your behaviour and saw that after the 'alter publication
pub2 set (publish_via_partition_root=false)', the changes are still
being replicated to 'tab_root' on subscriber. And this behaviour is
similar to HEAD.

For example:
Pub:
---------
CREATE TABLE tab_root (range_col int,i int,j int) PARTITION BY RANGE
(range_col);
CREATE TABLE tab_part_1 PARTITION OF tab_root FOR VALUES FROM (1) to (1000);
CREATE TABLE tab_part_2 PARTITION OF tab_root FOR VALUES FROM (1000) to (2000);
create publication pub2 for table tab_root WITH
(publish_via_partition_root=true);

Sub (tables without partition):
--------
CREATE TABLE tab_root (range_col int,i int,j int);
CREATE TABLE tab_part_1(range_col int,i int,j int);
CREATE TABLE tab_part_2(range_col int,i int,j int);
create subscription sub2 connection '...' publication pub2;

Pub:
--------
insert into tab_part_2 values(1001,1,1);

On Sub, the above row is replicated as expected in tab_root.

Now on Pub:
--------
alter publication pub2 set (publish_via_partition_root=false);

when I try this the data:
insert into tab_part_2 values(1002,2,2);
insert into tab_part_1 values(1,1,1);
insert into tab_root values(5,5,5);

The data is being replicated to tab_root on the subscriber.

After I do ALTER SUBSCRIPTION .. REFRESH PUBLICATION on subscriber,
replication happens as expected.

Also I found following documentation:
"Altering the <literal>publish_via_partition_root</literal> parameter can
lead to data loss or duplication at the subscriber because it changes
the identity and schema of the published tables. Note this happens only
when a partition root table is specified as the replication target."

Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Wed, 23 Jul 2025 at 10:08, shveta malik <shveta.malik@gmail.com> wrote:
>
> I further tested inherited tables flow as well wrt ONLY and EXCEPT, it
> works well. But while reading docs for the saem, I have few concerns.
>
> 1)
> While explaining ONLY for EXCEPT, create-publication doc says this
>
> +      This does not apply to a partitioned table, however.  The partitions of
> +      a partitioned table are always implicitly considered part of the
> +      publication, so they are never explicitly excluded from the publication.
>
> I do not understand the last line: "so they are never explicitly
> excluded from the publication" . But we can explicitly exclude them
> using EXCEPT <partition_name>. Do you mean to say something else here?
>
> 2)
> alter-publication doc says (in context of EXCEPT):
>
> "If ONLY is specified before the table name, only that table is
> affected. If ONLY is not specified, the table and all its descendant
> tables (if any) are affected. Optionally, * can be specified after
> the table name to explicitly indicate that descendant tables are
> affected."
>
> But it does not mention anything for partitions. I think we shall
> mention here as well that this does not apply to a partitioned table.
> (I tested ONLY and EXCEPT for partition-root. UNLIKE inherited tables,
> ONLY has no impact on partitioned tables.)
>
> 3)
> Shall we explain the relation of 'publish_via_partition_root' with
> EXCEPT briefly in docs(once we conclude that design)?
>
> Please note that I have performed all the tests (mentioned here and in
> previous emails) on patch001 and patch002. patch003 is not applied in
> these tests.
>
I have added/ modified the documentations as per the comments. The
changes are present in patch [1].
[1]: https://www.postgresql.org/message-id/CANhcyEXkeg3sjkS3DS9yU1ckz4ozUBNZ%2BRmrWaRNSSVCR8RquA%40mail.gmail.com

Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
On Mon, Aug 4, 2025 at 2:07 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
...
> > 10b.
> > How does "ALTER PUBLICATION tap_pub_col SET
> > (publish_generated_columns)" even work? I thought the
> > "pubish_generated_columns" is an enum but you did not specify any enum
> > value here (???)
> >
> > ~~~
> Yes, it works. It works equivalent to publish_generated_columns = stored.
> Eg:
> postgres=# CREATE PUBLICATION pub1 FOR TABLE t1 with
> (publish_generated_columns);
> CREATE PUBLICATION
> postgres=# select * from pg_publication;
>   oid  | pubname | pubowner | puballtables | pubinsert | pubupdate |
> pubdelete | pubtruncate | pubviaroot | pubgencols
>
-------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------+------------
>  16395 | pub1    |       10 | f            | t         | t         | t
>         | t           | f          | s
> (1 row)
>

Hmm -- it's not documented to behave like that, so I've created
another thread for getting to the bottom of this topic.

~~~

Meanwhile, here are my review comments for patch v18-0003

======
src/backend/catalog/pg_publication.c

pg_get_publication_tables:

1.
if (nattnums > 0)
{
values[2] = PointerGetDatum(buildint2vector(attnums, nattnums));
nulls[2] = false;
}
else
nulls[2] = true;

Is there any possibility that values[2] might not be null, but then
nattrnums skips some cols so remains 0? Then the final values[2] would
conflict with nulls[2], which seems strange. Maybe it is safer to also
assign values[2] = null in the else.

======
src/backend/replication/logical/tablesync.c

fetch_remote_table_info:

2.
 static void
 fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
- List **qual, bool *gencol_published)
+ List **qual, bool *gencol_published,
+ bool *no_cols_published)

This new parameter should be documented in the function comment.

~~~

3.
+ if (server_version >= 190000)
+ *no_cols_published = DatumGetBool(slot_getattr(tslot, 2, &isnull));
+

It seems that *no_cols_published (and *gencol_published) are assigned
false by the caller. I had to go looking for that, so IMO it would be
better to put Assert at the top of here so it is self-documenting

Assert(*gencol_published == false);
Assert(*no_cols_published == false);

======
src/backend/replication/pgoutput/pgoutput.c

4.
+ /*
+ * Indicates whether no columns are published for a given relation. With
+ * the introduction of the EXCEPT clause in column lists, it is now
+ * possible to define a publication that excludes all columns of a table.
+ * However, the 'columns' attribute cannot represent this case, since a
+ * NULL value implies that all columns are published. To distinguish this
+ * scenario, the 'no_cols_published' flag is introduced.
+ */
+ bool no_cols_published;

The wording of the comment seems a bit strange -- EXCEPT is not a clause.

BEFORE:
the introduction of the EXCEPT clause in column lists, ...

SUGGESTION
the introduction of the EXCEPT qualifier for column lists, ....

~~~

5.
  Bitmapset  *cols = NULL;
+ bool except_columns = false;
+ bool no_col_published = false;

There are multiple places in this patch that say:

'no_col_published'
or 'no_cols_published'

I felt this var name can be misunderstood because it is easy to read
"no" as meaning "no." (aka number), and then misinterpret as
"number_of_cols_published".

Maybe an unambiguous name can be found, like
- 'zero_cols_published' or
- 'nothing_published' or
- really make it 'num_cols_published' and check for 0.

(so this comment applies to multiple places in the patch)

~~

6.
  * of the table (including generated columns when
  * 'publish_generated_columns' parameter is true).
  */
- if (!cols)
+ if (!no_col_published && !cols)
  {

The existing comment above this code fragment also needs to mention
"EXCEPT (column-list)" where all the columns are excluded

======
src/bin/psql/describe.c

describeOneTableDetails:

7.
  /* column list (if any) */
  if (!PQgetisnull(result, i, 2))
- appendPQExpBuffer(&buf, " (%s)",
-   PQgetvalue(result, i, 2));
+ {
+ if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
+ appendPQExpBuffer(&buf, " EXCEPT (%s)",
+   PQgetvalue(result, i, 2));
+ else
+ appendPQExpBuffer(&buf, " (%s)",
+   PQgetvalue(result, i, 2));
+ }

Isn't this code fragment (and also surrounding code) using the same
logic as what is already encapsulated in the function
addFooterToPublicationDesc()?
Superficially, it seems like a large chunk can all be replaced with a
single call to the existing function.

======
src/test/regress/expected/publication.out

8.
+-- Syntax error EXCEPT without a col-list
+CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
+ERROR:  EXCEPT clause not allowed for table without column list
+LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
+                                               ^

Is that a bad syntax position marker (^)? e.g. Why is it pointed at
the word "TABLE" instead of "EXCEPT"?

======
.../t/037_rep_changes_except_collist.pl

9.
+# Test initial sync
+my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1");
+is($result, qq(|2|3),
+ 'check that initial sync for EXCEPT (column-list) publication');
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.tab1");
+is($result, qq(1||),
+ 'check that initial sync for EXCEPT (column-list) publication');

These messages still seem to have missing or extra words: "check that
initial sync" (??). Maybe just remove the word 'that'?

~~~

10.
# Test for update
$node_subscriber->safe_psql(
'postgres', qq(
CREATE UNIQUE INDEX b_idx ON tab1 (b);
ALTER TABLE tab1 REPLICA IDENTITY USING INDEX b_idx;
));
$node_publisher->safe_psql(
'postgres', qq(
CREATE UNIQUE INDEX b_idx ON tab1 (b);
ALTER TABLE tab1 REPLICA IDENTITY USING INDEX b_idx;
UPDATE tab1 SET a = 3, b = 4, c = 5 WHERE a = 1;
));
$node_publisher->wait_for_catchup('tap_sub_col');
$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1");
is( $result, qq(|5|6
|4|5),
'check update for EXCEPT (column-list) publication');

~

10a.
I think the test is OK, but your chosen numbers like 1,2,3, then 4,5,6
and then updating to 1,2,3 to 3,4,5 make it quite hard to review.
Maybe use easier numbers that are more identifiable, e.g. update 1,2,3
=> 991,992,993 or something like that.

~

10b.
You may need to put some ORDER BY in all these queries just to make
sure they are always reproducible, giving rows in the expected order.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 4 Aug 2025 at 13:03, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 2:07 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> ...
> > > 10b.
> > > How does "ALTER PUBLICATION tap_pub_col SET
> > > (publish_generated_columns)" even work? I thought the
> > > "pubish_generated_columns" is an enum but you did not specify any enum
> > > value here (???)
> > >
> > > ~~~
> > Yes, it works. It works equivalent to publish_generated_columns = stored.
> > Eg:
> > postgres=# CREATE PUBLICATION pub1 FOR TABLE t1 with
> > (publish_generated_columns);
> > CREATE PUBLICATION
> > postgres=# select * from pg_publication;
> >   oid  | pubname | pubowner | puballtables | pubinsert | pubupdate |
> > pubdelete | pubtruncate | pubviaroot | pubgencols
> >
-------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------+------------
> >  16395 | pub1    |       10 | f            | t         | t         | t
> >         | t           | f          | s
> > (1 row)
> >
>
> Hmm -- it's not documented to behave like that, so I've created
> another thread for getting to the bottom of this topic.
>
> ~~~
>
> Meanwhile, here are my review comments for patch v18-0003
>
> ======
> src/backend/catalog/pg_publication.c
>
> pg_get_publication_tables:
>
> 1.
> if (nattnums > 0)
> {
> values[2] = PointerGetDatum(buildint2vector(attnums, nattnums));
> nulls[2] = false;
> }
> else
> nulls[2] = true;
>
> Is there any possibility that values[2] might not be null, but then
> nattrnums skips some cols so remains 0? Then the final values[2] would
> conflict with nulls[2], which seems strange. Maybe it is safer to also
> assign values[2] = null in the else.
>
Yes, When all the columns of a table are present in 'EXCEPT
(column-list)'. Then effectively no column should be replicated. In
such cases we should mark nulls[2] as true.
I agree with your point that values[2] should be made null. I have
used '(Datum) 0', in accordance with other places.

> ======
> src/backend/replication/logical/tablesync.c
>
> fetch_remote_table_info:
>
> 2.
>  static void
>  fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
> - List **qual, bool *gencol_published)
> + List **qual, bool *gencol_published,
> + bool *no_cols_published)
>
> This new parameter should be documented in the function comment.
>
> ~~~
>
> 3.
> + if (server_version >= 190000)
> + *no_cols_published = DatumGetBool(slot_getattr(tslot, 2, &isnull));
> +
>
> It seems that *no_cols_published (and *gencol_published) are assigned
> false by the caller. I had to go looking for that, so IMO it would be
> better to put Assert at the top of here so it is self-documenting
>
> Assert(*gencol_published == false);
> Assert(*no_cols_published == false);
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 4.
> + /*
> + * Indicates whether no columns are published for a given relation. With
> + * the introduction of the EXCEPT clause in column lists, it is now
> + * possible to define a publication that excludes all columns of a table.
> + * However, the 'columns' attribute cannot represent this case, since a
> + * NULL value implies that all columns are published. To distinguish this
> + * scenario, the 'no_cols_published' flag is introduced.
> + */
> + bool no_cols_published;
>
> The wording of the comment seems a bit strange -- EXCEPT is not a clause.
>
> BEFORE:
> the introduction of the EXCEPT clause in column lists, ...
>
> SUGGESTION
> the introduction of the EXCEPT qualifier for column lists, ....
>
> ~~~
>
> 5.
>   Bitmapset  *cols = NULL;
> + bool except_columns = false;
> + bool no_col_published = false;
>
> There are multiple places in this patch that say:
>
> 'no_col_published'
> or 'no_cols_published'
>
> I felt this var name can be misunderstood because it is easy to read
> "no" as meaning "no." (aka number), and then misinterpret as
> "number_of_cols_published".
>
> Maybe an unambiguous name can be found, like
> - 'zero_cols_published' or
> - 'nothing_published' or
> - really make it 'num_cols_published' and check for 0.
>
> (so this comment applies to multiple places in the patch)
>
How about 'all_cols_excluded'? Or 'has_published_cols'?
I have used 'all_cols_excluded' in this patch. Thoughts?

> ~~
>
> 6.
>   * of the table (including generated columns when
>   * 'publish_generated_columns' parameter is true).
>   */
> - if (!cols)
> + if (!no_col_published && !cols)
>   {
>
> The existing comment above this code fragment also needs to mention
> "EXCEPT (column-list)" where all the columns are excluded
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 7.
>   /* column list (if any) */
>   if (!PQgetisnull(result, i, 2))
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> +   PQgetvalue(result, i, 2));
> + else
> + appendPQExpBuffer(&buf, " (%s)",
> +   PQgetvalue(result, i, 2));
> + }
>
> Isn't this code fragment (and also surrounding code) using the same
> logic as what is already encapsulated in the function
> addFooterToPublicationDesc()?
> Superficially, it seems like a large chunk can all be replaced with a
> single call to the existing function.
>
'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
"schema_name.table_name" EXCEPT (column-list)
Whereas code pasted above is executed when we use \d+ table_name and
the output is the format:
"publication_name" EXCEPT (column-list)

These pieces of code are used to print different info. One is used to
print info related to tables and the other is used to print info
related to publication.
Should we use a common function for this?

> ======
> src/test/regress/expected/publication.out
>
> 8.
> +-- Syntax error EXCEPT without a col-list
> +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> +ERROR:  EXCEPT clause not allowed for table without column list
> +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> +                                               ^
>
> Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> the word "TABLE" instead of "EXCEPT"?
>
In function 'preprocess_pubobj_list' the position of position marker
(^) is decided by "pubobj->location". Function handles multiple errors
and setting "$$->location" only specific to EXCEPT qualifier would not
be appropriate. One solution I feel is to not show "position marker
(^)" in the case of EXCEPT. Or maybe we can add a new variable to
'PublicationTable' for except_location but I think we should not do
that. Thoughts?

For this version of patch, I have removed the "position marker (^)" in
the case of EXCEPT.

> ======
> .../t/037_rep_changes_except_collist.pl
>
> 9.
> +# Test initial sync
> +my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1");
> +is($result, qq(|2|3),
> + 'check that initial sync for EXCEPT (column-list) publication');
> +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.tab1");
> +is($result, qq(1||),
> + 'check that initial sync for EXCEPT (column-list) publication');
>
> These messages still seem to have missing or extra words: "check that
> initial sync" (??). Maybe just remove the word 'that'?
>
> ~~~
>
> 10.
> # Test for update
> $node_subscriber->safe_psql(
> 'postgres', qq(
> CREATE UNIQUE INDEX b_idx ON tab1 (b);
> ALTER TABLE tab1 REPLICA IDENTITY USING INDEX b_idx;
> ));
> $node_publisher->safe_psql(
> 'postgres', qq(
> CREATE UNIQUE INDEX b_idx ON tab1 (b);
> ALTER TABLE tab1 REPLICA IDENTITY USING INDEX b_idx;
> UPDATE tab1 SET a = 3, b = 4, c = 5 WHERE a = 1;
> ));
> $node_publisher->wait_for_catchup('tap_sub_col');
> $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1");
> is( $result, qq(|5|6
> |4|5),
> 'check update for EXCEPT (column-list) publication');
>
> ~
>
> 10a.
> I think the test is OK, but your chosen numbers like 1,2,3, then 4,5,6
> and then updating to 1,2,3 to 3,4,5 make it quite hard to review.
> Maybe use easier numbers that are more identifiable, e.g. update 1,2,3
> => 991,992,993 or something like that.
>
> ~
>
> 10b.
> You may need to put some ORDER BY in all these queries just to make
> sure they are always reproducible, giving rows in the expected order.
>

I have also addressed the remaining comments and attached the latest
v19 patches.

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok.

On Wed, Aug 6, 2025 at 11:11 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
...
> > 5.
> >   Bitmapset  *cols = NULL;
> > + bool except_columns = false;
> > + bool no_col_published = false;
> >
> > There are multiple places in this patch that say:
> >
> > 'no_col_published'
> > or 'no_cols_published'
> >
> > I felt this var name can be misunderstood because it is easy to read
> > "no" as meaning "no." (aka number), and then misinterpret as
> > "number_of_cols_published".
> >
> > Maybe an unambiguous name can be found, like
> > - 'zero_cols_published' or
> > - 'nothing_published' or
> > - really make it 'num_cols_published' and check for 0.
> >
> > (so this comment applies to multiple places in the patch)
> >
> How about 'all_cols_excluded'? Or 'has_published_cols'?
> I have used 'all_cols_excluded' in this patch. Thoughts?

The new name is good.

> > ======
> > src/bin/psql/describe.c
> >
> > describeOneTableDetails:
> >
> > 7.
> >   /* column list (if any) */
> >   if (!PQgetisnull(result, i, 2))
> > - appendPQExpBuffer(&buf, " (%s)",
> > -   PQgetvalue(result, i, 2));
> > + {
> > + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> > + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> > +   PQgetvalue(result, i, 2));
> > + else
> > + appendPQExpBuffer(&buf, " (%s)",
> > +   PQgetvalue(result, i, 2));
> > + }
> >
> > Isn't this code fragment (and also surrounding code) using the same
> > logic as what is already encapsulated in the function
> > addFooterToPublicationDesc()?
> > Superficially, it seems like a large chunk can all be replaced with a
> > single call to the existing function.
> >
> 'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
> "schema_name.table_name" EXCEPT (column-list)
> Whereas code pasted above is executed when we use \d+ table_name and
> the output is the format:
> "publication_name" EXCEPT (column-list)
>
> These pieces of code are used to print different info. One is used to
> print info related to tables and the other is used to print info
> related to publication.
> Should we use a common function for this?

It still seems like quite a lot of overlap. e.g. I thought there were
~30 lines common. OTOH, perhaps you'll need to pass another boolean to
the function to indicate it is a "Publication:" footer. I guess you'd
have to try it out first to see if the changes required to save those
30 LOC are worthwhile or not.

>
> > ======
> > src/test/regress/expected/publication.out
> >
> > 8.
> > +-- Syntax error EXCEPT without a col-list
> > +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> > +ERROR:  EXCEPT clause not allowed for table without column list
> > +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> > +                                               ^
> >
> > Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> > the word "TABLE" instead of "EXCEPT"?
> >
> In function 'preprocess_pubobj_list' the position of position marker
> (^) is decided by "pubobj->location". Function handles multiple errors
> and setting "$$->location" only specific to EXCEPT qualifier would not
> be appropriate. One solution I feel is to not show "position marker
> (^)" in the case of EXCEPT. Or maybe we can add a new variable to
> 'PublicationTable' for except_location but I think we should not do
> that. Thoughts?

In the review comments below, I suggest putting this location back,
but changing the message.

>
> For this version of patch, I have removed the "position marker (^)" in
> the case of EXCEPT.
>

//////

Here are my review comments for the patch v19-0003.

======
1. General - SGML tags in docs for table/column names.

There is nothing to change just yet, but keep an eye on the thread
[1],  because if/when that gets pushed, then there will several tags
in this patch for table/column names that will need to be updated for
consistency.

======
src/backend/catalog/pg_publication.c

pg_get_publication_tables:

2.
+
+ if (!nulls[2])
+ {
+ Datum exceptDatum;
+ bool isnull;
+
+ /*
+ * We fetch pubtuple if publication is not FOR ALL TABLES and
+ * not FOR TABLES IN SCHEMA. So if prexcept is true, it
+ * indicates that prattrs contains columns to be excluded for
+ * replication.
+ */
+ exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
+   Anum_pg_publication_rel_prexcept,
+   &isnull);
+
+ if (!isnull && DatumGetBool(exceptDatum))
+ except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
+ }

Maybe this should be done a few lines earlier, to keep all the
values[2]/nulls[2] code together, ahead of the values[3]/nulls[3]
code. Indeed, there is lots of other values[2]/nulls[2] logic that
comes later in this function, so maybe it is better to do all of that
first, instead of mingling it with values[3]/nulls[3].

======
src/backend/commands/publicationcmds.c

pub_contains_invalid_column:

3.
  * 1. Ensures that all columns referenced in the REPLICA IDENTITY are covered
- *    by the column list. If any column is missing, *invalid_column_list is set
+ *    by the column list and are not part of column list specified with EXCEPT.
+ *   If any column is missing, *invalid_column_list is set
  *    to true.

Whitespace problem here; there is some tab instead of space in this comment.

Also /part of column list/part of the column list/

~~~

AlterPublicationTables:

4.
  bool isnull = true;
  Datum whereClauseDatum;
  Datum columnListDatum;
+ Datum exceptDatum;

It's not necessary to have all these different Datum variables; they
are only temporary storage. It might be simpler to use a single "Datum
datum;" which is reused 3x.

~

5.
+ exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+   Anum_pg_publication_rel_prexcept,
+   &isnull);
+
+ if (!isnull)
+ oldexcept = DatumGetBool(exceptDatum);
+

Isn't the 'prexcept' also used for EXCEPT TABLE as well as EXCEPT
(column-list)? In other words, should the change to this function be
done already in one of the earlier patches?

~

6.
  if (equal(oldrelwhereclause, newpubrel->whereClause) &&
- bms_equal(oldcolumns, newcolumns))
+ bms_equal(oldcolumns, newcolumns) &&
+ oldexcept == newpubrel->except)

The code comment about this code fragment should also mention EXCEPT.

======
src/backend/parser/gram.y

preprocess_pubobj_list:

7.
+ if (pubobj->pubtable && pubobj->pubtable->except &&
+ pubobj->pubtable->columns == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("EXCEPT clause not allowed for table without column list"));
+

Having the syntax error location (like before in v18) might be better,
but since that location is associated with the TABLE, then the error
message should also be reworded so the subject is the table.

SUGGESTION
errmsg("table without column list cannot use EXCEPT clause")

======
src/bin/psql/describe.c

describeOneTableDetails:

8.
- if (pset.sversion >= 150000)
+ if (pset.sversion >= 190000)
  {
  printfPQExpBuffer(&buf,
    "SELECT pubname\n"
    "     , NULL\n"
    "     , NULL\n"
+   " , NULL\n"
    "FROM pg_catalog.pg_publication p\n"
    "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
    "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
@@ -3038,35 +3039,62 @@ describeOneTableDetails(const char *schemaname,
    "                pg_catalog.pg_attribute\n"
    "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
    "        ELSE NULL END) "
+   " , prexcept "
    "FROM pg_catalog.pg_publication p\n"
    " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
    " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
-   "WHERE pr.prrelid = '%s'\n",
-   oid, oid, oid);
-
- if (pset.sversion >= 190000)
- appendPQExpBufferStr(&buf, " AND NOT pr.prexcept\n");
+   "WHERE pr.prrelid = '%s' "
+   "AND  c.relnamespace NOT IN (\n "
+   " SELECT pnnspid FROM\n"
+   " pg_catalog.pg_publication_namespace)\n"

- appendPQExpBuffer(&buf,
    "UNION\n"
    "SELECT pubname\n"
    " , NULL\n"
    " , NULL\n"
+   " , NULL\n"
    "FROM pg_catalog.pg_publication p\n"
-   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n",
-   oid);
-
- if (pset.sversion >= 190000)
- appendPQExpBuffer(&buf,
-   "     AND NOT EXISTS (\n"
-   " SELECT 1\n"
-   " FROM pg_catalog.pg_publication_rel pr\n"
-   " JOIN pg_catalog.pg_class pc\n"
-   " ON pr.prrelid = pc.oid\n"
-   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
-   oid);
-
- appendPQExpBufferStr(&buf, "ORDER BY 1;");
+   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
+   "     AND NOT EXISTS (\n"
+   " SELECT 1\n"
+   " FROM pg_catalog.pg_publication_rel pr\n"
+   " JOIN pg_catalog.pg_class pc\n"
+   " ON pr.prrelid = pc.oid\n"
+   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n"
+   "ORDER BY 1;",
+   oid, oid, oid, oid, oid);
+ }
+ else if (pset.sversion >= 150000)
+ {
+ printfPQExpBuffer(&buf,
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
+   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
+   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , pg_get_expr(pr.prqual, c.oid)\n"
+   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
+   "         (SELECT string_agg(attname, ', ')\n"
+   "           FROM pg_catalog.generate_series(0,
pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
+   "                pg_catalog.pg_attribute\n"
+   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
+   "        ELSE NULL END) "
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
+   "WHERE pr.prrelid = '%s'\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
+   "ORDER BY 1;",
+   oid, oid, oid, oid);

I found these large SQL selects with 3x UNIONs are difficult to read.
Maybe you can add more comments to describe the intention of each of
the UNION SELECTs?

~~~

9.
  /* column list (if any) */
  if (!PQgetisnull(result, i, 2))
- appendPQExpBuffer(&buf, " (%s)",
-   PQgetvalue(result, i, 2));
+ {
+ if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
+ appendPQExpBuffer(&buf, " EXCEPT");
+ appendPQExpBuffer(&buf, " (%s)", PQgetvalue(result, i, 2));
+ }

I did not find any regression test case where the "EXCEPT" col-list is
getting output for a "Publications:" footer.

======
[1] https://www.postgresql.org/message-id/aIELRMAviNiUL1ie%40momjian.us

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 11 Aug 2025 at 13:55, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok.
>
> On Wed, Aug 6, 2025 at 11:11 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> ...
> > > 5.
> > >   Bitmapset  *cols = NULL;
> > > + bool except_columns = false;
> > > + bool no_col_published = false;
> > >
> > > There are multiple places in this patch that say:
> > >
> > > 'no_col_published'
> > > or 'no_cols_published'
> > >
> > > I felt this var name can be misunderstood because it is easy to read
> > > "no" as meaning "no." (aka number), and then misinterpret as
> > > "number_of_cols_published".
> > >
> > > Maybe an unambiguous name can be found, like
> > > - 'zero_cols_published' or
> > > - 'nothing_published' or
> > > - really make it 'num_cols_published' and check for 0.
> > >
> > > (so this comment applies to multiple places in the patch)
> > >
> > How about 'all_cols_excluded'? Or 'has_published_cols'?
> > I have used 'all_cols_excluded' in this patch. Thoughts?
>
> The new name is good.
>
> > > ======
> > > src/bin/psql/describe.c
> > >
> > > describeOneTableDetails:
> > >
> > > 7.
> > >   /* column list (if any) */
> > >   if (!PQgetisnull(result, i, 2))
> > > - appendPQExpBuffer(&buf, " (%s)",
> > > -   PQgetvalue(result, i, 2));
> > > + {
> > > + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> > > + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> > > +   PQgetvalue(result, i, 2));
> > > + else
> > > + appendPQExpBuffer(&buf, " (%s)",
> > > +   PQgetvalue(result, i, 2));
> > > + }
> > >
> > > Isn't this code fragment (and also surrounding code) using the same
> > > logic as what is already encapsulated in the function
> > > addFooterToPublicationDesc()?
> > > Superficially, it seems like a large chunk can all be replaced with a
> > > single call to the existing function.
> > >
> > 'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
> > "schema_name.table_name" EXCEPT (column-list)
> > Whereas code pasted above is executed when we use \d+ table_name and
> > the output is the format:
> > "publication_name" EXCEPT (column-list)
> >
> > These pieces of code are used to print different info. One is used to
> > print info related to tables and the other is used to print info
> > related to publication.
> > Should we use a common function for this?
>
> It still seems like quite a lot of overlap. e.g. I thought there were
> ~30 lines common. OTOH, perhaps you'll need to pass another boolean to
> the function to indicate it is a "Publication:" footer. I guess you'd
> have to try it out first to see if the changes required to save those
> 30 LOC are worthwhile or not.
>
I have added the code changes for the same in this patch.

> >
> > > ======
> > > src/test/regress/expected/publication.out
> > >
> > > 8.
> > > +-- Syntax error EXCEPT without a col-list
> > > +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> > > +ERROR:  EXCEPT clause not allowed for table without column list
> > > +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> > > +                                               ^
> > >
> > > Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> > > the word "TABLE" instead of "EXCEPT"?
> > >
> > In function 'preprocess_pubobj_list' the position of position marker
> > (^) is decided by "pubobj->location". Function handles multiple errors
> > and setting "$$->location" only specific to EXCEPT qualifier would not
> > be appropriate. One solution I feel is to not show "position marker
> > (^)" in the case of EXCEPT. Or maybe we can add a new variable to
> > 'PublicationTable' for except_location but I think we should not do
> > that. Thoughts?
>
> In the review comments below, I suggest putting this location back,
> but changing the message.
>
> >
> > For this version of patch, I have removed the "position marker (^)" in
> > the case of EXCEPT.
> >
>
> //////
>
> Here are my review comments for the patch v19-0003.
>
> ======
> 1. General - SGML tags in docs for table/column names.
>
> There is nothing to change just yet, but keep an eye on the thread
> [1],  because if/when that gets pushed, then there will several tags
> in this patch for table/column names that will need to be updated for
> consistency.
>
Noted

> ======
> src/backend/catalog/pg_publication.c
>
> pg_get_publication_tables:
>
> 2.
> +
> + if (!nulls[2])
> + {
> + Datum exceptDatum;
> + bool isnull;
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and
> + * not FOR TABLES IN SCHEMA. So if prexcept is true, it
> + * indicates that prattrs contains columns to be excluded for
> + * replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum))
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
> + }
>
> Maybe this should be done a few lines earlier, to keep all the
> values[2]/nulls[2] code together, ahead of the values[3]/nulls[3]
> code. Indeed, there is lots of other values[2]/nulls[2] logic that
> comes later in this function, so maybe it is better to do all of that
> first, instead of mingling it with values[3]/nulls[3].
>
> ======
> src/backend/commands/publicationcmds.c
>
> pub_contains_invalid_column:
>
> 3.
>   * 1. Ensures that all columns referenced in the REPLICA IDENTITY are covered
> - *    by the column list. If any column is missing, *invalid_column_list is set
> + *    by the column list and are not part of column list specified with EXCEPT.
> + *   If any column is missing, *invalid_column_list is set
>   *    to true.
>
> Whitespace problem here; there is some tab instead of space in this comment.
>
> Also /part of column list/part of the column list/
>
> ~~~
>
> AlterPublicationTables:
>
> 4.
>   bool isnull = true;
>   Datum whereClauseDatum;
>   Datum columnListDatum;
> + Datum exceptDatum;
>
> It's not necessary to have all these different Datum variables; they
> are only temporary storage. It might be simpler to use a single "Datum
> datum;" which is reused 3x.
>
> ~
>
> 5.
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> +   Anum_pg_publication_rel_prexcept,
> +   &isnull);
> +
> + if (!isnull)
> + oldexcept = DatumGetBool(exceptDatum);
> +
>
> Isn't the 'prexcept' also used for EXCEPT TABLE as well as EXCEPT
> (column-list)? In other words, should the change to this function be
> done already in one of the earlier patches?
>
> ~
This code path is only executed when running ALTER PUBLICATION ... SET
TABLE and running this command on a  ALL TABLES publication throws an
error due to check by function 'CheckAlterPublication' . And EXCEPT
TABLE can only be used for ALL TABLES publications, I think it doesn’t
need to be moved to the 0002 patch.

>
> 6.
>   if (equal(oldrelwhereclause, newpubrel->whereClause) &&
> - bms_equal(oldcolumns, newcolumns))
> + bms_equal(oldcolumns, newcolumns) &&
> + oldexcept == newpubrel->except)
>
> The code comment about this code fragment should also mention EXCEPT.
>
> ======
> src/backend/parser/gram.y
>
> preprocess_pubobj_list:
>
> 7.
> + if (pubobj->pubtable && pubobj->pubtable->except &&
> + pubobj->pubtable->columns == NULL)
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("EXCEPT clause not allowed for table without column list"));
> +
>
> Having the syntax error location (like before in v18) might be better,
> but since that location is associated with the TABLE, then the error
> message should also be reworded so the subject is the table.
>
> SUGGESTION
> errmsg("table without column list cannot use EXCEPT clause")
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 8.
> - if (pset.sversion >= 150000)
> + if (pset.sversion >= 190000)
>   {
>   printfPQExpBuffer(&buf,
>     "SELECT pubname\n"
>     "     , NULL\n"
>     "     , NULL\n"
> +   " , NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
>     "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
>     "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> @@ -3038,35 +3039,62 @@ describeOneTableDetails(const char *schemaname,
>     "                pg_catalog.pg_attribute\n"
>     "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
>     "        ELSE NULL END) "
> +   " , prexcept "
>     "FROM pg_catalog.pg_publication p\n"
>     " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
>     " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> -   "WHERE pr.prrelid = '%s'\n",
> -   oid, oid, oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBufferStr(&buf, " AND NOT pr.prexcept\n");
> +   "WHERE pr.prrelid = '%s' "
> +   "AND  c.relnamespace NOT IN (\n "
> +   " SELECT pnnspid FROM\n"
> +   " pg_catalog.pg_publication_namespace)\n"
>
> - appendPQExpBuffer(&buf,
>     "UNION\n"
>     "SELECT pubname\n"
>     " , NULL\n"
>     " , NULL\n"
> +   " , NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
> -   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n",
> -   oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBuffer(&buf,
> -   "     AND NOT EXISTS (\n"
> -   " SELECT 1\n"
> -   " FROM pg_catalog.pg_publication_rel pr\n"
> -   " JOIN pg_catalog.pg_class pc\n"
> -   " ON pr.prrelid = pc.oid\n"
> -   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
> -   oid);
> -
> - appendPQExpBufferStr(&buf, "ORDER BY 1;");
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "     AND NOT EXISTS (\n"
> +   " SELECT 1\n"
> +   " FROM pg_catalog.pg_publication_rel pr\n"
> +   " JOIN pg_catalog.pg_class pc\n"
> +   " ON pr.prrelid = pc.oid\n"
> +   " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid, oid);
> + }
> + else if (pset.sversion >= 150000)
> + {
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> +   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , pg_get_expr(pr.prqual, c.oid)\n"
> +   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
> +   "         (SELECT string_agg(attname, ', ')\n"
> +   "           FROM pg_catalog.generate_series(0,
> pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
> +   "                pg_catalog.pg_attribute\n"
> +   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> +   "        ELSE NULL END) "
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> +   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prrelid = '%s'\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid);
>
> I found these large SQL selects with 3x UNIONs are difficult to read.
> Maybe you can add more comments to describe the intention of each of
> the UNION SELECTs?
>
> ~~~
>
> 9.
>   /* column list (if any) */
>   if (!PQgetisnull(result, i, 2))
> - appendPQExpBuffer(&buf, " (%s)",
> -   PQgetvalue(result, i, 2));
> + {
> + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT");
> + appendPQExpBuffer(&buf, " (%s)", PQgetvalue(result, i, 2));
> + }
>
> I did not find any regression test case where the "EXCEPT" col-list is
> getting output for a "Publications:" footer.
>
> ======
> [1] https://www.postgresql.org/message-id/aIELRMAviNiUL1ie%40momjian.us
>

I have addressed the comments and the changes in v20 patch.

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok,

Here are some review comments for v20-0003.

======
src/backend/commands/publicationcmds.c

AlterPublicationTables:

1.
  bool isnull = true;
- Datum whereClauseDatum;
- Datum columnListDatum;
+ Datum datum;

I know you did not write the code, but that "isnull = true" is
redundant, and seems kind of misleading because it will always be
re-assigned before it is used.

~~~

2.
  /* Load the WHERE clause for this table. */
- whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
-    Anum_pg_publication_rel_prqual,
-    &isnull);
+ datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prqual,
+ &isnull);
  if (!isnull)
- oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
+ oldrelwhereclause = stringToNode(TextDatumGetCString(datum));

  /* Transform the int2vector column list to a bitmap. */
- columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
-   Anum_pg_publication_rel_prattrs,
-   &isnull);
+ datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prattrs,
+ &isnull);
+
+ if (!isnull)
+ oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
+
+ /* Load the prexcept flag for this table. */
+ datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+ Anum_pg_publication_rel_prexcept,
+ &isnull);

  if (!isnull)
- oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
+ oldexcept = DatumGetBool(datum);

Use consistent spacing. Either do or don't (I prefer don't) put a
blank line between the pairs of "datum =" and "if (!isnull)". Avoid
having a mixture.

======
src/bin/psql/describe.c

addFooterToPublicationOrTableDesc:

3.
+/*
+ * If is_tbl_desc is true add footer to table description else add footer to
+ * publication description.
+ */
+static bool
+addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
+   bool as_schema, printTableContent *const cont,
+   bool is_tbl_desc)

3a.
Since you are changing this anyway, I think it would be better to keep
those boolean params together (at the end).

~

3b.
It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
but having the variable 'is_tbl_desc', because it seems more natural
to me to read left to right, so the logical order of everything here
should be pub desc then table desc. In other words, use boolean
'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
thing is kind of a *subset* of the publication description, so it
makes more sense for that to come last too.

e.g.
CURRENT
addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
SUGGESTION
addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)

~

3c
While you are changing things, maybe also consider changing that
'as_schema' name because I did not understand what "as" means. Perhaps
rename like 'pub_schemas', or 'only_show_schemas' or something better
(???).

~~~

4.
+ PGresult   *res;
+ int count = 0;
+ int i = 0;
+ int col = is_tbl_desc ? 0 : 1;
+
+ res = PSQLexec(buf->data);
+ if (!res)
+ return false;
+ else
+ count = PQntuples(res);
+

4a.
Assignment count = 0 is redundant.

~

4b.
Remove the 'i' declaration here. Declare it in the "for" loop later.

~

4c.
The "else" is not required. If 'res' was not good, you already returned.

~~~

5.
+ if (as_schema)
+ printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
+ else
+ {
+ if (is_tbl_desc)
+ printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
+ else
+ printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
+   PQgetvalue(res, i, col));

This function is basically either (a) a footer for a table description
or (b) a footer for a publication description. And that all hinges on
the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
main condition to be "if (is_tbl_desc)" here.

This turned everything inside out. PSA: a top-up patch to show a way
to do this. Perhaps my implementation is a bit verbose, but OTOH it
seems easier to understand. Anyway, see what you think...

~~~

6.
+ /*---------------------------------------------------
+ * Publication/ table description columns:
+ * [0]: schema name (nspname)
+ * [col]: table name (relname) / publication name (pubname)
+ * [col + 1]: row filter expression (prqual), may be NULL
+ * [col + 2]: column list (comma-separated), may be NULL
+ * [col + 3]: except flag ("t" if EXCEPT, else "f")
+ *---------------------------------------------------

I've modified this comment slightly so I could understand it better.
See if you agree.

SUGGESTION
/*---------------------------------------------------
 * Description columns:
 * PUB      TBL
 * [0]      -      : schema name (nspname)
 * [col]    -      : table name (relname)
 * -        [col]  : publication name (pubname)
 * [col+1]  [col+1]: row filter expression (prqual), may be NULL
 * [col+2]  [col+1]: column list (comma-separated), may be NULL
 * [col+3]  [col+1]: except flag ("t" if EXCEPT, else "f")
 *---------------------------------------------------
 */

~~~

describeOneTableDetails:

7.
+ else if (pset.sversion >= 150000)
+ {
+ printfPQExpBuffer(&buf,
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
+   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
+   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , pg_get_expr(pr.prqual, c.oid)\n"
+   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
+   "         (SELECT string_agg(attname, ', ')\n"
+   "           FROM pg_catalog.generate_series(0,
pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
+   "                pg_catalog.pg_attribute\n"
+   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
+   "        ELSE NULL END) "
+   "FROM pg_catalog.pg_publication p\n"
+   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
+   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
+   "WHERE pr.prrelid = '%s'\n"
+   "UNION\n"
+   "SELECT pubname\n"
+   "     , NULL\n"
+   "     , NULL\n"
+   "FROM pg_catalog.pg_publication p\n"
+   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
+   "ORDER BY 1;",
+   oid, oid, oid, oid);

AFAICT, that >= 150000 code seems to have added another UNION at the
end that was not previously there. What's that about? How is that
related to EXCEPT (column-list)?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: Skipping schema changes in publication

От
Kirill Reshke
Дата:
Hi

On Fri, 15 Aug 2025 at 05:53, Peter Smith <smithpb2250@gmail.com> wrote:

> 1.
>   bool isnull = true;
> - Datum whereClauseDatum;
> - Datum columnListDatum;
> + Datum datum;
>
> I know you did not write the code, but that "isnull = true" is
> redundant, and seems kind of misleading because it will always be
> re-assigned before it is used.

People are not generally excited about refactoring code they did not
change. This makes patch to have more review cycles, and less probable
to actually being committed. If we are really wedded with this change,
this could be a separate thread.


> ~~~
>
> 2.
>   /* Load the WHERE clause for this table. */
> - whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -    Anum_pg_publication_rel_prqual,
> -    &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prqual,
> + &isnull);
>   if (!isnull)
> - oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
> + oldrelwhereclause = stringToNode(TextDatumGetCString(datum));
>
>   /* Transform the int2vector column list to a bitmap. */
> - columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -   Anum_pg_publication_rel_prattrs,
> -   &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prattrs,
> + &isnull);
> +
> + if (!isnull)
> + oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
> +
> + /* Load the prexcept flag for this table. */
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prexcept,
> + &isnull);
>
>   if (!isnull)
> - oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
> + oldexcept = DatumGetBool(datum);
>
> Use consistent spacing. Either do or don't (I prefer don't) put a
> blank line between the pairs of "datum =" and "if (!isnull)". Avoid
> having a mixture.
>
> ======
> src/bin/psql/describe.c
>
> addFooterToPublicationOrTableDesc:
>
> 3.
> +/*
> + * If is_tbl_desc is true add footer to table description else add footer to
> + * publication description.
> + */
> +static bool
> +addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
> +   bool as_schema, printTableContent *const cont,
> +   bool is_tbl_desc)
>
> 3a.
> Since you are changing this anyway, I think it would be better to keep
> those boolean params together (at the end).
>
> ~
>
> 3b.
> It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
> but having the variable 'is_tbl_desc', because it seems more natural
> to me to read left to right, so the logical order of everything here
> should be pub desc then table desc. In other words, use boolean
> 'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
> thing is kind of a *subset* of the publication description, so it
> makes more sense for that to come last too.
>
> e.g.
> CURRENT
> addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
> SUGGESTION
> addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)
>
> ~
>
> 3c
> While you are changing things, maybe also consider changing that
> 'as_schema' name because I did not understand what "as" means. Perhaps
> rename like 'pub_schemas', or 'only_show_schemas' or something better
> (???).
>
> ~~~
>
> 4.
> + PGresult   *res;
> + int count = 0;
> + int i = 0;
> + int col = is_tbl_desc ? 0 : 1;
> +
> + res = PSQLexec(buf->data);
> + if (!res)
> + return false;
> + else
> + count = PQntuples(res);
> +
>
> 4a.
> Assignment count = 0 is redundant.
>
> ~
>
> 4b.
> Remove the 'i' declaration here. Declare it in the "for" loop later.
>
> ~
>
> 4c.
> The "else" is not required. If 'res' was not good, you already returned.
>
> ~~~
>
> 5.
> + if (as_schema)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
> + else
> + {
> + if (is_tbl_desc)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
> + else
> + printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
> +   PQgetvalue(res, i, col));
>
> This function is basically either (a) a footer for a table description
> or (b) a footer for a publication description. And that all hinges on
> the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
> main condition to be "if (is_tbl_desc)" here.
>
> This turned everything inside out. PSA: a top-up patch to show a way
> to do this. Perhaps my implementation is a bit verbose, but OTOH it
> seems easier to understand. Anyway, see what you think...
>

+ 1

>
> 6.
> + /*---------------------------------------------------
> + * Publication/ table description columns:
> + * [0]: schema name (nspname)
> + * [col]: table name (relname) / publication name (pubname)
> + * [col + 1]: row filter expression (prqual), may be NULL
> + * [col + 2]: column list (comma-separated), may be NULL
> + * [col + 3]: except flag ("t" if EXCEPT, else "f")
> + *---------------------------------------------------
>
> I've modified this comment slightly so I could understand it better.
> See if you agree.

For me that's equal. lets see what other people think


-- 
Best regards,
Kirill Reshke



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Fri, 15 Aug 2025 at 06:23, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> Here are some review comments for v20-0003.
>
> ======
> src/backend/commands/publicationcmds.c
>
> AlterPublicationTables:
>
> 1.
>   bool isnull = true;
> - Datum whereClauseDatum;
> - Datum columnListDatum;
> + Datum datum;
>
> I know you did not write the code, but that "isnull = true" is
> redundant, and seems kind of misleading because it will always be
> re-assigned before it is used.
>
Since this is part of already existing code, I think this should be a
new thread. I have created a new thread for this. See [1].

> ~~~
>
> 2.
>   /* Load the WHERE clause for this table. */
> - whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -    Anum_pg_publication_rel_prqual,
> -    &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prqual,
> + &isnull);
>   if (!isnull)
> - oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
> + oldrelwhereclause = stringToNode(TextDatumGetCString(datum));
>
>   /* Transform the int2vector column list to a bitmap. */
> - columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> -   Anum_pg_publication_rel_prattrs,
> -   &isnull);
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prattrs,
> + &isnull);
> +
> + if (!isnull)
> + oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
> +
> + /* Load the prexcept flag for this table. */
> + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prexcept,
> + &isnull);
>
>   if (!isnull)
> - oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
> + oldexcept = DatumGetBool(datum);
>
> Use consistent spacing. Either do or don't (I prefer don't) put a
> blank line between the pairs of "datum =" and "if (!isnull)". Avoid
> having a mixture.
>
> ======
> src/bin/psql/describe.c
>
> addFooterToPublicationOrTableDesc:
>
> 3.
> +/*
> + * If is_tbl_desc is true add footer to table description else add footer to
> + * publication description.
> + */
> +static bool
> +addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
> +   bool as_schema, printTableContent *const cont,
> +   bool is_tbl_desc)
>
> 3a.
> Since you are changing this anyway, I think it would be better to keep
> those boolean params together (at the end).
>
> ~
>
> 3b.
> It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
> but having the variable 'is_tbl_desc', because it seems more natural
> to me to read left to right, so the logical order of everything here
> should be pub desc then table desc. In other words, use boolean
> 'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
> thing is kind of a *subset* of the publication description, so it
> makes more sense for that to come last too.
>
> e.g.
> CURRENT
> addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
> SUGGESTION
> addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)
>
> ~
>
> 3c
> While you are changing things, maybe also consider changing that
> 'as_schema' name because I did not understand what "as" means. Perhaps
> rename like 'pub_schemas', or 'only_show_schemas' or something better
> (???).
>
I have used pub_schemas.
> ~~~
>
> 4.
> + PGresult   *res;
> + int count = 0;
> + int i = 0;
> + int col = is_tbl_desc ? 0 : 1;
> +
> + res = PSQLexec(buf->data);
> + if (!res)
> + return false;
> + else
> + count = PQntuples(res);
> +
>
> 4a.
> Assignment count = 0 is redundant.
>
> ~
>
> 4b.
> Remove the 'i' declaration here. Declare it in the "for" loop later.
>
> ~
>
> 4c.
> The "else" is not required. If 'res' was not good, you already returned.
>
> ~~~
>
> 5.
> + if (as_schema)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
> + else
> + {
> + if (is_tbl_desc)
> + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
> + else
> + printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
> +   PQgetvalue(res, i, col));
>
> This function is basically either (a) a footer for a table description
> or (b) a footer for a publication description. And that all hinges on
> the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
> main condition to be "if (is_tbl_desc)" here.
>
> This turned everything inside out. PSA: a top-up patch to show a way
> to do this. Perhaps my implementation is a bit verbose, but OTOH it
> seems easier to understand. Anyway, see what you think...
>
I have also used the patch with minor changes.

> ~~~
>
> 6.
> + /*---------------------------------------------------
> + * Publication/ table description columns:
> + * [0]: schema name (nspname)
> + * [col]: table name (relname) / publication name (pubname)
> + * [col + 1]: row filter expression (prqual), may be NULL
> + * [col + 2]: column list (comma-separated), may be NULL
> + * [col + 3]: except flag ("t" if EXCEPT, else "f")
> + *---------------------------------------------------
>
> I've modified this comment slightly so I could understand it better.
> See if you agree.
>
> SUGGESTION
> /*---------------------------------------------------
>  * Description columns:
>  * PUB      TBL
>  * [0]      -      : schema name (nspname)
>  * [col]    -      : table name (relname)
>  * -        [col]  : publication name (pubname)
>  * [col+1]  [col+1]: row filter expression (prqual), may be NULL
>  * [col+2]  [col+1]: column list (comma-separated), may be NULL
>  * [col+3]  [col+1]: except flag ("t" if EXCEPT, else "f")
>  *---------------------------------------------------
>  */
>
> ~~~
>
I have used the suggested description with some modifications.

> describeOneTableDetails:
>
> 7.
> + else if (pset.sversion >= 150000)
> + {
> + printfPQExpBuffer(&buf,
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> +   "     JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> +   "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , pg_get_expr(pr.prqual, c.oid)\n"
> +   "     , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
> +   "         (SELECT string_agg(attname, ', ')\n"
> +   "           FROM pg_catalog.generate_series(0,
> pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
> +   "                pg_catalog.pg_attribute\n"
> +   "          WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> +   "        ELSE NULL END) "
> +   "FROM pg_catalog.pg_publication p\n"
> +   "     JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> +   "     JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> +   "WHERE pr.prrelid = '%s'\n"
> +   "UNION\n"
> +   "SELECT pubname\n"
> +   "     , NULL\n"
> +   "     , NULL\n"
> +   "FROM pg_catalog.pg_publication p\n"
> +   "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> +   "ORDER BY 1;",
> +   oid, oid, oid, oid);
>
> AFAICT, that >= 150000 code seems to have added another UNION at the
> end that was not previously there. What's that about? How is that
> related to EXCEPT (column-list)?
>
This patch does not add any new code to  >= 150000. It is the same as
HEAD. This diff appears because of changes in 0002 patchset. In patch
0002, I did not create a separate full query for >= 190000 due to
small changes.

I have addressed the rest of the comments and added the changes in the
latest v21 patchset.


[1]: https://www.postgresql.org/message-id/CANhcyEXHiCbk2q8%3Dbq3boQDyc8ac9fjgK-kkp5PdTYLcAOq80Q%40mail.gmail.com

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
Hi Kirill,

Thanks for reviewing the patch.

On Fri, 15 Aug 2025 at 11:46, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> Hi
>
> On Fri, 15 Aug 2025 at 05:53, Peter Smith <smithpb2250@gmail.com> wrote:
>
> > 1.
> >   bool isnull = true;
> > - Datum whereClauseDatum;
> > - Datum columnListDatum;
> > + Datum datum;
> >
> > I know you did not write the code, but that "isnull = true" is
> > redundant, and seems kind of misleading because it will always be
> > re-assigned before it is used.
>
> People are not generally excited about refactoring code they did not
> change. This makes patch to have more review cycles, and less probable
> to actually being committed. If we are really wedded with this change,
> this could be a separate thread.
>
I also feel that we should create a new thread for the same. I have
created a new thread. See [1].

>
> > ~~~
> >
> > 2.
> >   /* Load the WHERE clause for this table. */
> > - whereClauseDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> > -    Anum_pg_publication_rel_prqual,
> > -    &isnull);
> > + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> > + Anum_pg_publication_rel_prqual,
> > + &isnull);
> >   if (!isnull)
> > - oldrelwhereclause = stringToNode(TextDatumGetCString(whereClauseDatum));
> > + oldrelwhereclause = stringToNode(TextDatumGetCString(datum));
> >
> >   /* Transform the int2vector column list to a bitmap. */
> > - columnListDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> > -   Anum_pg_publication_rel_prattrs,
> > -   &isnull);
> > + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> > + Anum_pg_publication_rel_prattrs,
> > + &isnull);
> > +
> > + if (!isnull)
> > + oldcolumns = pub_collist_to_bitmapset(NULL, datum, NULL);
> > +
> > + /* Load the prexcept flag for this table. */
> > + datum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> > + Anum_pg_publication_rel_prexcept,
> > + &isnull);
> >
> >   if (!isnull)
> > - oldcolumns = pub_collist_to_bitmapset(NULL, columnListDatum, NULL);
> > + oldexcept = DatumGetBool(datum);
> >
> > Use consistent spacing. Either do or don't (I prefer don't) put a
> > blank line between the pairs of "datum =" and "if (!isnull)". Avoid
> > having a mixture.
> >
> > ======
> > src/bin/psql/describe.c
> >
> > addFooterToPublicationOrTableDesc:
> >
> > 3.
> > +/*
> > + * If is_tbl_desc is true add footer to table description else add footer to
> > + * publication description.
> > + */
> > +static bool
> > +addFooterToPublicationOrTableDesc(PQExpBuffer buf, const char *footermsg,
> > +   bool as_schema, printTableContent *const cont,
> > +   bool is_tbl_desc)
> >
> > 3a.
> > Since you are changing this anyway, I think it would be better to keep
> > those boolean params together (at the end).
> >
> > ~
> >
> > 3b.
> > It seems a bit mixed up calling this addFooterToPublicationOrTableDesc
> > but having the variable 'is_tbl_desc', because it seems more natural
> > to me to read left to right, so the logical order of everything here
> > should be pub desc then table desc. In other words, use boolean
> > 'is_pub_desc' instead of 'is_tbl_desc'. Also, I think that 'as_schema'
> > thing is kind of a *subset* of the publication description, so it
> > makes more sense for that to come last too.
> >
> > e.g.
> > CURRENT
> > addFooterToPublicationOrTableDesc(buf, footermsg, as_schema, cont, is_tbl_desc)
> > SUGGESTION
> > addFooterToPublicationOrTableDesc(buf, cont, footermsg, is_pub_desc, as_schema)
> >
> > ~
> >
> > 3c
> > While you are changing things, maybe also consider changing that
> > 'as_schema' name because I did not understand what "as" means. Perhaps
> > rename like 'pub_schemas', or 'only_show_schemas' or something better
> > (???).
> >
> > ~~~
> >
> > 4.
> > + PGresult   *res;
> > + int count = 0;
> > + int i = 0;
> > + int col = is_tbl_desc ? 0 : 1;
> > +
> > + res = PSQLexec(buf->data);
> > + if (!res)
> > + return false;
> > + else
> > + count = PQntuples(res);
> > +
> >
> > 4a.
> > Assignment count = 0 is redundant.
> >
> > ~
> >
> > 4b.
> > Remove the 'i' declaration here. Declare it in the "for" loop later.
> >
> > ~
> >
> > 4c.
> > The "else" is not required. If 'res' was not good, you already returned.
> >
> > ~~~
> >
> > 5.
> > + if (as_schema)
> > + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, 0));
> > + else
> > + {
> > + if (is_tbl_desc)
> > + printfPQExpBuffer(buf, "    \"%s\"", PQgetvalue(res, i, col));
> > + else
> > + printfPQExpBuffer(buf, "    \"%s.%s\"", PQgetvalue(res, i, 0),
> > +   PQgetvalue(res, i, col));
> >
> > This function is basically either (a) a footer for a table description
> > or (b) a footer for a publication description. And that all hinges on
> > the boolean 'is_tbl_desc'. Therefore, it seems more natural for the
> > main condition to be "if (is_tbl_desc)" here.
> >
> > This turned everything inside out. PSA: a top-up patch to show a way
> > to do this. Perhaps my implementation is a bit verbose, but OTOH it
> > seems easier to understand. Anyway, see what you think...
> >
>
> + 1
>
Included these changes in the latest patch [2].

> >
> > 6.
> > + /*---------------------------------------------------
> > + * Publication/ table description columns:
> > + * [0]: schema name (nspname)
> > + * [col]: table name (relname) / publication name (pubname)
> > + * [col + 1]: row filter expression (prqual), may be NULL
> > + * [col + 2]: column list (comma-separated), may be NULL
> > + * [col + 3]: except flag ("t" if EXCEPT, else "f")
> > + *---------------------------------------------------
> >
> > I've modified this comment slightly so I could understand it better.
> > See if you agree.
>
> For me that's equal. lets see what other people think
>
For now I have used the version shared by Peter. I felt it was more descriptive.

[1] : https://www.postgresql.org/message-id/CANhcyEXHiCbk2q8%3Dbq3boQDyc8ac9fjgK-kkp5PdTYLcAOq80Q%40mail.gmail.com
[2] : https://www.postgresql.org/message-id/CANhcyEUEMWSkTfGc7Q3B%2BUiOzSiOmOGLgK-%2BC5DXwtCGOnDBhg%40mail.gmail.com

Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok,

I reviewed your latest v20-0003 patch and have no more comments at
this time; I only found one trivial typo.

======
src/bin/psql/describe.c

1.
+ /*
+ * Footers entries for a publication description or a table
+ * description
+ */

Typo. /Footers entries/Footer entries/

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> I reviewed your latest v20-0003 patch and have no more comments at
> this time; I only found one trivial typo.
>
> ======
> src/bin/psql/describe.c
>
> 1.
> + /*
> + * Footers entries for a publication description or a table
> + * description
> + */
>
> Typo. /Footers entries/Footer entries/
>

I have fixed it and attached the updated patches

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Shlok,
> >
> > I reviewed your latest v20-0003 patch and have no more comments at
> > this time; I only found one trivial typo.
> >
> > ======
> > src/bin/psql/describe.c
> >
> > 1.
> > + /*
> > + * Footers entries for a publication description or a table
> > + * description
> > + */
> >
> > Typo. /Footers entries/Footer entries/
> >
>
> I have fixed it and attached the updated patches
>
The patches were not applying on HEAD and needed a Rebase. Here is the
rebased patches

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, 5 Sept 2025 at 11:57, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > Hi Shlok,
> > >
> > > I reviewed your latest v20-0003 patch and have no more comments at
> > > this time; I only found one trivial typo.
> > >
> > > ======
> > > src/bin/psql/describe.c
> > >
> > > 1.
> > > + /*
> > > + * Footers entries for a publication description or a table
> > > + * description
> > > + */
> > >
> > > Typo. /Footers entries/Footer entries/
> > >
> >
> > I have fixed it and attached the updated patches
> >
> The patches were not applying on HEAD and needed a Rebase. Here is the
> rebased patches

Consider the following scenario:
create table t1(c1 int, c2 int);
create publication pub1 for table t1 except (c1, c2);

In this case, the publication is created in such a way that no columns
are included, so effectively no data will be replicated to the
subscriber.
However, when attempting an UPDATE, the following error occurs:
postgres=# update t1 set c1 = 2;
ERROR:  cannot update table "t1" because it does not have a replica
identity and publishes updates
HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

Is this behavior expected?

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Fri, 5 Sept 2025 at 11:57, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > Hi Shlok,
> > >
> > > I reviewed your latest v20-0003 patch and have no more comments at
> > > this time; I only found one trivial typo.
> > >
> > > ======
> > > src/bin/psql/describe.c
> > >
> > > 1.
> > > + /*
> > > + * Footers entries for a publication description or a table
> > > + * description
> > > + */
> > >
> > > Typo. /Footers entries/Footer entries/
> > >
> >
> > I have fixed it and attached the updated patches
> >
> The patches were not applying on HEAD and needed a Rebase. Here is the
> rebased patches

Few comments:
1) Currently from pg_publication_tables it is not clear if it is
replicating column list or replicating exclude column, can we indicate
if it is exclude or not:
create publication pub1 for table t1(c1);
create publication pub2 for  table t1 except ( c1);

postgres=# select * from pg_publication_tables;
 pubname | schemaname | tablename | attnames | rowfilter
---------+------------+-----------+----------+-----------
 pub1    | public     | t1        | {c1}     |
 pub2    | public     | t1        | {c2}     |
(2 rows)

2) Tab completion is not correct in this case:
postgres=# alter publication pub3 add table t2 EXCEPT (
,        WHERE (

3) tab6 is not used anywhere, it can be removed:
+       CREATE TABLE tab5 (a int, b int, c int);
+       CREATE TABLE tab6 (agen int GENERATED ALWAYS AS (1) STORED,
bgen int GENERATED ALWAYS AS (2) STORED);
+       INSERT INTO tab1 VALUES (1, 2, 3);

4) both these tests are using same message:
+  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1 ORDER BY a");
+is( $result, qq(|2|3
+|5|6),
+       'check incremental insert for EXCEPT (column-list) publication');
+$result = $node_subscriber->safe_psql('postgres',
+       "SELECT * FROM sch1.tab1 ORDER BY a");
+is( $result, qq(1||
+4||), 'check incremental insert for EXCEPT (column-list) publication');

we can include table name here to differentiate the test that will
help in identifying test failure easily

5) /newly added column are is replicated/ should be "newly added
column is replicated"
is($result, qq(|||10), 'newly added column are is replicated');

Regards,
Vignesh



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Thu, 25 Sept 2025 at 14:18, vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 5 Sept 2025 at 11:57, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > > On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > Hi Shlok,
> > > >
> > > > I reviewed your latest v20-0003 patch and have no more comments at
> > > > this time; I only found one trivial typo.
> > > >
> > > > ======
> > > > src/bin/psql/describe.c
> > > >
> > > > 1.
> > > > + /*
> > > > + * Footers entries for a publication description or a table
> > > > + * description
> > > > + */
> > > >
> > > > Typo. /Footers entries/Footer entries/
> > > >
> > >
> > > I have fixed it and attached the updated patches
> > >
> > The patches were not applying on HEAD and needed a Rebase. Here is the
> > rebased patches
>
> Consider the following scenario:
> create table t1(c1 int, c2 int);
> create publication pub1 for table t1 except (c1, c2);
>
> In this case, the publication is created in such a way that no columns
> are included, so effectively no data will be replicated to the
> subscriber.
> However, when attempting an UPDATE, the following error occurs:
> postgres=# update t1 set c1 = 2;
> ERROR:  cannot update table "t1" because it does not have a replica
> identity and publishes updates
> HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
>
> Is this behavior expected?

Hi Vignesh,

I think this behaviour is same as other similar cases like:

1. publication on empty table:
CREATE TABLE t1();
CREATE PUBLICATION pub1 FOR TABLE t1;

postgres=# DELETE FROM t1;
ERROR:  cannot delete from table "t1" because it does not have a
replica identity and publishes deletes
HINT:  To enable deleting from the table, set REPLICA IDENTITY using
ALTER TABLE.

2. All the columns in a table is a generated column:
 CREATE TABLE t2(a int GENERATED ALWAYS AS (2*2) STORED);
CREATE PUBLICATION pub2 FOR TABLE t2 WITH (publish_generated_columns='none');

In this case since "publish_generated_columns=none", should not
publish changes for table t2. But we get following:
postgres=# DELETE FROM t2;
ERROR:  cannot delete from table "t2" because it does not have a
replica identity and publishes deletes
HINT:  To enable deleting from the table, set REPLICA IDENTITY using
ALTER TABLE.

In above cases as well no columns are published but we have the similar error.
Given these behaviours in HEAD I think it is okay for EXCEPT
column_list to have the similar behaviour when all columns are
excluded. Thought?

Thanks,
Shlok Kyal



Re: Skipping schema changes in publication

От
Shlok Kyal
Дата:
On Thu, 25 Sept 2025 at 16:39, vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 5 Sept 2025 at 11:57, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > On Mon, 25 Aug 2025 at 13:38, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > > On Thu, 21 Aug 2025 at 05:33, Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > Hi Shlok,
> > > >
> > > > I reviewed your latest v20-0003 patch and have no more comments at
> > > > this time; I only found one trivial typo.
> > > >
> > > > ======
> > > > src/bin/psql/describe.c
> > > >
> > > > 1.
> > > > + /*
> > > > + * Footers entries for a publication description or a table
> > > > + * description
> > > > + */
> > > >
> > > > Typo. /Footers entries/Footer entries/
> > > >
> > >
> > > I have fixed it and attached the updated patches
> > >
> > The patches were not applying on HEAD and needed a Rebase. Here is the
> > rebased patches
>
> Few comments:
> 1) Currently from pg_publication_tables it is not clear if it is
> replicating column list or replicating exclude column, can we indicate
> if it is exclude or not:
> create publication pub1 for table t1(c1);
> create publication pub2 for  table t1 except ( c1);
>
> postgres=# select * from pg_publication_tables;
>  pubname | schemaname | tablename | attnames | rowfilter
> ---------+------------+-----------+----------+-----------
>  pub1    | public     | t1        | {c1}     |
>  pub2    | public     | t1        | {c2}     |
> (2 rows)
>
> 2) Tab completion is not correct in this case:
> postgres=# alter publication pub3 add table t2 EXCEPT (
> ,        WHERE (
>
> 3) tab6 is not used anywhere, it can be removed:
> +       CREATE TABLE tab5 (a int, b int, c int);
> +       CREATE TABLE tab6 (agen int GENERATED ALWAYS AS (1) STORED,
> bgen int GENERATED ALWAYS AS (2) STORED);
> +       INSERT INTO tab1 VALUES (1, 2, 3);
>
> 4) both these tests are using same message:
> +  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1 ORDER BY a");
> +is( $result, qq(|2|3
> +|5|6),
> +       'check incremental insert for EXCEPT (column-list) publication');
> +$result = $node_subscriber->safe_psql('postgres',
> +       "SELECT * FROM sch1.tab1 ORDER BY a");
> +is( $result, qq(1||
> +4||), 'check incremental insert for EXCEPT (column-list) publication');
>
> we can include table name here to differentiate the test that will
> help in identifying test failure easily
>
> 5) /newly added column are is replicated/ should be "newly added
> column is replicated"
> is($result, qq(|||10), 'newly added column are is replicated');
Hi Vignesh,

Thanks for reviewing the patch.
I have addressed the comments and attached the updated version.

Thanks,
Shlok Kyal

Вложения

Re: Skipping schema changes in publication

От
Peter Smith
Дата:
Hi Shlok,

I was looking at the recent v24 changes.

======
GENERAL.

I saw that you modified the system view to add a new flag:

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>exceptcol</structfield> <type>bool</type>
+      </para>
+      <para>
+       True if a column list with <literal>EXCEPT</literal> clause is specified
+       for the table in the publication.
+      </para></entry>
+     </row>

So output now might look like this:

+CREATE TABLE pub_test_except1 (a int NOT NULL, b int, c int NOT NULL, d int);
+CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1,
pub_sch1.pub_test_except2 EXCEPT (b, c);
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
+    pubname     | schemaname |    tablename     | attnames  |
rowfilter | exceptcol
+----------------+------------+------------------+-----------+-----------+-----------
+ testpub_except | public     | pub_test_except1 | {a,b,c,d} |           | f
+ testpub_except | pub_sch1   | pub_test_except2 | {a,d}     |           | t
+(2 rows)

~~~

I think this was done in response to a comment from Vignesh [1], but
it did not get implemented in the way that I had imagined. e.g. I
imagined the view might be more like this:

+    pubname     | schemaname |    tablename     | attnames  |
rowfilter | exceptcols
+----------------+------------+------------------+-----------+-----------+-----------
+ testpub_except | public     | pub_test_except1 | {a,b,c,d} |           |
+ testpub_except | pub_sch1   | pub_test_except2 | {a,d}     |           | {b,c}

I don't know if broadcasting to the user what the unpublished/hidden
columns' names are is very wise (e.g. "{password,internal_notes,
salary}", but OTOH just having a boolean flag saying that "something"
was excluded ddin't seem useful.

~

Furthermore, having a Boolean seemed strangely incompatible with a
normal column list. e.g. Lets say there is a table T1 with cols
c1,c2,c3,c4.

I could publish that as "FOR TABLE T1(c1,c2,c3)"
Or as "FOR TABLE T1 EXCEPT (c4)"

In the v24 implementation, AFAIK, the view will show those as
"attnames = {c1,c2,c3}", and except will be both "f" and "t". It
seemed odd to.

~

Lastly,  I think the EXCEPT (col-list) feature was mostly added just
to help users with 100s of columns to write their CREATE PUBLICATION
statement more easily.  Since the view already shows all the columns
that will be published. So, I'm kind of -0.5 on this idea of changing
the view to show how they typed their statement.

======
[1] https://www.postgresql.org/message-id/CALDaNm32XQDR4qsOhPQeophVbZ8r%2BShJSSssoVfdPcwG6joPHQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Sat, 27 Sept 2025 at 01:20, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> Thanks for reviewing the patch.
> I have addressed the comments and attached the updated version.

If all columns are excluded, we do not publish the changes. However,
when a table has no columns, the data is still replicated. Should we
make this behavior consistent?
@@ -1482,6 +1525,13 @@ pgoutput_change(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
        relentry = get_rel_sync_entry(data, relation);
+       /*
+        * If all columns of a table are present in column list specified with
+        * EXCEPT, skip publishing the changes.
+        */
+       if (relentry->all_cols_excluded)
+               return;

Steps to check the above issue:
-- pub
create table t1();
create table t2(c1 int, c2 int);
create publication pub1 FOR table t1;
create publication pub2 FOR table t2 except(c1, c2);

--sub
create table t1(c1 int);
create table t2(c1 int, c2 int);
create subscription sub1 connection 'dbname=postgres host=localhost
port=5432' publication pub1,pub2;

--pub
postgres=# insert into t1 default values ;
INSERT 0 1
postgres=# insert into t2 default values;
INSERT 0 1

--sub
-- In case of table having no columns, data is replicated
postgres=# select * from t1;
 c1
----

(1 row)

-- In case of table having all columns excluded, data is not replicated
postgres=# select * from t2;
 c1 | c2
----+----
(0 rows)

Thoughts?

Regards,
Vignesh



Re: Skipping schema changes in publication

От
vignesh C
Дата:
On Mon, 29 Sept 2025 at 08:58, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Shlok,
>
> I was looking at the recent v24 changes.
>
> ======
> GENERAL.
>
> I saw that you modified the system view to add a new flag:
>
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>exceptcol</structfield> <type>bool</type>
> +      </para>
> +      <para>
> +       True if a column list with <literal>EXCEPT</literal> clause is specified
> +       for the table in the publication.
> +      </para></entry>
> +     </row>
>
> So output now might look like this:
>
> +CREATE TABLE pub_test_except1 (a int NOT NULL, b int, c int NOT NULL, d int);
> +CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1,
> pub_sch1.pub_test_except2 EXCEPT (b, c);
> +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
> +    pubname     | schemaname |    tablename     | attnames  |
> rowfilter | exceptcol
> +----------------+------------+------------------+-----------+-----------+-----------
> + testpub_except | public     | pub_test_except1 | {a,b,c,d} |           | f
> + testpub_except | pub_sch1   | pub_test_except2 | {a,d}     |           | t
> +(2 rows)
>
> ~~~
>
> I think this was done in response to a comment from Vignesh [1], but
> it did not get implemented in the way that I had imagined. e.g. I
> imagined the view might be more like this:
>
> +    pubname     | schemaname |    tablename     | attnames  |
> rowfilter | exceptcols
> +----------------+------------+------------------+-----------+-----------+-----------
> + testpub_except | public     | pub_test_except1 | {a,b,c,d} |           |
> + testpub_except | pub_sch1   | pub_test_except2 | {a,d}     |           | {b,c}
>
> I don't know if broadcasting to the user what the unpublished/hidden
> columns' names are is very wise (e.g. "{password,internal_notes,
> salary}", but OTOH just having a boolean flag saying that "something"
> was excluded ddin't seem useful.
>
> ~
>
> Furthermore, having a Boolean seemed strangely incompatible with a
> normal column list. e.g. Lets say there is a table T1 with cols
> c1,c2,c3,c4.
>
> I could publish that as "FOR TABLE T1(c1,c2,c3)"
> Or as "FOR TABLE T1 EXCEPT (c4)"
>
> In the v24 implementation, AFAIK, the view will show those as
> "attnames = {c1,c2,c3}", and except will be both "f" and "t". It
> seemed odd to.
>
> ~
>
> Lastly,  I think the EXCEPT (col-list) feature was mostly added just
> to help users with 100s of columns to write their CREATE PUBLICATION
> statement more easily.  Since the view already shows all the columns
> that will be published. So, I'm kind of -0.5 on this idea of changing
> the view to show how they typed their statement.

On further consideration, I’m ok with removing this column to avoid
potential confusion.

Regards,
Vignesh