Re: Data is copied twice when specifying both child and parent table in publication

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: Data is copied twice when specifying both child and parent table in publication
Дата
Msg-id CAHut+Ptk6oANo2-mw5191aU0HHP9YKjYhWgTZYU0g0Y1Co_5SQ@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Data is copied twice when specifying both child and parent table in publication  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Ответы RE: Data is copied twice when specifying both child and parent table in publication  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Список pgsql-hackers
Here are some review comments for patch code of HEAD_v19-0001

======
doc/src/sgml/ref/create_publication.sgml

1.
+         <para>
+          There can be a case where a subscription combines multiple
+          publications. If a root partitioned table is published by any
+          subscribed publications which set
+          <literal>publish_via_partition_root</literal> = true, changes on this
+          root partitioned table (or on its partitions) will be published using
+          the identity and schema of this root partitioned table rather than
+          that of the individual partitions.
+         </para>

1a.
The paragraph prior to this one just refers to "partitioned tables"
instead of "root partitioned table", so IMO we should continue with
the same terminology.

I also modified the remaining text slightly. AFAIK my suggestion
conveys exactly the same information but is shorter.

SUGGESTION
There can be a case where one subscription combines multiple
publications. If any of those publications has set
<literal>publish_via_partition_root</literal> = true, then changes in
a partitioned table (or on its partitions) will be published using the
identity and schema of the partitioned table.

~

1b.
Shouldn't that paragraph (or possibly somewhere in the CREATE
SUBSCRIPTION notes?) also explain that in this scenario the logical
replication will only publish one set of changes? After all, this is
the whole point of the patch, but I am not sure if the user will know
of this behaviour from the current documentation.

======
src/backend/catalog/pg_publication.c

2. filter_partitions

BEFORE:
static void
filter_partitions(List *table_infos)
{
ListCell   *lc;

foreach(lc, table_infos)
{
bool skip = false;
List    *ancestors = NIL;
ListCell    *lc2;
published_rel    *table_info = (published_rel *) lfirst(lc);

if (get_rel_relispartition(table_info->relid))
ancestors = get_partition_ancestors(table_info->relid);

foreach(lc2, ancestors)
{
Oid ancestor = lfirst_oid(lc2);

/* Is ancestor exists in the published table list? */
if (is_ancestor_member_tableinfos(ancestor, table_infos))
{
skip = true;
break;
}
}

if (skip)
table_infos = foreach_delete_current(table_infos, lc);
}
}

~

2a.
My previous review [1] (see #1) suggested putting most code within the
condition. AFAICT my comment still is applicable but was not yet
addressed.

2b.
IMO the comment "/* Is ancestor exists in the published table list?
*/" is unnecessary because it is already clear what is the purpose of
the function named "is_ancestor_member_tableinfos".


SUGGESTION
static void
filter_partitions(List *table_infos)
{
    ListCell   *lc;

    foreach(lc, table_infos)
    {
        if (get_rel_relispartition(table_info->relid))
        {
            bool skip = false;
            ListCell    *lc2;
            published_rel    *table_info = (published_rel *) lfirst(lc);
            List *ancestors = get_partition_ancestors(table_info->relid);

            foreach(lc2, ancestors)
            {
                Oid ancestor = lfirst_oid(lc2);

                if (is_ancestor_member_tableinfos(ancestor, table_infos))
                {
                    skip = true;
                    break;
                }
            }

            if (skip)
                table_infos = foreach_delete_current(table_infos, lc);
        }
    }
}

======
src/backend/commands/subscriptioncmds.c

3.
 fetch_table_list(WalReceiverConn *wrconn, List *publications)
 {
  WalRcvExecResult *res;
- StringInfoData cmd;
+ StringInfoData cmd,
+ pub_names;
  TupleTableSlot *slot;
  Oid tableRow[3] = {TEXTOID, TEXTOID, NAMEARRAYOID};
  List    *tablelist = NIL;
- bool check_columnlist = (walrcv_server_version(wrconn) >= 150000);
+ int server_version = walrcv_server_version(wrconn);
+ bool check_columnlist = (server_version >= 150000);
+
+ initStringInfo(&pub_names);
+ get_publications_str(publications, &pub_names, true);

  initStringInfo(&cmd);
- appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename \n");

- /* Get column lists for each relation if the publisher supports it */
- if (check_columnlist)
- appendStringInfoString(&cmd, ", t.attnames\n");
+ /* Get the list of tables from the publisher. */
+ if (server_version >= 160000)
+ {

~

I think the 'pub_names' is only needed within that ">= 160000" condition.

So all the below code can be moved into that scope can't it?

+ StringInfoData pub_names;
+ initStringInfo(&pub_names);
+ get_publications_str(publications, &pub_names, true);

+ pfree(pub_names.data);

------
[1] https://www.postgresql.org/message-id/CAHut%2BPuNsvO9o9XzeJuSLsAsndgCKVphDPBRqYuOTy2bR28E%2Bg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
Следующее
От: Andres Freund
Дата:
Сообщение: Re: CREATE DATABASE ... STRATEGY WAL_LOG issues