Re: Data is copied twice when specifying both child and parent table in publication

Поиск
Список
Период
Сортировка
От Greg Nancarrow
Тема Re: Data is copied twice when specifying both child and parent table in publication
Дата
Msg-id CAJcOf-fv7tEv=N+LZo9H1fp1A7NB9wsWDDMw048XNy2fyESgnw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Data is copied twice when specifying both child and parent table in publication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Data is copied twice when specifying both child and parent table in publication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Wed, Oct 20, 2021 at 7:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > > Actually, at least with the scenario I gave steps for, after looking
> > > at it again and debugging, I think that the behavior is understandable
> > > and not a bug.
> > > The reason is that the INSERTed data is first published though the
> > > partitions, since initially there is no partitioned table in the
> > > publication (so publish_via_partition_root=true doesn't have any
> > > effect). But then adding the partitioned table to the publication and
> > > refreshing the publication in the subscriber, the data is then
> > > published "using the identity and schema of the partitioned table" due
> > > to publish_via_partition_root=true. Note that the corresponding table
> > > in the subscriber may well be a non-partitioned table (or the
> > > partitions arranged differently) so the data does need to be
> > > replicated again.
> >
>
> Even if the partitions are arranged differently why would the user
> expect the same data to be replicated twice?
>

It's the same data, but published in different ways because of changes
the user made to the publication.
I am not talking in general, I am specifically referring to the
scenario I gave steps for.
In the example scenario I gave, initially when the subscription was
made, the publication just explicitly included the partitions, but
publish_via_partition_root was true. So in this case it publishes
through the individual partitions (as no partitioned table is present
in the publication). Then on the publisher side, the partitioned table
was then added to the publication and then ALTER SUBSCRIPTION ...
REFRESH PUBLICATION done on the subscriber side. Now that the
partitioned table is present in the publication and
publish_via_partition_root is true, it is "published using the
identity and schema of the partitioned table rather than that of the
individual partitions that are actually changed". So the data is
replicated again.
This scenario didn't use initial table data, so initial table sync
didn't come into play (although as I previously posted, I can see a
double-publish issue on initial sync if data is put in the table prior
to subscription and partitions have been explicitly added to the
publication).

Regards,
Greg Nancarrow
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: LogicalChanges* and LogicalSubxact* wait events are never reported
Следующее
От: Ronan Dunklau
Дата:
Сообщение: Re: pg_receivewal starting position