Обсуждение: Re: create subscription with (origin = none, copy_data = on)
On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> wrote: > > Hi, hackers! > > I am looking at subscription creation command: > > CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin = > none, copy_data = on); > > For now we log a warning if the publisher has subscribed to the same > table from some other publisher. > However, in case of publication with publish_via_partition_root option, > we will not raise such warinigs > because SQL command in check_publications_origin() checks only directly > published tables. Yes, I agree that we are checking only the directly published tables which is why there is no warning in this case. I'm working on a fix to change the check_publications_origin to check accordingly. > For example: > > CREATE TABLE t(id int) PARTITION BY RANGE(id); > CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5); > CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10); > -- subscribe to part2 > CREATE SUBSCRIPTION sub_part2 CONNECTION '...' PUBLICATION pub_part2; > CREATE PUBLICATION pub_t FOR TABLE t; > CREATE PUBLICATION pub_t_via_root FOR TABLE t WITH > (publish_via_partition_root); > > and now this command will raise a warning: > CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t WITH (origin > = none, copy_data = on); > > but not this: > CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t_via_root > WITH (origin = none, copy_data = on); > > We also do not take into account cases of foreign partitions: > CREATE TABLE t(id int) PARTITION BY RANGE(id); > CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5); > CREATE FOREIGN TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10) > SERVER fdw_server; > CREATE PUBLICATION pub_t FOR TABLE t; > > Maybe we should raise WARNING (or even ERROR) in such cases? Currently we do not support replication of foreign tables. This is mentioned in logical replication restriction sections at [1]. > I would also note that the (origin = none) will work as expected, but in > case of (origin = any) > it will lead to inappropriate behavior - we will perform an initial sync > of "t", but we unable to > replicate further updates for "part2". I noticed the same behavior with both origins as none and any. i.e initial sync is ok and then replication of foreign table part2 will not work which is because of the above restriction that I mentioned. Just to be sure that I'm not checking a different scenario, could you share the test for this case. [1] - https://www.postgresql.org/docs/devel/logical-replication-restrictions.html Regards, Vignesh
17.01.2025 23:00, vignesh C пишет: > On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev > <s.tatarintsev@postgrespro.ru> wrote: >> Hi, hackers! >> >> I am looking at subscription creation command: >> >> CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin = >> none, copy_data = on); >> >> For now we log a warning if the publisher has subscribed to the same >> table from some other publisher. >> However, in case of publication with publish_via_partition_root option, >> we will not raise such warinigs >> because SQL command in check_publications_origin() checks only directly >> published tables. > Yes, I agree that we are checking only the directly published tables > which is why there is no warning in this case. I'm working on a fix to > change the check_publications_origin to check accordingly. seems promising. I would like to see this patch >> For example: >> >> CREATE TABLE t(id int) PARTITION BY RANGE(id); >> CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5); >> CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10); >> -- subscribe to part2 >> CREATE SUBSCRIPTION sub_part2 CONNECTION '...' PUBLICATION pub_part2; >> CREATE PUBLICATION pub_t FOR TABLE t; >> CREATE PUBLICATION pub_t_via_root FOR TABLE t WITH >> (publish_via_partition_root); >> >> and now this command will raise a warning: >> CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t WITH (origin >> = none, copy_data = on); >> >> but not this: >> CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t_via_root >> WITH (origin = none, copy_data = on); >> >> We also do not take into account cases of foreign partitions: >> CREATE TABLE t(id int) PARTITION BY RANGE(id); >> CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5); >> CREATE FOREIGN TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10) >> SERVER fdw_server; >> CREATE PUBLICATION pub_t FOR TABLE t; >> >> Maybe we should raise WARNING (or even ERROR) in such cases? > Currently we do not support replication of foreign tables. This is > mentioned in logical replication restriction sections at [1]. Yes of course, but we must raise an ERROR in such cases >> I would also note that the (origin = none) will work as expected, but in >> case of (origin = any) >> it will lead to inappropriate behavior - we will perform an initial sync >> of "t", but we unable to >> replicate further updates for "part2". > I noticed the same behavior with both origins as none and any. i.e > initial sync is ok and then replication of foreign table part2 will > not work which is because of the above restriction that I mentioned. > Just to be sure that I'm not checking a different scenario, could you > share the test for this case. That's right, but i think we must tell user about inappropriate usage. check_publication_add_relation() checks only publication creation for foreign tables directly, but not partitioned tables structure. My test case just shows that we can try to replicate partitioned table with foreign partitions, but I think we should disallow such cases. I would also like to show an interesting subscription creation scenario that I found: 1. subscriber: calls check_publications_origin() 2. publisher: executes the create/attach foreign partition command 3. the subscriber is sure he checked the origin and performing COPY t TO STDOUT i.e. between the check and the start of copying the publication has changed the problem is that check_publications_origin() and COPY t TO STDOUT are performed in different transactions > > [1] - https://www.postgresql.org/docs/devel/logical-replication-restrictions.html > > Regards, > Vignesh
On Fri, 17 Jan 2025 at 21:30, vignesh C <vignesh21@gmail.com> wrote:On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> wrote:Hi, hackers! I am looking at subscription creation command: CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin = none, copy_data = on); For now we log a warning if the publisher has subscribed to the same table from some other publisher. However, in case of publication with publish_via_partition_root option, we will not raise such warinigs because SQL command in check_publications_origin() checks only directly published tables.Yes, I agree that we are checking only the directly published tables which is why there is no warning in this case. I'm working on a fix to change the check_publications_origin to check accordingly.Attached patch has the fix for this issue which includes the partition tables also for the publication now and throws a warning appropriately. Regards, Vignesh
Thanks for patch!
I think we must take into account whole inheritance tree of partitioned table.
For example:
node_A:
CREATE TABLE t(id int);
CREATE PUBLICATION pub_b FOR TABLE t;
node_A:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part PARTITION OF t FOR VALUES FROM (0) TO (10) PARTITION BY RANGE(id);
CREATE TABLE subpart PARTITION OF part FOR VALUES FROM (0) TO (5);
CREATE SUBSCRIPTION sub_c CONNECTION '$node_B_connstr' PUBLICATION pub_b;
CREATE PUBLICATION pub_t FOR TABLE t WITH (publish_via_partition_root);
CREATE PUBLICATION pub_part FOR TABLE part WITH (publish_via_partition_root);
node_C:
-- this command will raise a warning CREATE SUBSCRIPTION sub_t CONNECTION '$node_A_connstr' PUBLICATION pub_t WITH (origin = none, copy_data = on);
DROP SUBSCRIPTION IF EXISTS sub_t;
-- here we got silence, but "part" is in tree of upper level replicated table
CREATE SUBSCRIPTION sub_part CONNECTION '$node_A_connstr' PUBLICATION pub_part WITH (origin = none, copy_data = on);
DROP SUBSCRIPTION IF EXISTS sub_part;
I think that for each partition/partitioned table in the publication we can use something like
select relid from pg_partition_tree('part'::regclass)
union
select relid from pg_partition_ancestors('part'::regclass);
In this case we don't care about publish_via_partition_root option, because we already check all inheritance tree, and there is no need to change pg_class
What are you thinking about it?
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote: > > Attached patch has the fix for this issue which includes the partition > tables also for the publication now and throws a warning > appropriately. > The corresponding query (see "To find which tables might potentially include non-local origins .." on [1]) on the create_subscription doc page. * @@ -1147,10 +1151,12 @@ pg_get_publication_tables(PG_FUNCTION_ARGS) *schemarelids; relids = GetPublicationRelations(pub_elem->oid, + allparttables ? PUBLICATION_PART_ALL : pub_elem->pubviaroot ? PUBLICATION_PART_ROOT : PUBLICATION_PART_LEAF); schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid, + allparttables ? PUBLICATION_PART_ALL : pub_elem->pubviaroot ? PUBLICATION_PART_ROOT : PUBLICATION_PART_LEAF); Don't we need to add similar handling FOR ALL TABLES case? If not, why? BTW, the proposed fix is not backpatcheable as it changes the catalog which requires catversion bump. However, as this is a WARNING case, if we can't find a fix that can't be backpatched, we can fix it in HEAD-only. [1] - https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-NOTES -- With Regards, Amit Kapila.
22.01.2025 18:41, Shlok Kyal пишет: > On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: >> On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote: >> >> Hi, >> >>> On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote: >>>> On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote: >>>>> Attached patch has the fix for this issue which includes the >>>>> partition tables also for the publication now and throws a warning >>>>> appropriately. >>>>> >>>> The corresponding query (see "To find which tables might potentially >>>> include non-local origins .." on [1]) on the create_subscription doc >>>> page. >>>> BTW, the proposed fix is not backpatcheable as it changes the catalog >>>> which requires catversion bump. However, as this is a WARNING case, if >>>> we can't find a fix that can't be backpatched, we can fix it in >>>> HEAD-only. >>> I could not find a way to fix the back version without changing the catalog >>> version. >>> >>> The attached v3 version has the changes for the same. >> Thanks for the patch. >> >> I agree that covering the partitioned table case when checking the non-local >> origin data on publisher is an improvement. But I think adding or extending the >> SQL functions may not be the appropriate way to fix because the new functions >> cannot be used in older PG version and is also not backpatchable. >> >> I am thinking it would be better to use the existing pg_partition_ancestors() >> and pg_partition_tree() to verify the same, which can be used in all supported >> PG versions and is also backpatchable. >> >> And here is another version which fixed the issue like that. I have not added >> tests for it, but I think it's doable to write the something like the testcases >> provided by Sergey. This patch does not fix the foreign tabel as that seems to >> be a separate issue which can be fixed independtly. >> >> Hi Sergey, if you have the time, could you please verify whether this patch >> resolves the partition issue you reported? I've confirmed that it passes the >> partitioned tests in the scripts, but I would appreciate your confirmation for >> the same. >> > Hi Hou-san, > > I have created a patch to add a test for the patch. > > v5-0001 : same as v4-0001 > v5-0002: adds the testcase > > Thanks and Regards, > Shlok Kyal Hi! Sorry, I can't do it right now, but I think we need add testcase where ancestor of published table have different origin. Also we still don't care about foreign partitions (as I wrote earlier we should raise an ERROR for such publications). This checking must be done at publication creation in check_publication_add_relation(), but I not sure about publication for all tables/for tables in schema because one foreign table will block publication creation Thoughts?
On Wed, Jan 22, 2025 at 9:44 PM Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> wrote: > > 22.01.2025 18:41, Shlok Kyal пишет: > > Also we still don't care about foreign partitions (as I wrote earlier > we should raise an ERROR for such publications). > I think dealing with this separately from the origin vs. partitioned table issue is better. -- With Regards, Amit Kapila.
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote: > > Hi, > > > > > On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Attached patch has the fix for this issue which includes the > > > > partition tables also for the publication now and throws a warning > > > > appropriately. > > > > > > > > > > The corresponding query (see "To find which tables might potentially > > > include non-local origins .." on [1]) on the create_subscription doc > > > page. > > > > > BTW, the proposed fix is not backpatcheable as it changes the catalog > > > which requires catversion bump. However, as this is a WARNING case, if > > > we can't find a fix that can't be backpatched, we can fix it in > > > HEAD-only. > > > > I could not find a way to fix the back version without changing the catalog > > version. > > > > The attached v3 version has the changes for the same. > > Thanks for the patch. > > I agree that covering the partitioned table case when checking the non-local > origin data on publisher is an improvement. But I think adding or extending the > SQL functions may not be the appropriate way to fix because the new functions > cannot be used in older PG version and is also not backpatchable. > > I am thinking it would be better to use the existing pg_partition_ancestors() > and pg_partition_tree() to verify the same, which can be used in all supported > PG versions and is also backpatchable. > > And here is another version which fixed the issue like that. I have not added > tests for it, but I think it's doable to write the something like the testcases > provided by Sergey. This patch does not fix the foreign tabel as that seems to > be a separate issue which can be fixed independtly. > > Hi Sergey, if you have the time, could you please verify whether this patch > resolves the partition issue you reported? I've confirmed that it passes the > partitioned tests in the scripts, but I would appreciate your confirmation for > the same. Hi Hou-san, I tested the patch, and it is working fine on HEAD. I also tried to apply the patches to back branches PG17 and PG 16. But the patch does not apply. This 'origin' option was added in PG 16. So, this patch will not be required for PG 15 and back branches. Thanks and Regards, Shlok Kyal
On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu) > > <houzj.fnst@fujitsu.com> wrote: > > > > > > Thanks for the patch. > > > > > > I agree that covering the partitioned table case when checking the > > > non-local origin data on publisher is an improvement. But I think > > > adding or extending the SQL functions may not be the appropriate way > > > to fix because the new functions cannot be used in older PG version and is > also not backpatchable. > > > > > > I am thinking it would be better to use the existing > > > pg_partition_ancestors() and pg_partition_tree() to verify the same, > > > which can be used in all supported PG versions and is also backpatchable. > > > > > > And here is another version which fixed the issue like that. I have > > > not added tests for it, but I think it's doable to write the > > > something like the testcases provided by Sergey. This patch does not > > > fix the foreign tabel as that seems to be a separate issue which can be fixed > independtly. > > > > > > Hi Sergey, if you have the time, could you please verify whether > > > this patch resolves the partition issue you reported? I've confirmed > > > that it passes the partitioned tests in the scripts, but I would > > > appreciate your confirmation for the same. > > > > Hi Hou-san, > > > > I tested the patch, and it is working fine on HEAD. > > I also tried to apply the patches to back branches PG17 and PG 16. But > > the patch does not apply. > > > > This 'origin' option was added in PG 16. So, this patch will not be > > required for PG 15 and back branches. > > > I have created a patch which applies to both PG17 and PG 16. The > v6-0002 is the test patch. It applies to all the branches (HEAD, PG17, > PG16) correctly. Thanks for the patch. I think the testcases could be improved. It's not clear why a separate schema is created for tables. I assume it was initially intended to test TABLES IN SCHEMA but was later modified. If the separate schema is still necessary, could you please add comments to clarify its purpose? Besides, the new table name 'ts' seems a bit unconventional. It would be better to align with the naming style of existing test cases for consistency and clarity. Also, Sergey had suggested adding an more test to verify scenarios where the table's ancestors are subscribed. It appears this hasn't been added yet. I think it would be better to add it. Best Regards, Hou zj
23.01.2025 15:24, Zhijie Hou (Fujitsu) пишет: > On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: >> On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: >>> On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu) >>> <houzj.fnst@fujitsu.com> wrote: >>>> Thanks for the patch. >>>> >>>> I agree that covering the partitioned table case when checking the >>>> non-local origin data on publisher is an improvement. But I think >>>> adding or extending the SQL functions may not be the appropriate way >>>> to fix because the new functions cannot be used in older PG version and is >> also not backpatchable. >>>> I am thinking it would be better to use the existing >>>> pg_partition_ancestors() and pg_partition_tree() to verify the same, >>>> which can be used in all supported PG versions and is also backpatchable. >>>> >>>> And here is another version which fixed the issue like that. I have >>>> not added tests for it, but I think it's doable to write the >>>> something like the testcases provided by Sergey. This patch does not >>>> fix the foreign tabel as that seems to be a separate issue which can be fixed >> independtly. >>>> Hi Sergey, if you have the time, could you please verify whether >>>> this patch resolves the partition issue you reported? I've confirmed >>>> that it passes the partitioned tests in the scripts, but I would >>>> appreciate your confirmation for the same. >>> Hi Hou-san, >>> >>> I tested the patch, and it is working fine on HEAD. >>> I also tried to apply the patches to back branches PG17 and PG 16. But >>> the patch does not apply. >>> >>> This 'origin' option was added in PG 16. So, this patch will not be >>> required for PG 15 and back branches. >>> >> I have created a patch which applies to both PG17 and PG 16. The >> v6-0002 is the test patch. It applies to all the branches (HEAD, PG17, >> PG16) correctly. > Thanks for the patch. I think the testcases could be improved. > > It's not clear why a separate schema is created for tables. I assume it was > initially intended to test TABLES IN SCHEMA but was later modified. If the > separate schema is still necessary, could you please add comments to clarify > its purpose? > > Besides, the new table name 'ts' seems a bit unconventional. It would be better > to align with the naming style of existing test cases for consistency and > clarity. > > Also, Sergey had suggested adding an more test to verify scenarios where the > table's ancestors are subscribed. It appears this hasn't been added yet. I > think it would be better to add it. > > Best Regards, > Hou zj Hi! That's right, separate schema was used to test "CREATE PUBLICATION FOR TABLES IN SCHEMA". My first patch with test cases contains 3 scenarios: CREATE PUBLICATION pub_a FOR TABLE ts.t; CREATE PUBLICATION pub_a_via_root FOR TABLE ts.t WITH (publish_via_partition_root); CREATE PUBLICATION pub_a_schema FOR TABLES IN SCHEMA ts WITH (publish_via_partition_root); (but not scenario where the table's ancestors are subscribed)
24.01.2025 07:22, Shlok Kyal пишет: > On Thu, 23 Jan 2025 at 17:54, Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: >> On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: >>> On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: >>>> On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu) >>>> <houzj.fnst@fujitsu.com> wrote: >>>>> Thanks for the patch. >>>>> >>>>> I agree that covering the partitioned table case when checking the >>>>> non-local origin data on publisher is an improvement. But I think >>>>> adding or extending the SQL functions may not be the appropriate way >>>>> to fix because the new functions cannot be used in older PG version and is >>> also not backpatchable. >>>>> I am thinking it would be better to use the existing >>>>> pg_partition_ancestors() and pg_partition_tree() to verify the same, >>>>> which can be used in all supported PG versions and is also backpatchable. >>>>> >>>>> And here is another version which fixed the issue like that. I have >>>>> not added tests for it, but I think it's doable to write the >>>>> something like the testcases provided by Sergey. This patch does not >>>>> fix the foreign tabel as that seems to be a separate issue which can be fixed >>> independtly. >>>>> Hi Sergey, if you have the time, could you please verify whether >>>>> this patch resolves the partition issue you reported? I've confirmed >>>>> that it passes the partitioned tests in the scripts, but I would >>>>> appreciate your confirmation for the same. >>>> Hi Hou-san, >>>> >>>> I tested the patch, and it is working fine on HEAD. >>>> I also tried to apply the patches to back branches PG17 and PG 16. But >>>> the patch does not apply. >>>> >>>> This 'origin' option was added in PG 16. So, this patch will not be >>>> required for PG 15 and back branches. >>>> >>> I have created a patch which applies to both PG17 and PG 16. The >>> v6-0002 is the test patch. It applies to all the branches (HEAD, PG17, >>> PG16) correctly. >> Thanks for the patch. I think the testcases could be improved. >> >> It's not clear why a separate schema is created for tables. I assume it was >> initially intended to test TABLES IN SCHEMA but was later modified. If the >> separate schema is still necessary, could you please add comments to clarify >> its purpose? >> Besides, the new table name 'ts' seems a bit unconventional. It would be better >> to align with the naming style of existing test cases for consistency and >> clarity. > I think the schema is not required. I have removed it. > >> Also, Sergey had suggested adding an more test to verify scenarios where the >> table's ancestors are subscribed. It appears this hasn't been added yet. I >> think it would be better to add it. > I have added the test in the latest patch. > > Thanks and Regards, > Shlok Kyal Hello! I think there is another problem - publisher can modify publication during the execution of "CREATE SUBSCRIPTION" (Rarely, this situation occurred in my tests.) I.e.: 1. Publisher: CREATE PUBLICATION ... FOR TABLE t1; 2. Subscriber (starts CREATE SUBSCRIPTION ...): check_publications_origin() 3. Publisher: ALTER PUBLICATION ... ADD TABLE t2; 4. Subscriber (still process CREATE SUBSCRIPTION ...): fetch_table_list() So, we check publication with only t1, but fetch t1,t2 I think we must start transaction on publisher while executing CreateSubscription() on subscriber (Or may be take an lock on publisher ) Thoughts?