Re: long-standing data loss bug in initial sync of logical replication
От | Nitin Motiani |
---|---|
Тема | Re: long-standing data loss bug in initial sync of logical replication |
Дата | |
Msg-id | CAH5HC95UsGrD1ez0vfNPCLP6vE3ptmaOkUECB5JsqLOCYjRW3w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: long-standing data loss bug in initial sync of logical replication (vignesh C <vignesh21@gmail.com>) |
Ответы |
Re: long-standing data loss bug in initial sync of logical replication
|
Список | pgsql-hackers |
On Wed, Jul 10, 2024 at 10:39 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 10 Jul 2024 at 12:28, Amit Kapila <amit.kapila16@gmail.com> wrote: > > The patch missed to use the ShareRowExclusiveLock for partitions, see > > attached. I haven't tested it but they should also face the same > > problem. Apart from that, I have changed the comments in a few places > > in the patch. > > I could not hit the updated ShareRowExclusiveLock changes through the > partition table, instead I could verify it using the inheritance > table. Added a test for the same and also attaching the backbranch > patch. > Hi, I tested alternative-experimental-fix-lock.patch provided by Tomas (replaces SUE with SRE in OpenTableList). I believe there are a couple of scenarios the patch does not cover. 1. It doesn't handle the case of "ALTER PUBLICATION <pub> ADD TABLES IN SCHEMA <schema>". I took crash-test.sh provided by Tomas and modified it to add all tables in the schema to publication using the following command : ALTER PUBLICATION p ADD TABLES IN SCHEMA public The modified script is attached (crash-test-with-schema.sh). With this script, I can reproduce the issue even with the patch applied. This is because the code path to add a schema to the publication doesn't go through OpenTableList. I have also attached a script run-test-with-schema.sh to run crash-test-with-schema.sh in a loop with randomly generated parameters (modified from run.sh provided by Tomas). 2. The second issue is a deadlock which happens when the alter publication command is run for a comma separated list of tables. I created another script create-test-tables-order-reverse.sh. This script runs a command like the following : ALTER PUBLICATION p ADD TABLE test_2,test_1 Running the above script, I was able to get a deadlock error (the output is attached in deadlock.txt). In the alter publication command, I added the tables in the reverse order to increase the probability of the deadlock. But it should happen with any order of tables. I am not sure if the deadlock is a major issue because detecting the deadlock is better than data loss. The schema issue is probably more important. I didn't test it out with the latest patches sent by Vignesh but since the code changes in that patch are also in OpenTableList, I think the schema scenario won't be covered by those. Thanks & Regards, Nitin Motiani Google
Вложения
В списке pgsql-hackers по дате отправления: