Re: Skipping logical replication transactions on subscriber side
От | Noah Misch |
---|---|
Тема | Re: Skipping logical replication transactions on subscriber side |
Дата | |
Msg-id | 20220402015943.GB3719101@rfd.leadboat.com обсуждение исходный текст |
Ответ на | Re: Skipping logical replication transactions on subscriber side (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Skipping logical replication transactions on subscriber side
|
Список | pgsql-hackers |
On Sat, Apr 02, 2022 at 06:49:20AM +0530, Amit Kapila wrote: > On Sat, Apr 2, 2022 at 5:41 AM Noah Misch <noah@leadboat.com> wrote: > > > > On Fri, Apr 01, 2022 at 09:25:52PM +0900, Masahiko Sawada wrote: > > > > On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote: > > > > > src/test/subscription/t/029_on_error.pl has been failing reliably on the five > > > > > AIX buildfarm members: > > > > > > > > > > # poll_query_until timed out executing this query: > > > > > # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub' > > > > > # expecting this output: > > > > > # t > > > > > # last actual query output: > > > > > # f > > > > > # with stderr: > > > > > timed out waiting for match: (?^:LOG: done skipping logical replication transaction finished at 0/1D30788) att/029_on_error.pl line 50. > > > > > > > > > > I've posted five sets of logs (2.7 MiB compressed) here: > > > > > https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing > > > > > Given that "SELECT subskiplsn = '0/0' > > > FROM pg_subscription WHERE subname = 'sub’” didn't return true, some > > > value was set to subskiplsn even after the unique key error. > > > > > > So I'm guessing that the apply worker could not get the updated value > > > of the subskiplsn or its MySubscription->skiplsn could not match with > > > the transaction's finish LSN. Also, given that the test is failing on > > > all AIX buildfarm members, there might be something specific to AIX. > > > > > > Noah, to investigate this issue further, is it possible for you to > > > apply the attached patch and run the 029_on_error.pl test? The patch > > > adds some logs to get additional information. > > > > Logs attached. > > Thank you. > > By seeing the below Logs: > ---- > .... > 2022-04-01 18:19:34.710 CUT [58327402] LOG: not started skipping > changes: my_skiplsn 14EB7D8/B0706F72 finish_lsn 0/14EB7D8 > ... > ---- > > It seems that the value of skiplsn read in GetSubscription is wrong > which makes the apply worker think it doesn't need to skip the > transaction. Now, in Alter/Create Subscription, we are using > LSNGetDatum() to store skiplsn value in pg_subscription but while > reading it in GetSubscription(), we are not converting back the datum > to LSN by using DatumGetLSN(). Is it possible that on this machine it > might be leading to not getting the right value for skiplsn? I think > it is worth trying to see if this fixes the problem. After applying datum_to_lsn_skiplsn_1.patch, I get another failure. Logs attached.
Вложения
В списке pgsql-hackers по дате отправления: