Re: Skipping logical replication transactions on subscriber side
От | Masahiko Sawada |
---|---|
Тема | Re: Skipping logical replication transactions on subscriber side |
Дата | |
Msg-id | CAD21AoAqDVCVzSpv6DOMLqLeO+m4Jh=dKv9a8Cc4qthdpYzqFw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Skipping logical replication transactions on subscriber side (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
RE: Skipping logical replication transactions on subscriber side
|
Список | pgsql-hackers |
On Wed, Dec 1, 2021 at 1:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > I have a question about the testcase (I could be wrong here). > > > > > > > > Is it possible that the race condition happen between apply worker(test_tab1) > > > > and table sync worker(test_tab2) ? If so, it seems the error("replication > > > > origin with OID") could happen randomly until we resolve the conflict. > > > > Based on this, for the following code: > > > > ----- > > > > # Wait for the error statistics to be updated. > > > > my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql; > > > > $node->poll_query_until( > > > > 'postgres', $check_sql, > > > > ) or die "Timed out while waiting for statistics to be updated"; > > > > > > > > * [1] * > > > > > > > > $check_sql = > > > > qq[ > > > > SELECT subname, last_error_command, last_error_relid::regclass, > > > > last_error_count > 0 ] . $part_sql; > > > > my $result = $node->safe_psql('postgres', $check_sql); > > > > is($result, $expected, $msg); > > > > ----- > > > > > > > > Is it possible that the error("replication origin with OID") happen again at the > > > > place [1]. In this case, the error message we have checked could be replaced by > > > > another error("replication origin ...") and then the test fail ? > > > > > > > > > > Once we get the "duplicate key violation ..." error before * [1] * via > > > apply_worker then we shouldn't get replication origin-specific error > > > because the origin set up is done before starting to apply changes. > > > > Right. > > > > > Also, even if that or some other happens after * [1] * because of > > > errmsg_prefix check it should still succeed. > > > > In this case, the old error ("duplicate key violation ...") is > > overwritten by a new error (e.g., connection error. not sure how > > possible it is) > > > > Yeah, or probably some memory allocation failure. I think the > probability of such failures is very low but OTOH why take chance. > > > and the test fails because the query returns no > > entries, no? > > > > Right. > > > If so, the result from the second check_sql is unstable > > and it's probably better to check the result only once. That is, the > > first check_sql includes the command and we exit from the function > > once we confirm the error entry is expectedly updated. > > > > Yeah, I think that should be fine. Okay, I've attached an updated patch. Please review it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Вложения
В списке pgsql-hackers по дате отправления: