Re: Excessive number of replication slots for 12->14 logical replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Excessive number of replication slots for 12->14 logical replication
Дата
Msg-id CAA4eK1KpXQRLswLkqLiWx61DBbL4x1NBSRxpLYmSJzr3gRYc7A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-bugs
On Sat, Sep 10, 2022 at 11:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Sep 10, 2022 at 11:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 10, 2022 at 3:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > One thing is not clear to me how the first time error: "could not find
> > record while sending logically-decoded data ..." can happen due to
> > this commit? Also, based on the origin even if the client sends a
> > prior location (0/0 in this case) but the server will still start from
> > the location where the client has confirmed the commit (aka
> > confirmed_flush location).
> >
>
> I missed the point that if the 'origin_lsn' is ahead of the
> 'confirmed_flush' location then it will start from some prior location
> which I think will be problematic.
>

I am able to reproduce the behavior as seen in BF failure with the
help of a debugger by introducing an artificial error in
libpqrcv_endstreaming and by ensuring that apply worker skips the
transaction that performs an operation on a table for which the sync
worker is copying the table. I have to also suppress keep_alive
messages from the publisher, otherwise, they move the confirm_flush
location ahead of origin_lsn. So, it is clear that this commit has
caused the BF failure even though the first error seen: "ERROR:  could
not find record while sending logically-decoded data: missing
contrecord at 0/1CCF9F0" was not due to this commit.

I don't have any better ideas to solve this at this stage than what
Hou-San has mentioned in his email [1]. What do you think?

[1] -
https://www.postgresql.org/message-id/OS0PR01MB5716E128E78C6CECD15C718394429%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Moisés Limón
Дата:
Сообщение: Bug in UPDATE statement
Следующее
От: "houzj.fnst@fujitsu.com"
Дата:
Сообщение: RE: Excessive number of replication slots for 12->14 logical replication