Re: Logical replication timeout problem

Поиск

Список

Период

Сортировка

От	Masahiko Sawada
Тема	Re: Logical replication timeout problem
Дата	29 марта 2022 г. 05:07:17
Msg-id	CAD21AoAo6x3rAQ7VuzPT9paA4Y7uuWPqdQn_XYk1bWpCMF_N5g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Logical replication timeout problem (Amit Kapila <amit.kapila16@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, Mar 25, 2022 at 5:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Mar 25, 2022 at 11:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Mar 25, 2022 at 2:23 PM wangw.fnst@fujitsu.com
> > <wangw.fnst@fujitsu.com> wrote:
> >
> > Since commit 75b1521 added decoding of sequence to logical
> > replication, the patch needs to have pgoutput_sequence() call
> > update_progress().
> >
>
> Yeah, I also think this needs to be addressed. But apart from this, I
> want to know your and other's opinion on the following two points:
> a. Both this and the patch discussed in the nearby thread [1] add an
> additional parameter to
> WalSndUpdateProgress/OutputPluginUpdateProgress and it seems to me
> that both are required. The additional parameter 'last_write' added by
> this patch indicates: "If the last write is skipped then try (if we
> are close to wal_sender_timeout) to send a keepalive message to the
> receiver to avoid timeouts.". This means it can be used after any
> 'write' message. OTOH, the parameter 'skipped_xact' added by another
> patch [1] indicates if we have skipped sending anything for a
> transaction then sendkeepalive for synchronous replication to avoid
> any delays in such a transaction. Does this sound reasonable or can
> you think of a better way to deal with it?

These current approaches look good to me.

> b. Do we want to backpatch the patch in this thread? I am reluctant to
> backpatch because it changes the exposed API which can have an impact
> and second there exists a workaround (user can increase
> wal_sender_timeout/wal_receiver_timeout).

Yeah, we should avoid API changes between minor versions. I feel it's
better to fix it also for back-branches but probably we need another
fix for them. The issue reported on this thread seems quite
confusable; it looks like a network problem but is not true. Also, the
user who faced this issue has to increase wal_sender_timeout due to
the decoded data size, which also means to delay detecting network
problems. It seems an unrelated trade-off.

Regards,
-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Logical replication timeout problem