Re: Logical replication timeout problem
От | Masahiko Sawada |
---|---|
Тема | Re: Logical replication timeout problem |
Дата | |
Msg-id | CAD21AoAo6x3rAQ7VuzPT9paA4Y7uuWPqdQn_XYk1bWpCMF_N5g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Logical replication timeout problem (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Fri, Mar 25, 2022 at 5:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Mar 25, 2022 at 11:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Mar 25, 2022 at 2:23 PM wangw.fnst@fujitsu.com > > <wangw.fnst@fujitsu.com> wrote: > > > > Since commit 75b1521 added decoding of sequence to logical > > replication, the patch needs to have pgoutput_sequence() call > > update_progress(). > > > > Yeah, I also think this needs to be addressed. But apart from this, I > want to know your and other's opinion on the following two points: > a. Both this and the patch discussed in the nearby thread [1] add an > additional parameter to > WalSndUpdateProgress/OutputPluginUpdateProgress and it seems to me > that both are required. The additional parameter 'last_write' added by > this patch indicates: "If the last write is skipped then try (if we > are close to wal_sender_timeout) to send a keepalive message to the > receiver to avoid timeouts.". This means it can be used after any > 'write' message. OTOH, the parameter 'skipped_xact' added by another > patch [1] indicates if we have skipped sending anything for a > transaction then sendkeepalive for synchronous replication to avoid > any delays in such a transaction. Does this sound reasonable or can > you think of a better way to deal with it? These current approaches look good to me. > b. Do we want to backpatch the patch in this thread? I am reluctant to > backpatch because it changes the exposed API which can have an impact > and second there exists a workaround (user can increase > wal_sender_timeout/wal_receiver_timeout). Yeah, we should avoid API changes between minor versions. I feel it's better to fix it also for back-branches but probably we need another fix for them. The issue reported on this thread seems quite confusable; it looks like a network problem but is not true. Also, the user who faced this issue has to increase wal_sender_timeout due to the decoded data size, which also means to delay detecting network problems. It seems an unrelated trade-off. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
В списке pgsql-hackers по дате отправления: