Re: logical decoding and replication of sequences, take 2
От | Ashutosh Bapat |
---|---|
Тема | Re: logical decoding and replication of sequences, take 2 |
Дата | |
Msg-id | CAExHW5vN=aVYOFikt6cqpsezwYErGW6aqaDJ0jUdu++B0eA0Dw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: logical decoding and replication of sequences, take 2 (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: logical decoding and replication of sequences, take 2
(Tomas Vondra <tomas.vondra@enterprisedb.com>)
|
Список | pgsql-hackers |
On Tue, Aug 1, 2023 at 8:46 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > Anyway, I think this is "just" a matter of efficiency, not correctness. > IMHO there are bigger questions regarding the "going back" behavior > after apply restart. sequence_decode() has the following code /* Skip the change if already processed (per the snapshot). */ if (transactional && !SnapBuildProcessChange(builder, xid, buf->origptr)) return; else if (!transactional && (SnapBuildCurrentState(builder) != SNAPBUILD_CONSISTENT || SnapBuildXactNeedsSkip(builder, buf->origptr))) return; This means that if the subscription restarts, the upstream will *not* send any non-transactional sequence changes with LSN prior to the LSN specified by START_REPLICATION command. That should avoid replicating all the non-transactional sequence changes since ReplicationSlot::restart_lsn if the subscription restarts. But in apply_handle_sequence(), we do not update the replorigin_session_origin_lsn with LSN of the non-transactional sequence change when it's applied. This means that if a subscription restarts while it is half way through applying a transaction, those changes will be replicated again. This will move the sequence backward. If the subscription keeps restarting again and again while applying that transaction, we will see the sequence "rubber banding" [1] on subscription. So untill the transaction is completely applied, the other users of the sequence may see duplicate values during this time. I think this is undesirable. But I am not able to find a case where this can lead to conflicting values after failover. If there's only one transaction which is repeatedly being applied, the rows which use sequence values were never committed so there's no conflicting value present on the subscription. The same reasoning can be extended to multiple in-flight transactions. If another transaction (T2) uses the sequence values changed by in-flight transaction T1 and if T2 commits before T1, the sequence changes used by T2 must have LSNs before commit of T2 and thus they will never be replicated. (See example below). T1 insert into t1 (nextval('seq'), ...) from generate_series(1, 100); - Q1 T2 insert into t1 (nextval('seq'), ...) from generate_series(1, 100); - Q2 COMMIT; T1 insert into t1 (nextval('seq'), ...) from generate_series(1, 100); - Q13 COMMIT; So I am not able to imagine a case when a sequence going backward can cause conflicting values. But whether or not that's the case, downstream should not request (and hence receive) any changes that have been already applied (and committed) downstream as a principle. I think a way to achieve this is to update the replorigin_session_origin_lsn so that a sequence change applied once is not requested (and hence sent) again. [1] https://en.wikipedia.org/wiki/Rubber_banding -- Best Wishes, Ashutosh Bapat
В списке pgsql-hackers по дате отправления: