An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
От | Bharath Rupireddy |
---|---|
Тема | An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication |
Дата | |
Msg-id | CALj2ACUrOB59QaE6=jF2cFAyv1MR7fzD8tr4YM5+OwEYG1SNzA@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication |
Список | pgsql-hackers |
Hi, With synchronous replication typically all the transactions (txns) first locally get committed, then streamed to the sync standbys and the backend that generated the transaction will wait for ack from sync standbys. While waiting for ack, it may happen that the query or the txn gets canceled (QueryCancelPending is true) or the waiting backend is asked to exit (ProcDiePending is true). In either of these cases, the wait for ack gets canceled and leaves the txn in an inconsistent state (as in the client thinks that the txn would have replicated to sync standbys) - "The transaction has already committed locally, but might not have been replicated to the standby.". Upon restart after the crash or in the next txn after the old locally committed txn was canceled, the client will be able to see the txns that weren't actually streamed to sync standbys. Also, if the client fails over to one of the sync standbys after the crash (either by choice or because of automatic failover management after crash), the locally committed txns on the crashed primary would be lost which isn't good in a true HA solution. Here's a proposal (mentioned previously by Satya [1]) to avoid the above problems: 1) Wait a configurable amount of time before canceling the sync replication by the backends i.e. delay processing of QueryCancelPending and ProcDiePending in Introduced a new timeout GUC synchronous_replication_naptime_before_cancel, when set, it will let the backends wait for the ack before canceling the synchronous replication so that the transaction can be available in sync standbys as well. If the ack isn't received even within this time frame, the backend cancels the wait and goes ahead as it does today. In production HA environments, the GUC can be set to a reasonable value to avoid missing transactions during failovers. 2) Wait for sync standbys to catch up upon restart after the crash or in the next txn after the old locally committed txn was canceled. One way to achieve this is to let the backend, that's making the first connection, wait for sync standbys to catch up in ClientAuthentication right after successful authentication. However, I'm not sure this is the best way to do it at this point. Thoughts? Here's a WIP patch implementing the (1), I'm yet to code for (2). I haven't added tests, I'm yet to figure out how to add one as there's no way we can delay the WAL sender so that we can reliably hit this code. I will think more about this. [1] https://www.postgresql.org/message-id/CAHg%2BQDdTdPsqtu0QLG8rMg3Xo%3D6Xo23TwHPYsUgGNEK13wTY5g%40mail.gmail.com Regards, Bharath Rupireddy.
Вложения
В списке pgsql-hackers по дате отправления: