Re: Synchronous commit behavior during network outage
От | Jeff Davis |
---|---|
Тема | Re: Synchronous commit behavior during network outage |
Дата | |
Msg-id | 6a052e81060824a8286148b1165bafedbd7c86cd.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Synchronous commit behavior during network outage (SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>) |
Ответы |
Re: Synchronous commit behavior during network outage
|
Список | pgsql-hackers |
On Tue, 2021-04-20 at 14:19 -0700, SATYANARAYANA NARLAPURAM wrote: > One idea here is to make the backend ignore query > cancellation/backend termination while waiting for the synchronous > commit ACK. This way client never reads the data that was never > flushed remotely. The problem with this approach is that your > backends get stuck until your commit log record is flushed on the > remote side. Also, the client can see the data not flushed remotely > if the server crashes and comes back online. You can prevent the > latter case by making a SyncRepWaitForLSN before opening up the > connections to the non-superusers. I have a working prototype of this > logic, if there is enough interest I can post the patch. I didn't see a patch here yet, so I wrote a simple one for consideration (attached). The problem exists for both cancellation and termination requests. The patch adds a GUC that makes SyncRepWaitForLSN keep waiting. It does not ignore the requests; for instance, a termination request will still be honored when it's done waiting for sync rep. The idea of this GUC is not to wait forever (obviously), but to allow the administrator (or an automated network agent) to be in control of the logic: If the primary is non-responsive, the administrator can decide to fail over, knowing that all visible transactions on the primary are durable on the standby (because any transaction that didn't make it to the standby also didn't release locks yet). If the standby is non- responsive, the administrator can intervene with something like: ALTER SYSTEM SET synchronous_standby_names = ''; SELECT pg_reload_conf(); which will disable sync rep, allowing the primary to complete the query and continue on without the standby; but in that case the admin must be sure not to fail over until there's a new standby fully caught-up. The patch may be somewhat controversial, so I'll wait for feedback before documenting it properly. Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: