Hi!
I cannot figure out proper way to implement safe HA upsert. I will be very grateful if someone would help me.
Imagine we have primary server after failover. It is network-partitioned. We are doing INSERT ON CONFLICT DO NOTHING;
thateventually timed out.
az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
INSERT INTO t(
pk,
v,
dt
)
VALUES
(
5,
'text',
now()
)
ON CONFLICT (pk) DO NOTHING
RETURNING pk,
v,
dt)
SELECT new_doc.pk from new_doc;
^CCancel request sent
WARNING: 01000: canceling wait for synchronous replication due to user request
DETAIL: The transaction has already committed locally, but might not have been replicated to the standby.
LOCATION: SyncRepWaitForLSN, syncrep.c:264
Time: 2173.770 ms (00:02.174)
Here our driver decided that something goes wrong and we retry query.
az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
INSERT INTO t(
pk,
v,
dt
)
VALUES
(
5,
'text',
now()
)
ON CONFLICT (pk) DO NOTHING
RETURNING pk,
v,
dt)
SELECT new_doc.pk from new_doc;
pk
----
(0 rows)
Time: 4.785 ms
Now we have split-brain, because we acknowledged that row to client.
How can I fix this?
There must be some obvious trick, but I cannot see it... Or maybe cancel of sync replication should be disallowed and
terminationshould be treated as system failure?
Best regards, Andrey Borodin.