Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
От | BharatDB |
---|---|
Тема | Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL |
Дата | |
Msg-id | CAAh00ERqqAhgA_BJJccwE0BXxUWMk+FHzMoLo1kWcsm+qdNVjw@mail.gmail.com обсуждение исходный текст |
Ответ на | [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL (Srinath Reddy Sadipiralla <srinath2133@gmail.com>) |
Ответы |
Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
|
Список | pgsql-hackers |
Dear Srinath,
Subject: [PATCH] pg_rewind: Ignore shutdown checkpoints when determining rewind necessity.
While working with pg_rewind
, I noticed that it can sometimes request a rewind even when no real changes exist after a failover. This happens because pg_rewind
currently determines the end-of-WAL on the target using the last shutdown checkpoint (or minRecoveryPoint
for a standby). In a clean failover scenario—where a standby is promoted and the old primary is later shut down—the only WAL record generated after divergence may be a shutdown checkpoint. Although the data on both nodes is identical, pg_rewind
treats this shutdown record as meaningful and unnecessarily forces a rewind. The proposed patch fixes this by ignoring shutdown checkpoints (XLOG_CHECKPOINT_SHUTDOWN
) when determining the end-of-WAL, scanning backward until a non-shutdown record is found. This ensures that rewinds are triggered only when actual modifications exist after divergence, avoiding unnecessary rewinds in clean failover situations.
Also, with the proposed fix implemented in my local script, it gives the following results:
Old primary shuts down cleanly.
Standby is promoted successfully.
pg_rewind
correctly detects no rewind is needed.Data on both clusters matches perfectly.
Thank you for your consideration.
Best regards,
Soumya.
Hi all,
While working with pg_rewind, I noticed that it can sometimes request a rewind even when no actual changes exist after a failover.
Problem:
Currently, pg_rewind determines the end-of-WAL on the target by using the last shutdown checkpoint (or minRecoveryPoint for a standby). This creates a false positive scenario:
1)Suppose a standby is promoted to become the new primary.
2)Later, the old primary is cleanly shut down.
3)The only WAL record generated on the old primary after divergence is a shutdown checkpoint.
At this point, the old primary and new primary contain identical data. However, since the shutdown checkpoint extends the WAL past the divergence point, pg_rewind concludes:
if (target_wal_endrec > divergerec)
rewind_needed = true;
That forces a rewind even though there are no meaningful changes.
To reproduce this scenario use the below attached script.
Fix:
The attached patch changes the logic so that pg_rewind no longer treats shutdown checkpoints as meaningful records when determining the end-of-WAL. Instead, we scan backward from the last checkpoint until we find the most recent valid WAL record that is not a shutdown-only related record.
This ensures rewind is only triggered when there are actual modifications after divergence, avoiding unnecessary rewinds in clean failover scenarios.--
Вложения
В списке pgsql-hackers по дате отправления: