Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
От | BharatDB |
---|---|
Тема | Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL |
Дата | |
Msg-id | CAAh00EQjzwShaJkmWaLmQ4GmMPJEfmsNeLj3XStqyjeeDWoCbw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
|
Список | pgsql-hackers |
1. Control file changes:
You are correct that the control file always changes on shutdown, and that pg_rewind cannot simply ignore those updates. My earlier patch proposal did not address that, and I now understand why the server itself would reject a mismatch here.2. Other WAL records (
RUNNING_XACTS
):
I see now that a clean shutdown generates bothRUNNING_XACTS
andCHECKPOINT_SHUTDOWN
. My patch only skipped over the latter, so in practice rewind would still be triggered incorrectly. I will extend the logic to also consider this sequence properly.3. Server-side consistency:
As noted, even if pg_rewind skips shutdown-only WAL records, the restarted old primary can still fail due to control file divergence (infinite loop issue). That means it needs a more holistic fix that considers both pg_rewind and server startup behavior.4. RMID verification:
I did not guard the filtering with anXLogRecGetRmid()
check. I’ll fix this to avoid misclassification, following thewalsummarizer.c
example as you suggested.Plan forward:
Revise the patch so that pg_rewind correctly checks RMIDs and handles both
RUNNING_XACTS
+CHECKPOINT_SHUTDOWN
sequences, not just shutdown checkpoints.Investigate whether control file normalization is required (or whether server-side startup logic also needs adjustments) so that an old primary can rejoin cleanly without looping.
Ensure consistent patch authorship (my name + email will match the commit and submission).
Add regression coverage under
src/bin/pg_rewind/t/
to reproduce this clean-shutdown failover scenario automatically.
- I’ll prepare and post a new version of the patch with these corrections. Looking forward for more suggestions from you.
- Thank you for carefully reviewing and pointing out both the technical and process issues.
Best regards,
Soumya
On Mon, Sep 29, 2025 at 6:30 AM BharatDB <bharatdbpg@gmail.com> wrote:
> Dear Srinath,
This is a really weird email. First, it doesn't address my objections
from September 25th. Second, the email is from BharatDB
<bharatdbpg@gmail.com>, the email signature says it's from "Soumya",
and the patch within is from manimurali1993
<manimurali1993@gamil.com>.
--
Robert Haas
EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: