Re: pg_rewind race condition just after promotion
От | Heikki Linnakangas |
---|---|
Тема | Re: pg_rewind race condition just after promotion |
Дата | |
Msg-id | 22b18d17-0556-da06-3e55-b7311ac39d3d@iki.fi обсуждение исходный текст |
Ответ на | Re: pg_rewind race condition just after promotion (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Список | pgsql-hackers |
On 08/12/2020 06:45, Kyotaro Horiguchi wrote: > At Mon, 7 Dec 2020 20:13:25 +0200, Heikki Linnakangas <hlinnaka@iki.fi> wrote in >> I think we should fix this properly. I'm not sure if it can lead to a >> broken cluster, but at least it can cause pg_rewind to fail >> unnecessarily and in a user-unfriendly way. But this is actually >> pretty simple to fix. pg_rewind looks at the control file to find out >> the timeline the server is on. When promotion happens, the startup >> process updates minRecoveryPoint and minRecoveryPointTLI fields in the >> control file. We just need to read it from there. Patch attached. > > Looks fine to me. A bit concerned about making sourceHistory > needlessly file-local but on the other hand unifying sourceHistory and > targetHistory looks better. Looking closer, findCommonAncestorTimeline() was freeing sourceHistory, which was pretty horrible when it's a file-local variable. I changed it so that both the source and target histories are passed to findCommonAncestorTimeline() as arguments. That seems more clear. > For the test part, that change doesn't necessariry catch the failure > of the current version, but I *believe* the prevous code is the result > of an actual failure in the past so the test probablistically (or > dependently on platforms?) hits the failure if it happned. Right. I think the current test coverage is good enough. We've been bitten by this a few times by now, when we've forgotten to add the manual checkpoint commands to new tests, and the buildfarm has caught it pretty quickly. >> I think we should also backpatch this. Back in 2015, we decided that >> we can live with this, but it's always been a bit bogus, and seems >> simple enough to fix. > > I don't think this changes any successful behavior and it just saves > the failure case so +1 for back-patching. Thanks for the review! New patch version attached. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: