Re: prevent immature WAL streaming
От | Amul Sul |
---|---|
Тема | Re: prevent immature WAL streaming |
Дата | |
Msg-id | CAAJ_b97KyJ6X9uO8KH31zn1vrcNscmHFUeE8+AFAzPqQPAmszw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: prevent immature WAL streaming (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
Ответы |
Re: prevent immature WAL streaming
|
Список | pgsql-hackers |
On Wed, Nov 24, 2021 at 2:10 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2021-Nov-23, Tom Lane wrote: > > > We're *still* not out of the woods with 026_overwrite_contrecord.pl, > > as we are continuing to see occasional "mismatching overwritten LSN" > > failures, further down in the test where it tries to start up the > > standby: > > Augh. > > > Looking at adjacent successful runs, it seems that the exact point > > where the "missing contrecord" starts varies substantially, even after > > our previous fix to disable autovacuum in this test. How could that be? > > Well, there is intentionally some variability. Maybe not as much as one > would wish, but I expect that that should explain why that point is not > always the same. > > > It's probably for the best though, because I think this is exposing > > an actual bug that we would not have seen if the start point were > > completely consistent. I have not dug into the code, but it looks to > > me like if the "consistent recovery state" is reached exactly at a > > page boundary (0/1FFE000 in all these cases), then the standby expects > > that to be what the OVERWRITE_CONTRECORD record will point at. But > > actually it points to the first WAL record on that page, resulting > > in a bogus failure. > > So what is happening is that we set state->overwrittenRecPtr to the LSN > of page start, ignoring the page header. Is that the LSN of the first > record in a page? I'll see if I can reproduce the problem. > In XLogReadRecord(), both the variables being compared have inconsistency in the assignment -- one gets assigned from state->currRecPtr where other is from RecPtr. ..... state->overwrittenRecPtr = state->currRecPtr; ..... state->abortedRecPtr = RecPtr; ..... Before the place where assembled flag sets, there is a bunch of code that adjusts RecPtr. I think instead of RecPtr, the latter assignment should use state->currRecPtr as well. Regards, Amul
В списке pgsql-hackers по дате отправления: