Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?
| От | Kyotaro HORIGUCHI |
|---|---|
| Тема | Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? |
| Дата | |
| Msg-id | 20170907.123347.101584520.horiguchi.kyotaro@lab.ntt.co.jp обсуждение исходный текст |
| Ответ на | Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? (Andres Freund <andres@anarazel.de>) |
| Ответы |
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
|
| Список | pgsql-hackers |
Hello, At Wed, 6 Sep 2017 12:23:53 -0700, Andres Freund <andres@anarazel.de> wrote in <20170906192353.ufp2dq7wm5fd6qa7@alap3.anarazel.de> > On 2017-09-06 17:36:02 +0900, Kyotaro HORIGUCHI wrote: > > The problem is that the current ReadRecord needs the first one of > > a series of continuation records from the same source with the > > other part, the master in the case. > > What's the problem with that? We can easily keep track of the beginning > of a record, and only confirm the address before that. After failure while reading a record locally, ReadRecored tries streaming to read from the beginning of a record, which is not on the master, then retry locally and.. This loops forever. > > A (or the) solution closed in the standby side is allowing to > > read a seris of continuation records from muliple sources. > > I'm not following. All we need to use is the beginning of the relevant > records, that's easy enough to keep track of. We don't need to read the > WAL or anything. The beginning is already tracked and nothing more to do. I reconsider that way and found that it doesn't need such destructive refactoring. The first *problem* was WaitForWALToBecomeAvaialble requests the beginning of a record, which is not on the page the function has been told to fetch. Still tliRecPtr is required to determine the TLI to request, it should request RecPtr to be streamed. The rest to do is let XLogPageRead retry other sources immediately. To do this I made ValidXLogPageHeader@xlogreader.c public (and renamed to XLogReaderValidatePageHeader). The patch attached fixes the problem and passes recovery tests. However, the test for this problem is not added. It needs to go to the last page in a segment then put a record continues to the next segment, then kill the standby after receiving the previous segment but before receiving the whole record. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: