Re: wal_consistency_checking reports an inconsistency on masterbranch
От | Andres Freund |
---|---|
Тема | Re: wal_consistency_checking reports an inconsistency on masterbranch |
Дата | |
Msg-id | 20180501050846.kyt2jq7pbqa4agbl@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: wal_consistency_checking reports an inconsistency on masterbranch (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: wal_consistency_checking reports an inconsistency on masterbranch
|
Список | pgsql-hackers |
On 2018-04-23 07:58:30 -0700, Andres Freund wrote: > On 2018-04-23 13:22:21 +0300, Heikki Linnakangas wrote: > > On 13/04/18 13:08, Michael Paquier wrote: > > > On Fri, Apr 13, 2018 at 02:15:35PM +0530, amul sul wrote: > > > > I have looked into this and found that the issue is in heap_xlog_delete -- we > > > > have missed to set the correct offset number from the target_tid when > > > > XLH_DELETE_IS_PARTITION_MOVE flag is set. > > > > > > Oh, this looks good to me. So when a row was moved across partitions > > > this could have caused incorrect tuple references on a standby, which > > > could have caused corruptions. > > > > Hmm. So, the problem was that HeapTupleHeaderSetMovedPartitions() only sets > > the block number to InvalidBlockNumber, and leaves the offset number > > unchanged. WAL replay didn't preserve the offset number, so the master and > > the standby had a different offset number in the ctid. > > Right. > > > Why does HeapTupleHeaderSetMovedPartitions() leave the offset number > > unchanged? The old offset number is meaningless without the block number. > > Also, bits and magic values in the tuple header are scarce. We're > > squandering a whole range of values in the ctid, everything with > > ip_blkid==InvalidBlockNumber, to mean "moved to different partition", when a > > single value would suffice. > > Yes, I agree on that. > > > > I kept using InvalidBlockNumber there, so ItemPointerIsValid() still > > considers those item pointers as invalid. But my gut feeling is actually > > that it would be better to use e.g. 0 as the block number, so that these > > item pointers would appear valid. Again, to follow the precedent of > > speculative insertion tokens. But I'm not sure if there was some > > well-thought-out reason to make them appear invalid. A comment on that would > > be nice, at least. > > That seems risky to me. We want something that stops EPQ style chasing > without running into asserts for invalid offsets... Heikki, would you rather apply this yourself or have me do it? Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: