Re: wal_consistency_checking reports an inconsistency on masterbranch
От | Heikki Linnakangas |
---|---|
Тема | Re: wal_consistency_checking reports an inconsistency on masterbranch |
Дата | |
Msg-id | 1a37a9bb-4a0f-80eb-773f-0ebd3e1a7b79@iki.fi обсуждение исходный текст |
Ответ на | Re: wal_consistency_checking reports an inconsistency on masterbranch (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: wal_consistency_checking reports an inconsistency on masterbranch
|
Список | pgsql-hackers |
On 13/04/18 13:08, Michael Paquier wrote: > On Fri, Apr 13, 2018 at 02:15:35PM +0530, amul sul wrote: >> I have looked into this and found that the issue is in heap_xlog_delete -- we >> have missed to set the correct offset number from the target_tid when >> XLH_DELETE_IS_PARTITION_MOVE flag is set. > > Oh, this looks good to me. So when a row was moved across partitions > this could have caused incorrect tuple references on a standby, which > could have caused corruptions. Hmm. So, the problem was that HeapTupleHeaderSetMovedPartitions() only sets the block number to InvalidBlockNumber, and leaves the offset number unchanged. WAL replay didn't preserve the offset number, so the master and the standby had a different offset number in the ctid. Why does HeapTupleHeaderSetMovedPartitions() leave the offset number unchanged? The old offset number is meaningless without the block number. Also, bits and magic values in the tuple header are scarce. We're squandering a whole range of values in the ctid, everything with ip_blkid==InvalidBlockNumber, to mean "moved to different partition", when a single value would suffice. Let's tighten that up. In the attached (untested) patch, I changed the macros so that "moved to different partition" is indicated by the magic TID (InvalidBlockNumber, 0xfffd). Offset number 0xfffe was already used for speculative insertion tokens, so this follows that precedent. I kept using InvalidBlockNumber there, so ItemPointerIsValid() still considers those item pointers as invalid. But my gut feeling is actually that it would be better to use e.g. 0 as the block number, so that these item pointers would appear valid. Again, to follow the precedent of speculative insertion tokens. But I'm not sure if there was some well-thought-out reason to make them appear invalid. A comment on that would be nice, at least. (Amit hinted at this in https://www.postgresql.org/message-id/CAA4eK1KtsTqsGDggDCrz2O9Jgo7ma-Co-B8%2Bv3L2zWMA2NHm6A%40mail.gmail.com. He was OK with the current approach, but I feel pretty strongly that we should also set the offset number.) - Heikki
Вложения
В списке pgsql-hackers по дате отправления: