Re: hung backends stuck in spinlock heavy endless loop
От | Jeff Janes |
---|---|
Тема | Re: hung backends stuck in spinlock heavy endless loop |
Дата | |
Msg-id | CAMkU=1woHWPyJzdmTf34M1zXHa4C9N06YmpuUw=PES3dK3euKQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: hung backends stuck in spinlock heavy endless loop (Merlin Moncure <mmoncure@gmail.com>) |
Список | pgsql-hackers |
On Thu, Jan 22, 2015 at 1:50 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
So far, the 'nasty' damage seems to generally if not always follow a
checksum failure and the checksum failures are always numerically
adjacent. For example:
[cds2 12707 2015-01-22 12:51:11.032 CST 2754]WARNING: page
verification failed, calculated checksum 9465 but expected 9477 at
character 20
[cds2 21202 2015-01-22 13:10:18.172 CST 3196]WARNING: page
verification failed, calculated checksum 61889 but expected 61903 at
character 20
[cds2 29153 2015-01-22 14:49:04.831 CST 4803]WARNING: page
verification failed, calculated checksum 27311 but expected 27316
I'm not up on the intricacies of our checksum algorithm but this is
making me suspicious that we are looking at a improperly flipped
visibility bit via some obscure problem -- almost certainly with
vacuum playing a role.
That very much sounds like the block is getting duplicated from one place to another.
Even flipping one hint bit (aren't these index pages? Do they have hint bits) should thoroughly scramble the checksum.
Because the checksum adds in the block number after the scrambling has been done, copying a page to another nearby location will just move the (expected) checksum a little bit.
Cheers,
Jeff
В списке pgsql-hackers по дате отправления: