Re: Silent data corruption in PostgreSQL 17 - how to detect it proactively?

Поиск
Список
Период
Сортировка
От Vladlen Popolitov
Тема Re: Silent data corruption in PostgreSQL 17 - how to detect it proactively?
Дата
Msg-id 20e5800983ca176d696f2f90ecc0a830@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Silent data corruption in PostgreSQL 17 - how to detect it proactively?  (Pawel Kudzia <kudzia@gmail.com>)
Ответы Re: Silent data corruption in PostgreSQL 17 - how to detect it proactively?
Список pgsql-general
Pawel Kudzia писал(а) 2025-09-14 15:47:
> On Sun, Sep 14, 2025 at 12:35 PM Laurenz Albe 
> <laurenz.albe@cybertec.at> wrote:

> 
> gdb stack trace for that process:
> 
> #0  0x000055cb571ef444 in hash_search_with_hash_value ()
> #1  0x000055cb5706217a in BufTableLookup ()

Hi,

  Probably, it does not hang in function hash_search_with_hash_value(),
probably you caught it in this function in that moment of the time.
This function itself do finite work and returns, it is hard to
harm it by wrong data.
  This function is called by code, that goes to leafs of the B+tree.
If this code enters destroyed block, it goes to wrong blocks and
behave unexpectedly. For example, it goes to block zero, that is not
leaf block (meta-page of the index) and probably has a lot of zero.
Btree code get new block address -  0 again,  and again goes to block 0.

  You have two options:
1) enable checksums (that is highly recommended)
and get error message immediately in case of failure, and restore 
database
from backup (and probably consider the change of the provider)
2) continue with disabled checksums, get programs crashes and finally
restore from backup.

Checksum calculation takes ~0.5% of query time, it is not bottleneck
in PostgreSQL.

P.S. Databases have a lot of code, that rely on correctness of original
data on the disk. It is impossible to to check every byte, is it correct
or not.

-- 
Best regards,

Vladlen Popolitov.



В списке pgsql-general по дате отправления: