Re: pgsql: Validate page level checksums in base backups
От | Tomas Vondra |
---|---|
Тема | Re: pgsql: Validate page level checksums in base backups |
Дата | |
Msg-id | f23e92ec-118d-e6ea-0c81-876ad62a588a@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: pgsql: Validate page level checksums in base backups (Magnus Hagander <magnus@hagander.net>) |
Ответы |
Re: pgsql: Validate page level checksums in base backups
Re: pgsql: Validate page level checksums in base backups |
Список | pgsql-hackers |
Hi, I think there's a bug in sendFile(). We do check checksums on all pages that pass this LSN check: /* * Only check pages which have not been modified since the * start of the base backup. Otherwise, they might have been * written only halfway and the checksum would not be valid. * However, replaying WAL would reinstate the correct page in * this case. */ if (PageGetLSN(page) < startptr) { ... } Now, imagine the page is new, i.e. all-zeroes. That means the LSN is 0/0 too, and we'll try to verify the checksum - but we actually do not set checksums on empty pages. So I think it should be something like this: if ((!PageIsNew(page)) && (PageGetLSN(page) < startptr)) { ... } It might be worth verifying that the page is actually all-zeroes (and not just with corrupted pd_upper value. Not sure it's worth it. I've found this by fairly trivial stress testing - running pgbench and pg_basebackup in a loop. It was failing pretty reliably (~75% of runs). With the proposed change I see no further failures. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: