Re: pgsql: Validate page level checksums in base backups
От | David Steele |
---|---|
Тема | Re: pgsql: Validate page level checksums in base backups |
Дата | |
Msg-id | a701f950-30a9-d9b9-d231-82c917420040@pgmasters.net обсуждение исходный текст |
Ответ на | Re: pgsql: Validate page level checksums in base backups (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: pgsql: Validate page level checksums in base backups
|
Список | pgsql-hackers |
On 4/10/18 5:24 PM, Tomas Vondra wrote: > > I think there's a bug in sendFile(). We do check checksums on all pages > that pass this LSN check: > > /* > * Only check pages which have not been modified since the > * start of the base backup. Otherwise, they might have been > * written only halfway and the checksum would not be valid. > * However, replaying WAL would reinstate the correct page in > * this case. > */ > if (PageGetLSN(page) < startptr) > { > ... > } > > Now, imagine the page is new, i.e. all-zeroes. That means the LSN is 0/0 > too, and we'll try to verify the checksum - but we actually do not set > checksums on empty pages. > > So I think it should be something like this: > > if ((!PageIsNew(page)) && (PageGetLSN(page) < startptr)) > { > ... > } > > It might be worth verifying that the page is actually all-zeroes (and > not just with corrupted pd_upper value. Not sure it's worth it. > > I've found this by fairly trivial stress testing - running pgbench and > pg_basebackup in a loop. It was failing pretty reliably (~75% of runs). > With the proposed change I see no further failures. Good catch, Tomas! I should have seen this since I had the same issue when I implemented this feature in pgBackRest. Anyway, I agree that your fix looks correct. Thanks, -- -David david@pgmasters.net
В списке pgsql-hackers по дате отправления: