Обсуждение: pg_waldump vs. all-zeros WAL files; server creation of such files
The attached 010_zero.pl, when run as part of the pg_waldump test suite, fails at today's master (c36b636) and v15 (1bc19df). It passes at v14 (5a32af3). Command "pg_waldump --start 0/01000000 --end 0/01000100" fails as follows: pg_waldump: error: WAL segment size must be a power of two between 1 MB and 1 GB, but the WAL file "000000010000000000000002"header specifies 0 bytes Where it fails, the server has created an all-zeros WAL file under that name. Where it succeeds, that file doesn't exist at all. Two decisions to make: - Should a clean server shutdown ever leave an all-zeros WAL file? I think yes, it's okay to let that happen. - Should "pg_waldump --start $X --end $Y" open files not needed for the requested range? I think no. Bisect of master got: 30a53b7 Wed Mar 8 16:56:37 2023 +0100 Allow tailoring of ICU locales with custom rules Doesn't fail at $(git merge-base REL_15_STABLE master). Bisect of v15 got: 811203d Sat Aug 6 11:50:23 2022 -0400 Fix data-corruption hazard in WAL-logged CREATE DATABASE. I suspect those are innocent. They changed the exact WAL content, which I expect somehow caused creation of segment 2. Oddly, I find only one other report of this: https://www.postgresql.org/message-id/CAJ6DU3HiJ5FHbqPua19jAD%3DwLgiXBTjuHfbmv1jCOaNOpB3cCQ%40mail.gmail.com Thanks, nm
Вложения
On Sat, Aug 12, 2023 at 08:15:31PM -0700, Noah Misch wrote: > The attached 010_zero.pl, when run as part of the pg_waldump test suite, fails > at today's master (c36b636) and v15 (1bc19df). It passes at v14 (5a32af3). > Command "pg_waldump --start 0/01000000 --end 0/01000100" fails as follows: > > pg_waldump: error: WAL segment size must be a power of two between > 1 MB and 1 GB, but the WAL file "000000010000000000000002" header > specifies 0 bytes So this depends on the ordering of the entries retrieved by readdir() as much as the segments produced by the backend. > Where it fails, the server has created an all-zeros WAL file under that name. > Where it succeeds, that file doesn't exist at all. Two decisions to make: > > - Should a clean server shutdown ever leave an all-zeros WAL file? I think > yes, it's okay to let that happen. It doesn't hurt to leave that around. On the contrary, it makes any follow-up startup cheaper the bigger the segment size. > - Should "pg_waldump --start $X --end $Y" open files not needed for the > requested range? I think no. So this is a case where identify_target_directory() is called with a fname of NULL. Agreed that search_directory could be smarter with the files it should scan, especially if we have start and/or end LSNs at hand to filter out what we'd like to be in the data folder. -- Michael