Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
От | Thomas Munro |
---|---|
Тема | Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt" |
Дата | |
Msg-id | CA+hUKGLnhvNW9+V7+o_8Cnu+_fp1kZ7UU9PDnwyqFKQi4agAnA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt" (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt" Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt" |
Список | pgsql-hackers |
While chatting to Robert and Andres about all this, a new idea came up. Or, rather, one of the first ideas that was initially rejected, now resurrected to try out a suggestion of Andres’s on how to de-pessimise it. Unfortunately, it also suffers from Windows-specific problems that I originally mentioned at the top of this thread but had since repressed. Arrrghgh. First, the good news: We could write out a whole new control file, and durable_rename() it into place. We don’t want to do that in general, because we don’t want to slow down UpdateMinRecoveryPoint(). The new concept is to do that only if a backup is in progress. That requires a bit of interlocking with backup start/stop (ie when runningBackups is changing in shmem, we don’t want to overlap with UpdateControlFile()'s decision on how to do it). Here is a patch to try that out. No more weasel wording needed for the docs; basebackup and low-level file system backup should always see an atomic control file (and occasionally also copy a harmless pg_control.tmp file). Then we only need the gross retry-until-stable hack for front-end programs. And the bad news: In my catalogue-of-Windows-weirdness project[1], I learned in v3-0003 that: + fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777); + PG_EXPECT_SYS(fd >= 0, "touch name1.txt"); + PG_REQUIRE_SYS(fd < 0 || close(fd) == 0); + + fd = open(path2, O_RDWR | PG_BINARY, 0777); + PG_EXPECT_SYS(fd >= 0, "open name2.txt"); + make_path(path2, "name2.txt"); +#ifdef WIN32 + + /* + * Windows can't rename over an open non-unlinked file, even with + * have_posix_unlink_semantics. + */ + pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */ + PG_EXPECT_SYS(rename(path, path2) == -1, + "Windows: can't rename name1.txt -> name2.txt while name2.txt is open"); + PG_EXPECT_EQ(errno, EACCES); + PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt"); +#else + PG_EXPECT_SYS(rename(path, path2) == 0, + "POSIX: can rename name1.txt -> name2.txt while name2.txt is open"); +#endif + PG_EXPECT_SYS(close(fd) == 0); Luckily the code in dirmod.c:pgrename() should retry lots of times if a concurrent transient opener/reader comes along, so I think that should be OK in practice (but if backups_r_us.exe holds the file open for 10 seconds while we're trying to rename it, I assume we'll PANIC); call that problem #1. What is slightly more disturbing is the clue in the "Cygwin cleanup" thread[2] that rename() can fail to be 100% atomic, so that a concurrent call to open() can fail with ENOENT (cf. the POSIX requirement "... a link named new shall remain visible to other processes throughout the renaming operation and refer either to the file referred to by new or old ..."). Call that problem #2, a problem that already causes us rare breakage (for example: could not open file "pg_logical/snapshots/0-14FE6B0.snap"). I know that problem #1 can be fixed by applying v3-0004 from [1] but that leads to impossible decisions like revoking support for non-NTFS filesystems as discussed in that thread, and we certainly couldn't back-patch that anyway. I assume problem #2 can too. That makes me want to *also* acquire ControlFileLock, for base backup's read of pg_control. Even though it seems redundant with the rename() trick (the rename() trick should be enough for low-level *and* basebackup on ext4), it would at least avoid the above Windowsian pathologies during base backups. I'm sorry for the patch/idea-churn in this thread. It's like Whac-a-Mole. Blasted non-POSIX-compliant moles. New patches attached. Are they getting better? [1] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2Be13wK0PBX5Z63CCwWm7MfRQuwBRabM_3aKWSko2AUww%40mail.gmail.com
Вложения
В списке pgsql-hackers по дате отправления: