Re: Permission failures with WAL files in 13~ on Windows
От | Andres Freund |
---|---|
Тема | Re: Permission failures with WAL files in 13~ on Windows |
Дата | |
Msg-id | 20210318023004.gz2aejhze2kkkqr2@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Permission failures with WAL files in 13~ on Windows (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: Permission failures with WAL files in 13~ on Windows
|
Список | pgsql-hackers |
Hi, On 2021-03-18 09:55:46 +0900, Michael Paquier wrote: > Let's see how it goes from this point, but, FWIW, I have not been able > to reproduce again my similar problem with the archive command :/ -- I suspect it might be easier to reproduce the issue with smaller WAL segments, a short checkpoint_timeout, and multiple jobs generating WAL and then sleeping for random amounts of time. Not sure if that's the sole ingredient, but consider what happens there's processes that XLogWrite()s some WAL and then sleeps. Typically such a process' openLogFile will still point to the WAL segment. And they may still do that when the next checkpoint finishes and we recycle the WAL file. I wonder if we actually fail to unlink() the file in durable_link_or_rename(), and then end up recycling the same old file into multiple "future" positions in the WAL stream. There's also these interesting notes at https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createhardlinka 1) > The security descriptor belongs to the file to which a hard link > points. The link itself is only a directory entry, and does not have a > security descriptor. Therefore, when you change the security > descriptor of a hard link, you a change the security descriptor of the > underlying file, and all hard links that point to the file allow the > newly specified access. You cannot give a file different security > descriptors on a per-hard-link basis. 2) > Flags, attributes, access, and sharing that are specified in > CreateFile operate on a per-file basis. That is, if you open a file > that does not allow sharing, another application cannot share the file > by creating a new hard link to the file. 3) > The maximum number of hard links that can be created with this > function is 1023 per file. If more than 1023 links are created for a > file, an error results. 1) and 2) seems problematic for restore_command use. I wonder if there's a chance that some of the reports ended up hitting 3), and that windows doesn't handle that well. If you manage to reproduce, could you check what the link count of the all the segments is? Apparently sysinternal's findlinks can do that. Or perhaps even better, add an error check that the number of links of WAL segments is 1 in a bunch of places (recycling, opening them, closing them, maybe?). Plus error reporting for unlink failures, of course. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: