Re: Race between KeepFileRestoredFromArchive() and restartpoint

Поиск

Список

Период

Сортировка

От	David Steele
Тема	Re: Race between KeepFileRestoredFromArchive() and restartpoint
Дата	2 августа 2022 г. 14:14:22
Msg-id	e1f1fb79-f668-0e7c-9841-763a023a7187@pgmasters.net обсуждение исходный текст
Ответ на	Re: Race between KeepFileRestoredFromArchive() and restartpoint (Noah Misch <noah@leadboat.com>)
Ответы	Re: Race between KeepFileRestoredFromArchive() and restartpoint
Список	pgsql-hackers

Дерево обсуждения

On 7/31/22 02:17, Noah Misch wrote:
> On Tue, Jul 26, 2022 at 07:21:29AM -0400, David Steele wrote:
>> On 6/19/21 16:39, Noah Misch wrote:
>>> On Tue, Feb 02, 2021 at 07:14:16AM -0800, Noah Misch wrote:
>>>> Recycling and preallocation are wasteful during archive recovery, because
>>>> KeepFileRestoredFromArchive() unlinks every entry in its path.  I propose to
>>>> fix the race by adding an XLogCtl flag indicating which regime currently owns
>>>> the right to add long-term pg_wal directory entries.  In the archive recovery
>>>> regime, the checkpointer will not preallocate and will unlink old segments
>>>> instead of recycling them (like wal_recycle=off).  XLogFileInit() will fail.
>>>
>>> Here's the implementation.  Patches 1-4 suffice to stop the user-visible
>>> ERROR.  Patch 5 avoids a spurious LOG-level message and wasted filesystem
>>> writes, and it provides some future-proofing.
>>>
>>> I was tempted to (but did not) just remove preallocation.  Creating one file
>>> per checkpoint seems tiny relative to the max_wal_size=1GB default, so I
>>> expect it's hard to isolate any benefit.  Under the old checkpoint_segments=3
>>> default, a preallocated segment covered a respectable third of the next
>>> checkpoint.  Before commit 63653f7 (2002), preallocation created more files.
>>
>> This also seems like it would fix the link issues we are seeing in [1].
>>
>> I wonder if that would make it worth a back patch?
> 
> Perhaps.  It's sad to have multiple people deep-diving into something fixed on
> HEAD.  On the other hand, I'm not eager to spend risk-of-backpatch points on
> this.  One alternative would be adding an errhint like "This is known to
> happen occasionally during archive recovery, where it is harmless."  That has
> an unpolished look, but it's low-risk and may avoid deep-dive efforts.

I think in this case a HINT might be sufficient to at least keep people 
from wasting time tracking down a problem that has already been fixed.

However, there is another issue [1] that might argue for a back patch if 
this patch (as I believe) would fix the issue.

Regards,
-David

[1] 
https://www.postgresql.org/message-id/CAHJZqBDxWfcd53jm0bFttuqpK3jV2YKWx%3D4W7KxNB4zzt%2B%2BqFg%40mail.gmail.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Race between KeepFileRestoredFromArchive() and restartpoint