Re: InstallXLogFileSegment() vs concurrent WAL flush

Поиск
Список
Период
Сортировка
От Yugo NAGATA
Тема Re: InstallXLogFileSegment() vs concurrent WAL flush
Дата
Msg-id 20240202205608.7e8a6a9ed08386d218d73704@sraoss.co.jp
обсуждение исходный текст
Ответ на InstallXLogFileSegment() vs concurrent WAL flush  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: InstallXLogFileSegment() vs concurrent WAL flush  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Fri, 2 Feb 2024 11:18:18 +0100
Thomas Munro <thomas.munro@gmail.com> wrote:

> Hi,
> 
> New WAL space is created by renaming a file into place.  Either a
> newly created file with a temporary name or, ideally, a recyclable old
> file with a name derived from an old LSN.  I think there is a data
> loss window between rename() and fsync(parent_directory).  A
> concurrent backend might open(new_name), write(), fdatasync(), and
> then we might lose power before the rename hits the disk.  The data
> itself would survive the crash, but recovery wouldn't be able to find
> and replay it.  That might break the log-before-data rule or forget a
> transaction that has been reported as committed to a client.
> 
> Actual breakage would presumably require really bad luck, and I
> haven't seen this happen or anything, it just occurred to me while
> reading code, and I can't see any existing defences.
> 
> One simple way to address that would be to make XLogFileInitInternal()
> wait for InstallXLogFileSegment() to finish.  It's a little

Or, can we make sure the rename is durable by calling fsync before
returning the fd, as a patch attached here?

Regards,
Yugo Nagata

> pessimistic to do that unconditionally, though, as then you have to
> wait even for rename operations for segment files later than the one
> you're opening, so I thought about how to skip waiting in that case --
> see 0002.  I'm not sure if it's worth worrying about or not.


-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: [PoC] Improve dead tuple storage for lazy vacuum
Следующее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: Checking MINIMUM_VERSION_FOR_WAL_SUMMARIES