Re: Initdb-time block size specification

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: Initdb-time block size specification
Дата	30 июня 2023 г. 22:59:09
Msg-id	20230630225909.ecthnlfvlnk3ij2k@awork3.anarazel.de обсуждение исходный текст
Ответ на	Re: Initdb-time block size specification (Bruce Momjian <bruce@momjian.us>)
Ответы	Re: Initdb-time block size specification
Список	pgsql-hackers

Дерево обсуждения

On 2023-06-30 18:37:39 -0400, Bruce Momjian wrote:
> On Sat, Jul  1, 2023 at 12:21:03AM +0200, Tomas Vondra wrote:
> > On 6/30/23 23:53, Bruce Momjian wrote:
> > > For a 4kB write, to say it is not partially written would be to require
> > > the operating system to guarantee that the 4kB write is not split into
> > > smaller writes which might each be atomic because smaller atomic writes
> > > would not help us.
> > 
> > Right, that's the dance we do to protect against torn pages. But Andres
> > suggested that if you have modern storage and configure it correctly,
> > writing with 4kB pages would be atomic. So we wouldn't need to do this
> > FPI stuff, eliminating pretty significant source of write amplification.
> 
> I agree the hardware is atomic for 4k writes, but do we know the OS
> always issues 4k writes?

When using a sector size of 4K you *can't* make smaller writes via normal
paths. The addressing unit is in sectors. The details obviously differ between
storage protocol, but you pretty much always just specify a start sector and a
number of sectors to be operated on.

Obviously the kernel could read 4k, modify 512 bytes in-memory, and then write
4k back, but that shouldn't be a danger here.  There might also be debug
interfaces to allow reading/writing in different increments, but that'd not be
something happening during normal operation.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Initdb-time block size specification