Re: Initdb-time block size specification

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Initdb-time block size specification
Дата
Msg-id 7acbab06-407e-5912-bcce-6788265d5ad2@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Initdb-time block size specification  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Initdb-time block size specification  (Bruce Momjian <bruce@momjian.us>)
Re: Initdb-time block size specification  (Peter Eisentraut <peter@eisentraut.org>)
Список pgsql-hackers

On 6/30/23 23:53, Bruce Momjian wrote:
> On Fri, Jun 30, 2023 at 11:42:30PM +0200, Tomas Vondra wrote:
>>
>>
>> On 6/30/23 23:11, Andres Freund wrote:
>>> Hi,
>>>
>>> ...
>>>
>>> I suspect you're going to see more benefits from going to a *lower* setting
>>> than a higher one. Some practical issues aside, plenty of storage hardware
>>> these days would allow to get rid of FPIs if you go to 4k blocks (although it
>>> often requires explicit sysadmin action to reformat the drive into that mode
>>> etc).  But obviously that's problematic from the "postgres limits" POV.
>>>
>>
>> I wonder what are the conditions/options for disabling FPI. I kinda
>> assume it'd apply to new drives with 4k sectors, with properly aligned
>> partitions etc. But I haven't seen any particularly clear confirmation
>> that's correct.
> 
> I don't think we have ever had to study this --- we just request the
> write to the operating system, and we either get a successful reply or
> we go into WAL recovery to reread the pre-image.  We never really care
> if the write is atomic, e.g., an 8k write can be done in 2 4kB writes 4
> 2kB writes --- we don't care --- we only care if they are all done or
> not.
> 
> For a 4kB write, to say it is not partially written would be to require
> the operating system to guarantee that the 4kB write is not split into
> smaller writes which might each be atomic because smaller atomic writes
> would not help us.
> 

Right, that's the dance we do to protect against torn pages. But Andres
suggested that if you have modern storage and configure it correctly,
writing with 4kB pages would be atomic. So we wouldn't need to do this
FPI stuff, eliminating pretty significant source of write amplification.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Initdb-time block size specification
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: Should we remove db_user_namespace?