Relation bulk write facility

Поиск

Список

Период

Сортировка

От	Heikki Linnakangas
Тема	Relation bulk write facility
Дата	19 сентября 2023 г. 15:13:47
Msg-id	30e8f366-58b3-b239-c521-422122dd5150@iki.fi обсуждение исходный текст
Ответы	Re: Relation bulk write facility
Список	pgsql-hackers

Дерево обсуждения

Several places bypass the buffer manager and use direct smgrextend() 
calls to populate a new relation: Index AM build methods, rewriteheap.c 
and RelationCopyStorage(). There's fair amount of duplicated code to 
WAL-log the pages, calculate checksums, call smgrextend(), and finally 
call smgrimmedsync() if needed. The duplication is tedious and 
error-prone. For example, if we want to optimize by WAL-logging multiple 
pages in one record, that needs to be implemented in each AM separately. 
Currently only sorted GiST index build does that but it would be equally 
beneficial in all of those places.

And I believe we got the smgrimmedsync() logic slightly wrong in a 
number of places [1]. And it's not great for latency, we could let the 
checkpointer do the fsyncing lazily, like Robert mentioned in the same 
thread.

The attached patch centralizes that pattern to a new bulk writing 
facility, and changes all those AMs to use it. The facility buffers 32 
pages and WAL-logs them in record, calculates checksums. You could 
imagine a lot of further optimizations, like writing those 32 pages in 
one vectored pvwrite() call [2], and not skipping the buffer manager 
when the relation is small. But the scope of this initial version is 
mostly to refactor the existing code.

One new optimization included here is to let the checkpointer do the 
fsyncing if possible. That gives a big speedup when e.g. restoring a 
schema-only dump with lots of relations.

[1] 
https://www.postgresql.org/message-id/58effc10-c160-b4a6-4eb7-384e95e6f9e3%40iki.fi

[2] 
https://www.postgresql.org/message-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Вложения

v1-0001-Introduce-a-new-bulk-loading-facility.patch

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Relation bulk write facility

Вложения