Re: The ability of postgres to determine loss of files of the main fork

Поиск

Список

Период

Сортировка

От	Jakub Wartak
Тема	Re: The ability of postgres to determine loss of files of the main fork
Дата	1 октября 15:05:53
Msg-id	CAKZiRmy0CK3m0-raCdTDELg0JjY7qAqzEN9P5n4N4wGw6ys4tw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: The ability of postgres to determine loss of files of the main fork (Aleksander Alekseev <aleksander@tigerdata.com>)
Ответы	Re: The ability of postgres to determine loss of files of the main fork
Список	pgsql-hackers

Дерево обсуждения

On Wed, Oct 1, 2025 at 1:46 PM Aleksander Alekseev
<aleksander@tigerdata.com> wrote:
>
> Hi Jakub,
>
> > IMHO all files should be opened at least on startup to check integrity,
>
> That might be a lot of files to open.

I was afraid of that, but let's say modern high-end is 200TB big DB,
that's like 200*1024 1GB files, but I'm getting such time(1) timings
for 204k files on ext4:

$ time ./createfiles                      # real    0m2.157s, it's
open(O_CREAT)+close()
$ time ls -l many_files_dir/ > /dev/null # real    0m0.734s
$ time ./openfiles                          # real    0m0.297s , for
already existing ones (hot)
$ time ./openfiles                          # real    0m1.456s , for
already existing ones (cold, echo 3 > drop_caches sysctl)

Not bad in my book as a one time activity. It could pose a problem
potentially with some high latency open() calls, maybe NFS or
something remote I guess.

> Even if you can open a file it doesn't mean it's not empty

Correct, I haven't investigated that rabbithole...

> or is not corrupted.

I think checksums guard users well in this case as they would get
notified that stuff is wonky (much better than wrong result/silent
data loss)

-J.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: The ability of postgres to determine loss of files of the main fork