Обсуждение: fallocate() causes btrfs to never compress postgresql files

Поиск
Список
Период
Сортировка

fallocate() causes btrfs to never compress postgresql files

От
Dimitrios Apostolou
Дата:
Hi,

this is just a heads-up about files being generated by PostgreSQL 17 not
being compressed by Btrfs, even when mounted with the force-compress mount
option. I have this occuring aggressively when restoring a database via
pg_restore. I think this is caused mdzeroextend() calling FileFallocate(),
which in turn invokes posix_fallocate().

I also verified that turning off the use of fallocate causes the database
to write compressed files again, like it did in older versions.
Unfortunately the only way I found was to configure with a "hack" so that
autoconf thinks the feature is not available:

   ./configure ac_cv_func_posix_fallocate=no

There have been discussions on the btrfs mailing list about why it does
that, the summary is that it is very difficult to guarantee that
compressed writes will not fail with ENOSPACE on a CoW filesystem, thus
files with fallocate()d ranges are treated as being marked NOCOW,
effectively disabling compression.

Should PostgreSQL provide a setting to avoid the use of fallocate()? Or is
it the filesystem at fault for not returning EOPNOTSUPP, in which case
postgres would use its fallback code?

BTW even in the last case, PostgreSQL would not notice the lack of
fallocate() support as glibc implements a userspace fallback in
posix_fallocate(). That fallback has its own issues that hopefully will
not affect postgres (see CAVEATS in man 3 posix_fallocate).

Regards,
Dimitris




Re: fallocate() causes btrfs to never compress postgresql files

От
Bráulio Oliveira
Дата:
On Sun, Nov 16, 2025 at 11:38 PM Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> Hi,
>
> this is just a heads-up about files being generated by PostgreSQL 17 not
> being compressed by Btrfs, even when mounted with the force-compress mount
> option. I have this occuring aggressively when restoring a database via
> pg_restore. I think this is caused mdzeroextend() calling FileFallocate(),
> which in turn invokes posix_fallocate().
>
> I also verified that turning off the use of fallocate causes the database
> to write compressed files again, like it did in older versions.
> Unfortunately the only way I found was to configure with a "hack" so that
> autoconf thinks the feature is not available:
>
>    ./configure ac_cv_func_posix_fallocate=no
Hi,

I'm rebuilding the official Ubuntu package with this configuration option as
after migrating to PostgreSQL to a BTRFS partition with Snapper & Compression
is eating space really fast (around 50-100gb per hour).

Sorry for not using the [PING] thread, I couldn't get the email resent
as I just joined the mailing list.

Are there any plans to merge the file_extend_method soon? BTRFS
compressed the original
database to 36% of the original size, but now with the DB running it is using
a lot of space crazily, not sure if running VACUUM and ANALYSE is
making things worse.

Thankfully,
Bráulio

>
> There have been discussions on the btrfs mailing list about why it does
> that, the summary is that it is very difficult to guarantee that
> compressed writes will not fail with ENOSPACE on a CoW filesystem, thus
> files with fallocate()d ranges are treated as being marked NOCOW,
> effectively disabling compression.
>
> Should PostgreSQL provide a setting to avoid the use of fallocate()? Or is
> it the filesystem at fault for not returning EOPNOTSUPP, in which case
> postgres would use its fallback code?
>
> BTW even in the last case, PostgreSQL would not notice the lack of
> fallocate() support as glibc implements a userspace fallback in
> posix_fallocate(). That fallback has its own issues that hopefully will
> not affect postgres (see CAVEATS in man 3 posix_fallocate).
>
> Regards,
> Dimitris
>
>
>
>
>



Re: fallocate() causes btrfs to never compress postgresql files

От
Thomas Munro
Дата:
On Mon, Nov 17, 2025 at 4:23 PM Bráulio Oliveira <brauliobo@gmail.com> wrote:
> Are there any plans to merge the file_extend_method soon? BTRFS
> compressed the original
> database to 36% of the original size, but now with the DB running it is using
> a lot of space crazily, not sure if running VACUUM and ANALYSE is
> making things worse.

Yeah, I'm planning to post a version with documentation and responses
to the new feedback on that thread soon.  Sorry it didn't make last
week's release.