Re: [PING] fallocate() causes btrfs to never compress postgresql files
От | Dimitrios Apostolou |
---|---|
Тема | Re: [PING] fallocate() causes btrfs to never compress postgresql files |
Дата | |
Msg-id | 4aa3d83d-9630-61ac-85a7-a55490be49a6@gmx.net обсуждение исходный текст |
Ответ на | Re: [PING] fallocate() causes btrfs to never compress postgresql files (Tomas Vondra <tomas@vondra.me>) |
Список | pgsql-hackers |
On Wed, 28 May 2025, Tomas Vondra wrote: > > Isn't guaranteeing success of a write a general issue with compressed > filesystem? Why is posix_fallocate() any special in this regard? > Shouldn't the filesystem be defensive and assume the data is not > compressible? Or maybe just return EOPNOTSUPP when in doubt. It's not simple for CoW filesystems, including Btrfs and ZFS. What I know is that the current design is a compromise, it's not that the developers are happy with it. I can point you to some discussion, with pointers to further discussions if you are interested: https://marc.info/?l=linux-btrfs&m=174310663519516&w=2 >> BTW even in the last case, PostgreSQL would not notice the lack of >> fallocate() support as glibc implements a userspace fallback in >> posix_fallocate(). That fallback has its own issues that hopefully will >> not affect postgres (see CAVEATS in man 3 posix_fallocate). >> > > Well, if btrfs starts returning EOPNOTSUPP, and glibc switches to the > userspace fallback, we wouldn't notice. But that's up to the btrfs to > decide if they want to support fallocate. We still need our fallback > anyway, because of other OSes. Btrfs has decided a few years back: they will "support" fallocate, but because real support is very difficult, they disable compression (among others) for files with fallocate'd ranges. They can't change that and return EOPNOTSUPP out of the blue now, but they are open to adding a mount option to optionally do that: https://marc.info/?l=linux-btrfs&m=174310663519516&w=2 >> Should PostgreSQL provide a setting to avoid the use of fallocate()? Or is >> it the filesystem at fault for not returning EOPNOTSUPP, in which case >> postgres would use its fallback code? >> > > I don't have a clear opinion on whether it's a filesystem issue. Maybe > we should be handling this differently, not sure. All I'm saying is that this is a regression for PostgreSQL users that keep tablespaces on compressed Btrfs. What could be done from postgres, is to provide a runtime setting for avoiding fallocate(), going instead through the old code path. Idelly this would be an option per tablespace, but even a global one is better than nothing. Thanks, Dimitris
В списке pgsql-hackers по дате отправления: