Re: pg_upgrade --copy-file-range
От | Tomas Vondra |
---|---|
Тема | Re: pg_upgrade --copy-file-range |
Дата | |
Msg-id | 4a039af3-3976-4716-90a9-2ee2839f1df3@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: pg_upgrade --copy-file-range (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: pg_upgrade --copy-file-range
|
Список | pgsql-hackers |
On 3/23/24 14:47, Tomas Vondra wrote: > On 3/23/24 13:38, Robert Haas wrote: >> On Fri, Mar 22, 2024 at 8:26 PM Thomas Munro <thomas.munro@gmail.com> wrote: >>> Hmm, this discussion seems to assume that we only use >>> copy_file_range() to copy/clone whole segment files, right? That's >>> great and may even get most of the available benefit given typical >>> databases with many segments of old data that never changes, but... I >>> think copy_write_range() allows us to go further than the other >>> whole-file clone techniques: we can stitch together parts of an old >>> backup segment file and an incremental backup to create a new file. >>> If you're interested in minimising disk use while also removing >>> dependencies on the preceding chain of backups, then it might make >>> sense to do that even if you *also* have to read the data to compute >>> the checksums, I think? That's why I mentioned it: if >>> copy_file_range() (ie sub-file-level block sharing) is a solution in >>> search of a problem, has the world ever seen a better problem than >>> pg_combinebackup? >> >> That makes sense; it's just a different part of the code than I >> thought we were talking about. >> > > Yeah, that's in write_reconstructed_file() and the patch does not touch > that at all. I agree it would be nice to use copy_file_range() in this > part too, and it doesn't seem it'd be that hard to do, I think. > > It seems we'd just need a "fork" that either calls pread/pwrite or > copy_file_range, depending on checksums and what was requested. > Here's a patch to use copy_file_range in write_reconstructed_file too, when requested/possible. One thing that I'm not sure about is whether to do pg_fatal() if --copy-file-range but the platform does not support it. This is more like what pg_upgrade does, but maybe we should just ignore what the user requested and fallback to the regular copy (a bit like when having to do a checksum for some files). Or maybe the check should just happen earlier ... I've been thinking about what Thomas wrote - that maybe it'd be good to do copy_file_range() even when calculating the checksum, and I think he may be right. But the current patch does not do that, and while it doesn't seem very difficult to do (at least when reconstructing the file from incremental backups) I don't have a very good intuition whether it'd be a win or not in typical cases. I have a naive question about the checksumming - if we used a merkle-tree-like scheme, i.e. hashing blocks and not hashes of blocks, wouldn't that allow calculating the hashes even without having to read the blocks, making copy_file_range more efficient? Sure, it's more complex, but a well known scheme. (OK, now I realized it'd mean we can't use tools like sha224sum to hash the files and compare to manifest. I guess that's why we don't do this ...) regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: