Re: pg_upgrade --copy-file-range
От | Jakub Wartak |
---|---|
Тема | Re: pg_upgrade --copy-file-range |
Дата | |
Msg-id | CAKZiRmyQ_F+OxHUi0+po9wnM=iwB0XUd=-ZT0ry_mOQJRnwmfA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pg_upgrade --copy-file-range (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: pg_upgrade --copy-file-range
|
Список | pgsql-hackers |
Hi Thomas, Michael, Peter and -hackers, On Sun, Dec 24, 2023 at 3:57 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Dec 23, 2023 at 09:52:59AM +1300, Thomas Munro wrote: > > As it happens I was just thinking about this particular patch because > > I suddenly had a strong urge to teach pg_combinebackup to use > > copy_file_range. I wonder if you had the same idea... > > Yeah, +1. That would make copy_file_blocks() more efficient where the > code is copying 50 blocks in batches because it needs to reassign > checksums to the blocks copied. I've tried to achieve what you were discussing. Actually this was my first thought when using pg_combinebackup with larger (realistic) backup sizes back in December. Attached is a set of very DIRTY (!) patches that provide CoW options (--clone/--copy-range-file) to pg_combinebackup (just like pg_upgrade to keep it in sync), while also refactoring some related bits of code to avoid duplication. With XFS (with reflink=1 which is default) on Linux with kernel 5.10 and ~210GB backups, I'm getting: root@jw-test-1:/xfs# du -sm * 210229 full 250 incr.1 Today in master, the old classic read()/while() loop without CoW/reflink optimization : root@jw-test-1:/xfs# rm -rf outtest; sync; sync ; sync; echo 3 | sudo tee /proc/sys/vm/drop_caches ; time /usr/pgsql17/bin/pg_combinebackup --manifest-checksums=NONE -o outtest full incr.1 3 real 49m43.963s user 0m0.887s sys 2m52.697s VS patch with "--clone" : root@jw-test-1:/xfs# rm -rf outtest; sync; sync ; sync; echo 3 | sudo tee /proc/sys/vm/drop_caches ; time /usr/pgsql17/bin/pg_combinebackup --manifest-checksums=NONE --clone -o outtest full incr.1 3 real 0m39.812s user 0m0.325s sys 0m2.401s So it is 49mins down to 40 seconds(!) +/-10s (3 tries) if the FS supports CoW/reflinks (XFS, BTRFS, upcoming bcachefs?). It looks to me that this might mean that if one actually wants to use incremental backups (to get minimal RTO), it would be wise to only use CoW filesystems from the start so that RTO is as low as possible. Random patch notes: - main meat is in v3-0002*, I hope i did not screw something seriously - in worst case: it is opt-in through switch, so the user always can stick to the classic copy - no docs so far - pg_copyfile_offload_supported() should actually be fixed if it is a good path forward - pgindent actually indents larger areas of code that I would like to, any ideas or is it ok? - not tested on Win32/MacOS/FreeBSD - i've tested pg_upgrade manually and it seems to work and issue correct syscalls, however some tests are failing(?). I haven't investigated why yet due to lack of time. Any help is appreciated. -J.
Вложения
В списке pgsql-hackers по дате отправления: