On Thu, 21 Dec 2023 at 10:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I have spent some more effort in this area and developed a patch
series that I think addresses all of the performance issues that
we've discussed in this thread, both for pg_upgrade and more
general use of pg_dump/pg_restore.
Thanks for picking this up!
Applying all 4 patches, I also see good performance improvement.
With more Large Objects, although pg_dump improved significantly,
pg_restore is now comfortably an order of magnitude faster.
pg_dump times (seconds):
NumLOs dump-patch004 dump-HEAD improvement (%)
1 0.09 0.09 ~
10 0.10 0.12 ~
100 0.12 0.12 ~
1,000 0.41 0.44 ~
10,000 3 5 76%
100,000 35 47 36%
1,000,000 111 251 126%
pg_restore times (seconds):
NumLOs restore-patch0004 restore-HEAD improvement (%)
1 0.02 0.02 ~
10 0.03 0.03 ~
100 0.13 0.12 ~
1,000 0.98 0.97 ~
10,000 2 9 ~5x
100,000 6 93 13x
1,000,000 53 973 17x
Test details:
- pg_dump -Fd -j32 / pg_restore -j32
- 32vCPU / Ubuntu 20.04 / 260GB Memory / r6id.8xlarge
- Client & Server on same machine
- Empty LOs / Empty ACLs
- HEAD = 7d7ef075d2b3f3bac4db323c2a47fb15a4a9a817
- See attached graphs
IMHO the knob (for configuring batch size) is a non-blocker. The
default (1k) here is already way better than what we have today.
Look forward to feedback on the tests, or I'll continue testing
whether ACLs / non-empty LOs etc. adversely affect these numbers.
-
Robins Tharakan
Amazon Web Services