Re: Add recovery to pg_control and remove backup_label
От | David Steele |
---|---|
Тема | Re: Add recovery to pg_control and remove backup_label |
Дата | |
Msg-id | c9a8b7e0-a451-4148-abcd-1ba7c2e661b7@pgmasters.net обсуждение исходный текст |
Ответ на | Re: Add recovery to pg_control and remove backup_label (Andres Freund <andres@anarazel.de>) |
Список | pgsql-hackers |
On 11/21/23 16:00, Andres Freund wrote: > Hi, > > On 2023-11-21 14:48:59 -0400, David Steele wrote: >>> I'd not call 7.06->4.77 or 6.76->4.77 "virtually free". >> >> OK, but how does that look with compression > > With compression it's obviously somewhat different - but that part is done in > parallel, potentially on a different machine with client side compression, > whereas I think right now the checksumming is single-threaded, on the server > side. Ah, yes, that's certainly a bottleneck. > With parallel server side compression, it's still 20% slower with the default > checksumming than none. With client side it's 15%. Yeah, that still seems a lot. But to a large extent it sounds like a limitation of the current implementation. >> -- to a remote location? > > I think this one unfortunately makes checksums a bigger issue, not a smaller > one. The network interaction piece is single-threaded, adding another > significant use of CPU onto the same thread means that you are hit harder by > using substantial amount of CPU for checksumming in the same thread. > > Once you go beyond the small instances, you have plenty network bandwidth in > cloud environments. We top out well before the network on bigger instances. > >> Uncompressed backup to local storage doesn't seem very realistic. With gzip >> compression we measure SHA1 checksums at about 5% of total CPU. > > IMO using gzip is basically infeasible for non-toy sized databases today. I > think we're using our users a disservice by defaulting to it in a bunch of > places. Even if another default exposes them to difficulty due to potentially > using a different compiled binary with fewer supported compression methods - > that's gona be very rare in practice. Yeah, I don't use gzip anymore, but there are still some platforms that do not provide zstd (at least not easily) and lz4 compresses less. One thing people do seem to have is a lot of cores. >> I can't understate how valuable checksums are in finding corruption, >> especially in long-lived backups. > > I agree! But I think we need faster checksum algorithms or a faster > implementation of the existing ones. And probably default to something faster > once we have it. We've been using xxHash to generate checksums for our block-level incremental and it is seriously fast, written by the same guy who did zstd and lz4. Regards, -David
В списке pgsql-hackers по дате отправления: