Re: Add recovery to pg_control and remove backup_label

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Add recovery to pg_control and remove backup_label
Дата
Msg-id 20231121200018.ifkhrclmum3gq2pt@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Add recovery to pg_control and remove backup_label  (David Steele <david@pgmasters.net>)
Ответы Re: Add recovery to pg_control and remove backup_label  (David Steele <david@pgmasters.net>)
Список pgsql-hackers
Hi,

On 2023-11-21 14:48:59 -0400, David Steele wrote:
> > I'd not call 7.06->4.77 or 6.76->4.77 "virtually free".
> 
> OK, but how does that look with compression

With compression it's obviously somewhat different - but that part is done in
parallel, potentially on a different machine with client side compression,
whereas I think right now the checksumming is single-threaded, on the server
side.

With parallel server side compression, it's still 20% slower with the default
checksumming than none. With client side it's 15%.


> -- to a remote location?

I think this one unfortunately makes checksums a bigger issue, not a smaller
one. The network interaction piece is single-threaded, adding another
significant use of CPU onto the same thread means that you are hit harder by
using substantial amount of CPU for checksumming in the same thread.

Once you go beyond the small instances, you have plenty network bandwidth in
cloud environments. We top out well before the network on bigger instances.


> Uncompressed backup to local storage doesn't seem very realistic. With gzip
> compression we measure SHA1 checksums at about 5% of total CPU.

IMO using gzip is basically infeasible for non-toy sized databases today. I
think we're using our users a disservice by defaulting to it in a bunch of
places. Even if another default exposes them to difficulty due to potentially
using a different compiled binary with fewer supported compression methods -
that's gona be very rare in practice.


> I can't understate how valuable checksums are in finding corruption,
> especially in long-lived backups.

I agree!  But I think we need faster checksum algorithms or a faster
implementation of the existing ones. And probably default to something faster
once we have it.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: Add recovery to pg_control and remove backup_label
Следующее
От: David Steele
Дата:
Сообщение: Re: Add recovery to pg_control and remove backup_label