Re: [PATCH] Verify Checksums during Basebackups

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: [PATCH] Verify Checksums during Basebackups
Дата
Msg-id CABUevEyTJTvn328B6Jb=LdZFZJE6p0MPT=HivSXs-KbxGpqrGw@mail.gmail.com
обсуждение исходный текст
Ответ на [PATCH] Verify Checksums during Basebackups  (Michael Banck <michael.banck@credativ.de>)
Ответы Re: [PATCH] Verify Checksums during Basebackups  (Robert Haas <robertmhaas@gmail.com>)
Re: [PATCH] Verify Checksums during Basebackups  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers


On Wed, Feb 28, 2018 at 7:08 PM, Michael Banck <michael.banck@credativ.de> wrote:
Hi,

some installations have data which is only rarerly read, and if they are
so large that dumps are not routinely taken, data corruption would only
be detected with some large delay even with checksums enabled.

I think this is a very common scenario. Particularly when you take into account indexes and things like that.


The attached small patch verifies checksums (in case they are enabled)
during a basebackup. The rationale is that we are reading every block in
this case anyway, so this is a good opportunity to check them as well.
Other and complementary ways of checking the checksums are possible of
course, like the offline checking tool that Magnus just submitted.

It probably makes sense to use the same approach for determining the
segment numbers as Magnus did in his patch, or refactor that out in a
utility function, but I'm sick right now so wanted to submit this for
v11 first.

I did some light benchmarking and it seems that the performance
degradation is minimal, but this could well be platform or
architecture-dependent. Right now, the checksums are always checked but
maybe this could be made optional, probably by extending the replication
protocol.

I think it should be.

I think it would also be a good idea to have this a three-mode setting, with "no check", "check and warning", "check and error". Where "check and error" should be the default, but you could turn off that in "save whatever is left mode". But I think it's better if pg_basebackup simply fails on a checksum error, because that will make it glaringly obvious that there is a problem -- which is the main point of checksums in the first place. And then an option to turn it off completely in cases where performance is the thing.

Another quick note -- we need to assert that the size of the buffer is actually divisible by BLCKSZ. I don't think it's a common scenario, but it could break badly if somebody changes BLCKSZ. Either that or perhaps just change the TARSENDSIZE to be a multiple of BLCKSZ.



--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] path toward faster partition pruning
Следующее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] path toward faster partition pruning