Re: Proposal: Incremental Backup

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Proposal: Incremental Backup
Дата
Msg-id 20140812232651.GG16422@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Proposal: Incremental Backup  (Claudio Freire <klaussfreire@gmail.com>)
Ответы Re: Proposal: Incremental Backup  (Claudio Freire <klaussfreire@gmail.com>)
Список pgsql-hackers
Claudio,

* Claudio Freire (klaussfreire@gmail.com) wrote:
> I'm not talking about malicious attacks, with big enough data sets,
> checksum collisions are much more likely to happen than with smaller
> ones, and incremental backups are supposed to work for the big sets.

This is an issue when you're talking about de-duplication, not when
you're talking about testing if two files are the same or not for
incremental backup purposes.  The size of the overall data set in this
case is not relevant as you're only ever looking at the same (at most
1G) specific file in the PostgreSQL data directory.  Were you able to
actually produce a file with a colliding checksum as an existing PG
file, the chance that you'd be able to construct one which *also* has
a valid page layout sufficient that it wouldn't be obviously massivly
corrupted is very quickly approaching zero.

> You could use strong cryptographic checksums, but such strong
> checksums still aren't perfect, and even if you accept the slim chance
> of collision, they are quite expensive to compute, so it's bound to be
> a bottleneck with good I/O subsystems. Checking the LSN is much
> cheaper.

For my 2c on this- I'm actually behind the idea of using the LSN (though
I have not followed this thread in any detail), but there's plenty of
existing incremental backup solutions (PG specific and not) which work
just fine by doing checksums.  If you truely feel that this is a real
concern, I'd suggest you review the rsync binary diff protocol which is
used extensively around the world and show reports of it failing in the
field.
Thanks,
    Stephen

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [PATCH] PostgreSQL 9.4 mmap(2) performance regression on FreeBSD...
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: WAL format and API changes (9.5)