Обсуждение: [MASSMAIL]incremental backup breakage in BlockRefTableEntryGetBlocks

Поиск
Список
Период
Сортировка

[MASSMAIL]incremental backup breakage in BlockRefTableEntryGetBlocks

От
Robert Haas
Дата:
Hi,

Yesterday, Tomas Vondra reported to me off-list that he was seeing
what appeared to be data corruption after taking and restoring an
incremental backup. Overnight, Jakub Wartak further experimented with
Tomas's test case, did some initial analysis, and made it very easy to
reproduce. I spent this morning tracking down the problem, for which I
attach a patch.

It doesn't take a whole lot to hit this bug. I think all you need is a
relation that is more than 1GB in size, plus a modification to the
second half of some 1GB file between the full backup and the
incremental backup, plus some way of realizing after you do a restore
that you've gotten the wrong answer. It's obviously quite embarrassing
that this wasn't caught sooner, and to be honest I'm not sure why we
didn't. I think the test cases pass because we avoid committing test
cases that create large relations, but the failure to catch it during
manual testing must be because I never worked hard enough to verify
that the results were fully correct. Ouch.

The actual bug is here:

if (chunkno == stop_chunkno - 1)
    stop_offset = stop_blkno % BLOCKS_PER_CHUNK;

Each chunk covers 64k block numbers. The caller specifies a range of
block numbers of interest via start_blkno (inclusive) and stop_blkno
(non-inclusive). We need to translate those start and stop values for
the overall function call into start and stop values for each
particular chunk. When we're on any chunk but the last one of
interest, the stop offset is BLOCKS_PER_CHUNK i.e. we care about
blocks all the way through the end of the chunk. The existing code
handles that fine. If stop_blkno is somewhere in the middle of the
last chunk, the existing code also handles that fine. But the code is
wrong when stop_blkno is a multiple of BLOCKS_PER_CHUNK, because it
then calculates a stop_offset of 0 (which is never right, because that
would mean that we thought the chunk was relevant when it isn't)
rather than BLOCKS_PER_CHUNK. So this forgets about 64k * BLCKSZ =
512MB of potentially modified blocks whenever the size of a single
relation file is exactly 512MB or exactly 1GB, which our users would
probably find more than slightly distressing.

I'm not posting Jakub's reproducer here because it was sent to me
privately and, although I presume he would have no issue with me
posting it, I can't actually confirm that right at the moment.
Hopefully he'll reply with it, and double-check that this does indeed
fix the issue. My thanks to both Tomas and Jakub.

Regards,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: incremental backup breakage in BlockRefTableEntryGetBlocks

От
Tomas Vondra
Дата:
On 4/4/24 19:38, Robert Haas wrote:
> Hi,
> 
> Yesterday, Tomas Vondra reported to me off-list that he was seeing
> what appeared to be data corruption after taking and restoring an
> incremental backup. Overnight, Jakub Wartak further experimented with
> Tomas's test case, did some initial analysis, and made it very easy to
> reproduce. I spent this morning tracking down the problem, for which I
> attach a patch.
> 

Thanks, I can confirm this fixes the issue I've observed/reported. On
master 10 out of 10 runs failed, with the patch no failures.

The test is very simple:

1) init pgbench
2) full backup
3) run short pgbench
4) incremental backup
5) compare pg_dumpall on the instance vs. restored backup


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: incremental backup breakage in BlockRefTableEntryGetBlocks

От
Jakub Wartak
Дата:
On Thu, Apr 4, 2024 at 9:11 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 4/4/24 19:38, Robert Haas wrote:
> > Hi,
> >
> > Yesterday, Tomas Vondra reported to me off-list that he was seeing
> > what appeared to be data corruption after taking and restoring an
> > incremental backup. Overnight, Jakub Wartak further experimented with
> > Tomas's test case, did some initial analysis, and made it very easy to
> > reproduce. I spent this morning tracking down the problem, for which I
> > attach a patch.
> >
>
> Thanks, I can confirm this fixes the issue I've observed/reported. On
> master 10 out of 10 runs failed, with the patch no failures.

Same here, patch fixes it on recent master. I've also run pgbench for
~30mins and compared master and incremental and got 0 differences,
should be good.

> The test is very simple:
>
> 1) init pgbench

Tomas had magic fingers here - he used pgbench -i -s 100 which causes
bigger relations (it wouldn't trigger for smaller -s values as Robert
explained - now it makes full sense; in earlier tests I was using much
smaller -s , then transitioned to other workloads (mostly append
only), and final 100GB+/24h+ tests used mostly INSERTs rather than
UPDATEs AFAIR). The other interesting thing is that one of the animals
runs with configure --with-relsegsize=<somesmallvalue> (so new
relations are full much earlier) and it was not catched there either -
Wouldn't it be good idea to to test in src/test/recover/ like that?

And of course i'm attaching reproducer with some braindump notes in
case in future one hits similiar issue and wonders where to even start
looking (it's very primitive though but might help).

-J.

Вложения

Re: incremental backup breakage in BlockRefTableEntryGetBlocks

От
Robert Haas
Дата:
On Fri, Apr 5, 2024 at 2:59 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> And of course i'm attaching reproducer with some braindump notes in
> case in future one hits similiar issue and wonders where to even start
> looking (it's very primitive though but might help).

Thanks. I've committed the patch now.

--
Robert Haas
EDB: http://www.enterprisedb.com