Обсуждение: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

Поиск
Список
Период
Сортировка

BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18084
Logged by:          Git Queries
Email address:      gitqueries0@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system:   Microsoft Windows Server 2019 Standard
Description:

Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
perform a sanity check on the migrated tables using source and destination
row counts. Occasionally, we encounter count mismatches where the source
database has fewer rows, resulting in a higher count, or more rows,
resulting in a lower count.

Note: There are no migration issues, as all data rows have been successfully
migrated without data loss.

ProdDB=> select count(*) from PdtDetailsTable;
  count
---------
 1297324
(1 row)


ProdDB=> reindex table PdtDetailsTable;
REINDEX
ProdDB=> select count(*) from PdtDetailsTable;
  count
---------
 1297322
(1 row)


ProdDB=>
Upon reindexing, the count(*) query returns the correct results in the
source DB.

In the pg_log, we couldn't find any traces related to this behavior. Is
there a reason for such behavior, and how can this be addressed to prevent
future issues?


Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
David Rowley
Дата:
On Tue, 5 Sept 2023 at 06:19, PG Bug reporting form
<noreply@postgresql.org> wrote:
> Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
> perform a sanity check on the migrated tables using source and destination
> row counts. Occasionally, we encounter count mismatches where the source
> database has fewer rows, resulting in a higher count, or more rows,
> resulting in a lower count.

Can we assume there are no concurrent changes being made?

> Upon reindexing, the count(*) query returns the correct results in the
> source DB.

Is it possible that the previous index was created with the
CONCURRENTLY option? There have been a few bugs in that area.  I've
not looked exhaustively, but see [1] and [2]. Search for "reindex".

> In the pg_log, we couldn't find any traces related to this behavior. Is
> there a reason for such behavior, and how can this be addressed to prevent
> future issues?

There have been a number of bugs fixed since 10.0 that could have led
to this.  The best thing to do to help prevent this is always upgrade
when a minor version is released and follow any relevant instructions
given in the release notes. If you skip minor versions, then you'll
need to look at all the release notes from the minor version you're
going from, up to the version you're upgrading to.

It looks like this practice wasn't being followed as you're on 10.20,
which was released 1.5 years ago.  Minor versions are released to
address/fix bugs that are found. If you don't upgrade minor versions
then you don't receive bug fixes.

David

[1] https://www.postgresql.org/docs/release/10.16/
[2] https://www.postgresql.org/docs/release/10.19/



Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
Bruce Momjian
Дата:
On Mon, Sep  4, 2023 at 05:55:51PM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      18084
> Logged by:          Git Queries
> Email address:      gitqueries0@gmail.com
> PostgreSQL version: Unsupported/Unknown
> Operating system:   Microsoft Windows Server 2019 Standard
> Description:        
> 
> Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit
> 
> Platform - Windows
> 
> Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
> perform a sanity check on the migrated tables using source and destination
> row counts. Occasionally, we encounter count mismatches where the source
> database has fewer rows, resulting in a higher count, or more rows,
> resulting in a lower count.
> 
> Note: There are no migration issues, as all data rows have been successfully
> migrated without data loss.

And how did you perform this migration?  pg_dump, pg_upgrade, logical
replication?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
Git Queries
Дата:
> Can we assume there are no concurrent changes being made?

Yes there are no concurrent changes being made during the migration.

> Is it possible that the previous index was created with the
> CONCURRENTLY option? There have been a few bugs in that area.  I've
> not looked exhaustively, but see [1] and [2]. Search for "reindex".

No, indexes are not created with the CONCURRENTLY option.

> It looks like this practice wasn't being followed as you're on 10.20,
> which was released 1.5 years ago.  Minor versions are released to
> address/fix bugs that are found. If you don't upgrade minor versions
> then you don't receive bug fixes.

Sure, we will take this into account.

On Tue, Sep 5, 2023 at 2:43 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 5 Sept 2023 at 06:19, PG Bug reporting form
<noreply@postgresql.org> wrote:
> Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
> perform a sanity check on the migrated tables using source and destination
> row counts. Occasionally, we encounter count mismatches where the source
> database has fewer rows, resulting in a higher count, or more rows,
> resulting in a lower count.

Can we assume there are no concurrent changes being made?

> Upon reindexing, the count(*) query returns the correct results in the
> source DB.

Is it possible that the previous index was created with the
CONCURRENTLY option? There have been a few bugs in that area.  I've
not looked exhaustively, but see [1] and [2]. Search for "reindex".

> In the pg_log, we couldn't find any traces related to this behavior. Is
> there a reason for such behavior, and how can this be addressed to prevent
> future issues?

There have been a number of bugs fixed since 10.0 that could have led
to this.  The best thing to do to help prevent this is always upgrade
when a minor version is released and follow any relevant instructions
given in the release notes. If you skip minor versions, then you'll
need to look at all the release notes from the minor version you're
going from, up to the version you're upgrading to.

It looks like this practice wasn't being followed as you're on 10.20,
which was released 1.5 years ago.  Minor versions are released to
address/fix bugs that are found. If you don't upgrade minor versions
then you don't receive bug fixes.

David

[1] https://www.postgresql.org/docs/release/10.16/
[2] https://www.postgresql.org/docs/release/10.19/

Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
Git Queries
Дата:
> And how did you perform this migration?  pg_dump, pg_upgrade, logical
> replication?

Migration was performed using pg_dump.

On Tue, Sep 5, 2023 at 8:48 PM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Sep  4, 2023 at 05:55:51PM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference:      18084
> Logged by:          Git Queries
> Email address:      gitqueries0@gmail.com
> PostgreSQL version: Unsupported/Unknown
> Operating system:   Microsoft Windows Server 2019 Standard
> Description:       
>
> Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit
>
> Platform - Windows
>
> Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
> perform a sanity check on the migrated tables using source and destination
> row counts. Occasionally, we encounter count mismatches where the source
> database has fewer rows, resulting in a higher count, or more rows,
> resulting in a lower count.
>
> Note: There are no migration issues, as all data rows have been successfully
> migrated without data loss.

And how did you perform this migration?  pg_dump, pg_upgrade, logical
replication?

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
Bruce Momjian
Дата:
On Wed, Sep  6, 2023 at 11:11:05PM +0530, Git Queries wrote:
> > And how did you perform this migration?  pg_dump, pg_upgrade, logical
> > replication?
> 
> Migration was performed using pg_dump.

Uh, that is kind of a surprise since pg_dump/restore does a logical dump
--- none of the binary files are transfered, so it should always be
accurate.

---------------------------------------------------------------------------


> 
> On Tue, Sep 5, 2023 at 8:48 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Mon, Sep  4, 2023 at 05:55:51PM +0000, PG Bug reporting form wrote:
>     > The following bug has been logged on the website:
>     >
>     > Bug reference:      18084
>     > Logged by:          Git Queries
>     > Email address:      gitqueries0@gmail.com
>     > PostgreSQL version: Unsupported/Unknown
>     > Operating system:   Microsoft Windows Server 2019 Standard
>     > Description:       
>     >
>     > Version - PostgreSQL 10.20, compiled by Visual C++ build 1800, 64-bit
>     >
>     > Platform - Windows
>     >
>     > Regarding PostgreSQL, during the migration from PG 10.20 to 14.7, we will
>     > perform a sanity check on the migrated tables using source and
>     destination
>     > row counts. Occasionally, we encounter count mismatches where the source
>     > database has fewer rows, resulting in a higher count, or more rows,
>     > resulting in a lower count.
>     >
>     > Note: There are no migration issues, as all data rows have been
>     successfully
>     > migrated without data loss.
> 
>     And how did you perform this migration?  pg_dump, pg_upgrade, logical
>     replication?
> 
>     --
>       Bruce Momjian  <bruce@momjian.us>        https://momjian.us
>       EDB                                      https://enterprisedb.com
> 
>       Only you can decide what is important to you.
> 

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
David Rowley
Дата:
On Thu, 7 Sept 2023 at 11:17, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Sep  6, 2023 at 11:11:05PM +0530, Git Queries wrote:
> > Migration was performed using pg_dump.
>
> Uh, that is kind of a surprise since pg_dump/restore does a logical dump
> --- none of the binary files are transfered, so it should always be
> accurate.

I can't be certain, but I think the likely reason here is that the
newly restored instance is fine and all indexes match the heap, but
it's the source instance that has indexes with missing entries. Since
the pg_dump queries are most likely to result in seqscans, then all
rows will be returned and the new instance has no missing rows.

David



Re: BUG #18084: Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions

От
Bruce Momjian
Дата:
On Thu, Sep  7, 2023 at 12:26:53PM +1200, David Rowley wrote:
> On Thu, 7 Sept 2023 at 11:17, Bruce Momjian <bruce@momjian.us> wrote:
> >
> > On Wed, Sep  6, 2023 at 11:11:05PM +0530, Git Queries wrote:
> > > Migration was performed using pg_dump.
> >
> > Uh, that is kind of a surprise since pg_dump/restore does a logical dump
> > --- none of the binary files are transfered, so it should always be
> > accurate.
> 
> I can't be certain, but I think the likely reason here is that the
> newly restored instance is fine and all indexes match the heap, but
> it's the source instance that has indexes with missing entries. Since
> the pg_dump queries are most likely to result in seqscans, then all
> rows will be returned and the new instance has no missing rows.

Yes, very good point.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.