Обсуждение: pg_restore scan

Поиск
Список
Период
Сортировка

pg_restore scan

От
R Wahyudi
Дата:

I'm trying to troubleshoot the slowness issue with pg_restore and stumbled across a recent post about pg_restore scanning the whole file : 

> "scanning happens in a very inefficient way, with many seek calls and small block reads. Try strace to see them. This initial phase can take hours in a huge dump file, before even starting any actual restoration."

I'm currently having this same issue. 
 
At the early stage of restoration I can see lots of disk writes activities but as time goes by, disk writes activities are reduced.
I can see the COPY process in postgres but not using any CPU, and the process that uses CPU are pg_restores. 

I can recreate this issue when restoring a specific table to stdout. 

ie :
pg_restore -vvvv -t <some_table_at_the> DB.pgdump -f -

If the table is at the bottom of the TOC it will take  hours before I get a result, but I get an almost immediate result when the table is at the top. 
 parallel restore suffers with the same issue where each process has to perform a scan for each table.

What is the best way to speed up the restore ? 


More info about my environment : 
pg_restore (PostgreSQL) 17.6

Archive : 
; Archive created at 2025-09-16 16:08:28 AEST
;     dbname: DB
;     TOC Entries: 8221
;     Compression: none
;     Dump Version: 1.14-0
;     Format: CUSTOM
;     Integer: 4 bytes
;     Offset: 8 bytes
;     Dumped from database version: 14.15
;     Dumped by pg_dump version: 14.19 (Ubuntu 14.19-1.pgdg22.04+1)






Re: pg_restore scan

От
Adrian Klaver
Дата:
On 9/16/25 15:25, R Wahyudi wrote:
> 
> I'm trying to troubleshoot the slowness issue with pg_restore and 
> stumbled across a recent post about pg_restore scanning the whole file :
> 
>  > "scanning happens in a very inefficient way, with many seek calls and 
> small block reads. Try strace to see them. This initial phase can take 
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820- 
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/ 
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as 
result had no object offsets in the TOC.

How are you doing your pg_dump?



-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: pg_restore scan

От
R Wahyudi
Дата:
pg_dump was done using the following command : 
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> 

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
>  > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.

How are you doing your pg_dump?



--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_restore scan

От
Ron Johnson
Дата:
So, piping or redirecting to a file?  If so, then that's the problem.

pg_dump directly to a file puts file offsets in the TOC.

This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log

On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:
pg_dump was done using the following command : 
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> 

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
>  > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.

How are you doing your pg_dump?



--
Adrian Klaver
adrian.klaver@aklaver.com


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_restore scan

От
Adrian Klaver
Дата:
On 9/16/25 17:54, R Wahyudi wrote:
> pg_dump was done using the following command :
> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>

What do you do with the output?




-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: pg_restore scan

От
R Wahyudi
Дата:
Sorry for not including the full command - yes , its piping to a compression command : 
 | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>


I think we found the issue! I'll do further testing and see how it goes !





On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
So, piping or redirecting to a file?  If so, then that's the problem.

pg_dump directly to a file puts file offsets in the TOC.

This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log

On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:
pg_dump was done using the following command : 
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> 

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
>  > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.

How are you doing your pg_dump?



--
Adrian Klaver
adrian.klaver@aklaver.com


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_restore scan

От
Ron Johnson
Дата:

PG 17 has integrated zstd compression, while --format=directory lets you do multi-threaded dumps.  That's much faster than a single-threaded pg_dump into a multi-threaded compression program.

(If for _Reasons_ you require a single-file backup, then tar the directory of compressed files using the --remove-files option.)

On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com> wrote:
Sorry for not including the full command - yes , its piping to a compression command : 
 | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>


I think we found the issue! I'll do further testing and see how it goes !





On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
So, piping or redirecting to a file?  If so, then that's the problem.

pg_dump directly to a file puts file offsets in the TOC.

This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log

On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:
pg_dump was done using the following command : 
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> 

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
>  > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.

How are you doing your pg_dump?



--
Adrian Klaver
adrian.klaver@aklaver.com


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_restore scan

От
R Wahyudi
Дата:
Hi All, 

Thanks for the quick and accurate response!  I never been so happy seeing IOwait on my system! 

I might be blind as  I can't find information about 'offset' in pg_dump documentation.
Where can I find more info about this? 

Regards,
Rianto

On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

PG 17 has integrated zstd compression, while --format=directory lets you do multi-threaded dumps.  That's much faster than a single-threaded pg_dump into a multi-threaded compression program.

(If for _Reasons_ you require a single-file backup, then tar the directory of compressed files using the --remove-files option.)

On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com> wrote:
Sorry for not including the full command - yes , its piping to a compression command : 
 | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>


I think we found the issue! I'll do further testing and see how it goes !





On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
So, piping or redirecting to a file?  If so, then that's the problem.

pg_dump directly to a file puts file offsets in the TOC.

This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log

On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:
pg_dump was done using the following command : 
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> 

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
>  > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.

How are you doing your pg_dump?



--
Adrian Klaver
adrian.klaver@aklaver.com


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_restore scan

От
Ron Johnson
Дата:

It's towards the end of this long mailing list thread from a couple of weeks ago.


On Thu, Sep 18, 2025 at 8:58 AM R Wahyudi <rwahyudi@gmail.com> wrote:
Hi All, 

Thanks for the quick and accurate response!  I never been so happy seeing IOwait on my system! 

I might be blind as  I can't find information about 'offset' in pg_dump documentation.
Where can I find more info about this? 

Regards,
Rianto

On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

PG 17 has integrated zstd compression, while --format=directory lets you do multi-threaded dumps.  That's much faster than a single-threaded pg_dump into a multi-threaded compression program.

(If for _Reasons_ you require a single-file backup, then tar the directory of compressed files using the --remove-files option.)

On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com> wrote:
Sorry for not including the full command - yes , its piping to a compression command : 
 | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>


I think we found the issue! I'll do further testing and see how it goes !





On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
So, piping or redirecting to a file?  If so, then that's the problem.

pg_dump directly to a file puts file offsets in the TOC.

This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log

On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:
pg_dump was done using the following command : 
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> 

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
>  > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.

How are you doing your pg_dump?



--
Adrian Klaver
adrian.klaver@aklaver.com


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_restore scan

От
Adrian Klaver
Дата:
On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
> 
> Thanks for the quick and accurate response!  I never been so happy 
> seeing IOwait on my system!

Because?

What did you find?

> 
> I might be blind as  I can't find information about 'offset' in pg_dump 
> documentation.
> Where can I find more info about this?

It is not in the user documentation.

 From the thread Ron referred to, there is an explanation here:

https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us

I believe the actual code, for the -Fc format, is in pg_backup_custom.c 
here:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723

Per comment at line 755:

"
  If possible, re-write the TOC in order to update the data offset 
information.  This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore).  We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"

> 
> Regards,
> Rianto
> 
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com 
> <mailto:ronljohnsonjr@gmail.com>> wrote:
> 
> 
>     PG 17 has integrated zstd compression, while --format=directory lets
>     you do multi-threaded dumps.  That's much faster than a single-
>     threaded pg_dump into a multi-threaded compression program.
> 
>     (If for _Reasons_ you require a single-file backup, then tar the
>     directory of compressed files using the --remove-files option.)
> 
>     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com
>     <mailto:rwahyudi@gmail.com>> wrote:
> 
>         Sorry for not including the full command - yes , its piping to a
>         compression command :
>           | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
> 
> 
>         I think we found the issue! I'll do further testing and see how
>         it goes !
> 
> 
> 
> 
> 
>         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>         <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>> wrote:
> 
>             So, piping or redirecting to a file?  If so, then that's the
>             problem.
> 
>             pg_dump directly to a file puts file offsets in the TOC.
> 
>             This how I do custom dumps:
>             cd $BackupDir
>             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>               2> ${db}.log
> 
>             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>             <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>> wrote:
> 
>                 pg_dump was done using the following command :
>                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
> 
>                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>                 <adrian.klaver@aklaver.com
>                 <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>                     On 9/16/25 15:25, R Wahyudi wrote:
>                      >
>                      > I'm trying to troubleshoot the slowness issue
>                     with pg_restore and
>                      > stumbled across a recent post about pg_restore
>                     scanning the whole file :
>                      >
>                      >  > "scanning happens in a very inefficient way,
>                     with many seek calls and
>                      > small block reads. Try strace to see them. This
>                     initial phase can take
>                      > hours in a huge dump file, before even starting
>                     any actual restoration."
>                      > see : https://www.postgresql.org/message-id/
>                     E48B611D-7D61-4575-A820- <https://
>                     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
>                      > B2C3EC2E0551%40gmx.net <http://40gmx.net>
>                     <https://www.postgresql.org/message-id/ <https://
>                     www.postgresql.org/message-id/>
>                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
>                     <http://40gmx.net>>
> 
>                     This was for pg_dump output that was streamed to a
>                     Borg archive and as
>                     result had no object offsets in the TOC.
> 
>                     How are you doing your pg_dump?
> 
> 
> 
>                     -- 
>                     Adrian Klaver
>                     adrian.klaver@aklaver.com
>                     <mailto:adrian.klaver@aklaver.com>
> 
> 
> 
>             -- 
>             Death to <Redacted>, and butter sauce.
>             Don't boil me, I'm still alive.
>             <Redacted> lobster!
> 
> 
> 
>     -- 
>     Death to <Redacted>, and butter sauce.
>     Don't boil me, I'm still alive.
>     <Redacted> lobster!
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: pg_restore scan

От
R Wahyudi
Дата:
I've been given a database dump file daily and I've been asked to restore it. 
I tried everything I could to speed up the process, including using -j 40. 

I discovered that at the later stage of the restore process,  the following behaviour repeated a few times : 
40 x pg_restore process doing 100% CPU
40 x  postgres process doing COPY but using 0% CPU 
..... and zero disk write activity

I don't see this behaviour when restoring the database that was dumped with -Fd.
Also with an un-piped backup file, I can restore a specific table without having to wait for hours. 


--





On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
>
> Thanks for the quick and accurate response!  I never been so happy
> seeing IOwait on my system!

Because?

What did you find?

>
> I might be blind as  I can't find information about 'offset' in pg_dump
> documentation.
> Where can I find more info about this?

It is not in the user documentation.

 From the thread Ron referred to, there is an explanation here:

https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us

I believe the actual code, for the -Fc format, is in pg_backup_custom.c
here:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723

Per comment at line 755:

"
  If possible, re-write the TOC in order to update the data offset
information.  This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore).  We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"

>
> Regards,
> Rianto
>
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com
> <mailto:ronljohnsonjr@gmail.com>> wrote:
>
>
>     PG 17 has integrated zstd compression, while --format=directory lets
>     you do multi-threaded dumps.  That's much faster than a single-
>     threaded pg_dump into a multi-threaded compression program.
>
>     (If for _Reasons_ you require a single-file backup, then tar the
>     directory of compressed files using the --remove-files option.)
>
>     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com
>     <mailto:rwahyudi@gmail.com>> wrote:
>
>         Sorry for not including the full command - yes , its piping to a
>         compression command :
>           | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>
>
>         I think we found the issue! I'll do further testing and see how
>         it goes !
>
>
>
>
>
>         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>         <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>> wrote:
>
>             So, piping or redirecting to a file?  If so, then that's the
>             problem.
>
>             pg_dump directly to a file puts file offsets in the TOC.
>
>             This how I do custom dumps:
>             cd $BackupDir
>             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>               2> ${db}.log
>
>             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>             <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>> wrote:
>
>                 pg_dump was done using the following command :
>                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>
>                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>                 <adrian.klaver@aklaver.com
>                 <mailto:adrian.klaver@aklaver.com>> wrote:
>
>                     On 9/16/25 15:25, R Wahyudi wrote:
>                      >
>                      > I'm trying to troubleshoot the slowness issue
>                     with pg_restore and
>                      > stumbled across a recent post about pg_restore
>                     scanning the whole file :
>                      >
>                      >  > "scanning happens in a very inefficient way,
>                     with many seek calls and
>                      > small block reads. Try strace to see them. This
>                     initial phase can take
>                      > hours in a huge dump file, before even starting
>                     any actual restoration."
>                      > see : https://www.postgresql.org/message-id/
>                     E48B611D-7D61-4575-A820- <https://
>                     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
>                      > B2C3EC2E0551%40gmx.net <http://40gmx.net>
>                     <https://www.postgresql.org/message-id/ <https://
>                     www.postgresql.org/message-id/>
>                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
>                     <http://40gmx.net>>
>
>                     This was for pg_dump output that was streamed to a
>                     Borg archive and as
>                     result had no object offsets in the TOC.
>
>                     How are you doing your pg_dump?
>
>
>
>                     --
>                     Adrian Klaver
>                     adrian.klaver@aklaver.com
>                     <mailto:adrian.klaver@aklaver.com>
>
>
>
>             --
>             Death to <Redacted>, and butter sauce.
>             Don't boil me, I'm still alive.
>             <Redacted> lobster!
>
>
>
>     --
>     Death to <Redacted>, and butter sauce.
>     Don't boil me, I'm still alive.
>     <Redacted> lobster!
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_restore scan

От
Adrian Klaver
Дата:

On 9/18/25 2:36 PM, R Wahyudi wrote:
> I've been given a database dump file daily and I've been asked to 
> restore it.
> I tried everything I could to speed up the process, including using -j 40.
> 
> I discovered that at the later stage of the restore process,  the 
> following behaviour repeated a few times :
> 40 x pg_restore process doing 100% CPU
> 40 x  postgres process doing COPY but using 0% CPU
> ..... and zero disk write activity
> 
> I don't see this behaviour when restoring the database that was dumped 
> with -Fd.
> Also with an un-piped backup file, I can restore a specific table 
> without having to wait for hours.

 From the docs:

https://www.postgresql.org/docs/current/app-pgrestore.html

"
-j number-of-jobs

Only the custom and directory archive formats are supported with this 
option. The input must be a regular file or directory (not, for example, 
a pipe or standard input). Also, multiple jobs cannot be used together 
with the option --single-transaction.
"


> 
> 
> --
> 
> 
> 
> 
> 
> On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 9/18/25 05:58, R Wahyudi wrote:
>      > Hi All,
>      >
>      > Thanks for the quick and accurate response!  I never been so happy
>      > seeing IOwait on my system!
> 
>     Because?
> 
>     What did you find?
> 
>      >
>      > I might be blind as  I can't find information about 'offset' in
>     pg_dump
>      > documentation.
>      > Where can I find more info about this?
> 
>     It is not in the user documentation.
> 
>       From the thread Ron referred to, there is an explanation here:
> 
>     https://www.postgresql.org/message-
>     id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/
>     message-id/366773.1756749256%40sss.pgh.pa.us>
> 
>     I believe the actual code, for the -Fc format, is in pg_backup_custom.c
>     here:
> 
>     https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
>     pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/
>     master/src/bin/pg_dump/pg_backup_custom.c#L723>
> 
>     Per comment at line 755:
> 
>     "
>        If possible, re-write the TOC in order to update the data offset
>     information.  This is not essential, as pg_restore can cope in most
>     cases without it; but it can make pg_restore significantly faster
>     in some situations (especially parallel restore).  We can skip this
>     step if we're not dumping any data; there are no offsets to update
>     in that case.
>     "
> 
>      >
>      > Regards,
>      > Rianto
>      >
>      > On Wed, 17 Sept 2025 at 13:48, Ron Johnson
>     <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>
>      > <mailto:ronljohnsonjr@gmail.com
>     <mailto:ronljohnsonjr@gmail.com>>> wrote:
>      >
>      >
>      >     PG 17 has integrated zstd compression, while --
>     format=directory lets
>      >     you do multi-threaded dumps.  That's much faster than a single-
>      >     threaded pg_dump into a multi-threaded compression program.
>      >
>      >     (If for _Reasons_ you require a single-file backup, then tar the
>      >     directory of compressed files using the --remove-files option.)
>      >
>      >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi
>     <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>
>      >     <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote:
>      >
>      >         Sorry for not including the full command - yes , its
>     piping to a
>      >         compression command :
>      >           | lbzip2 -n <threadsforbzipgoeshere>--best >
>     <filenamegoeshere>
>      >
>      >
>      >         I think we found the issue! I'll do further testing and
>     see how
>      >         it goes !
>      >
>      >
>      >
>      >
>      >
>      >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>      >         <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>
>     <mailto:ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>>>
>     wrote:
>      >
>      >             So, piping or redirecting to a file?  If so, then
>     that's the
>      >             problem.
>      >
>      >             pg_dump directly to a file puts file offsets in the TOC.
>      >
>      >             This how I do custom dumps:
>      >             cd $BackupDir
>      >             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>      >               2> ${db}.log
>      >
>      >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>      >             <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>
>     <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote:
>      >
>      >                 pg_dump was done using the following command :
>      >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>      >
>      >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>      >                 <adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>
>      >                 <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>> wrote:
>      >
>      >                     On 9/16/25 15:25, R Wahyudi wrote:
>      >                      >
>      >                      > I'm trying to troubleshoot the slowness issue
>      >                     with pg_restore and
>      >                      > stumbled across a recent post about pg_restore
>      >                     scanning the whole file :
>      >                      >
>      >                      >  > "scanning happens in a very inefficient
>     way,
>      >                     with many seek calls and
>      >                      > small block reads. Try strace to see them.
>     This
>      >                     initial phase can take
>      >                      > hours in a huge dump file, before even
>     starting
>      >                     any actual restoration."
>      >                      > see : https://www.postgresql.org/message-
>     id/ <https://www.postgresql.org/message-id/>
>      >                     E48B611D-7D61-4575-A820- <https://
>      > www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http://
>     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->>
>      >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net>
>     <http://40gmx.net <http://40gmx.net>>
>      >                     <https://www.postgresql.org/message-id/
>     <https://www.postgresql.org/message-id/> <https://
>      > www.postgresql.org/message-id/ <http://www.postgresql.org/
>     message-id/>>
>      >                      > E48B611D-7D61-4575-A820-
>     B2C3EC2E0551%40gmx.net <http://40gmx.net>
>      >                     <http://40gmx.net <http://40gmx.net>>>
>      >
>      >                     This was for pg_dump output that was streamed
>     to a
>      >                     Borg archive and as
>      >                     result had no object offsets in the TOC.
>      >
>      >                     How are you doing your pg_dump?
>      >
>      >
>      >
>      >                     --
>      >                     Adrian Klaver
>      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>      >                     <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>
>      >
>      >
>      >
>      >             --
>      >             Death to <Redacted>, and butter sauce.
>      >             Don't boil me, I'm still alive.
>      >             <Redacted> lobster!
>      >
>      >
>      >
>      >     --
>      >     Death to <Redacted>, and butter sauce.
>      >     Don't boil me, I'm still alive.
>      >     <Redacted> lobster!
>      >
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_restore scan

От
R Wahyudi
Дата:
>> The input must be a regular file or directory (not, for example, a pipe or standard input). 

Thanks again for the pointer! 

I successfully ran a parallel restore with no warnings presented. 
I didn't really pay attention to how the dump was taken until I accidentally stumbled upon your post.


Regards,
Rianto




On Fri, 19 Sept 2025 at 07:45, Adrian Klaver <adrian.klaver@aklaver.com> wrote:


On 9/18/25 2:36 PM, R Wahyudi wrote:
> I've been given a database dump file daily and I've been asked to
> restore it.
> I tried everything I could to speed up the process, including using -j 40.
>
> I discovered that at the later stage of the restore process,  the
> following behaviour repeated a few times :
> 40 x pg_restore process doing 100% CPU
> 40 x  postgres process doing COPY but using 0% CPU
> ..... and zero disk write activity
>
> I don't see this behaviour when restoring the database that was dumped
> with -Fd.
> Also with an un-piped backup file, I can restore a specific table
> without having to wait for hours.

 From the docs:

https://www.postgresql.org/docs/current/app-pgrestore.html

"
-j number-of-jobs

Only the custom and directory archive formats are supported with this
option. The input must be a regular file or directory (not, for example,
a pipe or standard input). Also, multiple jobs cannot be used together
with the option --single-transaction.
"


>
>
> --
>
>
>
>
>
> On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 9/18/25 05:58, R Wahyudi wrote:
>      > Hi All,
>      >
>      > Thanks for the quick and accurate response!  I never been so happy
>      > seeing IOwait on my system!
>
>     Because?
>
>     What did you find?
>
>      >
>      > I might be blind as  I can't find information about 'offset' in
>     pg_dump
>      > documentation.
>      > Where can I find more info about this?
>
>     It is not in the user documentation.
>
>       From the thread Ron referred to, there is an explanation here:
>
>     https://www.postgresql.org/message-
>     id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/
>     message-id/366773.1756749256%40sss.pgh.pa.us>
>
>     I believe the actual code, for the -Fc format, is in pg_backup_custom.c
>     here:
>
>     https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
>     pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/
>     master/src/bin/pg_dump/pg_backup_custom.c#L723>
>
>     Per comment at line 755:
>
>     "
>        If possible, re-write the TOC in order to update the data offset
>     information.  This is not essential, as pg_restore can cope in most
>     cases without it; but it can make pg_restore significantly faster
>     in some situations (especially parallel restore).  We can skip this
>     step if we're not dumping any data; there are no offsets to update
>     in that case.
>     "
>
>      >
>      > Regards,
>      > Rianto
>      >
>      > On Wed, 17 Sept 2025 at 13:48, Ron Johnson
>     <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>
>      > <mailto:ronljohnsonjr@gmail.com
>     <mailto:ronljohnsonjr@gmail.com>>> wrote:
>      >
>      >
>      >     PG 17 has integrated zstd compression, while --
>     format=directory lets
>      >     you do multi-threaded dumps.  That's much faster than a single-
>      >     threaded pg_dump into a multi-threaded compression program.
>      >
>      >     (If for _Reasons_ you require a single-file backup, then tar the
>      >     directory of compressed files using the --remove-files option.)
>      >
>      >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi
>     <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>
>      >     <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote:
>      >
>      >         Sorry for not including the full command - yes , its
>     piping to a
>      >         compression command :
>      >           | lbzip2 -n <threadsforbzipgoeshere>--best >
>     <filenamegoeshere>
>      >
>      >
>      >         I think we found the issue! I'll do further testing and
>     see how
>      >         it goes !
>      >
>      >
>      >
>      >
>      >
>      >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>      >         <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>
>     <mailto:ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>>>
>     wrote:
>      >
>      >             So, piping or redirecting to a file?  If so, then
>     that's the
>      >             problem.
>      >
>      >             pg_dump directly to a file puts file offsets in the TOC.
>      >
>      >             This how I do custom dumps:
>      >             cd $BackupDir
>      >             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>      >               2> ${db}.log
>      >
>      >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>      >             <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>
>     <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote:
>      >
>      >                 pg_dump was done using the following command :
>      >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>      >
>      >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>      >                 <adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>
>      >                 <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>> wrote:
>      >
>      >                     On 9/16/25 15:25, R Wahyudi wrote:
>      >                      >
>      >                      > I'm trying to troubleshoot the slowness issue
>      >                     with pg_restore and
>      >                      > stumbled across a recent post about pg_restore
>      >                     scanning the whole file :
>      >                      >
>      >                      >  > "scanning happens in a very inefficient
>     way,
>      >                     with many seek calls and
>      >                      > small block reads. Try strace to see them.
>     This
>      >                     initial phase can take
>      >                      > hours in a huge dump file, before even
>     starting
>      >                     any actual restoration."
>      >                      > see : https://www.postgresql.org/message-
>     id/ <https://www.postgresql.org/message-id/>
>      >                     E48B611D-7D61-4575-A820- <https://
>      > www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http://
>     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->>
>      >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net>
>     <http://40gmx.net <http://40gmx.net>>
>      >                     <https://www.postgresql.org/message-id/
>     <https://www.postgresql.org/message-id/> <https://
>      > www.postgresql.org/message-id/ <http://www.postgresql.org/
>     message-id/>>
>      >                      > E48B611D-7D61-4575-A820-
>     B2C3EC2E0551%40gmx.net <http://40gmx.net>
>      >                     <http://40gmx.net <http://40gmx.net>>>
>      >
>      >                     This was for pg_dump output that was streamed
>     to a
>      >                     Borg archive and as
>      >                     result had no object offsets in the TOC.
>      >
>      >                     How are you doing your pg_dump?
>      >
>      >
>      >
>      >                     --
>      >                     Adrian Klaver
>      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>      >                     <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>
>      >
>      >
>      >
>      >             --
>      >             Death to <Redacted>, and butter sauce.
>      >             Don't boil me, I'm still alive.
>      >             <Redacted> lobster!
>      >
>      >
>      >
>      >     --
>      >     Death to <Redacted>, and butter sauce.
>      >     Don't boil me, I'm still alive.
>      >     <Redacted> lobster!
>      >
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_restore scan

От
Ron Johnson
Дата:
On Thu, Sep 18, 2025 at 5:37 PM R Wahyudi <rwahyudi@gmail.com> wrote:
I've been given a database dump file daily and I've been asked to restore it. 
I tried everything I could to speed up the process, including using -j 40. 

I discovered that at the later stage of the restore process,  the following behaviour repeated a few times : 
40 x pg_restore process doing 100% CPU

Threads are not magic.  IO and memory limitations still exist.
 
40 x  postgres process doing COPY but using 0% CPU 
..... and zero disk write activity

I don't see this behaviour when restoring the database that was dumped with -Fd.
Also with an un-piped backup file, I can restore a specific table without having to wait for hours. 

We explained this three days ago.  Heck, it's in this very email.   Click on "the three dots", scroll down a bit.
 
On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
>
> Thanks for the quick and accurate response!  I never been so happy
> seeing IOwait on my system!

Because?

What did you find?

>
> I might be blind as  I can't find information about 'offset' in pg_dump
> documentation.
> Where can I find more info about this?

It is not in the user documentation.

 From the thread Ron referred to, there is an explanation here:

https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us

I believe the actual code, for the -Fc format, is in pg_backup_custom.c
here:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723

Per comment at line 755:

"
  If possible, re-write the TOC in order to update the data offset
information.  This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore).  We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"

>
> Regards,
> Rianto
>
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com
> <mailto:ronljohnsonjr@gmail.com>> wrote:
>
>
>     PG 17 has integrated zstd compression, while --format=directory lets
>     you do multi-threaded dumps.  That's much faster than a single-
>     threaded pg_dump into a multi-threaded compression program.
>
>     (If for _Reasons_ you require a single-file backup, then tar the
>     directory of compressed files using the --remove-files option.)
>
>     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com
>     <mailto:rwahyudi@gmail.com>> wrote:
>
>         Sorry for not including the full command - yes , its piping to a
>         compression command :
>           | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>
>
>         I think we found the issue! I'll do further testing and see how
>         it goes !
>
>
>
>
>
>         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>         <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>> wrote:
>
>             So, piping or redirecting to a file?  If so, then that's the
>             problem.
>
>             pg_dump directly to a file puts file offsets in the TOC.
>
>             This how I do custom dumps:
>             cd $BackupDir
>             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>               2> ${db}.log
>
>             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>             <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>> wrote:
>
>                 pg_dump was done using the following command :
>                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>
>                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>                 <adrian.klaver@aklaver.com
>                 <mailto:adrian.klaver@aklaver.com>> wrote:
>
>                     On 9/16/25 15:25, R Wahyudi wrote:
>                      >
>                      > I'm trying to troubleshoot the slowness issue
>                     with pg_restore and
>                      > stumbled across a recent post about pg_restore
>                     scanning the whole file :
>                      >
>                      >  > "scanning happens in a very inefficient way,
>                     with many seek calls and
>                      > small block reads. Try strace to see them. This
>                     initial phase can take
>                      > hours in a huge dump file, before even starting
>                     any actual restoration."
>                      > see : https://www.postgresql.org/message-id/
>                     E48B611D-7D61-4575-A820- <https://
>                     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
>                      > B2C3EC2E0551%40gmx.net <http://40gmx.net>
>                     <https://www.postgresql.org/message-id/ <https://
>                     www.postgresql.org/message-id/>
>                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
>                     <http://40gmx.net>>
>
>                     This was for pg_dump output that was streamed to a
>                     Borg archive and as
>                     result had no object offsets in the TOC.
>
>                     How are you doing your pg_dump?
>
>
>
>                     --
>                     Adrian Klaver
>                     adrian.klaver@aklaver.com
>                     <mailto:adrian.klaver@aklaver.com>
>
>
>
>             --
>             Death to <Redacted>, and butter sauce.
>             Don't boil me, I'm still alive.
>             <Redacted> lobster!
>
>
>
>     --
>     Death to <Redacted>, and butter sauce.
>     Don't boil me, I'm still alive.
>     <Redacted> lobster!
>


--
Adrian Klaver
adrian.klaver@aklaver.com


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!