Обсуждение: pg_restore scan
I'm trying to troubleshoot the slowness issue with pg_restore and stumbled across a recent post about pg_restore scanning the whole file :
> "scanning happens in a very inefficient way, with many seek calls and small block reads. Try strace to see them. This initial phase can take hours in a huge dump file, before even starting any actual restoration."
I'm currently having this same issue.
At the early stage of restoration I can see lots of disk writes activities but as time goes by, disk writes activities are reduced.
I can see the COPY process in postgres but not using any CPU, and the process that uses CPU are pg_restores.
I can recreate this issue when restoring a specific table to stdout.
ie :
pg_restore -vvvv -t <some_table_at_the> DB.pgdump -f -
If the table is at the bottom of the TOC it will take hours before I get a result, but I get an almost immediate result when the table is at the top.
parallel restore suffers with the same issue where each process has to perform a scan for each table.
What is the best way to speed up the restore ?
More info about my environment :
pg_restore (PostgreSQL) 17.6
Archive :
; Archive created at 2025-09-16 16:08:28 AEST
; dbname: DB
; TOC Entries: 8221
; Compression: none
; Dump Version: 1.14-0
; Format: CUSTOM
; Integer: 4 bytes
; Offset: 8 bytes
; Dumped from database version: 14.15
; Dumped by pg_dump version: 14.19 (Ubuntu 14.19-1.pgdg22.04+1)
; dbname: DB
; TOC Entries: 8221
; Compression: none
; Dump Version: 1.14-0
; Format: CUSTOM
; Integer: 4 bytes
; Offset: 8 bytes
; Dumped from database version: 14.15
; Dumped by pg_dump version: 14.19 (Ubuntu 14.19-1.pgdg22.04+1)
On 9/16/25 15:25, R Wahyudi wrote: > > I'm trying to troubleshoot the slowness issue with pg_restore and > stumbled across a recent post about pg_restore scanning the whole file : > > > "scanning happens in a very inefficient way, with many seek calls and > small block reads. Try strace to see them. This initial phase can take > hours in a huge dump file, before even starting any actual restoration." > see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820- > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/ > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net> This was for pg_dump output that was streamed to a Borg archive and as result had no object offsets in the TOC. How are you doing your pg_dump? -- Adrian Klaver adrian.klaver@aklaver.com
pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
> > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.
How are you doing your pg_dump?
--
Adrian Klaver
adrian.klaver@aklaver.com
So, piping or redirecting to a file? If so, then that's the problem.
pg_dump directly to a file puts file offsets in the TOC.
This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump 2> ${db}.log
On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:
pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
> > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.
How are you doing your pg_dump?
--
Adrian Klaver
adrian.klaver@aklaver.com
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 9/16/25 17:54, R Wahyudi wrote: > pg_dump was done using the following command : > pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> What do you do with the output? -- Adrian Klaver adrian.klaver@aklaver.com
Sorry for not including the full command - yes , its piping to a compression command :
| lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
I think we found the issue! I'll do further testing and see how it goes !
On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
So, piping or redirecting to a file? If so, then that's the problem.pg_dump directly to a file puts file offsets in the TOC.This how I do custom dumps:cd $BackupDirpg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump 2> ${db}.logOn Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
> > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.
How are you doing your pg_dump?
--
Adrian Klaver
adrian.klaver@aklaver.com--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
PG 17 has integrated zstd compression, while --format=directory lets you do multi-threaded dumps. That's much faster than a single-threaded pg_dump into a multi-threaded compression program.
(If for _Reasons_ you require a single-file backup, then tar the directory of compressed files using the --remove-files option.)
On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com> wrote:
Sorry for not including the full command - yes , its piping to a compression command :| lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>I think we found the issue! I'll do further testing and see how it goes !On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:So, piping or redirecting to a file? If so, then that's the problem.pg_dump directly to a file puts file offsets in the TOC.This how I do custom dumps:cd $BackupDirpg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump 2> ${db}.logOn Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
> > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.
How are you doing your pg_dump?
--
Adrian Klaver
adrian.klaver@aklaver.com--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Hi All,
Thanks for the quick and accurate response! I never been so happy seeing IOwait on my system!
I might be blind as I can't find information about 'offset' in pg_dump documentation.
Where can I find more info about this?
Regards,
Rianto
On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
PG 17 has integrated zstd compression, while --format=directory lets you do multi-threaded dumps. That's much faster than a single-threaded pg_dump into a multi-threaded compression program.(If for _Reasons_ you require a single-file backup, then tar the directory of compressed files using the --remove-files option.)On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com> wrote:Sorry for not including the full command - yes , its piping to a compression command :| lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>I think we found the issue! I'll do further testing and see how it goes !On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:So, piping or redirecting to a file? If so, then that's the problem.pg_dump directly to a file puts file offsets in the TOC.This how I do custom dumps:cd $BackupDirpg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump 2> ${db}.logOn Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
> > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.
How are you doing your pg_dump?
--
Adrian Klaver
adrian.klaver@aklaver.com--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
It's towards the end of this long mailing list thread from a couple of weeks ago.
On Thu, Sep 18, 2025 at 8:58 AM R Wahyudi <rwahyudi@gmail.com> wrote:
Hi All,Thanks for the quick and accurate response! I never been so happy seeing IOwait on my system!I might be blind as I can't find information about 'offset' in pg_dump documentation.Where can I find more info about this?Regards,RiantoOn Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com> wrote:PG 17 has integrated zstd compression, while --format=directory lets you do multi-threaded dumps. That's much faster than a single-threaded pg_dump into a multi-threaded compression program.(If for _Reasons_ you require a single-file backup, then tar the directory of compressed files using the --remove-files option.)On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com> wrote:Sorry for not including the full command - yes , its piping to a compression command :| lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>I think we found the issue! I'll do further testing and see how it goes !On Wed, 17 Sept 2025 at 11:02, Ron Johnson <ronljohnsonjr@gmail.com> wrote:So, piping or redirecting to a file? If so, then that's the problem.pg_dump directly to a file puts file offsets in the TOC.This how I do custom dumps:cd $BackupDirpg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump 2> ${db}.logOn Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <rwahyudi@gmail.com> wrote:pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 9/16/25 15:25, R Wahyudi wrote:
>
> I'm trying to troubleshoot the slowness issue with pg_restore and
> stumbled across a recent post about pg_restore scanning the whole file :
>
> > "scanning happens in a very inefficient way, with many seek calls and
> small block reads. Try strace to see them. This initial phase can take
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
This was for pg_dump output that was streamed to a Borg archive and as
result had no object offsets in the TOC.
How are you doing your pg_dump?
--
Adrian Klaver
adrian.klaver@aklaver.com--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 9/18/25 05:58, R Wahyudi wrote: > Hi All, > > Thanks for the quick and accurate response! I never been so happy > seeing IOwait on my system! Because? What did you find? > > I might be blind as I can't find information about 'offset' in pg_dump > documentation. > Where can I find more info about this? It is not in the user documentation. From the thread Ron referred to, there is an explanation here: https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us I believe the actual code, for the -Fc format, is in pg_backup_custom.c here: https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723 Per comment at line 755: " If possible, re-write the TOC in order to update the data offset information. This is not essential, as pg_restore can cope in most cases without it; but it can make pg_restore significantly faster in some situations (especially parallel restore). We can skip this step if we're not dumping any data; there are no offsets to update in that case. " > > Regards, > Rianto > > On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com > <mailto:ronljohnsonjr@gmail.com>> wrote: > > > PG 17 has integrated zstd compression, while --format=directory lets > you do multi-threaded dumps. That's much faster than a single- > threaded pg_dump into a multi-threaded compression program. > > (If for _Reasons_ you require a single-file backup, then tar the > directory of compressed files using the --remove-files option.) > > On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com > <mailto:rwahyudi@gmail.com>> wrote: > > Sorry for not including the full command - yes , its piping to a > compression command : > | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere> > > > I think we found the issue! I'll do further testing and see how > it goes ! > > > > > > On Wed, 17 Sept 2025 at 11:02, Ron Johnson > <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>> wrote: > > So, piping or redirecting to a file? If so, then that's the > problem. > > pg_dump directly to a file puts file offsets in the TOC. > > This how I do custom dumps: > cd $BackupDir > pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump > 2> ${db}.log > > On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi > <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>> wrote: > > pg_dump was done using the following command : > pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> > > On Wed, 17 Sept 2025 at 08:36, Adrian Klaver > <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 9/16/25 15:25, R Wahyudi wrote: > > > > I'm trying to troubleshoot the slowness issue > with pg_restore and > > stumbled across a recent post about pg_restore > scanning the whole file : > > > > > "scanning happens in a very inefficient way, > with many seek calls and > > small block reads. Try strace to see them. This > initial phase can take > > hours in a huge dump file, before even starting > any actual restoration." > > see : https://www.postgresql.org/message-id/ > E48B611D-7D61-4575-A820- <https:// > www.postgresql.org/message-id/E48B611D-7D61-4575-A820-> > > B2C3EC2E0551%40gmx.net <http://40gmx.net> > <https://www.postgresql.org/message-id/ <https:// > www.postgresql.org/message-id/> > > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net > <http://40gmx.net>> > > This was for pg_dump output that was streamed to a > Borg archive and as > result had no object offsets in the TOC. > > How are you doing your pg_dump? > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com> > > > > -- > Death to <Redacted>, and butter sauce. > Don't boil me, I'm still alive. > <Redacted> lobster! > > > > -- > Death to <Redacted>, and butter sauce. > Don't boil me, I'm still alive. > <Redacted> lobster! > -- Adrian Klaver adrian.klaver@aklaver.com
I've been given a database dump file daily and I've been asked to restore it.
I tried everything I could to speed up the process, including using -j 40.
I discovered that at the later stage of the restore process, the following behaviour repeated a few times :
40 x pg_restore process doing 100% CPU
40 x postgres process doing COPY but using 0% CPU
..... and zero disk write activity
I don't see this behaviour when restoring the database that was dumped with -Fd.
Also with an un-piped backup file, I can restore a specific table without having to wait for hours.
--
On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
>
> Thanks for the quick and accurate response! I never been so happy
> seeing IOwait on my system!
Because?
What did you find?
>
> I might be blind as I can't find information about 'offset' in pg_dump
> documentation.
> Where can I find more info about this?
It is not in the user documentation.
From the thread Ron referred to, there is an explanation here:
https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us
I believe the actual code, for the -Fc format, is in pg_backup_custom.c
here:
https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723
Per comment at line 755:
"
If possible, re-write the TOC in order to update the data offset
information. This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore). We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"
>
> Regards,
> Rianto
>
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com
> <mailto:ronljohnsonjr@gmail.com>> wrote:
>
>
> PG 17 has integrated zstd compression, while --format=directory lets
> you do multi-threaded dumps. That's much faster than a single-
> threaded pg_dump into a multi-threaded compression program.
>
> (If for _Reasons_ you require a single-file backup, then tar the
> directory of compressed files using the --remove-files option.)
>
> On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com
> <mailto:rwahyudi@gmail.com>> wrote:
>
> Sorry for not including the full command - yes , its piping to a
> compression command :
> | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>
>
> I think we found the issue! I'll do further testing and see how
> it goes !
>
>
>
>
>
> On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>> wrote:
>
> So, piping or redirecting to a file? If so, then that's the
> problem.
>
> pg_dump directly to a file puts file offsets in the TOC.
>
> This how I do custom dumps:
> cd $BackupDir
> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
> 2> ${db}.log
>
> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>> wrote:
>
> pg_dump was done using the following command :
> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>
> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 9/16/25 15:25, R Wahyudi wrote:
> >
> > I'm trying to troubleshoot the slowness issue
> with pg_restore and
> > stumbled across a recent post about pg_restore
> scanning the whole file :
> >
> > > "scanning happens in a very inefficient way,
> with many seek calls and
> > small block reads. Try strace to see them. This
> initial phase can take
> > hours in a huge dump file, before even starting
> any actual restoration."
> > see : https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820- <https://
> www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
> > B2C3EC2E0551%40gmx.net <http://40gmx.net>
> <https://www.postgresql.org/message-id/ <https://
> www.postgresql.org/message-id/>
> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
> <http://40gmx.net>>
>
> This was for pg_dump output that was streamed to a
> Borg archive and as
> result had no object offsets in the TOC.
>
> How are you doing your pg_dump?
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>
>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On 9/18/25 2:36 PM, R Wahyudi wrote: > I've been given a database dump file daily and I've been asked to > restore it. > I tried everything I could to speed up the process, including using -j 40. > > I discovered that at the later stage of the restore process, the > following behaviour repeated a few times : > 40 x pg_restore process doing 100% CPU > 40 x postgres process doing COPY but using 0% CPU > ..... and zero disk write activity > > I don't see this behaviour when restoring the database that was dumped > with -Fd. > Also with an un-piped backup file, I can restore a specific table > without having to wait for hours. From the docs: https://www.postgresql.org/docs/current/app-pgrestore.html " -j number-of-jobs Only the custom and directory archive formats are supported with this option. The input must be a regular file or directory (not, for example, a pipe or standard input). Also, multiple jobs cannot be used together with the option --single-transaction. " > > > -- > > > > > > On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 9/18/25 05:58, R Wahyudi wrote: > > Hi All, > > > > Thanks for the quick and accurate response! I never been so happy > > seeing IOwait on my system! > > Because? > > What did you find? > > > > > I might be blind as I can't find information about 'offset' in > pg_dump > > documentation. > > Where can I find more info about this? > > It is not in the user documentation. > > From the thread Ron referred to, there is an explanation here: > > https://www.postgresql.org/message- > id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/ > message-id/366773.1756749256%40sss.pgh.pa.us> > > I believe the actual code, for the -Fc format, is in pg_backup_custom.c > here: > > https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/ > pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/ > master/src/bin/pg_dump/pg_backup_custom.c#L723> > > Per comment at line 755: > > " > If possible, re-write the TOC in order to update the data offset > information. This is not essential, as pg_restore can cope in most > cases without it; but it can make pg_restore significantly faster > in some situations (especially parallel restore). We can skip this > step if we're not dumping any data; there are no offsets to update > in that case. > " > > > > > Regards, > > Rianto > > > > On Wed, 17 Sept 2025 at 13:48, Ron Johnson > <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com> > > <mailto:ronljohnsonjr@gmail.com > <mailto:ronljohnsonjr@gmail.com>>> wrote: > > > > > > PG 17 has integrated zstd compression, while -- > format=directory lets > > you do multi-threaded dumps. That's much faster than a single- > > threaded pg_dump into a multi-threaded compression program. > > > > (If for _Reasons_ you require a single-file backup, then tar the > > directory of compressed files using the --remove-files option.) > > > > On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi > <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com> > > <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote: > > > > Sorry for not including the full command - yes , its > piping to a > > compression command : > > | lbzip2 -n <threadsforbzipgoeshere>--best > > <filenamegoeshere> > > > > > > I think we found the issue! I'll do further testing and > see how > > it goes ! > > > > > > > > > > > > On Wed, 17 Sept 2025 at 11:02, Ron Johnson > > <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com> > <mailto:ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>>> > wrote: > > > > So, piping or redirecting to a file? If so, then > that's the > > problem. > > > > pg_dump directly to a file puts file offsets in the TOC. > > > > This how I do custom dumps: > > cd $BackupDir > > pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump > > 2> ${db}.log > > > > On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi > > <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com> > <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote: > > > > pg_dump was done using the following command : > > pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> > > > > On Wed, 17 Sept 2025 at 08:36, Adrian Klaver > > <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com> > > <mailto:adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>>> wrote: > > > > On 9/16/25 15:25, R Wahyudi wrote: > > > > > > I'm trying to troubleshoot the slowness issue > > with pg_restore and > > > stumbled across a recent post about pg_restore > > scanning the whole file : > > > > > > > "scanning happens in a very inefficient > way, > > with many seek calls and > > > small block reads. Try strace to see them. > This > > initial phase can take > > > hours in a huge dump file, before even > starting > > any actual restoration." > > > see : https://www.postgresql.org/message- > id/ <https://www.postgresql.org/message-id/> > > E48B611D-7D61-4575-A820- <https:// > > www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http:// > www.postgresql.org/message-id/E48B611D-7D61-4575-A820->> > > > B2C3EC2E0551%40gmx.net <http://40gmx.net> > <http://40gmx.net <http://40gmx.net>> > > <https://www.postgresql.org/message-id/ > <https://www.postgresql.org/message-id/> <https:// > > www.postgresql.org/message-id/ <http://www.postgresql.org/ > message-id/>> > > > E48B611D-7D61-4575-A820- > B2C3EC2E0551%40gmx.net <http://40gmx.net> > > <http://40gmx.net <http://40gmx.net>>> > > > > This was for pg_dump output that was streamed > to a > > Borg archive and as > > result had no object offsets in the TOC. > > > > How are you doing your pg_dump? > > > > > > > > -- > > Adrian Klaver > > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > <mailto:adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> > > > > > > > > -- > > Death to <Redacted>, and butter sauce. > > Don't boil me, I'm still alive. > > <Redacted> lobster! > > > > > > > > -- > > Death to <Redacted>, and butter sauce. > > Don't boil me, I'm still alive. > > <Redacted> lobster! > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > -- Adrian Klaver adrian.klaver@aklaver.com
>> The input must be a regular file or directory (not, for example, a pipe or standard input).
Thanks again for the pointer!
I successfully ran a parallel restore with no warnings presented.
I didn't really pay attention to how the dump was taken until I accidentally stumbled upon your post.
Regards,
Rianto
On Fri, 19 Sept 2025 at 07:45, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 9/18/25 2:36 PM, R Wahyudi wrote:
> I've been given a database dump file daily and I've been asked to
> restore it.
> I tried everything I could to speed up the process, including using -j 40.
>
> I discovered that at the later stage of the restore process, the
> following behaviour repeated a few times :
> 40 x pg_restore process doing 100% CPU
> 40 x postgres process doing COPY but using 0% CPU
> ..... and zero disk write activity
>
> I don't see this behaviour when restoring the database that was dumped
> with -Fd.
> Also with an un-piped backup file, I can restore a specific table
> without having to wait for hours.
From the docs:
https://www.postgresql.org/docs/current/app-pgrestore.html
"
-j number-of-jobs
Only the custom and directory archive formats are supported with this
option. The input must be a regular file or directory (not, for example,
a pipe or standard input). Also, multiple jobs cannot be used together
with the option --single-transaction.
"
>
>
> --
>
>
>
>
>
> On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 9/18/25 05:58, R Wahyudi wrote:
> > Hi All,
> >
> > Thanks for the quick and accurate response! I never been so happy
> > seeing IOwait on my system!
>
> Because?
>
> What did you find?
>
> >
> > I might be blind as I can't find information about 'offset' in
> pg_dump
> > documentation.
> > Where can I find more info about this?
>
> It is not in the user documentation.
>
> From the thread Ron referred to, there is an explanation here:
>
> https://www.postgresql.org/message-
> id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/
> message-id/366773.1756749256%40sss.pgh.pa.us>
>
> I believe the actual code, for the -Fc format, is in pg_backup_custom.c
> here:
>
> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
> pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/
> master/src/bin/pg_dump/pg_backup_custom.c#L723>
>
> Per comment at line 755:
>
> "
> If possible, re-write the TOC in order to update the data offset
> information. This is not essential, as pg_restore can cope in most
> cases without it; but it can make pg_restore significantly faster
> in some situations (especially parallel restore). We can skip this
> step if we're not dumping any data; there are no offsets to update
> in that case.
> "
>
> >
> > Regards,
> > Rianto
> >
> > On Wed, 17 Sept 2025 at 13:48, Ron Johnson
> <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>
> > <mailto:ronljohnsonjr@gmail.com
> <mailto:ronljohnsonjr@gmail.com>>> wrote:
> >
> >
> > PG 17 has integrated zstd compression, while --
> format=directory lets
> > you do multi-threaded dumps. That's much faster than a single-
> > threaded pg_dump into a multi-threaded compression program.
> >
> > (If for _Reasons_ you require a single-file backup, then tar the
> > directory of compressed files using the --remove-files option.)
> >
> > On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi
> <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>
> > <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote:
> >
> > Sorry for not including the full command - yes , its
> piping to a
> > compression command :
> > | lbzip2 -n <threadsforbzipgoeshere>--best >
> <filenamegoeshere>
> >
> >
> > I think we found the issue! I'll do further testing and
> see how
> > it goes !
> >
> >
> >
> >
> >
> > On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> > <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>
> <mailto:ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>>>
> wrote:
> >
> > So, piping or redirecting to a file? If so, then
> that's the
> > problem.
> >
> > pg_dump directly to a file puts file offsets in the TOC.
> >
> > This how I do custom dumps:
> > cd $BackupDir
> > pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
> > 2> ${db}.log
> >
> > On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> > <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>
> <mailto:rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>>> wrote:
> >
> > pg_dump was done using the following command :
> > pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
> >
> > On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> > <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>
> > <mailto:adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>>> wrote:
> >
> > On 9/16/25 15:25, R Wahyudi wrote:
> > >
> > > I'm trying to troubleshoot the slowness issue
> > with pg_restore and
> > > stumbled across a recent post about pg_restore
> > scanning the whole file :
> > >
> > > > "scanning happens in a very inefficient
> way,
> > with many seek calls and
> > > small block reads. Try strace to see them.
> This
> > initial phase can take
> > > hours in a huge dump file, before even
> starting
> > any actual restoration."
> > > see : https://www.postgresql.org/message-
> id/ <https://www.postgresql.org/message-id/>
> > E48B611D-7D61-4575-A820- <https://
> > www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http://
> www.postgresql.org/message-id/E48B611D-7D61-4575-A820->>
> > > B2C3EC2E0551%40gmx.net <http://40gmx.net>
> <http://40gmx.net <http://40gmx.net>>
> > <https://www.postgresql.org/message-id/
> <https://www.postgresql.org/message-id/> <https://
> > www.postgresql.org/message-id/ <http://www.postgresql.org/
> message-id/>>
> > > E48B611D-7D61-4575-A820-
> B2C3EC2E0551%40gmx.net <http://40gmx.net>
> > <http://40gmx.net <http://40gmx.net>>>
> >
> > This was for pg_dump output that was streamed
> to a
> > Borg archive and as
> > result had no object offsets in the TOC.
> >
> > How are you doing your pg_dump?
> >
> >
> >
> > --
> > Adrian Klaver
> > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> > <mailto:adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>>
> >
> >
> >
> > --
> > Death to <Redacted>, and butter sauce.
> > Don't boil me, I'm still alive.
> > <Redacted> lobster!
> >
> >
> >
> > --
> > Death to <Redacted>, and butter sauce.
> > Don't boil me, I'm still alive.
> > <Redacted> lobster!
> >
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On Thu, Sep 18, 2025 at 5:37 PM R Wahyudi <rwahyudi@gmail.com> wrote:
I've been given a database dump file daily and I've been asked to restore it.I tried everything I could to speed up the process, including using -j 40.I discovered that at the later stage of the restore process, the following behaviour repeated a few times :40 x pg_restore process doing 100% CPU
Threads are not magic. IO and memory limitations still exist.
40 x postgres process doing COPY but using 0% CPU..... and zero disk write activityI don't see this behaviour when restoring the database that was dumped with -Fd.Also with an un-piped backup file, I can restore a specific table without having to wait for hours.
We explained this three days ago. Heck, it's in this very email. Click on "the three dots", scroll down a bit.
On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
>
> Thanks for the quick and accurate response! I never been so happy
> seeing IOwait on my system!
Because?
What did you find?
>
> I might be blind as I can't find information about 'offset' in pg_dump
> documentation.
> Where can I find more info about this?
It is not in the user documentation.
From the thread Ron referred to, there is an explanation here:
https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us
I believe the actual code, for the -Fc format, is in pg_backup_custom.c
here:
https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723
Per comment at line 755:
"
If possible, re-write the TOC in order to update the data offset
information. This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore). We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"
>
> Regards,
> Rianto
>
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnsonjr@gmail.com
> <mailto:ronljohnsonjr@gmail.com>> wrote:
>
>
> PG 17 has integrated zstd compression, while --format=directory lets
> you do multi-threaded dumps. That's much faster than a single-
> threaded pg_dump into a multi-threaded compression program.
>
> (If for _Reasons_ you require a single-file backup, then tar the
> directory of compressed files using the --remove-files option.)
>
> On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahyudi@gmail.com
> <mailto:rwahyudi@gmail.com>> wrote:
>
> Sorry for not including the full command - yes , its piping to a
> compression command :
> | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>
>
> I think we found the issue! I'll do further testing and see how
> it goes !
>
>
>
>
>
> On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> <ronljohnsonjr@gmail.com <mailto:ronljohnsonjr@gmail.com>> wrote:
>
> So, piping or redirecting to a file? If so, then that's the
> problem.
>
> pg_dump directly to a file puts file offsets in the TOC.
>
> This how I do custom dumps:
> cd $BackupDir
> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
> 2> ${db}.log
>
> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> <rwahyudi@gmail.com <mailto:rwahyudi@gmail.com>> wrote:
>
> pg_dump was done using the following command :
> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>
> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 9/16/25 15:25, R Wahyudi wrote:
> >
> > I'm trying to troubleshoot the slowness issue
> with pg_restore and
> > stumbled across a recent post about pg_restore
> scanning the whole file :
> >
> > > "scanning happens in a very inefficient way,
> with many seek calls and
> > small block reads. Try strace to see them. This
> initial phase can take
> > hours in a huge dump file, before even starting
> any actual restoration."
> > see : https://www.postgresql.org/message-id/
> E48B611D-7D61-4575-A820- <https://
> www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
> > B2C3EC2E0551%40gmx.net <http://40gmx.net>
> <https://www.postgresql.org/message-id/ <https://
> www.postgresql.org/message-id/>
> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
> <http://40gmx.net>>
>
> This was for pg_dump output that was streamed to a
> Borg archive and as
> result had no object offsets in the TOC.
>
> How are you doing your pg_dump?
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>
>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!