Обсуждение: Adding pg_dump flag for parallel export to pipes

Поиск

Список

Период

Сортировка

Adding pg_dump flag for parallel export to pipes

От

Nitin Motiani

Дата:

07 апреля, 20:16:58

Hi Hackers,

We are proposing the ability to specify a pipe command to pg_dump by a
flag. And attaching the patch set.

Why : Currently it is quite simple to pipe the output of pg_dump for
text format to a pipe at command line and do any manipulations
necessary. Following is an example :

       pg_dump <flags> <dbname> | lz4 | pv -L 10k | ssh remote.host
"cat - > remote.dump.lz4"

Here we first compress the stream using lz4 and then send it over ssh
to a remote host to be saved as a file while rate-limiting the network
usage to 10KB/s.

Something like this is not possible for format=directory (-Fd) since
all you can provide is the directory name to store the individual
files. Note it is not possible to do this irrespective of the usage of
the parallel dump option ('--jobs' flag).

While the directory format supports compression using a flag, the rest
of the operations in the above example are not possible. And a pipe
command provides more flexibility in what compression algorithm one
wants to use.

This patch set provides pg_dump the ability to pipe the data in the
directory mode by using a new flag '--pipe-command' (in both parallel
and non-parallel mode).

We also add a similar option to pg_restore.

The following can be the major use cases of these changes :
  1. Stream pg_dump output to a cloud storage
  2. SSH the data to a remote host (with or without throttling)
  3. Custom compression options


Usage Examples : Here is an example of how the pipe-command will look like.

     pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
should exist beforehand.)

This is equivalent to

     pg_dump -Fd mydb --file=dumpdir

(Please note that the flags '--file' or '--pipe-command' can't be used
together.)

For the more complex scenario as mentioned above, the command will be
(with the parallelism of 5) :

      pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
remote.host "cat > dumpdir/%f""

Please note the use of %f in the above examples. As a user would
almost always want to write the post-processing output to a file (or
perhaps a cloud location), we provide a format specifier %f in the
command. The implementation of pipe-command replaces these format
specifiers with the corresponding file names. These file names are the
same as they would be in the current usage of directory format with
'--file' flag (<dump_id>.dat, toc.dat, blob_NNN.toc,
blob_<blob_id>.dat).

The usage of this flag with pg_restore will also be similar. Here is
an example of restoring from a gzip compressed dump directory.

        pg_restore -C -Fd -d postgres --pipe-commnad="cat
dumpdir/%f.gz | gunzip"

The new flag in pg_restore also works with '-l' and '-L' options

        pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L db.list


Implementation Details : Here are the major changes :
   1. We reuse the same variables which store the file name to store
the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
(similar bools in pg_dump.c and pg_restore.c) to specify if it's a
pipe command.
    2. In the cases when the above bool is set to true, we use popen
and pclose instead of fopen and fclose.
     3. To enable the format specifier %f in the pipe-command, we make
changes to the file name creation logic in a few places. Currently the
file name (corresponding to a table or large object) is appended to
the directory name provided by '--file' command. In case of
'--pipe-command', we use 'replace_percent_placeholders' to replace %f
with the corresponding file name. This change is made for both table
files and LO TOC files.

With these core changes, the rest of the code continues working as-is.

We are attaching 4 patches for this change :

  001-pg_dump_pipe has the pg_dump pipe support code.
  002-pg_restore_pipe has the pg_restore pipe support.
  003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
  004-pg_dump_documentation has the proposed documentation changes.

We are working on the above test issues and cleanup of the patches.

Open Questions : There are a couple of open questions in the implementation :

     1. Currently the LO TOC file (blob_NNN.toc) is opened in the
append mode. This is not possible with popen for the pipe command.
From reading the code, it seems to us that this file doesn't need to
be opened in the append mode. As '_StartLOs' is called once per
archive entry in WriteDataChunksForToCEntry followed by the dumper
function and then '_EndLOs', it should be okay to change this to 'w'
mode. But this code has been there since the start so we haven't made
that change yet. In the patch, we have changed it to 'w' pipe-command
only and added the ideas for potential solutions in the comments.
     2. We are also not sure yet on how to handle the environment
issues when trying to add new tests to 002_pg_dump.pl.

Please let us know what you think.

Thanks & Regards,
Nitin Motiani
Google

Вложения

Re: Adding pg_dump flag for parallel export to pipes

От

Hannu Krosing

Дата:

07 апреля, 22:48:20

Just to bring this out separately : Does anybody have any idea why pipe commands close inside tests ?

Re: 003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.

----
Hannu

On Mon, Apr 7, 2025 at 7:17 PM Nitin Motiani <nitinmotiani@google.com> wrote:

Hi Hackers,

We are proposing the ability to specify a pipe command to pg_dump by a
flag. And attaching the patch set.

Why : Currently it is quite simple to pipe the output of pg_dump for
text format to a pipe at command line and do any manipulations
necessary. Following is an example :

pg_dump <flags> <dbname> | lz4 | pv -L 10k | ssh remote.host
"cat - > remote.dump.lz4"

Here we first compress the stream using lz4 and then send it over ssh
to a remote host to be saved as a file while rate-limiting the network
usage to 10KB/s.

Something like this is not possible for format=directory (-Fd) since
all you can provide is the directory name to store the individual
files. Note it is not possible to do this irrespective of the usage of
the parallel dump option ('--jobs' flag).

While the directory format supports compression using a flag, the rest
of the operations in the above example are not possible. And a pipe
command provides more flexibility in what compression algorithm one
wants to use.

This patch set provides pg_dump the ability to pipe the data in the
directory mode by using a new flag '--pipe-command' (in both parallel
and non-parallel mode).

We also add a similar option to pg_restore.

The following can be the major use cases of these changes :
1. Stream pg_dump output to a cloud storage
2. SSH the data to a remote host (with or without throttling)
3. Custom compression options

Usage Examples : Here is an example of how the pipe-command will look like.

pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
should exist beforehand.)

This is equivalent to

pg_dump -Fd mydb --file=dumpdir

(Please note that the flags '--file' or '--pipe-command' can't be used
together.)

For the more complex scenario as mentioned above, the command will be
(with the parallelism of 5) :

pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
remote.host "cat > dumpdir/%f""

Please note the use of %f in the above examples. As a user would
almost always want to write the post-processing output to a file (or
perhaps a cloud location), we provide a format specifier %f in the
command. The implementation of pipe-command replaces these format
specifiers with the corresponding file names. These file names are the
same as they would be in the current usage of directory format with
'--file' flag (<dump_id>.dat, toc.dat, blob_NNN.toc,
blob_<blob_id>.dat).

The usage of this flag with pg_restore will also be similar. Here is
an example of restoring from a gzip compressed dump directory.

pg_restore -C -Fd -d postgres --pipe-commnad="cat
dumpdir/%f.gz | gunzip"

The new flag in pg_restore also works with '-l' and '-L' options

pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L db.list

Implementation Details : Here are the major changes :
1. We reuse the same variables which store the file name to store
the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
(similar bools in pg_dump.c and pg_restore.c) to specify if it's a
pipe command.
2. In the cases when the above bool is set to true, we use popen
and pclose instead of fopen and fclose.
3. To enable the format specifier %f in the pipe-command, we make
changes to the file name creation logic in a few places. Currently the
file name (corresponding to a table or large object) is appended to
the directory name provided by '--file' command. In case of
'--pipe-command', we use 'replace_percent_placeholders' to replace %f
with the corresponding file name. This change is made for both table
files and LO TOC files.

With these core changes, the rest of the code continues working as-is.

We are attaching 4 patches for this change :

001-pg_dump_pipe has the pg_dump pipe support code.
002-pg_restore_pipe has the pg_restore pipe support.
003-pg_dump_basic_tests has a few basic validation tests for
correctmflag combinations. We need to write more automated tests in
002_pg_dump.pl but have been running into some issues with environment
setup due to which certain pipe commands result in the shell process
becoming defunct. These same commands are working fine in manual
testing. We are still looking into this.
004-pg_dump_documentation has the proposed documentation changes.

We are working on the above test issues and cleanup of the patches.

Open Questions : There are a couple of open questions in the implementation :

1. Currently the LO TOC file (blob_NNN.toc) is opened in the
append mode. This is not possible with popen for the pipe command.
From reading the code, it seems to us that this file doesn't need to
be opened in the append mode. As '_StartLOs' is called once per
archive entry in WriteDataChunksForToCEntry followed by the dumper
function and then '_EndLOs', it should be okay to change this to 'w'
mode. But this code has been there since the start so we haven't made
that change yet. In the patch, we have changed it to 'w' pipe-command
only and added the ideas for potential solutions in the comments.
2. We are also not sure yet on how to handle the environment
issues when trying to add new tests to 002_pg_dump.pl.

Please let us know what you think.

Thanks & Regards,
Nitin Motiani
Google

Re: Adding pg_dump flag for parallel export to pipes

От

Hannu Krosing

Дата:

22 апреля, 15:10:58

If there are no objections we will add this to the commitfest

On Mon, Apr 7, 2025 at 9:48 PM Hannu Krosing <hannuk@google.com> wrote:
>
>
> Just to bring this out separately : Does anybody have any idea why pipe commands close inside tests ?
>
> Re: 003-pg_dump_basic_tests has a few basic validation tests for
> correctmflag combinations. We need to write more automated tests in
> 002_pg_dump.pl but have been running into some issues with environment
> setup due to which certain pipe commands result in the shell process
> becoming defunct. These same commands are working fine in manual
> testing. We are still looking into this.
>
> ----
> Hannu
>
>
> On Mon, Apr 7, 2025 at 7:17 PM Nitin Motiani <nitinmotiani@google.com> wrote:
>>
>> Hi Hackers,
>>
>> We are proposing the ability to specify a pipe command to pg_dump by a
>> flag. And attaching the patch set.
>>
>> Why : Currently it is quite simple to pipe the output of pg_dump for
>> text format to a pipe at command line and do any manipulations
>> necessary. Following is an example :
>>
>>        pg_dump <flags> <dbname> | lz4 | pv -L 10k | ssh remote.host
>> "cat - > remote.dump.lz4"
>>
>> Here we first compress the stream using lz4 and then send it over ssh
>> to a remote host to be saved as a file while rate-limiting the network
>> usage to 10KB/s.
>>
>> Something like this is not possible for format=directory (-Fd) since
>> all you can provide is the directory name to store the individual
>> files. Note it is not possible to do this irrespective of the usage of
>> the parallel dump option ('--jobs' flag).
>>
>> While the directory format supports compression using a flag, the rest
>> of the operations in the above example are not possible. And a pipe
>> command provides more flexibility in what compression algorithm one
>> wants to use.
>>
>> This patch set provides pg_dump the ability to pipe the data in the
>> directory mode by using a new flag '--pipe-command' (in both parallel
>> and non-parallel mode).
>>
>> We also add a similar option to pg_restore.
>>
>> The following can be the major use cases of these changes :
>>   1. Stream pg_dump output to a cloud storage
>>   2. SSH the data to a remote host (with or without throttling)
>>   3. Custom compression options
>>
>>
>> Usage Examples : Here is an example of how the pipe-command will look like.
>>
>>      pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
>> should exist beforehand.)
>>
>> This is equivalent to
>>
>>      pg_dump -Fd mydb --file=dumpdir
>>
>> (Please note that the flags '--file' or '--pipe-command' can't be used
>> together.)
>>
>> For the more complex scenario as mentioned above, the command will be
>> (with the parallelism of 5) :
>>
>>       pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
>> remote.host "cat > dumpdir/%f""
>>
>> Please note the use of %f in the above examples. As a user would
>> almost always want to write the post-processing output to a file (or
>> perhaps a cloud location), we provide a format specifier %f in the
>> command. The implementation of pipe-command replaces these format
>> specifiers with the corresponding file names. These file names are the
>> same as they would be in the current usage of directory format with
>> '--file' flag (<dump_id>.dat, toc.dat, blob_NNN.toc,
>> blob_<blob_id>.dat).
>>
>> The usage of this flag with pg_restore will also be similar. Here is
>> an example of restoring from a gzip compressed dump directory.
>>
>>         pg_restore -C -Fd -d postgres --pipe-commnad="cat
>> dumpdir/%f.gz | gunzip"
>>
>> The new flag in pg_restore also works with '-l' and '-L' options
>>
>>         pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L db.list
>>
>>
>> Implementation Details : Here are the major changes :
>>    1. We reuse the same variables which store the file name to store
>> the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
>> (similar bools in pg_dump.c and pg_restore.c) to specify if it's a
>> pipe command.
>>     2. In the cases when the above bool is set to true, we use popen
>> and pclose instead of fopen and fclose.
>>      3. To enable the format specifier %f in the pipe-command, we make
>> changes to the file name creation logic in a few places. Currently the
>> file name (corresponding to a table or large object) is appended to
>> the directory name provided by '--file' command. In case of
>> '--pipe-command', we use 'replace_percent_placeholders' to replace %f
>> with the corresponding file name. This change is made for both table
>> files and LO TOC files.
>>
>> With these core changes, the rest of the code continues working as-is.
>>
>> We are attaching 4 patches for this change :
>>
>>   001-pg_dump_pipe has the pg_dump pipe support code.
>>   002-pg_restore_pipe has the pg_restore pipe support.
>>   003-pg_dump_basic_tests has a few basic validation tests for
>> correctmflag combinations. We need to write more automated tests in
>> 002_pg_dump.pl but have been running into some issues with environment
>> setup due to which certain pipe commands result in the shell process
>> becoming defunct. These same commands are working fine in manual
>> testing. We are still looking into this.
>>   004-pg_dump_documentation has the proposed documentation changes.
>>
>> We are working on the above test issues and cleanup of the patches.
>>
>> Open Questions : There are a couple of open questions in the implementation :
>>
>>      1. Currently the LO TOC file (blob_NNN.toc) is opened in the
>> append mode. This is not possible with popen for the pipe command.
>> From reading the code, it seems to us that this file doesn't need to
>> be opened in the append mode. As '_StartLOs' is called once per
>> archive entry in WriteDataChunksForToCEntry followed by the dumper
>> function and then '_EndLOs', it should be okay to change this to 'w'
>> mode. But this code has been there since the start so we haven't made
>> that change yet. In the patch, we have changed it to 'w' pipe-command
>> only and added the ideas for potential solutions in the comments.
>>      2. We are also not sure yet on how to handle the environment
>> issues when trying to add new tests to 002_pg_dump.pl.
>>
>> Please let us know what you think.
>>
>> Thanks & Regards,
>> Nitin Motiani
>> Google

Re: Adding pg_dump flag for parallel export to pipes

От

Thomas Munro

Дата:

26 апреля, 04:07:09

On Tue, Apr 8, 2025 at 7:48 AM Hannu Krosing <hannuk@google.com> wrote:
> Just to bring this out separately : Does anybody have any idea why pipe commands close inside tests ?
>
> Re: 003-pg_dump_basic_tests has a few basic validation tests for
> correctmflag combinations. We need to write more automated tests in
> 002_pg_dump.pl but have been running into some issues with environment
> setup due to which certain pipe commands result in the shell process
> becoming defunct. These same commands are working fine in manual
> testing. We are still looking into this.

No comment on the wider project except that it looks generally useful,
and I can see that it's not possible to use the conventional POSIX
filename "-" to represent stdout, because you need to write to
multiple files so you need to come up with *something* along the lines
you're proposing here.  But I was interested in seeing if I could help
with that technical problem you mentioned above, and I don't see that
happening with the current patches.  Do I understand correctly that
the problem you encountered is in some other tests that you haven't
attached yet?  Could you post what you have so that others can see the
problem and perhaps have a chance of helping?  I also recommend using
git format-patch when you post patches so that you have a place to
write a commit message including a note about which bits are WIP and
known not to work correctly yet.

Re: Adding pg_dump flag for parallel export to pipes

От

Nitin Motiani

Дата:

28 апреля, 11:22:53

Thanks for the feedback, Thomas.

> No comment on the wider project except that it looks generally useful,
> and I can see that it's not possible to use the conventional POSIX
> filename "-" to represent stdout, because you need to write to
> multiple files so you need to come up with *something* along the lines
> you're proposing here.  But I was interested in seeing if I could help
> with that technical problem you mentioned above, and I don't see that
> happening with the current patches.  Do I understand correctly that
> the problem you encountered is in some other tests that you haven't
> attached yet?  Could you post what you have so that others can see the
> problem and perhaps have a chance of helping?

Yes, we didn't add the failed tests to the patch. We'll add those and
send new patches.

> I also recommend using
> git format-patch when you post patches so that you have a place to
> write a commit message including a note about which bits are WIP and
> known not to work correctly yet.

Will follow these recommendations when sending the next set of patches.

Regards,
Nitin Motiani
Google

Re: Adding pg_dump flag for parallel export to pipes

От

Nitin Motiani

Дата:

05 июня, 15:39:26

Hi,

Apologies for the delay on this thread.

On Mon, Apr 28, 2025 at 1:52 PM Nitin Motiani <nitinmotiani@google.com> wrote:
>
> Thanks for the feedback, Thomas.
>
> > Do I understand correctly that
> > the problem you encountered is in some other tests that you haven't
> > attached yet?  Could you post what you have so that others can see the
> > problem and perhaps have a chance of helping?
>
> Yes, we didn't add the failed tests to the patch. We'll add those and
> send new patches.
>

I'm attaching the patch files generated using git format-patch.

0001 has the pg_dump pipe support code.
0002 has the pg_restore pipe support.
0003 has a few basic validation tests for correct flag combinations.
0004 has the proposed documentation changes.

The above 4 are the same as before.

The 0005 patch is the new WIP patch file. This includes the tests
which we have been trying to add but which are failing (although the
same commands run fine manually).

The tests in this patch are added to src/bin/pg_dump/t/002_pg_dump.pl.
The original attempt was to have a test case with dump and restore
commands using the new flag and run it in multiple scenarios. But
since that was failing, for the ease of debugging I added a few
standalone tests which just run a pg_dump with the pipe-command flag.
In these tests, if the pipe-command is a simple command like 'cat' or
'gzip', the test passes. But if the pipe-command itself uses a pipe
(either to a file or another command), the test fails.

In the following test

 ['pg_dump', '-Fd', '-B', 'postgres', "--pipe-command=\"cat > $tempdir/%f\"",],]

I get the below error.

# 'sh: line 1: cat >
/usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/tmp_check/tmp_test_XpFO/toc.dat:
No such file or directory

I can see that the temp directory tmp_test_XpFO exists. Even when I
changed the test to use an absolute path to an existing directory, I
got the same error. When I do manual testing with the same
pipe-command, it works fine. That is why we think there is some issue
with our environment setup for the tap test where it is not able to
parse the command.

I also ran the following loop (started just before starting the test
run) to print the output of ps commands around 'cat >' to see what
happens.

 for i in $(seq 1 10000); do ps --forest -ef | grep "cat >" -A 5 >>
~/ps_output.txt; done

The printed results showed that the child process with the pipe
command became defunct.

 nitinmo+ 3180211 3180160  5 17:05 pts/1    00:00:00  |   |
   \_ /usr/local/google/home/nitinmotiani/postgresql/tmp_install/usr/local/pgsql/bin/pg_dump
-Fd -B p     ostgres --pipe-command="cat >
/usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/definite_dumpdir/%f"
 nitinmo+ 3180215 3180211  0 17:05 pts/1    00:00:00  |   |
       \_ [sh] <defunct>

We are not sure how to handle this issue. Please let us know your thoughts.

Thanks & Regards,
Nitin Motiani
Google

Вложения

Re: Adding pg_dump flag for parallel export to pipes

От

Hannu Krosing

Дата:

04 июля, 10:12:20

I have added this to the commitfest

We would be grateful for any reviews and feedback on this.

When adding to commitfest I tried to put Nitin as "first author" as he
has done the bulk of the work (I did just a quick pg_dump-only PoC)
but it looks like Commitfest just orders all provided authors
alphabetically .

Re: Adding pg_dump flag for parallel export to pipes

От

Andrew Jackson

Дата:

28 августа, 16:05:34

Hi,

Very interesting patch. One question: is it possible with this patch to pipe pg_dump directory output directly into
pg_restorewith this patch? Looking at the code I don't believe that is the case but figured I would ask.
 

Thanks,
Andrew Jackson

Re: Adding pg_dump flag for parallel export to pipes

От

Andrew Jackson

Дата:

31 августа, 18:43:22

Hi,

Went ahead and experimented with your patch a bit. To answer my previous question this patch can be used to pipe
pg_dumpdirectly into pg_restore. This should absolutely be added as another use case to your list above as it is a well
knownlimitation that you can use pg_dump/psql to do buffered copy but only with a single process, while using
pg_dump/pg_restoreis capable of multiprocessed copy but it must be saved to disk in its entirety before the restore can
begin.This is extremely frustrating when dealing with large databases where you don't want multiple copies saved on
diskand because it's not as fast as it can be. With this patch you can get the best of both worlds. 
 

 Example dump
```bash
pg_dump --jobs=4 -Fd "${connection_str}" --pipe-command="mkfifo dumpdir/%f; cat >> dumpdir/%f"
```

Example restore run in different process
```bash
pg_restore --jobs=4 -Fd --dbname="${another_connection_str}" ./dumpdir
```
Thanks,
Andrew Jackson

Re: Adding pg_dump flag for parallel export to pipes

От

Dilip Kumar

Дата:

09 сентября, 09:37:19

On Thu, Jun 5, 2025 at 6:09 PM Nitin Motiani <nitinmotiani@google.com> wrote:
>
> Hi,
>
> Apologies for the delay on this thread.
>
> On Mon, Apr 28, 2025 at 1:52 PM Nitin Motiani <nitinmotiani@google.com> wrote:
> >
> > Thanks for the feedback, Thomas.
> >
> > > Do I understand correctly that
> > > the problem you encountered is in some other tests that you haven't
> > > attached yet?  Could you post what you have so that others can see the
> > > problem and perhaps have a chance of helping?
> >
> > Yes, we didn't add the failed tests to the patch. We'll add those and
> > send new patches.
> >
>
> I'm attaching the patch files generated using git format-patch.
>
> 0001 has the pg_dump pipe support code.
> 0002 has the pg_restore pipe support.
> 0003 has a few basic validation tests for correct flag combinations.
> 0004 has the proposed documentation changes.
>
> The above 4 are the same as before.
>
> The 0005 patch is the new WIP patch file. This includes the tests
> which we have been trying to add but which are failing (although the
> same commands run fine manually).
>
> The tests in this patch are added to src/bin/pg_dump/t/002_pg_dump.pl.
> The original attempt was to have a test case with dump and restore
> commands using the new flag and run it in multiple scenarios. But
> since that was failing, for the ease of debugging I added a few
> standalone tests which just run a pg_dump with the pipe-command flag.
> In these tests, if the pipe-command is a simple command like 'cat' or
> 'gzip', the test passes. But if the pipe-command itself uses a pipe
> (either to a file or another command), the test fails.
>
> In the following test
>
>  ['pg_dump', '-Fd', '-B', 'postgres', "--pipe-command=\"cat > $tempdir/%f\"",],]
>
> I get the below error.
>
> # 'sh: line 1: cat >
> /usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/tmp_check/tmp_test_XpFO/toc.dat:
> No such file or directory
>
> I can see that the temp directory tmp_test_XpFO exists. Even when I
> changed the test to use an absolute path to an existing directory, I
> got the same error. When I do manual testing with the same
> pipe-command, it works fine. That is why we think there is some issue
> with our environment setup for the tap test where it is not able to
> parse the command.
>
> I also ran the following loop (started just before starting the test
> run) to print the output of ps commands around 'cat >' to see what
> happens.
>
>  for i in $(seq 1 10000); do ps --forest -ef | grep "cat >" -A 5 >>
> ~/ps_output.txt; done
>
> The printed results showed that the child process with the pipe
> command became defunct.
>
>  nitinmo+ 3180211 3180160  5 17:05 pts/1    00:00:00  |   |
>    \_ /usr/local/google/home/nitinmotiani/postgresql/tmp_install/usr/local/pgsql/bin/pg_dump
> -Fd -B p     ostgres --pipe-command="cat >
> /usr/local/google/home/nitinmotiani/postgresql/src/bin/pg_dump/definite_dumpdir/%f"
>  nitinmo+ 3180215 3180211  0 17:05 pts/1    00:00:00  |   |
>        \_ [sh] <defunct>
>
> We are not sure how to handle this issue. Please let us know your thoughts.

The latest patch set is not applying on HEAD can you rebase the patch
set.  And also there are many TODOs in the patch, if those TODOs are
just good to do and you are planning for future development better to
get rid of those.  OTOH if some of those TODOs are mandatory to do
before we can commit the patch then are you planning to work on those
soon?  I am planning to review this patch so are you planning to send
the rebased version with implementing the TODO which are required for
the first version.

--
Regards,
Dilip Kumar
Google

Re: Adding pg_dump flag for parallel export to pipes

От

Nitin Motiani

Дата:

09 сентября, 18:11:24

On Tue, Sep 9, 2025 at 12:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
>
> The latest patch set is not applying on HEAD can you rebase the patch
> set.  And also there are many TODOs in the patch, if those TODOs are
> just good to do and you are planning for future development better to
> get rid of those.  OTOH if some of those TODOs are mandatory to do
> before we can commit the patch then are you planning to work on those
> soon?  I am planning to review this patch so are you planning to send
> the rebased version with implementing the TODO which are required for
> the first version.
>

Thanks for the feedback, Dilip. We will rebase the patch soon and send
it. Regarding the TODOs, some of those are plans for future
development (i.e. refactor). There are also TODOs in the first patch
file 0001 which are actually removed in file 0002. We can clean those
up or combine the two files. Other than that, some are about the open
questions. We will remove those from the code and will discuss those
issues on the thread.

Thanks,
Nitin Motiani
Google

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Adding pg_dump flag for parallel export to pipes

Вложения

Вложения