Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT
Дата
Msg-id 18397.1540297291@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-bugs
I wrote:
> =?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
>> SELECT * FROM test_file_fdw_program_limit LIMIT 0;
>> /*
>> [38000] ERROR: program "echo "test"" failed Detail: child process exited
>> with exit code 1
>> */

> Yeah, I can reproduce this on macOS as well as Linux.  Capturing stderr
> shows something pretty unsurprising:
> sh: line 1: echo: write error: Broken pipe
> So the called program is behaving in a somewhat reasonable way: it's
> detecting EPIPE on its stdout (after we close the pipe), reporting that,
> and doing exit(1).
> Unfortunately, it's not clear what we could do about that, short of
> always reading the whole program output, which is likely to be a
> cure worse than the disease.  If the program were failing thanks to
> SIGPIPE, we could recognize that as a case we can ignore ... but with
> behavior like this, I don't see a reliable way to tell it apart from
> a generic program failure, which surely we'd better report.

After a bit of thought, the problem here is blindingly obvious:
we generally run the backend with SIGPIPE handing set to SIG_IGN,
and evidently popen() allows the called program to inherit that,
at least on these platforms.

So what we need to do is make sure the called program inherits SIG_DFL
handling for SIGPIPE, and then special-case that result as not being
a failure.  The attached POC patch does that and successfully makes
the file_fdw problem go away for me.

It's just a POC because there are some things that need more thought
than I've given them:

1. Is it OK to revert SIGPIPE to default processing for *all* programs
launched through OpenPipeStream?  If not, what infrastructure do we
need to add to control that?  In particular, is it sane to revert
SIGPIPE for a pipe program that we will write to not read from?
(I thought probably yes, because that is the normal Unix behavior,
but it could be debated.)

2. Likewise, is it always OK for ClosePipeToProgram to ignore a
SIGPIPE failure?  (For ordinary COPY, maybe it shouldn't, since
we don't intend to terminate that early.)

3. Maybe this should be implemented at some higher level?

4. Are there any other signals we ought to be reverting to default
behavior before launching a COPY TO/FROM PROGRAM?

            regards, tom lane

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index b58a74f4e3db9d8e8f30d35e08ca84814742faa3..c4c6c875ec8102df2756d8ea1ba79791d63fed25 100644
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
*************** ClosePipeToProgram(CopyState cstate)
*** 1727,1737 ****
--- 1727,1746 ----
                  (errcode_for_file_access(),
                   errmsg("could not close pipe to external command: %m")));
      else if (pclose_rc != 0)
+     {
+         /*
+          * If the called program terminated on SIGPIPE, assume it's OK; we
+          * must have chosen to stop reading its output early.
+          */
+         if (WIFSIGNALED(pclose_rc) && WTERMSIG(pclose_rc) == SIGPIPE)
+             return;
+
          ereport(ERROR,
                  (errcode(ERRCODE_EXTERNAL_ROUTINE_EXCEPTION),
                   errmsg("program \"%s\" failed",
                          cstate->filename),
                   errdetail_internal("%s", wait_result_to_str(pclose_rc))));
+     }
  }

  /*
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 8dd51f176749c5d1b2adc9993b5a9a3571ee566c..6cec85e2c1dc7dea23c8b941e5080714ea65e57a 100644
*** a/src/backend/storage/file/fd.c
--- b/src/backend/storage/file/fd.c
*************** OpenTransientFilePerm(const char *fileNa
*** 2430,2440 ****
--- 2430,2445 ----
   * Routines that want to initiate a pipe stream should use OpenPipeStream
   * rather than plain popen().  This lets fd.c deal with freeing FDs if
   * necessary.  When done, call ClosePipeStream rather than pclose.
+  *
+  * This function also ensures that the popen'd program is run with default
+  * SIGPIPE processing, rather than the SIG_IGN setting the backend normally
+  * uses.  This ensures desirable response to, eg, closing a read pipe early.
   */
  FILE *
  OpenPipeStream(const char *command, const char *mode)
  {
      FILE       *file;
+     int            save_errno;

      DO_DB(elog(LOG, "OpenPipeStream: Allocated %d (%s)",
                 numAllocatedDescs, command));
*************** OpenPipeStream(const char *command, cons
*** 2452,2459 ****
  TryAgain:
      fflush(stdout);
      fflush(stderr);
      errno = 0;
!     if ((file = popen(command, mode)) != NULL)
      {
          AllocateDesc *desc = &allocatedDescs[numAllocatedDescs];

--- 2457,2469 ----
  TryAgain:
      fflush(stdout);
      fflush(stderr);
+     pqsignal(SIGPIPE, SIG_DFL);
      errno = 0;
!     file = popen(command, mode);
!     save_errno = errno;
!     pqsignal(SIGPIPE, SIG_IGN);
!     errno = save_errno;
!     if (file != NULL)
      {
          AllocateDesc *desc = &allocatedDescs[numAllocatedDescs];

*************** TryAgain:
*** 2466,2473 ****

      if (errno == EMFILE || errno == ENFILE)
      {
-         int            save_errno = errno;
-
          ereport(LOG,
                  (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
                   errmsg("out of file descriptors: %m; release and retry")));
--- 2476,2481 ----

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: BUG #15448: server process (PID 22656) was terminated byexception 0xC0000005
Следующее
От: Amit Langote
Дата:
Сообщение: Re: BUG #15448: server process (PID 22656) was terminated byexception 0xC0000005