Обсуждение: SIGPIPE in TAP tests

Поиск
Список
Период
Сортировка

SIGPIPE in TAP tests

От
Noah Misch
Дата:
Two buildfarm runs[1][2] from the last 90 days have failed in
src/test/authentication, like this:

  t/001_password.pl .. 
  Failed 3/8 subtests 
  t/002_saslprep.pl .. ok     1815 ms ( 0.00 usr  0.00 sys +  0.89 cusr  0.26 csys =  1.15 CPU)

  Test Summary Report
  -------------------
  t/001_password.pl (Wstat: 13 Tests: 5 Failed: 0)
    Non-zero wait status: 13
    Parse errors: Bad plan.  You planned 8 tests but ran 5.
  Files=2, Tests=17,  3 wallclock secs ( 0.04 usr  0.00 sys +  1.67 cusr  0.50 csys =  2.21 CPU)
  Result: FAIL

Compared to a good run, the other logs just end suddenly after the expected
"FATAL:  password authentication failed".  "Wstat: 13" means the Perl process
died to signal 13 (SIGPIPE).  This test invokes psql in a way that fails
authentication, and it writes "SELECT 1" to psql's stdin.  The SIGPIPE happens
if the psql process exits before that write.  I can reproduce this
consistently by patching a delay into IPC::Run:

--- IPC/Run.pm~    2017-05-12 06:46:43.000000000 -0700
+++ IPC/Run.pm    2017-12-09 00:24:06.876132628 -0800
@@ -1449,6 +1449,8 @@
 sub run {
     local $in_run = 1;     ## Allow run()-only optimizations.
     my IPC::Run $self = start(@_);
+    use Time::HiRes 'usleep';
+    usleep 100_000;

The two src/test/authentication tests then fail, but nothing else fails.
Let's ignore SIGPIPE in all TAP tests, which leaves some evidence in
regress_log_001_password:

  ack Broken pipe: write( 13, 'SELECT 1' ) at /home/nm/sw/cpan/lib/perl5/IPC/Run/IO.pm line 549.

To fix the actual failures, we can cease sending "SELECT 1"; it's enough to
disconnect immediately.  Patch attached.

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2017-11-23%2008%3A37%3A17
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2017-10-30%2000%3A14%3A09

Вложения

Re: SIGPIPE in TAP tests

От
Michael Paquier
Дата:
On Sun, Dec 10, 2017 at 6:02 AM, Noah Misch <noah@leadboat.com> wrote:
> Two buildfarm runs[1][2] from the last 90 days have failed in
> src/test/authentication, like this:
>
>   t/001_password.pl ..
>   Failed 3/8 subtests
>   t/002_saslprep.pl .. ok     1815 ms ( 0.00 usr  0.00 sys +  0.89 cusr  0.26 csys =  1.15 CPU)
>
>   Test Summary Report
>   -------------------
>   t/001_password.pl (Wstat: 13 Tests: 5 Failed: 0)
>     Non-zero wait status: 13
>     Parse errors: Bad plan.  You planned 8 tests but ran 5.
>   Files=2, Tests=17,  3 wallclock secs ( 0.04 usr  0.00 sys +  1.67 cusr  0.50 csys =  2.21 CPU)
>   Result: FAIL
>
> Compared to a good run, the other logs just end suddenly after the expected
> "FATAL:  password authentication failed".  "Wstat: 13" means the Perl process
> died to signal 13 (SIGPIPE).  This test invokes psql in a way that fails
> authentication, and it writes "SELECT 1" to psql's stdin.  The SIGPIPE happens
> if the psql process exits before that write.

Nice investigation. An interesting coincidence is that I have looked
yesterday at an off-list reported some folks have sent me which is
basically what you have here.

+   # Return EPIPE instead of killing the process with SIGPIPE.  An affected
+   # test may still fail, but it's more likely to report useful facts.
+   $SIG{PIPE} = 'IGNORE';
With this portion changed I can indeed see something which is more helpful:
psql: FATAL:  password authentication failed for user "saslpreptest1_role"
ack Broken pipe: write( 11, 'SELECT 1' ) at
/Library/Perl/5.18/IPC/Run/IO.pm line 558.

If SIGPIPE is ignored then test output just stops after generating the
FATAL message. Oops.

> The two src/test/authentication tests then fail, but nothing else fails.
> Let's ignore SIGPIPE in all TAP tests, which leaves some evidence in
> regress_log_001_password:
>
>   ack Broken pipe: write( 13, 'SELECT 1' ) at /home/nm/sw/cpan/lib/perl5/IPC/Run/IO.pm line 549.
>
> To fix the actual failures, we can cease sending "SELECT 1"; it's enough to
> disconnect immediately.  Patch attached.

Perhaps you could use an empty string instead? I feel a bit uneasy
about passing an undefined object to IPC::Run::run.
-- 
Michael


Re: SIGPIPE in TAP tests

От
Noah Misch
Дата:
On Mon, Dec 11, 2017 at 04:19:52PM +0900, Michael Paquier wrote:
> On Sun, Dec 10, 2017 at 6:02 AM, Noah Misch <noah@leadboat.com> wrote:

> If SIGPIPE is ignored then test output just stops after generating the
> FATAL message. Oops.

You mean "If SIGPIPE is not ignored ...", right?

> > To fix the actual failures, we can cease sending "SELECT 1"; it's enough to
> > disconnect immediately.  Patch attached.
> 
> Perhaps you could use an empty string instead? I feel a bit uneasy
> about passing an undefined object to IPC::Run::run.

IPC::Run documents the equivalence of undef and '' in this context; search for
"close a child processes stdin" in
http://search.cpan.org/~rbs/IPC-Run-0.78/lib/IPC/Run.pm.  Thus, I expect both
spellings to work reliably, and I find "undef" slightly more evocative.


Re: SIGPIPE in TAP tests

От
Michael Paquier
Дата:
On Mon, Dec 11, 2017 at 4:32 PM, Noah Misch <noah@leadboat.com> wrote:
> On Mon, Dec 11, 2017 at 04:19:52PM +0900, Michael Paquier wrote:
>> On Sun, Dec 10, 2017 at 6:02 AM, Noah Misch <noah@leadboat.com> wrote:
>
>> If SIGPIPE is ignored then test output just stops after generating the
>> FATAL message. Oops.
>
> You mean "If SIGPIPE is not ignored ...", right?

Yes, sorry.

>> > To fix the actual failures, we can cease sending "SELECT 1"; it's enough to
>> > disconnect immediately.  Patch attached.
>>
>> Perhaps you could use an empty string instead? I feel a bit uneasy
>> about passing an undefined object to IPC::Run::run.
>
> IPC::Run documents the equivalence of undef and '' in this context; search for
> "close a child processes stdin" in
> http://search.cpan.org/~rbs/IPC-Run-0.78/lib/IPC/Run.pm.  Thus, I expect both
> spellings to work reliably, and I find "undef" slightly more evocative.

Thanks, I missed this bit. No objections to use undef then.
-- 
Michael