Re: Suggestion to add --continue-client-on-abort option to pgbench

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Suggestion to add --continue-client-on-abort option to pgbench
Дата
Msg-id CAHGQGwE_bvuqNjMEGARGq3m4oRmN5xm2zEJKwztAym3gbkp7Jw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Suggestion to add --continue-client-on-abort option to pgbench  (Yugo Nagata <nagata@sraoss.co.jp>)
Ответы Re: Suggestion to add --continue-client-on-abort option to pgbench
Список pgsql-hackers
On Sat, Sep 20, 2025 at 12:21 AM Yugo Nagata <nagata@sraoss.co.jp> wrote:
> > While testing, I found that running pgbench with --continue-on-error and
> > pipeline mode triggers the following assertion failure. Could this be
> > a bug in the patch?
> >
> > ---------------------------------------------------
> > $ cat pipeline.pgbench
> > \startpipeline
> > DO $$
> >   BEGIN
> >     PERFORM pg_sleep(3);
> >     PERFORM pg_terminate_backend(pg_backend_pid());
> >   END $$;
> > \endpipeline
> >
> > $ pgbench -n --debug --verbose-errors -f pipeline.pgbench -c 2 -t 4 -M
> > extended --continue-on-error
> > ...
> > Assertion failed:
> > (sql_script[st->use_file].commands[st->command]->type == 1), function
> > commandError, file pgbench.c, line 3081.
> > Abort trap: 6
> > ---------------------------------------------------
> >
> > When I ran the same command without --continue-on-error,
> > the assertion failure did not occur.
>
> I think this bug was introduced by commit 4a39f87acd6e, which enabled pgbench
> to retry and added the --verbose-errors option, rather than by this patch itself.
>
> The assertion failure occurs in commandError(), which is called to report an error when
> it can be retried (i.e., serializable failure or deadlock), or when --continue-on-error
> is used after this patch.
>
>  Assert(sql_script[st->use_file].commands[st->command]->type == SQL_COMMAND);
>
> This assumes the error is always detected during SQL command execution, but
> that’s not correct, since in pipeline mode, the error can be detected when
> a \endpipeline meta-command is executed.
>
>  $ cat deadlock.sql
>  \startpipeline
>  begin;
>  lock b;
>  lock a;
>  end;
>  \endpipeline
>
>  $ cat deadlock2.sql
>  \startpipeline
>  begin;
>  lock a;
>  lock b;
>  end;
>  \endpipeline
>
>  $ pgbench --verbose-errors -f deadlock.sql  -f deadlock2.sql -c 2 -T 3 -M extended
>  pgbench (19devel)
>  starting vacuum...end.
>  pgbench: pgbench.c:3062: commandError: Assertion `sql_script[st->use_file].commands[st->command]->type == 1' failed.
>
> Although one option would be to remove this assertion, if we prefer to keep it,
> the attached patch fixes the issue.

Thanks for the analysis and the patch!

I think we should fix the issue rather than just removing the assertion.
I'd like to apply your patch with the following source comment:

---------------------------
Errors should only be detected during an SQL command or the \endpipeline
meta command. Any other case triggers an assertion failure.
--------------------------


With your patch and the continue-on-error patches, running the same pgbench
command I used to reproduce the assertion failure upthread causes pgbench
to hang. From my analysis, it enters an infinite loop in discardUntilSync().
That loop waits for PGRES_PIPELINE_SYNC, but since the connection has already
been closed, it never arrives, leaving pgbench stuck.

Could this also happen without the continue-on-error patch, or is it a new bug
introduced by it? Either way, it seems pgbench needs to exit the loop when
the result status is PGRES_FATAL_ERROR.

Regards,

--
Fujii Masao



В списке pgsql-hackers по дате отправления: