Обсуждение: [COMMITTERS] pgsql: Unify SIGHUP handling between normal and walsender backends.

Поиск
Список
Период
Сортировка

[COMMITTERS] pgsql: Unify SIGHUP handling between normal and walsender backends.

От
Andres Freund
Дата:
Unify SIGHUP handling between normal and walsender backends.

Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload.  Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).

This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.

A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.

Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0

Branch
------
REL9_6_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/b8bd32a51f2dd451644175af7ae32f9bec3153f1

Modified Files
--------------
src/backend/replication/walsender.c | 29 +++++++----------------------
src/backend/tcop/postgres.c         | 30 ++++++++++++++----------------
src/backend/utils/init/globals.c    |  1 +
src/include/miscadmin.h             |  5 +++++
4 files changed, 27 insertions(+), 38 deletions(-)


Re: [COMMITTERS] pgsql: Unify SIGHUP handling between normal andwalsender backends.

От
Andres Freund
Дата:
On 2017-06-06 02:25:18 +0000, Andres Freund wrote:
> Unify SIGHUP handling between normal and walsender backends.
>
> Because walsender and normal backends share the same main loop it's
> problematic to have two different flag variables, set in signal
> handlers, indicating a pending configuration reload.  Only certain
> walsender commands reach code paths checking for the
> variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
> ... LOGICAL, notably not base backups).
>
> This is a bug present since the introduction of walsender, but has
> gotten worse in releases since then which allow walsender to do more.
>
> A later patch, not slated for v10, will similarly unify SIGHUP
> handling in other types of processes as well.
>
> Author: Petr Jelinek, Andres Freund
> Reviewed-By: Michael Paquier
> Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
> Backpatch: 9.2-, bug is present since 9.0
>
> Branch
> ------
> REL9_6_STABLE
>
> Details
> -------
> https://git.postgresql.org/pg/commitdiff/b8bd32a51f2dd451644175af7ae32f9bec3153f1
>
> Modified Files
> --------------
> src/backend/replication/walsender.c | 29 +++++++----------------------
> src/backend/tcop/postgres.c         | 30 ++++++++++++++----------------
> src/backend/utils/init/globals.c    |  1 +
> src/include/miscadmin.h             |  5 +++++
> 4 files changed, 27 insertions(+), 38 deletions(-)

This commit, or one of its siblings, seemingly caused 'handfish' to fail
with a weird error message:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-06-06%2002%3A59%3A01

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I. -I.
-I../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2   -c -o walsender.o walsender.c 
Assembler messages:
Fatal error: can't create walsender.o: No such file or directory
<builtin>: recipe for target 'walsender.o' failed

I'm clueless what that could be caused by, given that the rest of the
9.6 animals do not seem to be scared.

Any ideas?   So far I just plan to wait till the machine runs again on
its own.

- Andres


Re: [COMMITTERS] pgsql: Unify SIGHUP handling between normal and walsender backends.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2017-06-06 02:25:18 +0000, Andres Freund wrote:
>> Unify SIGHUP handling between normal and walsender backends.

> This commit, or one of its siblings, seemingly caused 'handfish' to fail
> with a weird error message:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-06-06%2002%3A59%3A01

handfish has failed with weird irreproducible problems before, eg in

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-04-20%2023%3A37%3A45

the first sign of trouble is

! invalid binary "/home/filiperosset/dev/build-farm-4.18/HEAD/inst/bin/psql"

I'm inclined to think it's got slightly flaky hardware.

            regards, tom lane


Re: [COMMITTERS] pgsql: Unify SIGHUP handling between normal andwalsender backends.

От
Andres Freund
Дата:
On 2017-06-06 19:41:10 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-06-06 02:25:18 +0000, Andres Freund wrote:
> >> Unify SIGHUP handling between normal and walsender backends.
>
> > This commit, or one of its siblings, seemingly caused 'handfish' to fail
> > with a weird error message:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-06-06%2002%3A59%3A01
>
> handfish has failed with weird irreproducible problems before, eg in
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-04-20%2023%3A37%3A45
>
> the first sign of trouble is
>
> ! invalid binary "/home/filiperosset/dev/build-farm-4.18/HEAD/inst/bin/psql"
>
> I'm inclined to think it's got slightly flaky hardware.

Thanks, I'd looked at a few other recent failures, and they'd looked
like proper failures.

Filipe, do you know if that machine has any troubles?

Regards,

Andres


Re: [COMMITTERS] pgsql: Unify SIGHUP handling between normal andwalsender backends.

От
Filipe Rosset
Дата:
2017-06-06 20:49 GMT-03:00 Andres Freund <andres@anarazel.de>:
On 2017-06-06 19:41:10 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-06-06 02:25:18 +0000, Andres Freund wrote:
> >> Unify SIGHUP handling between normal and walsender backends.
>
> > This commit, or one of its siblings, seemingly caused 'handfish' to fail
> > with a weird error message:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-06-06%2002%3A59%3A01
>
> handfish has failed with weird irreproducible problems before, eg in
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-04-20%2023%3A37%3A45
>
> the first sign of trouble is
>
> ! invalid binary "/home/filiperosset/dev/build-farm-4.18/HEAD/inst/bin/psql"
>
> I'm inclined to think it's got slightly flaky hardware.

Thanks, I'd looked at a few other recent failures, and they'd looked
like proper failures.

Filipe, do you know if that machine has any troubles?

Regards,

Andres


Hi guys, I'm not aware of any hardware issues in 'handfish'.

For while, I changed my crontab to run the build every hour instead of each 20 minutes, let's see how it will behave in next builds.

Cheers,
Filipe