Обсуждение: Re: [HACKERS] Recent SIGSEGV failures in buildfarm HEAD
Stefan Kaltenbrunner wrote:
> Tom Lane wrote:
>
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>
>>> I'm actually wondering if unlimiting core might not be a useful switch
>>> to provide on pg_ctl, as long as the platform has setrlimit().
>>>
>> Not a bad thought; that's actually one of the reasons that I still
>> usually use a handmade script rather than pg_ctl for launching
>> postmasters ...
>>
>
> this sounds like a good idea for me too - it seems like a cleaner and
> more useful thing on a general base then just doing it in the buildfarm
> code ...
>
>
>
Draft patch attached. However, there will be some more work to do. For
one thing, pg_regress does not use pg_ctl to start its temp install
postmaster, so either we'll need to train it the same way or get it to
use pg_ctl. And then we'd need to change the regression makefile to use
the option, based on an environment variable a bit like MAX_CONNEXCTIONS
maybe.
cheers
andrew
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.74
diff -c -r1.74 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c 12 Oct 2006 05:14:49 -0000 1.74
--- src/bin/pg_ctl/pg_ctl.c 29 Dec 2006 21:08:39 -0000
***************
*** 26,31 ****
--- 26,36 ----
#include <sys/stat.h>
#include <unistd.h>
+ #ifdef HAVE_SYS_RESOURCE_H
+ #include <sys/time.h>
+ #include <sys/resource.h>
+ #endif
+
#include "libpq/pqsignal.h"
#include "getopt_long.h"
***************
*** 90,95 ****
--- 95,103 ----
static char *register_username = NULL;
static char *register_password = NULL;
static char *argv0 = NULL;
+ #if HAVE_GETRLIMIT
+ static bool allow_core_files = false;
+ #endif
static void
write_stderr(const char *fmt,...)
***************
*** 132,137 ****
--- 140,149 ----
static char pid_file[MAXPGPATH];
static char conf_file[MAXPGPATH];
+ #if HAVE_GETRLIMIT
+ static void unlimit_core_size(void);
+ #endif
+
#if defined(WIN32) || defined(__CYGWIN__)
static void
***************
*** 478,483 ****
--- 490,516 ----
}
+ #if HAVE_GETRLIMIT
+ static void
+ unlimit_core_size(void)
+ {
+ struct rlimit lim;
+ getrlimit(RLIMIT_CORE,&lim);
+ if (lim.rlim_max == 0)
+ {
+ write_stderr(_("%s: cannot set core size,: disallowed by hard limit.\n"),
+ progname);
+ return;
+ }
+ else if (lim.rlim_max == RLIM_INFINITY || lim.rlim_cur < lim.rlim_max)
+ {
+ lim.rlim_cur = lim.rlim_max;
+ setrlimit(RLIMIT_CORE,&lim);
+ }
+ }
+ #endif
+
+
static void
do_start(void)
***************
*** 581,586 ****
--- 614,624 ----
postgres_path = postmaster_path;
}
+ #if HAVE_GETRLIMIT
+ if (allow_core_files)
+ unlimit_core_size();
+ #endif
+
exitcode = start_postmaster();
if (exitcode != 0)
{
***************
*** 1401,1406 ****
--- 1439,1447 ----
printf(_(" -o OPTIONS command line options to pass to postgres\n"
" (PostgreSQL server executable)\n"));
printf(_(" -p PATH-TO-POSTGRES normally not necessary\n"));
+ #if HAVE_GETRLIMIT
+ printf(_(" -c, --corefiles allow postgres to produce core files\n"));
+ #endif
printf(_("\nOptions for stop or restart:\n"));
printf(_(" -m SHUTDOWN-MODE may be \"smart\", \"fast\", or \"immediate\"\n"));
***************
*** 1497,1502 ****
--- 1538,1546 ----
{"mode", required_argument, NULL, 'm'},
{"pgdata", required_argument, NULL, 'D'},
{"silent", no_argument, NULL, 's'},
+ #if HAVE_GETRLIMIT
+ {"corefiles", no_argument, NULL, 'c'},
+ #endif
{NULL, 0, NULL, 0}
};
***************
*** 1561,1567 ****
/* process command-line options */
while (optind < argc)
{
! while ((c = getopt_long(argc, argv, "D:l:m:N:o:p:P:sU:wW", long_options, &option_index)) != -1)
{
switch (c)
{
--- 1605,1611 ----
/* process command-line options */
while (optind < argc)
{
! while ((c = getopt_long(argc, argv, "cD:l:m:N:o:p:P:sU:wW", long_options, &option_index)) != -1)
{
switch (c)
{
***************
*** 1632,1637 ****
--- 1676,1686 ----
do_wait = false;
wait_set = true;
break;
+ #if HAVE_GETRLIMIT
+ case 'c':
+ allow_core_files = true;
+ break;
+ #endif
default:
/* getopt_long already issued a suitable error message */
do_advice();
Andrew Dunstan <andrew@dunslane.net> writes:
> ... And then we'd need to change the regression makefile to use
> the option, based on an environment variable a bit like MAX_CONNEXCTIONS
> maybe.
Why wouldn't we just use it always? If a regression test dumps core,
that's going to deserve investigation.
regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> ... And then we'd need to change the regression makefile to use >> the option, based on an environment variable a bit like MAX_CONNEXCTIONS >> maybe. > > Why wouldn't we just use it always? If a regression test dumps core, > that's going to deserve investigation. enabling it always for the regression tests probably makes sense - but there is also the possibility that such a core can get very large and potentially run the partitition the regression test runs on out of space. Stefan
Stefan Kaltenbrunner wrote: > Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> ... And then we'd need to change the regression makefile to use >>> the option, based on an environment variable a bit like >>> MAX_CONNEXCTIONS >>> maybe. >> >> Why wouldn't we just use it always? If a regression test dumps core, >> that's going to deserve investigation. > > enabling it always for the regression tests probably makes sense - but > there is also the possibility that such a core can get very large and > potentially run the partitition the regression test runs on out of space. > > I think Tom is right. You can always set the hard limit before calling "make check" or running the buildfarm script. I'll prepare a patch to use similar code unconditionally in pg_regress. cheers andrew
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>
>> ... And then we'd need to change the regression makefile to use
>> the option, based on an environment variable a bit like MAX_CONNEXCTIONS
>> maybe.
>>
>
> Why wouldn't we just use it always? If a regression test dumps core,
> that's going to deserve investigation.
>
>
>
Revised patch attached, doing just this. I will apply it soon unless
there are objections.
cheers
andrew
Index: doc/src/sgml/ref/pg_ctl-ref.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_ctl-ref.sgml,v
retrieving revision 1.35
diff -c -r1.35 pg_ctl-ref.sgml
*** doc/src/sgml/ref/pg_ctl-ref.sgml 2 Dec 2006 00:34:52 -0000 1.35
--- doc/src/sgml/ref/pg_ctl-ref.sgml 2 Jan 2007 20:25:01 -0000
***************
*** 29,34 ****
--- 29,35 ----
<arg>-l <replaceable>filename</replaceable></arg>
<arg>-o <replaceable>options</replaceable></arg>
<arg>-p <replaceable>path</replaceable></arg>
+ <arg>-c</arg>
<sbr>
<command>pg_ctl</command>
<arg choice="plain">stop</arg>
***************
*** 48,53 ****
--- 49,55 ----
<arg>-w</arg>
<arg>-s</arg>
<arg>-D <replaceable>datadir</replaceable></arg>
+ <arg>-c</arg>
<arg>-m
<group choice="plain">
<arg>s[mart]</arg>
***************
*** 246,251 ****
--- 248,266 ----
</varlistentry>
<varlistentry>
+ <term><option>-c</option></term>
+ <listitem>
+ <para>
+ Attempt to allow server crashes to produce core files, on platforms
+ where this available, by lifting any soft resource limit placed on
+ them.
+ This is useful in debugging or diagnosing problems by allowing a
+ stack trace to be obtained from a failed server process.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-w</option></term>
<listitem>
<para>
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.74
diff -c -r1.74 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c 12 Oct 2006 05:14:49 -0000 1.74
--- src/bin/pg_ctl/pg_ctl.c 2 Jan 2007 20:25:02 -0000
***************
*** 26,31 ****
--- 26,36 ----
#include <sys/stat.h>
#include <unistd.h>
+ #ifdef HAVE_SYS_RESOURCE_H
+ #include <sys/time.h>
+ #include <sys/resource.h>
+ #endif
+
#include "libpq/pqsignal.h"
#include "getopt_long.h"
***************
*** 90,95 ****
--- 95,103 ----
static char *register_username = NULL;
static char *register_password = NULL;
static char *argv0 = NULL;
+ #if HAVE_GETRLIMIT
+ static bool allow_core_files = false;
+ #endif
static void
write_stderr(const char *fmt,...)
***************
*** 132,137 ****
--- 140,149 ----
static char pid_file[MAXPGPATH];
static char conf_file[MAXPGPATH];
+ #if HAVE_GETRLIMIT
+ static void unlimit_core_size(void);
+ #endif
+
#if defined(WIN32) || defined(__CYGWIN__)
static void
***************
*** 478,483 ****
--- 490,516 ----
}
+ #if HAVE_GETRLIMIT
+ static void
+ unlimit_core_size(void)
+ {
+ struct rlimit lim;
+ getrlimit(RLIMIT_CORE,&lim);
+ if (lim.rlim_max == 0)
+ {
+ write_stderr(_("%s: cannot set core size,: disallowed by hard limit.\n"),
+ progname);
+ return;
+ }
+ else if (lim.rlim_max == RLIM_INFINITY || lim.rlim_cur < lim.rlim_max)
+ {
+ lim.rlim_cur = lim.rlim_max;
+ setrlimit(RLIMIT_CORE,&lim);
+ }
+ }
+ #endif
+
+
static void
do_start(void)
***************
*** 581,586 ****
--- 614,624 ----
postgres_path = postmaster_path;
}
+ #if HAVE_GETRLIMIT
+ if (allow_core_files)
+ unlimit_core_size();
+ #endif
+
exitcode = start_postmaster();
if (exitcode != 0)
{
***************
*** 1401,1406 ****
--- 1439,1447 ----
printf(_(" -o OPTIONS command line options to pass to postgres\n"
" (PostgreSQL server executable)\n"));
printf(_(" -p PATH-TO-POSTGRES normally not necessary\n"));
+ #if HAVE_GETRLIMIT
+ printf(_(" -c, --corefiles allow postgres to produce core files\n"));
+ #endif
printf(_("\nOptions for stop or restart:\n"));
printf(_(" -m SHUTDOWN-MODE may be \"smart\", \"fast\", or \"immediate\"\n"));
***************
*** 1497,1502 ****
--- 1538,1546 ----
{"mode", required_argument, NULL, 'm'},
{"pgdata", required_argument, NULL, 'D'},
{"silent", no_argument, NULL, 's'},
+ #if HAVE_GETRLIMIT
+ {"corefiles", no_argument, NULL, 'c'},
+ #endif
{NULL, 0, NULL, 0}
};
***************
*** 1561,1567 ****
/* process command-line options */
while (optind < argc)
{
! while ((c = getopt_long(argc, argv, "D:l:m:N:o:p:P:sU:wW", long_options, &option_index)) != -1)
{
switch (c)
{
--- 1605,1611 ----
/* process command-line options */
while (optind < argc)
{
! while ((c = getopt_long(argc, argv, "cD:l:m:N:o:p:P:sU:wW", long_options, &option_index)) != -1)
{
switch (c)
{
***************
*** 1632,1637 ****
--- 1676,1686 ----
do_wait = false;
wait_set = true;
break;
+ #if HAVE_GETRLIMIT
+ case 'c':
+ allow_core_files = true;
+ break;
+ #endif
default:
/* getopt_long already issued a suitable error message */
do_advice();
Index: src/test/regress/pg_regress.c
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/pg_regress.c,v
retrieving revision 1.23
diff -c -r1.23 pg_regress.c
*** src/test/regress/pg_regress.c 4 Oct 2006 00:30:14 -0000 1.23
--- src/test/regress/pg_regress.c 2 Jan 2007 20:25:03 -0000
***************
*** 24,29 ****
--- 24,34 ----
#include <signal.h>
#include <unistd.h>
+ #ifdef HAVE_SYS_RESOURCE_H
+ #include <sys/time.h>
+ #include <sys/resource.h>
+ #endif
+
#include "getopt_long.h"
#include "pg_config_paths.h"
***************
*** 122,127 ****
--- 127,156 ----
the supplied arguments. */
__attribute__((format(printf, 2, 3)));
+ /*
+ * allow core files if possible.
+ */
+ #if HAVE_GETRLIMIT
+ static void
+ unlimit_core_size(void)
+ {
+ struct rlimit lim;
+ getrlimit(RLIMIT_CORE,&lim);
+ if (lim.rlim_max == 0)
+ {
+ fprintf(stderr,
+ _("%s: cannot set core size,: disallowed by hard limit.\n"),
+ progname);
+ return;
+ }
+ else if (lim.rlim_max == RLIM_INFINITY || lim.rlim_cur < lim.rlim_max)
+ {
+ lim.rlim_cur = lim.rlim_max;
+ setrlimit(RLIMIT_CORE,&lim);
+ }
+ }
+ #endif
+
/*
* Add an item at the end of a stringlist.
***************
*** 1459,1464 ****
--- 1488,1497 ----
initialize_environment();
+ #if HAVE_GETRLIMIT
+ unlimit_core_size();
+ #endif
+
if (temp_install)
{
/*
Andrew Dunstan <andrew@dunslane.net> writes:
> Revised patch attached, doing just this. I will apply it soon unless
> there are objections.
Probably a good idea to check defined(HAVE_GETRLIMIT) && defined(RLIMIT_CORE),
rather than naively assuming every getrlimit implementation supports
that particular setting. Also, should the -c option exist but just not
do anything if the platform doesn't support it? As is, you're making it
impossible to just specify -c without worrying if it does anything.
The documentation fails to list the long form of the switch
(--corefiles, which should probably really be --core-files for consistency).
There's a typo in this message, too:
+ _("%s: cannot set core size,: disallowed by hard limit.\n"),
regards, tom lane
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>
>> Revised patch attached, doing just this. I will apply it soon unless
>> there are objections.
>>
>
> Probably a good idea to check defined(HAVE_GETRLIMIT) && defined(RLIMIT_CORE),
> rather than naively assuming every getrlimit implementation supports
> that particular setting. Also, should the -c option exist but just not
> do anything if the platform doesn't support it? As is, you're making it
> impossible to just specify -c without worrying if it does anything.
>
> The documentation fails to list the long form of the switch
> (--corefiles, which should probably really be --core-files for consistency).
> There's a typo in this message, too:
>
> + _("%s: cannot set core size,: disallowed by hard limit.\n"),
>
>
>
OK, I'll fix all this. Thanks.
cheers
andrew