Обсуждение: pgsql: Consistently test for in-use shared memory.

Поиск
Список
Период
Сортировка

pgsql: Consistently test for in-use shared memory.

От
Noah Misch
Дата:
Consistently test for in-use shared memory.

postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process.  When the postmaster.pid file was
missing, a starting postmaster used weaker checks.  Change to use the
same checks in both scenarios.  This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start".  A postmaster
will no longer stop if shmat() of an old segment fails with EACCES.  A
postmaster will no longer recycle segments pertaining to other data
directories.  That's good for production, but it's bad for integration
tests that crash a postmaster and immediately delete its data directory.
Such a test now leaks a segment indefinitely.  No "make check-world"
test does that.  win32_shmem.c already avoided all these problems.  In
9.6 and later, enhance PostgresNode to facilitate testing.  Back-patch
to 9.4 (all supported versions).

Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c098509927f9a49ebceb301a2cb6a477ecd4ac3c

Modified Files
--------------
src/Makefile.global.in              |   4 +-
src/backend/port/sysv_shmem.c       | 269 +++++++++++++++++++++---------------
src/backend/port/win32_shmem.c      |   7 +-
src/backend/postmaster/postmaster.c |  12 +-
src/backend/storage/ipc/ipci.c      |  14 +-
src/backend/utils/init/postinit.c   |   6 +-
src/include/storage/ipc.h           |   2 +-
src/include/storage/pg_shmem.h      |   6 +-
src/test/perl/PostgresNode.pm       | 184 +++++++++++++++++++-----
src/test/recovery/t/017_shm.pl      | 200 +++++++++++++++++++++++++++
src/tools/msvc/vcregress.pl         |   1 +
11 files changed, 525 insertions(+), 180 deletions(-)


Re: pgsql: Consistently test for in-use shared memory.

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> Consistently test for in-use shared memory.

Hmm ... conchuela has just found another problem with this test case:

ok 2 - detected live backend via shared memory
# Running: postgres --single -D
/home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_017_shm_gnat_data/pgdatatemplate1 
ack Broken pipe: write( 16, 'SELECT 1 + 1' ) at /usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.

I interpret this as "the single-user backend exited before we could stuff
'SELECT 1 + 1' down the pipe to it, and the Perl script is not expecting
to get a write failure there".

            regards, tom lane



Re: pgsql: Consistently test for in-use shared memory.

От
Noah Misch
Дата:
On Mon, Apr 15, 2019 at 08:08:48PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > Consistently test for in-use shared memory.
> 
> Hmm ... conchuela has just found another problem with this test case:
> 
> ok 2 - detected live backend via shared memory
> # Running: postgres --single -D
/home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_017_shm_gnat_data/pgdatatemplate1
 
> ack Broken pipe: write( 16, 'SELECT 1 + 1' ) at /usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
> 
> I interpret this as "the single-user backend exited before we could stuff
> 'SELECT 1 + 1' down the pipe to it, and the Perl script is not expecting
> to get a write failure there".

Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
from that file.



Re: pgsql: Consistently test for in-use shared memory.

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Mon, Apr 15, 2019 at 08:08:48PM -0400, Tom Lane wrote:
>> I interpret this as "the single-user backend exited before we could stuff
>> 'SELECT 1 + 1' down the pipe to it, and the Perl script is not expecting
>> to get a write failure there".

> Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
> from that file.

Yeah, that was the first idea that occurred to me as well.
Kinda grotty, but ...

Another idea is to shove the problem off on a sub-shell, that is run

    echo SELECT 1 + 1 | postgres --single ...

            regards, tom lane



Re: pgsql: Consistently test for in-use shared memory.

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
>> Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
>> from that file.

Actually ... we don't need to run a query at all do we?

Although this would only help if Perl will optimize away an attempt
to write zero bytes to the pipe, which seems likely but not certain.

            regards, tom lane



Re: pgsql: Consistently test for in-use shared memory.

От
Noah Misch
Дата:
On Mon, Apr 15, 2019 at 09:01:45PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> >> Perhaps I should write the 'SELECT 1 + 1' to a regular file and redirect input
> >> from that file.
> 
> Actually ... we don't need to run a query at all do we?

No.  We expect never to run one; it was there for the unexpected case of
"postgres --single" startup succeeding.  I pushed a change to close the stdin
of "postgres --single" instead of writing to it.  I probably worried that
"postgres --single" would not tolerate that, but it seems to.



Re: pgsql: Consistently test for in-use shared memory.

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> No.  We expect never to run one; it was there for the unexpected case of
> "postgres --single" startup succeeding.  I pushed a change to close the stdin
> of "postgres --single" instead of writing to it.

Ah, good, that's safer than what I was imagining.

> I probably worried that
> "postgres --single" would not tolerate that, but it seems to.

Sure.  EOF-on-stdin is the standard way to end a standalone-backend
session.  This fix means it'd have zero queries to execute rather
than one, but I'd be entirely willing to call it a backend bug if that
didn't work.

The more important point of course is that we must correctly detect
test failure if the backend doesn't complain.  But since the test
explicitly looks for a particular error message text, that seems
fine.

            regards, tom lane