Обсуждение: pgsql: pg_ctl: Detect current standby state from pg_control

Поиск
Список
Период
Сортировка

pgsql: pg_ctl: Detect current standby state from pg_control

От
Peter Eisentraut
Дата:
pg_ctl: Detect current standby state from pg_control

pg_ctl used to determine whether a server was in standby mode by looking
for a recovery.conf file.  With this change, it instead looks into
pg_control, which is potentially more accurate.  There are also
occasional discussions about removing recovery.conf, so this removes one
dependency.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/c1dc51d4844e2a37412b034c07c1c5a439ba0b9d

Modified Files
--------------
src/backend/utils/misc/pg_controldata.c | 12 +++++++++
src/bin/pg_controldata/pg_controldata.c |  4 +++
src/bin/pg_ctl/pg_ctl.c                 | 47 ++++++++++++++++++++++++++-------
src/common/controldata_utils.c          | 15 +++++------
src/include/common/controldata_utils.h  |  2 +-
5 files changed, 62 insertions(+), 18 deletions(-)


Re: pgsql: pg_ctl: Detect current standby state from pg_control

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> pg_ctl: Detect current standby state from pg_control

Coverity thinks that this patch introduced a bunch of
null-pointer-dereference hazards, and AFAICS it is right.
The change in get_controlfile()'s API is completely broken
and needs to be undone.

            regards, tom lane


Re: pgsql: pg_ctl: Detect current standby state from pg_control

От
Michael Paquier
Дата:
On Mon, Sep 26, 2016 at 12:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> pg_ctl: Detect current standby state from pg_control
>
> Coverity thinks that this patch introduced a bunch of
> null-pointer-dereference hazards, and AFAICS it is right.
> The change in get_controlfile()'s API is completely broken
> and needs to be undone.

So you'd rather have a sleep(1) and complicate this code to replace it
with a WaitLatch() when the code is used by the backend? I don't agree
with that as the new API is cleaner as presented, though I agree that
it is a change that caller does not get any result in case of a CRC
mismatch, but who's going to use results that cannot be trusted
anyway? It seems to me that the correct fix here is that
pg_controldata.c should just exit like in the attached patch. Somewhat
I missed that during my lookup of c1dc51d4.

If we would still want callers of get_controlfile() to be allowed to
read untrustworthy results, we could just add a boolean status flag..
--
Michael

Вложения

Re: pgsql: pg_ctl: Detect current standby state from pg_control

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Mon, Sep 26, 2016 at 12:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Coverity thinks that this patch introduced a bunch of
>> null-pointer-dereference hazards, and AFAICS it is right.
>> The change in get_controlfile()'s API is completely broken
>> and needs to be undone.

> So you'd rather have a sleep(1) and complicate this code to replace it
> with a WaitLatch() when the code is used by the backend? I don't agree
> with that as the new API is cleaner as presented, though I agree that
> it is a change that caller does not get any result in case of a CRC
> mismatch, but who's going to use results that cannot be trusted
> anyway?

Well, they certainly won't be using any untrustworthy reports if
pg_controldata dumps core instead of printing the data, which is what
will happen in HEAD.  Although I don't understand your reference to
needing a sleep.  I think that it's 100% pointless for get_control_dbstate
to be worried about transient CRC failures.  If writes to pg_control
aren't atomic then we have problems enormously larger than whether
"pg_ctl promote" throws an error or not.

> It seems to me that the correct fix here is that
> pg_controldata.c should just exit like in the attached patch.

No.  Removing pg_controldata's designed-in ability to print information
from corrupted pg_control files was not part of the advertised impact of
this patch, and I will absolutely not hold still for it happening as an
accidental consequence of ill-advised refactoring.

It may well be that you need two different APIs for get_controlfile(),
maybe with a flag, or two different wrappers around some common code.
Or just go back to using #ifdef FRONTEND, and forget the silliness of
expecting pg_ctl not to throw an error for bad pg_control content.

BTW, now that get_controlfile has become a global symbol rather than
something private to pg_controldata.c, I suggest that it needs a less
generic name.  "control file" could apply to a lot of things.
Possibly "get_pg_control_contents" would be better.

            regards, tom lane


Re: pgsql: pg_ctl: Detect current standby state from pg_control

От
Peter Eisentraut
Дата:
On 9/26/16 8:53 AM, Tom Lane wrote:
> I think that it's 100% pointless for get_control_dbstate
> to be worried about transient CRC failures.  If writes to pg_control
> aren't atomic then we have problems enormously larger than whether
> "pg_ctl promote" throws an error or not.

The new code was designed to handle that, but if we think it's not an
issue, then we can simplify it.  I'm on it.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services