Обсуждение: "pg_ctl promote" exit status

Поиск
Список
Период
Сортировка

"pg_ctl promote" exit status

От
Dhruv Ahuja
Дата:
Hello

The "pg_ctl promote" command returns an exit code of 1 when the server
is not in standby mode, and the same exit code of 1 when the server
isn't started at all. The only difference at the time being is the
string output at the time, which FYI are...

pg_ctl: cannot promote server; server is not in standby mode

...and...

pg_ctl: PID file "/var/lib/pgsql/9.1/data/postmaster.pid" does not exist
Is server running?

...respectively.

I am in the process of developing a clustering solution around luci
and rgmanager (in Red Hat EL 6) and for the time being, am basing it
off the string output. Maybe each different exit reason should have a
unique exit code, whatever my logic and approach to solving this
problem be?


Thanks



Re: "pg_ctl promote" exit status

От
Robert Haas
Дата:
On Tue, Oct 23, 2012 at 6:39 AM, Dhruv Ahuja <dhruvahuja@gmail.com> wrote:
> The "pg_ctl promote" command returns an exit code of 1 when the server
> is not in standby mode, and the same exit code of 1 when the server
> isn't started at all. The only difference at the time being is the
> string output at the time, which FYI are...
>
> pg_ctl: cannot promote server; server is not in standby mode
>
> ...and...
>
> pg_ctl: PID file "/var/lib/pgsql/9.1/data/postmaster.pid" does not exist
> Is server running?
>
> ...respectively.
>
> I am in the process of developing a clustering solution around luci
> and rgmanager (in Red Hat EL 6) and for the time being, am basing it
> off the string output. Maybe each different exit reason should have a
> unique exit code, whatever my logic and approach to solving this
> problem be?

That doesn't seem like a bad idea.  Got a patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: "pg_ctl promote" exit status

От
"Aaron W. Swenson"
Дата:
On Tue, Oct 23, 2012 at 12:29:11PM -0400, Robert Haas wrote:
> On Tue, Oct 23, 2012 at 6:39 AM, Dhruv Ahuja <dhruvahuja@gmail.com> wrote:
> > The "pg_ctl promote" command returns an exit code of 1 when the server
> > is not in standby mode, and the same exit code of 1 when the server
> > isn't started at all. The only difference at the time being is the
> > string output at the time, which FYI are...
> >
> > pg_ctl: cannot promote server; server is not in standby mode
> >
> > ...and...
> >
> > pg_ctl: PID file "/var/lib/pgsql/9.1/data/postmaster.pid" does not exist
> > Is server running?
> >
> > ...respectively.
> >
> > I am in the process of developing a clustering solution around luci
> > and rgmanager (in Red Hat EL 6) and for the time being, am basing it
> > off the string output. Maybe each different exit reason should have a
> > unique exit code, whatever my logic and approach to solving this
> > problem be?
>
> That doesn't seem like a bad idea.  Got a patch?
>

The Linux Standard Base Core Specification 3.1 says this should return
'3'. [1]

[1] http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C 0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Вложения

Re: "pg_ctl promote" exit status

От
Dhruv Ahuja
Дата:
May I propose the attached patch.

Points to note and possibly discuss:
(a) Only exit codes in do_* functions have been changed.
(b) The link to, and the version of, LSB specifications has been updated.
(c) A significant change is the exit code of do_stop() on stopping a stopped server. Previous return is 1. Proposed return is 0. If this is accepted, I would highly suggest a mention in the Release Notes.
(d) The exit code that raised this issue was the return of promoting a promoted server. If promotion fails because the server is running but not as standby, should that be considered a case of starting a started service, or an application specific failure? I am equally weighted to opt for the former, but have proposed differently in the patch.



On 23 October 2012 17:29, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Oct 23, 2012 at 6:39 AM, Dhruv Ahuja <dhruvahuja@gmail.com> wrote:
> The "pg_ctl promote" command returns an exit code of 1 when the server
> is not in standby mode, and the same exit code of 1 when the server
> isn't started at all. The only difference at the time being is the
> string output at the time, which FYI are...
>
> pg_ctl: cannot promote server; server is not in standby mode
>
> ...and...
>
> pg_ctl: PID file "/var/lib/pgsql/9.1/data/postmaster.pid" does not exist
> Is server running?
>
> ...respectively.
>
> I am in the process of developing a clustering solution around luci
> and rgmanager (in Red Hat EL 6) and for the time being, am basing it
> off the string output. Maybe each different exit reason should have a
> unique exit code, whatever my logic and approach to solving this
> problem be?

That doesn't seem like a bad idea.  Got a patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: "pg_ctl promote" exit status

От
Dhruv Ahuja
Дата:
Don't think the attachment made it in the last mail. Attaching now.


On 25 January 2013 18:33, Dhruv Ahuja <dhruvahuja@gmail.com> wrote:
May I propose the attached patch.

Points to note and possibly discuss:
(a) Only exit codes in do_* functions have been changed.
(b) The link to, and the version of, LSB specifications has been updated.
(c) A significant change is the exit code of do_stop() on stopping a stopped server. Previous return is 1. Proposed return is 0. If this is accepted, I would highly suggest a mention in the Release Notes.
(d) The exit code that raised this issue was the return of promoting a promoted server. If promotion fails because the server is running but not as standby, should that be considered a case of starting a started service, or an application specific failure? I am equally weighted to opt for the former, but have proposed differently in the patch.



On 23 October 2012 17:29, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Oct 23, 2012 at 6:39 AM, Dhruv Ahuja <dhruvahuja@gmail.com> wrote:
> The "pg_ctl promote" command returns an exit code of 1 when the server
> is not in standby mode, and the same exit code of 1 when the server
> isn't started at all. The only difference at the time being is the
> string output at the time, which FYI are...
>
> pg_ctl: cannot promote server; server is not in standby mode
>
> ...and...
>
> pg_ctl: PID file "/var/lib/pgsql/9.1/data/postmaster.pid" does not exist
> Is server running?
>
> ...respectively.
>
> I am in the process of developing a clustering solution around luci
> and rgmanager (in Red Hat EL 6) and for the time being, am basing it
> off the string output. Maybe each different exit reason should have a
> unique exit code, whatever my logic and approach to solving this
> problem be?

That doesn't seem like a bad idea.  Got a patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Вложения

Re: "pg_ctl promote" exit status

От
Peter Eisentraut
Дата:
On 1/12/13 3:30 PM, Aaron W. Swenson wrote:
> The Linux Standard Base Core Specification 3.1 says this should return
> '3'. [1]
> 
> [1] http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

The LSB spec doesn't say anything about a "promote" action.

And for the stop and reload actions that you tried to change, 3 is
"unimplemented".

There is an ongoing discussion about the exit status of the stop action
under <https://commitfest.postgresql.org/action/patch_view?id=1045>, so
let's keep this item about the "promote" action.



Re: "pg_ctl promote" exit status

От
"Aaron W. Swenson"
Дата:
On Fri, Jan 25, 2013 at 01:54:06PM -0500, Peter Eisentraut wrote:
> On 1/12/13 3:30 PM, Aaron W. Swenson wrote:
> > The Linux Standard Base Core Specification 3.1 says this should return
> > '3'. [1]
> >
> > [1] http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
>
> The LSB spec doesn't say anything about a "promote" action.
>
> And for the stop and reload actions that you tried to change, 3 is
> "unimplemented".
>
> There is an ongoing discussion about the exit status of the stop action
> under <https://commitfest.postgresql.org/action/patch_view?id=1045>, so
> let's keep this item about the "promote" action.

You are right. Had I read a little further down, it seems that the
exit status should actually be 7.

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C 0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: "pg_ctl promote" exit status

От
Heikki Linnakangas
Дата:
On 26.01.2013 23:44, Aaron W. Swenson wrote:
> On Fri, Jan 25, 2013 at 01:54:06PM -0500, Peter Eisentraut wrote:
>> On 1/12/13 3:30 PM, Aaron W. Swenson wrote:
>>> The Linux Standard Base Core Specification 3.1 says this should return
>>> '3'. [1]
>>>
>>> [1] http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
>>
>> The LSB spec doesn't say anything about a "promote" action.
>>
>> And for the stop and reload actions that you tried to change, 3 is
>> "unimplemented".
>>
>> There is an ongoing discussion about the exit status of the stop action
>> under<https://commitfest.postgresql.org/action/patch_view?id=1045>, so
>> let's keep this item about the "promote" action.
>
> You are right. Had I read a little further down, it seems that the
> exit status should actually be 7.

Not sure if that LSB section is relevant anyway. It specifies the exit 
codes for init scripts, but pg_ctl is not an init script.

- Heikki



Re: "pg_ctl promote" exit status

От
Kevin Grittner
Дата:
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> Not sure if that LSB section is relevant anyway. It specifies the
> exit codes for init scripts, but pg_ctl is not an init script.

Except that when I went to the trouble of wrapping pg_ctl with an
init script which was thoroughly LSB compliant (according to my
reading) and offered it to the community, everyone said that rather
than have such a complicated script it would be better to change
pg_ctl to include that logic and exit with an LSB compliant exit
code.

-Kevin



Re: "pg_ctl promote" exit status

От
Peter Eisentraut
Дата:
On 1/26/13 4:44 PM, Aaron W. Swenson wrote:
> You are right. Had I read a little further down, it seems that the
> exit status should actually be 7.

7 is OK for "not running", but what should we use when the server is not
in standby mode?  Using the idempotent argument that we are discussing
for the stop action, promoting a server that is not a standby should be
a noop and exit successfully.  Not sure if that is what we want, though.




Re: "pg_ctl promote" exit status

От
Tom Lane
Дата:
Kevin Grittner <kgrittn@ymail.com> writes:
> Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>> Not sure if that LSB section is relevant anyway. It specifies the
>> exit codes for init scripts, but pg_ctl is not an init script.

> Except that when I went to the trouble of wrapping pg_ctl with an
> init script which was thoroughly LSB compliant (according to my
> reading) and offered it to the community, everyone said that rather
> than have such a complicated script it would be better to change
> pg_ctl to include that logic and exit with an LSB compliant exit
> code.

Right.  The start and stop actions are commonly used in initscripts
so it'd be handy if the exit codes for those didn't need to be
remapped.

On the other hand, it's not at all clear to me that anyone would try
to put the promote action into an initscript, or that LSB would have
anything to say about the exit codes for such a nonstandard action
anyway.
        regards, tom lane



Re: "pg_ctl promote" exit status

От
Bruce Momjian
Дата:
On Mon, Jan 28, 2013 at 09:46:32AM -0500, Peter Eisentraut wrote:
> On 1/26/13 4:44 PM, Aaron W. Swenson wrote:
> > You are right. Had I read a little further down, it seems that the
> > exit status should actually be 7.
>
> 7 is OK for "not running", but what should we use when the server is not
> in standby mode?  Using the idempotent argument that we are discussing
> for the stop action, promoting a server that is not a standby should be
> a noop and exit successfully.  Not sure if that is what we want, though.

I looked at all the LSB return codes listed here and mapped them to
pg_ctl error situations:

    https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

Patch attached.  I did not touch the start/stop return codes.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Вложения

Re: "pg_ctl promote" exit status

От
Peter Eisentraut
Дата:
On 6/28/13 10:50 PM, Bruce Momjian wrote:
> On Mon, Jan 28, 2013 at 09:46:32AM -0500, Peter Eisentraut wrote:
>> On 1/26/13 4:44 PM, Aaron W. Swenson wrote:
>>> You are right. Had I read a little further down, it seems that the
>>> exit status should actually be 7.
>>
>> 7 is OK for "not running", but what should we use when the server is not
>> in standby mode?  Using the idempotent argument that we are discussing
>> for the stop action, promoting a server that is not a standby should be
>> a noop and exit successfully.  Not sure if that is what we want, though.
> 
> I looked at all the LSB return codes listed here and mapped them to
> pg_ctl error situations:
> 
>     https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
> 
> Patch attached.  I did not touch the start/stop return codes.

Approximately none of these changes seem correct to me.  For example,
why is failing to open the PID file 6, or failing to start the server 7?





Re: "pg_ctl promote" exit status

От
Bruce Momjian
Дата:
On Mon, Jul  1, 2013 at 10:11:23AM -0400, Peter Eisentraut wrote:
> On 6/28/13 10:50 PM, Bruce Momjian wrote:
> > On Mon, Jan 28, 2013 at 09:46:32AM -0500, Peter Eisentraut wrote:
> >> On 1/26/13 4:44 PM, Aaron W. Swenson wrote:
> >>> You are right. Had I read a little further down, it seems that the
> >>> exit status should actually be 7.
> >>
> >> 7 is OK for "not running", but what should we use when the server is not
> >> in standby mode?  Using the idempotent argument that we are discussing
> >> for the stop action, promoting a server that is not a standby should be
> >> a noop and exit successfully.  Not sure if that is what we want, though.
> > 
> > I looked at all the LSB return codes listed here and mapped them to
> > pg_ctl error situations:
> > 
> >     https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
> > 
> > Patch attached.  I did not touch the start/stop return codes.
> 
> Approximately none of these changes seem correct to me.  For example,
> why is failing to open the PID file 6, or failing to start the server 7?

Well, according to that URL, we have:
6    program is not configured7    program is not running

I just updated the pg_ctl.c comments to at least point to a valid URL
for this.  I think we can just call this item closed because I am still
unclear if these return codes should be returned by pg_ctl or the
start/stop script.

Anyway, while I do think pg_ctl could pass a little more information
back about failure via its return code, I am unclear if LSB is the right
approach.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: "pg_ctl promote" exit status

От
Peter Eisentraut
Дата:
On 7/1/13 12:47 PM, Bruce Momjian wrote:
>> Approximately none of these changes seem correct to me.  For example,
>> why is failing to open the PID file 6, or failing to start the server 7?
> 
> Well, according to that URL, we have:
> 
>     6    program is not configured
>     7    program is not running

There is also

4    user had insufficient privilege

> I just updated the pg_ctl.c comments to at least point to a valid URL
> for this.  I think we can just call this item closed because I am still
> unclear if these return codes should be returned by pg_ctl or the
> start/stop script.
> 
> Anyway, while I do think pg_ctl could pass a little more information
> back about failure via its return code, I am unclear if LSB is the right
> approach.

Yeah, a lot of these things are unclear and not used in practice, so
it's probably better to stick to exit code 1, unless there is a clear
use case.  The "status" case is different, because there the exit code
can be passed out by the init script directly.