Обсуждение: postgresql-[any version] from FreeBSD ports - startup problems after crash

Поиск
Список
Период
Сортировка

postgresql-[any version] from FreeBSD ports - startup problems after crash

От
Ruslan A Dautkhanov
Дата:
Hello !

Server rebooted occasionally after power failure.
And I have stale postmaster.pid file, so postmaster didn't start with error
    bill postgres[600]: [1-1] FATAL:  file "postmaster.pid" already exists

I think startup script and/or pg_ctl have to be written to check if that
process really exists
and it is postmaster, so DBMS server starts after any hard reboot.

I changed the startup script block

    postgresql_command()
    {
        su -l ${postgresql_user} -c "exec ${command} ${command_args}
${rc_arg}"
    }

to

postgresql_cmd()
{
        su -l ${postgresql_user} -c "exec ${command} ${command_args}
${rc_arg}"
}
postgresql_command()
{
        if [ ".$1" = ".start" ]; then
                pidfile="${postgresql_data}/postmaster.pid"
                if [ -e ${pidfile} ]; then
                        #check if postmaster process really exists
                        pid_fromfile=`head -1 ${pidfile}`
                        real_pid=`ps ax | grep -v grep | grep postmaster
| grep ${postgresql_data} | awk '{print $1}'`
                        if [ "x${pid_fromfile}" = "x${real_pid}" ]; then
                                echo "Postmater for datadir
${postgresql_data} already run with pid $real_pid"
                        else
                                #we have stale pidfile, remove it
                                unlink $pidfile
                                #and run postmater safely
                                postgresql_cmd
                        fi
                else
                        #.pid file not exists, clean startup
                        postgresql_cmd
                fi
        else
                postgresql_cmd
        fi
}

That I hope satisfy all cases with stale .pid file...

--
Ruslan A Dautkhanov

Re: postgresql-[any version] from FreeBSD ports - startup problems after crash

От
Tom Lane
Дата:
Ruslan A Dautkhanov <rusland@scn.ru> writes:
> Server rebooted occasionally after power failure.
> And I have stale postmaster.pid file, so postmaster didn't start with error
>     bill postgres[600]: [1-1] FATAL:  file "postmaster.pid" already exists

You probably need a newer postgres version (you didn't say what you are
using) and/or a more carefully written start script.

Your proposed change in the start script is useless --- do you think the
postmaster doesn't check that already?  Furthermore, it's actually
dangerous for reasons we need not get into here; suffice to say that
automated removal of that lock file is NOT a good idea.

The problem comes up when the startup timing is just different enough
that the PID belonging to the postmaster in the previous boot cycle now
belongs to the shell that's launching it.  The postmaster sees a live
process of the correct userid (ie, postgres) and has to assume that
that's a pre-existing postmaster.

We've fixed this in recent releases by having the postmaster also check
for a match to its parent process ID (getppid).  The care in the start
script comes because this only works for one level up.  Therefore, you
can't "su -c pg_ctl start ..." because that would create three levels of
postgres-owned processes (shell, pg_ctl, postmaster) and if the PID
count is off by 2 instead of 1 then we still lose.  You have to invoke
the postmaster directly, "su -c postmaster ...".  (Hm, actually it might
work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.)

            regards, tom lane

Re: postgresql-[any version] from FreeBSD ports - startup problems after crash

От
"Jim C. Nasby"
Дата:
On Mon, May 15, 2006 at 09:23:33AM -0400, Tom Lane wrote:
> We've fixed this in recent releases by having the postmaster also check
> for a match to its parent process ID (getppid).  The care in the start
> script comes because this only works for one level up.  Therefore, you
> can't "su -c pg_ctl start ..." because that would create three levels of
> postgres-owned processes (shell, pg_ctl, postmaster) and if the PID
> count is off by 2 instead of 1 then we still lose.  You have to invoke
> the postmaster directly, "su -c postmaster ...".  (Hm, actually it might
> work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.)

Except that the shell that's running su would be root, not pgsql, at
least in the case of FreeBSD. The guts of the current port's rc.d file
are:

su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}"
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: postgresql-[any version] from FreeBSD ports - startup problems after crash

От
Tom Lane
Дата:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Except that the shell that's running su would be root, not pgsql, at
> least in the case of FreeBSD. The guts of the current port's rc.d file
> are:

> su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}"

Yeah, but what's the ${command} ?

If it's pg_ctl then all he's missing is the recent change to check
getppid.  If it's execing postmaster directly then maybe we need
another theory.

            regards, tom lane

Re: postgresql-[any version] from FreeBSD ports - startup problems after crash

От
"Larry Rosenman"
Дата:
Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
>> Except that the shell that's running su would be root, not pgsql, at
>> least in the case of FreeBSD. The guts of the current port's rc.d
>> file are:
>
>> su -l ${postgresql_user} -c "exec ${command} ${command_args}
>> ${rc_arg}"
>
> Yeah, but what's the ${command} ?
>
> If it's pg_ctl then all he's missing is the recent change to check
> getppid.  If it's execing postmaster directly then maybe we need
> another theory.

It's pg_ctl....

command=${prefix}/bin/pg_ctl


--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 512-248-2683                 E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893

Re: postgresql-[any version] from FreeBSD ports - startup problems after crash

От
"Jim C. Nasby"
Дата:
On Mon, May 15, 2006 at 02:20:51PM -0500, Larry Rosenman wrote:
> > Yeah, but what's the ${command} ?
> >
> > If it's pg_ctl then all he's missing is the recent change to check
> > getppid.  If it's execing postmaster directly then maybe we need
> > another theory.
>
> It's pg_ctl....
>
> command=${prefix}/bin/pg_ctl

http://lnk.nu/freebsd.org/9fu.tmpl is the file in ports CVS.
http://jim.nasby.net/010.pgsql.sh.txt is the file as it exists on one of
my systems.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: postgresql-[any version] from FreeBSD ports - startup

От
Ruslan A Dautkhanov
Дата: