Обсуждение: postgresql-[any version] from FreeBSD ports - startup problems after crash
postgresql-[any version] from FreeBSD ports - startup problems after crash
От
Ruslan A Dautkhanov
Дата:
Hello !
Server rebooted occasionally after power failure.
And I have stale postmaster.pid file, so postmaster didn't start with error
bill postgres[600]: [1-1] FATAL: file "postmaster.pid" already exists
I think startup script and/or pg_ctl have to be written to check if that
process really exists
and it is postmaster, so DBMS server starts after any hard reboot.
I changed the startup script block
postgresql_command()
{
su -l ${postgresql_user} -c "exec ${command} ${command_args}
${rc_arg}"
}
to
postgresql_cmd()
{
su -l ${postgresql_user} -c "exec ${command} ${command_args}
${rc_arg}"
}
postgresql_command()
{
if [ ".$1" = ".start" ]; then
pidfile="${postgresql_data}/postmaster.pid"
if [ -e ${pidfile} ]; then
#check if postmaster process really exists
pid_fromfile=`head -1 ${pidfile}`
real_pid=`ps ax | grep -v grep | grep postmaster
| grep ${postgresql_data} | awk '{print $1}'`
if [ "x${pid_fromfile}" = "x${real_pid}" ]; then
echo "Postmater for datadir
${postgresql_data} already run with pid $real_pid"
else
#we have stale pidfile, remove it
unlink $pidfile
#and run postmater safely
postgresql_cmd
fi
else
#.pid file not exists, clean startup
postgresql_cmd
fi
else
postgresql_cmd
fi
}
That I hope satisfy all cases with stale .pid file...
--
Ruslan A Dautkhanov
Ruslan A Dautkhanov <rusland@scn.ru> writes:
> Server rebooted occasionally after power failure.
> And I have stale postmaster.pid file, so postmaster didn't start with error
> bill postgres[600]: [1-1] FATAL: file "postmaster.pid" already exists
You probably need a newer postgres version (you didn't say what you are
using) and/or a more carefully written start script.
Your proposed change in the start script is useless --- do you think the
postmaster doesn't check that already? Furthermore, it's actually
dangerous for reasons we need not get into here; suffice to say that
automated removal of that lock file is NOT a good idea.
The problem comes up when the startup timing is just different enough
that the PID belonging to the postmaster in the previous boot cycle now
belongs to the shell that's launching it. The postmaster sees a live
process of the correct userid (ie, postgres) and has to assume that
that's a pre-existing postmaster.
We've fixed this in recent releases by having the postmaster also check
for a match to its parent process ID (getppid). The care in the start
script comes because this only works for one level up. Therefore, you
can't "su -c pg_ctl start ..." because that would create three levels of
postgres-owned processes (shell, pg_ctl, postmaster) and if the PID
count is off by 2 instead of 1 then we still lose. You have to invoke
the postmaster directly, "su -c postmaster ...". (Hm, actually it might
work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.)
regards, tom lane
Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
От
"Jim C. Nasby"
Дата:
On Mon, May 15, 2006 at 09:23:33AM -0400, Tom Lane wrote:
> We've fixed this in recent releases by having the postmaster also check
> for a match to its parent process ID (getppid). The care in the start
> script comes because this only works for one level up. Therefore, you
> can't "su -c pg_ctl start ..." because that would create three levels of
> postgres-owned processes (shell, pg_ctl, postmaster) and if the PID
> count is off by 2 instead of 1 then we still lose. You have to invoke
> the postmaster directly, "su -c postmaster ...". (Hm, actually it might
> work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.)
Except that the shell that's running su would be root, not pgsql, at
least in the case of FreeBSD. The guts of the current port's rc.d file
are:
su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}"
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Except that the shell that's running su would be root, not pgsql, at
> least in the case of FreeBSD. The guts of the current port's rc.d file
> are:
> su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}"
Yeah, but what's the ${command} ?
If it's pg_ctl then all he's missing is the recent change to check
getppid. If it's execing postmaster directly then maybe we need
another theory.
regards, tom lane
Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
От
"Larry Rosenman"
Дата:
Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
>> Except that the shell that's running su would be root, not pgsql, at
>> least in the case of FreeBSD. The guts of the current port's rc.d
>> file are:
>
>> su -l ${postgresql_user} -c "exec ${command} ${command_args}
>> ${rc_arg}"
>
> Yeah, but what's the ${command} ?
>
> If it's pg_ctl then all he's missing is the recent change to check
> getppid. If it's execing postmaster directly then maybe we need
> another theory.
It's pg_ctl....
command=${prefix}/bin/pg_ctl
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893
Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
От
"Jim C. Nasby"
Дата:
On Mon, May 15, 2006 at 02:20:51PM -0500, Larry Rosenman wrote:
> > Yeah, but what's the ${command} ?
> >
> > If it's pg_ctl then all he's missing is the recent change to check
> > getppid. If it's execing postmaster directly then maybe we need
> > another theory.
>
> It's pg_ctl....
>
> command=${prefix}/bin/pg_ctl
http://lnk.nu/freebsd.org/9fu.tmpl is the file in ports CVS.
http://jim.nasby.net/010.pgsql.sh.txt is the file as it exists on one of
my systems.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461