Обсуждение: postgresql-[any version] from FreeBSD ports - startup problems after crash
postgresql-[any version] from FreeBSD ports - startup problems after crash
От
Ruslan A Dautkhanov
Дата:
Hello ! Server rebooted occasionally after power failure. And I have stale postmaster.pid file, so postmaster didn't start with error bill postgres[600]: [1-1] FATAL: file "postmaster.pid" already exists I think startup script and/or pg_ctl have to be written to check if that process really exists and it is postmaster, so DBMS server starts after any hard reboot. I changed the startup script block postgresql_command() { su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}" } to postgresql_cmd() { su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}" } postgresql_command() { if [ ".$1" = ".start" ]; then pidfile="${postgresql_data}/postmaster.pid" if [ -e ${pidfile} ]; then #check if postmaster process really exists pid_fromfile=`head -1 ${pidfile}` real_pid=`ps ax | grep -v grep | grep postmaster | grep ${postgresql_data} | awk '{print $1}'` if [ "x${pid_fromfile}" = "x${real_pid}" ]; then echo "Postmater for datadir ${postgresql_data} already run with pid $real_pid" else #we have stale pidfile, remove it unlink $pidfile #and run postmater safely postgresql_cmd fi else #.pid file not exists, clean startup postgresql_cmd fi else postgresql_cmd fi } That I hope satisfy all cases with stale .pid file... -- Ruslan A Dautkhanov
Ruslan A Dautkhanov <rusland@scn.ru> writes: > Server rebooted occasionally after power failure. > And I have stale postmaster.pid file, so postmaster didn't start with error > bill postgres[600]: [1-1] FATAL: file "postmaster.pid" already exists You probably need a newer postgres version (you didn't say what you are using) and/or a more carefully written start script. Your proposed change in the start script is useless --- do you think the postmaster doesn't check that already? Furthermore, it's actually dangerous for reasons we need not get into here; suffice to say that automated removal of that lock file is NOT a good idea. The problem comes up when the startup timing is just different enough that the PID belonging to the postmaster in the previous boot cycle now belongs to the shell that's launching it. The postmaster sees a live process of the correct userid (ie, postgres) and has to assume that that's a pre-existing postmaster. We've fixed this in recent releases by having the postmaster also check for a match to its parent process ID (getppid). The care in the start script comes because this only works for one level up. Therefore, you can't "su -c pg_ctl start ..." because that would create three levels of postgres-owned processes (shell, pg_ctl, postmaster) and if the PID count is off by 2 instead of 1 then we still lose. You have to invoke the postmaster directly, "su -c postmaster ...". (Hm, actually it might work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.) regards, tom lane
Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
От
"Jim C. Nasby"
Дата:
On Mon, May 15, 2006 at 09:23:33AM -0400, Tom Lane wrote: > We've fixed this in recent releases by having the postmaster also check > for a match to its parent process ID (getppid). The care in the start > script comes because this only works for one level up. Therefore, you > can't "su -c pg_ctl start ..." because that would create three levels of > postgres-owned processes (shell, pg_ctl, postmaster) and if the PID > count is off by 2 instead of 1 then we still lose. You have to invoke > the postmaster directly, "su -c postmaster ...". (Hm, actually it might > work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.) Except that the shell that's running su would be root, not pgsql, at least in the case of FreeBSD. The guts of the current port's rc.d file are: su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}" -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
"Jim C. Nasby" <jnasby@pervasive.com> writes: > Except that the shell that's running su would be root, not pgsql, at > least in the case of FreeBSD. The guts of the current port's rc.d file > are: > su -l ${postgresql_user} -c "exec ${command} ${command_args} ${rc_arg}" Yeah, but what's the ${command} ? If it's pg_ctl then all he's missing is the recent change to check getppid. If it's execing postmaster directly then maybe we need another theory. regards, tom lane
Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
От
"Larry Rosenman"
Дата:
Tom Lane wrote: > "Jim C. Nasby" <jnasby@pervasive.com> writes: >> Except that the shell that's running su would be root, not pgsql, at >> least in the case of FreeBSD. The guts of the current port's rc.d >> file are: > >> su -l ${postgresql_user} -c "exec ${command} ${command_args} >> ${rc_arg}" > > Yeah, but what's the ${command} ? > > If it's pg_ctl then all he's missing is the recent change to check > getppid. If it's execing postmaster directly then maybe we need > another theory. It's pg_ctl.... command=${prefix}/bin/pg_ctl -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 512-248-2683 E-Mail: ler@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893
Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
От
"Jim C. Nasby"
Дата:
On Mon, May 15, 2006 at 02:20:51PM -0500, Larry Rosenman wrote: > > Yeah, but what's the ${command} ? > > > > If it's pg_ctl then all he's missing is the recent change to check > > getppid. If it's execing postmaster directly then maybe we need > > another theory. > > It's pg_ctl.... > > command=${prefix}/bin/pg_ctl http://lnk.nu/freebsd.org/9fu.tmpl is the file in ports CVS. http://jim.nasby.net/010.pgsql.sh.txt is the file as it exists on one of my systems. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461