Обсуждение: pgsql8b5 not launching on OSX system start; otherwise OK
hi all,
i've a new install of pgsql8b5 running on OSX 10.3.6.
i can readily start it from the command line with:
sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster -n -i -h
10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf
</dev/null >>/var/devlogs/postgres.log &"
after which it behaves as i'd expect =)
however, if i place an identical startup string in my OSX's StartupItem for
pgsql & reboot, pgsql does not start on boot. immediately after, i can launch
... but not on system start.
i've turned debugging (debug5, i think i got 'em all ...) on, and my
"/var/devlogs/postgres.log" after startup only shows:
LOG: logger shutting down
DEBUG: proc_exit(0)
DEBUG: shmem_exit(0)
DEBUG: exit(0)
system & kernel logs show nothing of obvious consequence ...
any suggestions as to how to track down the no-start-on-startup problem?
thx!
richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
> sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster -n -i -h
> 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf
> </dev/null >>/var/devlogs/postgres.log &"
Hmm, isn't this letting postmaster stderr disappear into the bit bucket?
Try adding "2>&1" after the ">>/var/devlogs/postgres.log" so you can see
if anything interesting shows up.
regards, tom lane
hi tom,
-- On Thursday, December 2, 2004 12:33:48 PM PST -0500 Tom Lane
<tgl@sss.pgh.pa.us> wrote:
> OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
>> sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster -n -i -h
>> 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf
>> </dev/null >>/var/devlogs/postgres.log &"
>
> Hmm, isn't this letting postmaster stderr disappear into the bit bucket?
entirely possible, and probably probable.
(it actually was 'in there' at one point, per the distro's included startup
script ... damn that copy-n-paste!)
> Try adding "2>&1" after the ">>/var/devlogs/postgres.log" so you can see
> if anything interesting shows up.
ok, did that, and 'simplified' my cmd as much as possible ...
here's the exact c/p from my current script:
sudo -u testuser sh -c "/usr/local/pgsql/bin/postmaster -i -h 10.0.0.6 -D
/var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf &"
>>/var/devlogs/postgres.log 2>&1
which i've tried to make 'as similar as possible' to the distro's example
script:
sudo -u $PGUSER sh -c "${DAEMON} -D '${PGDATA}' &" >>$PGLOG 2>&1
given my additions of:
-n do not reinitialize shared memory after abnormal exit
-i enable TCP/IP connections
-h HOSTNAME host name or IP address to listen on
, and the spec'd config file,
mine, all in all, _looks_ ok to me.
with the aforementioned startup string, here's the tail from my
'/var/devlogs/postgres.log' immediately after a reboot, b4 starting postmaster
from the cmd line:
LOCATION: PostmasterMain, postmaster.c:644
DEBUG: 00000: -----------------------------------------
LOCATION: PostmasterMain, postmaster.c:646
DEBUG: 00000: invoking IpcMemoryCreate(size=2547712)
LOCATION: CreateSharedMemoryAndSemaphores, ipci.c:87
DEBUG: 00000: max_safe_fds = 917, usable_fds = 951, already_open = 73
LOCATION: set_max_safe_fds, fd.c:360
LOG: 00000: logger shutting down
LOCATION: SysLoggerMain, syslogger.c:361
DEBUG: 00000: proc_exit(0)
LOCATION: proc_exit, ipc.c:95
DEBUG: 00000: shmem_exit(0)
LOCATION: shmem_exit, ipc.c:126
DEBUG: 00000: exit(0)
LOCATION: proc_exit, ipc.c:113
whereas the output starting *successfully* by executing the startup script from
the cmd line is just:
LOCATION: PostmasterMain, postmaster.c:644
DEBUG: 00000: -----------------------------------------
LOCATION: PostmasterMain, postmaster.c:646
DEBUG: 00000: invoking IpcMemoryCreate(size=2547712)
LOCATION: CreateSharedMemoryAndSemaphores, ipci.c:87
DEBUG: 00000: max_safe_fds = 917, usable_fds = 951, already_open = 73
LOCATION: set_max_safe_fds, fd.c:360
note, of course, _no_ 'proc exit'.
thoughts?
richard
On Thu, Dec 02, 2004 at 12:43:57PM -0800, OpenMacNews wrote: > given my additions of: > > -n do not reinitialize shared memory after abnormal exit > -i enable TCP/IP connections > -h HOSTNAME host name or IP address to listen on Why don't you use postgresql.conf for this, rather than modifying the start script? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "No necesitamos banderas No reconocemos fronteras" (Jorge González)
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
> LOG: 00000: logger shutting down
> LOCATION: SysLoggerMain, syslogger.c:361
I should have twigged to that before --- if you're running the syslogger,
then nothing except very early startup messages is going to go to
stderr. Look in wherever you told it to put the log output.
regards, tom lane
hi tom,
>> LOG: 00000: logger shutting down
>> LOCATION: SysLoggerMain, syslogger.c:361
>
> I should have twigged to that before --- if you're running the syslogger,
> then nothing except very early startup messages is going to go to
> stderr. Look in wherever you told it to put the log output.
i thought i was, in that the startup script was 'dumping' to
/var/devlogs/postgres.log.
also, given my logging section from my conf file:
######################
## ERROR REPORTING AND LOGGING
#
log_destination = 'stderr'
# relevant when logging to stderr:
redirect_stderr = true
log_directory = '/var/devlogs'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
# relevant when logging to syslog:
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
client_min_messages = debug5
log_min_messages =debug5
log_error_verbosity = verbose
log_min_error_statement = debug5
there's been no trace of any output to any 'postgresql-%Y-%m-%d_%H%M%S.log'
files.
while stumbling around, though, i noticed that after an un-successful startup
(i.e., no pgsql launched), there, nonetheless, WAS a pgsql pid file in my
process dir. odd ... so i deleted it, rebooted, and - voila! pgsql is up &
running ... and there are now dated log files, as well.
despite being able to start/stop pgsql from cmd line at will, *something* in my
system is not removing the pid file.
although i've seen nothing pid-related in my logs, preceding my startup file
launch cmd with a pid check/delete:
if [ -f /var/run/postgresql.pid ]; then
rm -rf /var/run/postgresql.pid
fi
(launch cmd)
seems to have done the trick. i can now reboot w/ pgsql launch on start
without fail.
so,
(a) i'll now hunt-n-destroy why i'm having a lingering pid file lying around,
and why a restart-launch chokes on an existing pid, but not a cmd-line launch?
(b) i might suggest that such a check be placed in the example startup script
for safety's sake ... although you'd have to check for the defined pid
path+file, of course.
thx! for your guidance =)
cheers,
richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
> although i've seen nothing pid-related in my logs, preceding my startup file
> launch cmd with a pid check/delete:
> if [ -f /var/run/postgresql.pid ]; then
> rm -rf /var/run/postgresql.pid
> fi
> (launch cmd)
> seems to have done the trick. i can now reboot w/ pgsql launch on start
> without fail.
In that case it's a problem in your launch script. The postmaster
doesn't even know that such a file exists; it keeps its lock file
in the data directory.
regards, tom lane
hi tom,
> In that case it's a problem in your launch script. The postmaster
> doesn't even know that such a file exists; it keeps its lock file
> in the data directory.
well, hmmmm.
the launch script is currently simplified (for testing) to just the
pid-checking-if-stmt + the single line launch cmd. there's honestly not much
left to have a problem with ...
note that my cmd line refers to the conf file, which has the external PID id'd
in it:
external_pid_file = '/var/run/postgresql.pid'
i've set it up to be (eventually) watched by a watchdog app ...
so, wouldn't (a) the postmaster know abt the PID file, and (b) check for its
existence?
or am i misunderstanding the purpose/use of the external pid?
cheers,
richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
> note that my cmd line refers to the conf file, which has the external
> PID id'd in it:
> external_pid_file = '/var/run/postgresql.pid'
Oh, now you tell us ;-)
Still, I'm not sure what could be the problem. The only code that
reacts to that setting is in postmaster.c:
/*
* Write the external PID file if requested
*/
if (external_pid_file)
{
FILE *fpidfile = fopen(external_pid_file, "w");
if (fpidfile)
{
fprintf(fpidfile, "%d\n", MyProcPid);
fclose(fpidfile);
/* Should we remove the pid file on postmaster exit? */
}
else
write_stderr("%s: could not write external PID file \"%s\": %s\n",
progname, external_pid_file, strerror(errno));
}
I suppose that the fopen might have failed (maybe the original pid file
wasn't writable by the postmaster??), but why wouldn't it have printed
an error message and kept going?
regards, tom lane
hi,
>> note that my cmd line refers to the conf file, which has the external
>> PID id'd in it:
>
>> external_pid_file = '/var/run/postgresql.pid'
> Oh, now you tell us ;-)
heh. sorry -- just thought it was SOP.
in case you haven't noticed, i'm at that 'wunnerful' ramp-up stage that i dunno
what i dunno ... or ... er ... or know what i should know ... or somesuch ...
=8-D
> write_stderr("%s: could not write external PID file \"%s\": %s\n",
> progname, external_pid_file, strerror(errno));
> }
simple enuf ...
> I suppose that the fopen might have failed (maybe the original pid file
> wasn't writable by the postmaster??),
just checked -- looks ok. PID is properly 'owned & operated' by the postmaster
superuser defined in the launch command
> but why wouldn't it have printed an error message and kept going?
that's the rub. i'd expect to see it in the logs, as well.
i just did a simple experiment.
disable PIFfile check/delete in startup script
stop postgres
delete PIDfile (if still there)
reboot
---> postgres launches OK
verify PIDfile exists ... it does
---> can start/stop pgsql at will @ cmd line
stop postgres
touch PIDfile (if _not_ there)
reboot
--> NO launch, nothing in the logs
verify PIDfile exists ... it does
---> can start/stop pgsql at will @ cmd line
reboot
--> still NO launch, nothing in the logs
verify PIDfile still exists ... it does
---> can start/stop pgsql at will @ cmd line
stop postgres
delete PIDfile
reboot
--> back to normal
all reproducible.
imho, it's acting like the cmd line launch is working with a different PID file
... somethin's wonky.
so,
(1) i have a workaround for the moment via the script check (couldn't hurt,
really, to add the check to the startup script ...)
(2) since i've been appropriately mangling my system while getting this all
running, i think it may be time for a wipe-n-reinstall ... who knows what i've
done to myself?
as you've mentioned, i wonder if i've an odd permission on a process or log dir
somehwere ...
cheers,
richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
> stop postgres
> touch PIDfile (if _not_ there)
> reboot
> --> NO launch, nothing in the logs
> verify PIDfile exists ... it does
But who is it owned by, and with what permissions? If you do the
"touch" as some other user than the postmaster runs as, it's very
plausible the postmaster can't write the file. (That doesn't yet
explain why it goes south afterward, but first we need to understand
the conditions that make it fail.)
regards, tom lane
hi, > But who is it owned by, and with what permissions? same owner as postmaster, 0644 or 0600 > If you do the "touch" as some other user than the postmaster runs as, it's very > plausible the postmaster can't write the file. (That doesn't yet > explain why it goes south afterward, but first we need to understand > the conditions that make it fail.) yup. agreed. postmaster launched as 'testuser', pidfile touched as: sudo -u testuser touch /var/run/postgresql.pid resulting in: -rw-r--r-- 1 testuser testuser 4 Dec 2 14:07 postgresql.pid fwiw, i've got a clean build under way on another box: pgsql, prereqs and dir hierarchy will all be 'fresh'. we'll see if it's me (betcha! there's been a LOT going on on _this_ box ... more than pgsql) or the code ... richard
(From someone else who doesn't know what doesn't know, ... :-/) > sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster [...] ... > >> note that my cmd line refers to the conf file, which has the external > >> PID id'd in it: > > > >> external_pid_file = '/var/run/postgresql.pid' > ... > just checked -- looks ok. PID is properly 'owned & operated' by the postmaster > superuser defined in the launch command Who owns /var/run? What group? Does testuser have permission to delete files there? (May need to add testuser to the wheel or admin group?) Another thought, try su -c instead of sudo? (See warning on first line. It's been a while since I've mucked that deep in the Mac OS X configurations, and my box is still on 10.2, so I'm probably just blowing smoke.) -- Joel Rees <rees@ddcom.co.jp> digitcom, inc. 株式会社デジコム Kobe, Japan +81-78-672-8800 ** <http://www.ddcom.co.jp> **
hi joel,
>> just checked -- looks ok. PID is properly 'owned & operated' by the
>> postmaster superuser defined in the launch command
>
> Who owns /var/run? What group? Does testuser have permission to delete
> files there? (May need to add testuser to the wheel or admin group?)
good points =) already done, tho ...
% ls -ald /var/run
drwxrwxr-x 29 root daemon 986 Dec 2 20:53 /var/run
% niutil -read / /groups/daemon
name: daemon
gid: 1
passwd: *
users: root testuser
> Another thought, try su -c instead of sudo?
afaik, shouldn't make a diff, as testuser is in /etc/sudoers ...
thx!
> Kobe, Japan <--- *there's* the beef ... :p
cheers,
richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes:
> i've a new install of pgsql8b5 running on OSX 10.3.6.
> ...
> however, if i place an identical startup string in my OSX's StartupItem for
> pgsql & reboot, pgsql does not start on boot.
I was trying to reproduce this on my own machine, but couldn't get out
of the starting gate. I put an executable shell script into
"/System Folder/Startup Items", but I couldn't see any evidence that the
system paid any attention to it at all. Exactly what are you doing to
tell OSX to run a bit of shell script at boot time?
regards, tom lane
>> i've a new install of pgsql8b5 running on OSX 10.3.6. >> ... >> however, if i place an identical startup string in my OSX's >> StartupItem for >> pgsql & reboot, pgsql does not start on boot. > > I was trying to reproduce this on my own machine, but couldn't get out > of the starting gate. I put an executable shell script into > "/System Folder/Startup Items", but I couldn't see any evidence that > the > system paid any attention to it at all. Exactly what are you doing to > tell OSX to run a bit of shell script at boot time? You basicall put formatted files in a directory under /Library/StartupItems One file should contain scripts for starting, stopping and restarting the service, the other should contains some generic stuff in a "Plist" file / XML file. I use the startup package from Liyanage (site down alas) sudo systemstarter -help will give you some info on how to test this without rebooting HTH, Philippe Schmid
Вложения
hi tom,
>> however, if i place an identical startup string in my OSX's StartupItem for
>> pgsql & reboot, pgsql does not start on boot.
>
> I was trying to reproduce this on my own machine, but couldn't get out
> of the starting gate. I put an executable shell script into
> "/System Folder/Startup Items", but I couldn't see any evidence that the
> system paid any attention to it at all. Exactly what are you doing to
> tell OSX to run a bit of shell script at boot time?
wrong location/folder name ... 'System Folder' is an OS9 construct
(you haven't installed OS9 and OSX on the same partition, now, have you? tsk,
tsk ... ;-) )
OSX userland startup scripts need to go into
/Library/StartupItems/SCRIPTNAME
(System startup scripts go in '/System/Library/StartupItems/SCRIPTNAME' but we
users are supposed to stay out o' there. BUT, you should always check to make
sure your userland script isn't conflicting with an Apple-installed flavor in
/System/...)
in the SCRIPTNAME dir you need two files:
(1) 'SCRIPTNAME', containing your script
(2) 'StartupParameters/.plist', a test or XML-formatted parameter file
perms/ownership should be:
chown -R root:wheel /Library/StartupItems/SCRIPTNAME
chmod 755 /Library/StartupItems/SCRIPTNAME
chmod 755 /Library/StartupItems/DarkMatter/SCRIPTNAME
chmod 644 /Library/StartupItems/SCRIPTNAME/StartupParameters.plist
fyi: here's an O'Reilly blurb with way more info than you want to know ...
<http://www.macdevcenter.com/pub/a/mac/2003/10/21/startup.html>
HTH!
richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > fyi: here's an O'Reilly blurb with way more info than you want to know ... > <http://www.macdevcenter.com/pub/a/mac/2003/10/21/startup.html> After eyeballing that, I think I have no hope of reproducing your test conditions unless you show me the exact script and property list files you used. In particular, I was wondering if the problem could be related to launching the postmaster in advance of some system service it needs; without seeing the Requires/Uses specs you gave, there's no way to know what might have happened. BTW, that page also references this Apple document saying that StartupItems are being obsoleted: http://developer.apple.com/documentation/macosx/Conceptual/BPSystemStartup/Concepts/BootProcess.html#//apple_ref/doc/uid/20002130/CJBBICAB However, they should still work as of 10.3.*, so that's just an interesting tidbit for the future. regards, tom lane
hi tom,
> After eyeballing that, I think I have no hope of reproducing your test
> conditions unless you show me the exact script and property list files
> you used.
certainly easy enuf. thought *i'm* not certain *i* have hope of reproducing
much after today's shenanigans ... jeeesh!
fyi, the latest versions (as you might suspect, they've been rather dynamic of
late ...) are:
% vi /Library/StartupItems/PostgreSQL/PostgreSQL
------------------------------------------
#!/bin/sh
. /etc/rc.common
StartService () {
if [ "${POSTGRESQL:=-NO-}" = "-YES-" ]; then
ConsoleMessage "Starting PgSQL"
if [ -f /var/run/postgresql.pid ]; then
ConsoleMessage "clearing PgSQL PIDfile"
rm -f /var/run/postgresql.pid
fi
sudo -u testuser sh -c "/usr/local/pgsql/bin/postmaster -n -i -h 10.0.0.6
-D
/var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf &"
>>/var/devlogs/postgres.log 2>&1
fi
}
StopService () {
ConsoleMessage "Stopping PgSQL"
sudo -u testuser $POSTGRE_DAEMON stop -D /var/data/pgsql -s -m fast
}
RestartService ()
{
if [ "${POSTGRESQL:=-NO-}" = "-YES-" ]; then
ConsoleMessage "Restarting PgSQL"
sudo -u testuser /usr/local/pgsql/bin/pg_ctl restart -D /var/data/pgsql -s
-m fast
else
StopService
fi
}
RunService "$1"
------------------------------------------
and,
%vi /Library/StartupItems/PostgreSQL/StartupParameters.plist
------------------------------------------
{
Description = "PgSQL DatabaseServer";
Provides = ("PgSQL", "DatabaseServer");
Requires = ("Disks", "Resolver");
Uses = ("NFS", "NetworkTime");
OrderPreference = "Late";
Messages =
{
start = "Starting PgSQL";
stop = "Stopping PgSQL";
};
}
------------------------------------------
where, of course
% vi /etc/hostconfig
------------------------------------------
+++ POSTGRESQL=-YES-
------------------------------------------
> In particular, I was wondering if the problem could be
> related to launching the postmaster in advance of some system service it
> needs; without seeing the Requires/Uses specs you gave, there's no way
> to know what might have happened.
i've had that issue in the past ... primarily related to partitions with DIRs
symlinked elsewhere not spinning up fast enuf.
long story short, i took care of it (chats on the Apple kernle board) and
hasn't been an issue since ...
> BTW, that page also references this Apple document saying that
> StartupItems are being obsoleted:
> http://developer.apple.com/documentation/macosx/Conceptual/BPSystemStartup/Co
> ncepts/BootProcess.html#//apple_ref/doc/uid/20002130/CJBBICAB However, they
> should still work as of 10.3.*, so that's just an
> interesting tidbit for the future.
yeah, yeah ;-) everything under xinetd, eventually ...
one thing at a time -- i'm running outa beer here!
cheers,
richard