Backend core dump, Please help, Urgent!
От | Matthew Hagerty |
---|---|
Тема | Backend core dump, Please help, Urgent! |
Дата | |
Msg-id | 4.1.19991214124701.0416e100@mail.venux.net обсуждение исходный текст |
Список | pgsql-interfaces |
Greetings, If anyone could help me figure out what is going on with my PostgreSQL backend I would greatly appreciate it!! I'll try to be brief and to the point. I work for a small company and we created an online app for another small company that has about 300 members who access the site. I think the record for simultaneous logins is about 15, so the load is not really that great. There are about 3000 to 5000 records added per month. The app is written in PHP3-3.0.12 compiled as an Apache-1.3.6 module. The OS is FreeBSD-3.1-Release with GCC-2.7.2.1 and a PostgreSQL-6.5.1 backend. I start the postgres process at startup like this: su postgres -c "/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data -i > /usr/local/pgsql/postgres.log 2>&1 &" The server is an Intel R440LX Motherboard with two P2/333, 128Meg ECC DIMM, and three 4.5G WD SCSI drives. The primary database and main app code were designed and written in-house, however we do use a PHP3 program called Phorum to implement a message forum for the users. The main app database and the phorum database are two separate databases. The app went online on August 30, 1999 and has run without incident until yesterday. At about 10am Dec, 13th, 1999 one of the programmers noticed that none of the forum messages would come up. I went to the console of the server and saw this message about 10 or 15 times: Dec 13 10:35:56 redbox /kernel: pid 13856 (postgres), uid 1002: exited on signal 11 (core dumped) A ps -xa revealed about 15 or so postgres processes! I did not think postgres made any child processes?!?! So I stopped the web server and killed the main postgres process which seemed to kill all the other postgres processes. I then tried to restart postgres and got an error message that was something like: IpcSemaphore??? - Key=54321234 Max I could kick myself for not recording the exact message. Something to do with shared memory I think. Never the less, postgres was not going to start back up and I did not know what the error was telling me, so I had to reboot (uptime said 143 days). When the system came back up postgres started and I tried to check if there was a post to the phorum database that may have caused the core dump. I executed 2 queries and then tried to query the main app database from another terminal. The main app queries were not executing, so I did a ps -xa to see what processes were running and there were exactly 2 core dumped sig 11 postgres processes!! So I did another query on the phorum database and got a 3rd core dumped process! At this point I killed all the postgres processes, restarted postgres and tried to do a dump on the main app database. pg_dump gave an error similar to this (I kick myself again): Tuple 0:0 invalid, can't dump. So, pg_dump was not going to give me a backup to that point, so I stopped postgres and issued: # rm -r data # initdb # createdb ipa # createdb phorum Then I used the previous day's backup for the main app, and just created the table structure for the phourm since we do not backup that data. Restarted the postgres and the web server and all seemed fine... until today. At 9:36am on the 14th it happened again. Again I was unable to recover the data and had to rebuild the data directory. I did not delete the data directory this time, I just moved it to another directory so I would have it. I also have the core dumps. The only file I had to delete was the pg_log in the data directory. What is this file? It had grown to 700Meg in under 24 hours!! Also, the core dump for the main app grew from 2.7Meg to over 80Meg while I was trying to dump the data. My biggest hang-up is why all of a sudden? We literally did not change anything! The system was working fine since August. And now, after creating new databases, it does it again in less than 24 hours! Also, is there some reason why the log file created by postgres does not timestamp its entries? I will provide any table structures, core files, server logs, etc. if needed. Anything that might give me an idea as to what is going on. Thank you, Matthew Matthew Hagerty Venux Technology Group matthew@venux.net 616.458.9800
В списке pgsql-interfaces по дате отправления:
Предыдущее
От: David OsborneДата:
Сообщение: Re: [INTERFACES] Duplicate INSERTS into pgsql table via PHP