Re: Backends dying due to memory exhaustion--I'm stonkered
От | Doug McNaught |
---|---|
Тема | Re: Backends dying due to memory exhaustion--I'm stonkered |
Дата | |
Msg-id | m3bsststrf.fsf@belphigor.mcnaught.org обсуждение исходный текст |
Ответ на | Backends dying due to memory exhaustion--I'm stonkered (Doug McNaught <doug@wireboard.com>) |
Ответы |
Re: Backends dying due to memory exhaustion--I'm stonkered
|
Список | pgsql-general |
Tom Lane <tgl@sss.pgh.pa.us> writes: > Doug McNaught <doug@wireboard.com> writes: > > One funny thing is that the nightly VACUUM doesn't always fail--the > > system will run smoothly for one to three days on average before a > > crash. > > That does seem to contradict the corrupt-data theory. Do you run a > VACUUM ANALYZE or just a plain VACUUM? If there were a persisting > corrupted tuple, I'd expect VACUUM ANALYZE to crash always, VACUUM > never (VACUUM doesn't inquire into the actual contents of tuples). I'm running VACUUM, then VACUUM ANALYZE (the docs seem to suggest that you need both). Basically my script is: $ vacuumdb -a $ vacuumdb -z -a The example I sent was a crash during VACUUM. > > That's a thought, and I will try it. I'm currently (as of yesterday's > > crash) running with -d 2 and output sent to a logfile. Is this > > debuglevel high enough to tell me which table contains the bad tuple, > > if that's indeed the problem? > > That would tell you what query is running. It's not enough to tell you > where VACUUM is unless you do VACUUM VERBOSE. Which will no doubt generate reams and reams of data... > > If I can't nail it down that way, how hard would it be to write a C > > program to scan all the tuples in a database looking for bogus size > > fields? > > Fairly hard. I'd suggest instead that you just do > psql -c "copy FOO to stdout" dbname >/dev/null > and try that on each table in turn to see if you get any crashes... OK, I'll keep that in reserve. Another thing that springs to mind--once the crash happens, the database doesn't respond (or gives fatal errors) to new connections and to queries on existing connections. Killing the postmaster does nothing--I have to send SIGTERM to all backends and the postmaster in order to get it to exit. I don't know if this helps... -Doug
В списке pgsql-general по дате отправления: