Re: Elusive segfault with 9.3.5 & query cancel
От | Jim Nasby |
---|---|
Тема | Re: Elusive segfault with 9.3.5 & query cancel |
Дата | |
Msg-id | 54823492.3000009@BlueTreble.com обсуждение исходный текст |
Ответ на | Re: Elusive segfault with 9.3.5 & query cancel (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: Elusive segfault with 9.3.5 & query cancel
Re: Elusive segfault with 9.3.5 & query cancel |
Список | pgsql-hackers |
On 12/5/14, 4:11 PM, Peter Geoghegan wrote: > On Fri, Dec 5, 2014 at 1:29 PM, Josh Berkus <josh@agliodbs.com> wrote: >>> We made some changes which decreased query cancel (optimizing queries, >>> turning on hot_standby_feedback) and we haven't seen a segfault since >>> then. As far as the user is concerned, this solves the problem, so I'm >>> never going to get a trace or a core dump file. >> >> Forgot a major piece of evidence as to why I think this is related to >> query cancel: in each case, the segfault was preceeded by a >> multi-backend query cancel 3ms to 30ms beforehand. It is possible that >> the backend running the query which segfaulted might have been the only >> backend *not* cancelled due to query conflict concurrently. >> Contradicting this, there are other multi-backend query cancels in the >> logs which do NOT produce a segfault. > > I wonder if it would be useful to add additional instrumentation so > that even without a core dump, there was some cursory information > about the nature of a segfault. > > Yes, doing something with a SIGSEGV handler is very scary, and there > are major portability concerns (e.g. > https://bugs.ruby-lang.org/issues/9654), but I believe it can be made > robust on Linux. For what it's worth, this open source project offers > that kind of functionality in the form of a library: > https://github.com/vmarkovtsev/DeathHandler Perhaps we should also officially recommend production servers be setup to create core files. AFAIK the only downside isthe time it would take to write a core that's huge because of shared buffers, but perhaps there's some way to avoid writingthose? (That means the core won't help if the bug is due to something in a buffer, but that seems unlikely enoughthat the tradeoff is worth it...) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
В списке pgsql-hackers по дате отправления: