Re: buffer assertion tripping under repeat pgbench load
От | Greg Smith |
---|---|
Тема | Re: buffer assertion tripping under repeat pgbench load |
Дата | |
Msg-id | 50F2474F.5040204@2ndQuadrant.com обсуждение исходный текст |
Ответ на | Re: buffer assertion tripping under repeat pgbench load (Greg Stark <stark@mit.edu>) |
Ответы |
Re: buffer assertion tripping under repeat pgbench load
Re: buffer assertion tripping under repeat pgbench load |
Список | pgsql-hackers |
On 12/26/12 7:23 PM, Greg Stark wrote: > It's also possible it's a bad cpu, not bad memory. If it affects > decrement or increment in particular it's possible that the pattern of > usage on LocalRefCount is particularly prone to triggering it. This looks to be the winning answer. It turns out that under extended multi-hour loads at high concurrency, something related to CPU overheating was occasionally flipping a bit. One round of compressed air for all the fans/vents, a little tweaking of the fan controls, and now the system goes >24 hours with no problems. Sorry about all the noise over this. I do think the improved warning messages that came out of the diagnosis ideas are useful. The reworked code must slows down the checking a few cycles, but if you care about performance these assertions are tacked onto the biggest pig around. I added the patch to the January CF as "Improve buffer refcount leak warning messages". The sample I showed with the patch submission was a simulated one. Here's the output from the last crash before resolving the issue, where the assertion really triggered: WARNING: buffer refcount leak: [170583] (rel=base/16384/16578, blockNum=302295, flags=0x106, refcount=0 1073741824) WARNING: buffers with non-zero refcount is 1 TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 1712) -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
В списке pgsql-hackers по дате отправления: