Re: error: could not find pg_class tuple for index 2662
От | daveg |
---|---|
Тема | Re: error: could not find pg_class tuple for index 2662 |
Дата | |
Msg-id | 20110801030630.GG15578@sonic.net обсуждение исходный текст |
Ответ на | Re: error: could not find pg_class tuple for index 2662 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: error: could not find pg_class tuple for index 2662
|
Список | pgsql-hackers |
On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote: > daveg <daveg@sonic.net> writes: > > Here is the update: the problem happens with vacuum full alone, no reindex > > is needed to trigger it. I updated the script to avoid reindexing after > > vacuum. Over the past two days there are still many ocurrances of this > > error coincident with the vacuum. > > Well, that jives with the assumption that the one case we saw in > the buildfarm was the same thing, because the regression tests were > certainly only doing a VACUUM FULL and not a REINDEX of pg_class. > But it doesn't get us much closer to understanding what's happening. > In particular, it seems to knock out most ideas associated with race > conditions, because the VAC FULL should hold exclusive lock on pg_class > until it's completely done (including index rebuilds). > > I think we need to start adding some instrumentation so we can get a > better handle on what's going on in your database. If I were to send > you a source-code patch for the server that adds some more logging > printout when this happens, would you be willing/able to run a patched > build on your machine? Yes we can run an instrumented server so long as the instrumentation does not interfere with normal operation. However, scheduling downtime to switch binaries is difficult, and generally needs to be happen on a weekend, but sometimes can be expedited. I'll look into that. > (BTW, just to be perfectly clear ... the "could not find pg_class tuple" > errors always mention index 2662, right, never any other number?) Yes, only index 2662, never any other. I'm attaching a somewhat redacted log for two different databases on the same instance around the time of vacuum full of pg_class in each database. My observations so far are: - the error occurs at commit of vacuum full of pg_class - in these cases error hits autovacuum after it waited for a lock on pg_class - in these two cases there was a new process startup while the vacuum was running. Don't know if this is relevant. - while these hit autovacuum, the error does hit other processs (just not in these sessions). Unknown if autovacuum is a required component. -dg -- David Gould daveg@sonic.net 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects.
Вложения
В списке pgsql-hackers по дате отправления: