Обсуждение: crash / data recovery issues

Поиск
Список
Период
Сортировка

crash / data recovery issues

От
Robert Treat
Дата:
I'm trying to do some data recovery on an 8.1.9 system.  The brief history is 
the system crashed, attempted to do xlog replay but that failed.   I did a 
pg_resetxlog to get something that would startup, and it looks as if the 
indexes on pg_class have become corrupt. (ie. reindex claimes duplicate rows, 
which do not show up when doing count() manipulations on the data).  As it 
turns out, I can't drop these indexes either (system refuses with message 
indexes are needed by the system).  This has kind of let the system in an 
unworkable state.  

I've tried to do a pg_dump, but get schema with OID 96568 does not exist 
error.  The database has a number (~100) temp schemas in it, so I was 
suspecting that the problem was with some object referencing a temp schema 
with broken dependencies, but I looked through pg_depend for any referencing 
objects but found none. I also looked through  pg_type, pg_proc, pg_class, 
pg_constraint, pg_operator, pg_opclass, pg_conversion at their respective 
*namespace fields and also found no matches.   Any suggestions on what else 
might cause this, or how to get past it?  

I also did some digging to find the original error on xlog replay and it 
was  "failed to re-find parent key in "763769" for split pages 21032/21033". 
I'm wondering if this is actually something you can push past with 
pg_resetxlog, or if I need to do a pg_resetxlog and pass in values prior to 
that error point (i guess essentially letting pg_resetxlog do a lookup)... 
thoughts? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


Re: crash / data recovery issues

От
Alvaro Herrera
Дата:
Robert Treat wrote:
> I'm trying to do some data recovery on an 8.1.9 system.  The brief history is 
> the system crashed, attempted to do xlog replay but that failed.   I did a 
> pg_resetxlog to get something that would startup, and it looks as if the 
> indexes on pg_class have become corrupt. (ie. reindex claimes duplicate rows, 
> which do not show up when doing count() manipulations on the data).  As it 
> turns out, I can't drop these indexes either (system refuses with message 
> indexes are needed by the system).  This has kind of let the system in an 
> unworkable state.  

You can work out of it by starting a standalone server with system
indexes disabled (postgres -O -P, I think) and do a REINDEX on it (the
form of it that reindexes all system indexes -- I think it's REINDEX
DATABASE).

> I also did some digging to find the original error on xlog replay and it 
> was  "failed to re-find parent key in "763769" for split pages 21032/21033". 
> I'm wondering if this is actually something you can push past with 
> pg_resetxlog, or if I need to do a pg_resetxlog and pass in values prior to 
> that error point (i guess essentially letting pg_resetxlog do a lookup)... 
> thoughts? 

You should be able to get out of that by reindexing that index.
(Actually, after you do a pg_resetxlog I think the best is to pg_dump
the whole thing and reload it.  That gives you at least the assurance
that your FKs are not b0rked)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: crash / data recovery issues

От
Tom Lane
Дата:
Robert Treat <xzilla@users.sourceforge.net> writes:
> I'm trying to do some data recovery on an 8.1.9 system.
> ...
> I also did some digging to find the original error on xlog replay and it 
> was  "failed to re-find parent key in "763769" for split pages 21032/21033". 

Hmm, the only known cause of that was fixed in 8.1.6.  Don't suppose you made
a copy of everything before destroying the evidence with pg_resetxlog?
If you did, any chance I could get access to it?
        regards, tom lane


Re: crash / data recovery issues

От
Robert Treat
Дата:
On Wednesday 06 February 2008 13:56, Alvaro Herrera wrote:
> Robert Treat wrote:
> > it looks as if the indexes on pg_class have become corrupt. (ie. reindex
> > claimes duplicate rows, which do not show up when doing count()
> > manipulations on the data).  As it turns out, I can't drop these indexes
> > either (system refuses with message indexes are needed by the system). 
> > This has kind of let the system in an unworkable state.
>
> You can work out of it by starting a standalone server with system
> indexes disabled (postgres -O -P, I think) and do a REINDEX on it (the
> form of it that reindexes all system indexes -- I think it's REINDEX
> DATABASE).
>

Sorry, I should have mentioned I tried the above was under postgres -d 
1 -P -O -D /path/to/data, but the reindex complains (doing reindex directly 
on the pg_class indexes, or doing reindex system).  

Personally I was surprised to find out it wouldn't let me drop the indexes 
under this mode,  but thats a different story.  Oh, probably worth noting I 
am able to reindex other system tables this way, just not pg_class. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL