Re: Serious Crash last Friday
От | Martijn van Oosterhout |
---|---|
Тема | Re: Serious Crash last Friday |
Дата | |
Msg-id | 20020617174311.A31157@svana.org обсуждение исходный текст |
Ответ на | Serious Crash last Friday ("Henrik Steffen" <steffen@city-map.de>) |
Список | pgsql-general |
On Mon, Jun 17, 2002 at 08:43:37AM +0200, Henrik Steffen wrote: > > Hello all, > > on Friday we experienced a very very worrying crash of our postgresql > server. Sound like the CTIDs are out of whack or something. If you're really desperate you can try the program here, it may be able to dump something. http://svana.org/kleptog/pgsql/pgfsck.html > Well, the crash was indicated as follows: One of my employees complained > that she couldn't > work anymore (via webinterface). The error-message was due to an error in > the > employee-table. This particular table has a unique row for employee-numbers. > Suddenly > there were 11 entries for the same employee. Even my name was included > twice, and > another employee still working on friday afternoon was also included 3 > times. Note: > This was a table with a UNIQUE KEY - this shouldn't be possible IMHO. What DB version is this. Could it be XID wraparound? > Taking a closer look, I found additional tables, with non-unique values in > UNIQUE columns. > > When trying to delete unique values by using the OIDs, I found out, that > even the OIDs > were the same!!!! Taking a yet closer look, I found out by querying > pg_tables that > there were duplicates of some tables. Then there was the message: "Backend > message type > 0x44 arrived while idle" Try the CTIDs, they will be unique. > I was running VACUUM and VACUUM FULL a hundred times - but it failed to > repair these > errors. It didn't even succeed in running VACUUM on all tables: VACUUM > complained something > about "UNIQUE" (I didn't write down the exact error message though). Please post the message exactly as printed out. > Then I tried to DUMP as much as I could, then I stopped the database, moved > the db-folder to > a different location, did a new initdb and restored the whole system. > Unfortunately > there was one table I couldn't dump at all and I had to use the 15 hours old > backup copy. > > But, please correct me if I am wrong, this should never actually happen, > shouldn't it? Never, that's why it would be helpful to know what went wrong. > Anyone had any of these problems before? I will see if this happens again - > and if it > does I will have to think about using a different backend-server. I'll don't > have to > explain to you, that a database server that corrupts data, is completely > useless. HTH, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
В списке pgsql-general по дате отправления: