Обсуждение: More than one pg_database entry for database
Hi all, I'm trying to do a pg_dump of my database but I'm getting this error: # pg_dump misp > misp.sql pg_dump: query returned more than one (2) pg_database entry for database "misp" I can connect to the database OK, I just can't dump it. So I connected as postgres to look at the pg_database table: misp=# SELECT oid,* from pg_database; oid | datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid | datvacuumxid | datfrozenxid | datpath | datconfig | datacl ----------+-----------+--------+----------+---------------+------------- -+---------------+--------------+--------------+---------+-----------+-- --------------- 9019080 | webstats | 101 | 0 | f | t | 16974 | 427 | 427 | | | 21676718 | webstats | 1 | 0 | f | t | 16974 | 427 | 427 | | | 1 | template1 | 1 | 0 | t | t | 16974 | 427 | 427 | | | {=,postgres=CT} 16975 | template0 | 1 | 0 | t | f | 16974 | 427 | 427 | | | {=,postgres=CT} 15062595 | misp_jon2 | 101 | 0 | f | t | 16974 | 427 | 427 | | | 25823532 | misptest | 1 | 0 | f | t | 16974 | 427 | 427 | | | 3988824 | devices | 1 | 0 | f | t | 16974 | 427 | 427 | | | 30149940 | status | 1 | 0 | f | t | 16974 | 427 | 427 | | | 30153251 | mispcopy | 1 | 0 | f | t | 16974 | 427 | 427 | | | 31238435 | misp | 1 | 0 | f | t | 16974 | 440751129 | 3661976602 | | | 6790290 | misp | 1 | 0 | f | t | 16974 | 2300049162 | 1226307340 | | | 7511185 | misp_jon | 101 | 0 | f | t | 16974 | 427 | 427 | | | (12 rows) This shows 2 entries for "misp" with different OIDs. Only the first one (oid 31238435) exists in the data/base directory. So I tried to delete the other row but it doesn't work: misp=# DELETE from pg_database where oid = 6790290; DELETE 0 Several sources I found via Googling suggested this should work. But it doesn't, so I'm stuck. Things I've tried are: 1. Same thing in standalone mode - no difference 2. VACUUM on misp and pg_database 3. UPDATE on pg_database to change the name of the bogus row - this doesn't match any rows either just like the DELETE. Any help would be greatly appreciated. Thanks, James
"James Wilford" <jwilford@mistral.net> writes: > This shows 2 entries for "misp" with different OIDs. Only the first one > (oid 31238435) exists in the data/base directory. So I tried to delete > the other row but it doesn't work: > misp=# DELETE from pg_database where oid = 6790290; > DELETE 0 That's pretty strange. I wonder if the index on pg_database.oid is corrupt. It certainly seems like the one on datname must be corrupt, else it should have disallowed two such entries. You could try deleting the bogus row by selecting it by ctid instead of oid. Also see if you can REINDEX pg_database (this will only work in standalone mode I think). What PG version is this exactly? regards, tom lane
"James Wilford" <jwilford@mistral.net> writes: > Its PG version 7.3. Please tell us it's a relatively recent 7.3.x, at least ... > I've just tried all your suggestions and > unfortunately I still can't delete the bogus row. I think I managed to > reindex the system tables: Curious. I'd have thought the reindex would fail because of the duplicate datname entries. > And trying ctid didn't work, assuming this syntax is correct: > backend> delete from pg_database where ctid = '(0,37)'; Even more curious. I'm starting to think it's some sort of XID wraparound problem, except if you can see it in SELECT then why doesn't the DELETE see it? Could you show all the system columns of pg_database, viz select xmin, cmin, xmax, cmax, oid, ctid, datname from pg_database; regards, tom lane
-----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: 05 July 2007 15:53 To: James Wilford Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] More than one pg_database entry for database "James Wilford" <jwilford@mistral.net> writes: > This shows 2 entries for "misp" with different OIDs. Only the first > one (oid 31238435) exists in the data/base directory. So I tried to > delete the other row but it doesn't work: > misp=# DELETE from pg_database where oid = 6790290; DELETE 0 That's pretty strange. I wonder if the index on pg_database.oid is corrupt. It certainly seems like the one on datname must be corrupt, else it should have disallowed two such entries. You could try deleting the bogus row by selecting it by ctid instead of oid. Also see if you can REINDEX pg_database (this will only work in standalone mode I think). What PG version is this exactly? regards, tom lane Hi Tom, Its PG version 7.3. I've just tried all your suggestions and unfortunately I still can't delete the bogus row. I think I managed to reindex the system tables: backend> reindex database misp force NOTICE: relation 16416 was reindexed NOTICE: relation 1261 was reindexed NOTICE: relation 1255 was reindexed NOTICE: relation 16410 was reindexed NOTICE: relation 1247 was reindexed NOTICE: relation 1249 was reindexed NOTICE: relation 1259 was reindexed NOTICE: relation 16388 was reindexed NOTICE: relation 16390 was reindexed NOTICE: relation 16392 was reindexed NOTICE: relation 16394 was reindexed NOTICE: relation 16396 was reindexed NOTICE: relation 16398 was reindexed NOTICE: relation 16400 was reindexed NOTICE: relation 16402 was reindexed NOTICE: relation 16404 was reindexed NOTICE: relation 16406 was reindexed NOTICE: relation 16412 was reindexed NOTICE: relation 16418 was reindexed NOTICE: relation 16594 was reindexed NOTICE: relation 1260 was reindexed NOTICE: relation 16596 was reindexed NOTICE: relation 16598 was reindexed NOTICE: relation 16384 was reindexed NOTICE: relation 16386 was reindexed NOTICE: relation 1262 was reindexed NOTICE: relation 16408 was reindexed However I still can't delete the row, even from standalone mode: backend> delete from pg_database where oid = 6790290 blank 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) ---- And trying ctid didn't work, assuming this syntax is correct: backend> delete from pg_database where ctid = '(0,37)'; blank 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) ---- I might try creating a new database cluster on a test box and copying the data directory over, would this work? Could I then create a new database and just rename the data directory to the new OID? Thanks, James
-----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: 05 July 2007 17:27 To: James Wilford Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] More than one pg_database entry for database "James Wilford" <jwilford@mistral.net> writes: > Its PG version 7.3. Please tell us it's a relatively recent 7.3.x, at least ... > I've just tried all your suggestions and unfortunately I still can't > delete the bogus row. I think I managed to reindex the system tables: Curious. I'd have thought the reindex would fail because of the duplicate datname entries. > And trying ctid didn't work, assuming this syntax is correct: > backend> delete from pg_database where ctid = '(0,37)'; Even more curious. I'm starting to think it's some sort of XID wraparound problem, except if you can see it in SELECT then why doesn't the DELETE see it? Could you show all the system columns of pg_database, viz select xmin, cmin, xmax, cmax, oid, ctid, datname from pg_database; regards, tom lane It's a redhat version of 7.3 - rh-postgresql-server-7.3.8-2 is the package. Here's the system colums: misp=# select xmin, cmin, xmax, cmax, oid, ctid, datname from pg_database; xmin | cmin | xmax | cmax | oid | ctid | datname ----------+------------+------------+------+----------+--------+-------- --- 2 | 2304646034 | 2304646034 | 0 | 9019080 | (0,3) | webstats 2 | 0 | 0 | 0 | 21676718 | (0,4) | webstats 2 | 0 | 0 | 0 | 1 | (0,6) | template1 2 | 0 | 0 | 0 | 16975 | (0,7) | template0 2 | 2425306827 | 2425306827 | 0 | 15062595 | (0,16) | misp_jon2 2 | 0 | 0 | 0 | 25823532 | (0,19) | misptest 2 | 0 | 0 | 0 | 3988824 | (0,23) | devices 45097349 | 0 | 0 | 0 | 30149940 | (0,26) | status 59253569 | 0 | 0 | 0 | 30153251 | (0,27) | mispcopy 59254726 | 0 | 0 | 0 | 31238435 | (0,28) | misp 2 | 2430588419 | 2430588419 | 0 | 6790290 | (0,37) | misp 2 | 2425306819 | 2425306819 | 0 | 7511185 | (0,38) | misp_jon (12 rows) The "webstats" database also appears twice, although we're more concerned about "misp" at the moment. James
"James Wilford" <jwilford@mistral.net> writes: > It's a redhat version of 7.3 - rh-postgresql-server-7.3.8-2 is the > package. That was obsoleted by Red Hat well over two years ago; you are overdue for an update. > xmin | cmin | xmax | cmax | oid | ctid | datname > ----------+------------+------------+------+----------+--------+-------- > 59254726 | 0 | 0 | 0 | 31238435 | (0,28) | misp > 2 | 2430588419 | 2430588419 | 0 | 6790290 | (0,37) | misp It definitely looks to me like you have a problem with XID wraparound :-( The (0,37) row was apparently deleted long ago, but never vacuumed away, and now its xmax has wrapped around past the 2-billion-transaction event horizon, causing it to appear visible to MVCC-using queries. So there's something broken about your routine vacuuming procedures; you'd better take a look at that. As far as getting out of the situation, the only really good answer is a dump and reload. I can't think of any simple way of getting rid of the bogus row, but what you should be able to do to let pg_dump work is to rename misp to something else. You can rename it back after getting through the dump/reload, of course. regards, tom lane
Tom Lane wrote: > "James Wilford" <jwilford@mistral.net> writes: > > It's a redhat version of 7.3 - rh-postgresql-server-7.3.8-2 is the > > package. > > That was obsoleted by Red Hat well over two years ago; you are overdue > for an update. > > > xmin | cmin | xmax | cmax | oid | ctid | datname > > ----------+------------+------------+------+----------+--------+-------- > > 59254726 | 0 | 0 | 0 | 31238435 | (0,28) | misp > > 2 | 2430588419 | 2430588419 | 0 | 6790290 | (0,37) | misp > > It definitely looks to me like you have a problem with XID wraparound :-( > The (0,37) row was apparently deleted long ago, but never vacuumed away, > and now its xmax has wrapped around past the 2-billion-transaction event > horizon, causing it to appear visible to MVCC-using queries. So there's > something broken about your routine vacuuming procedures; you'd better > take a look at that. > > As far as getting out of the situation, the only really good answer is a > dump and reload. I can't think of any simple way of getting rid of the > bogus row, but what you should be able to do to let pg_dump work is to > rename misp to something else. You can rename it back after getting > through the dump/reload, of course. Or roll the XID counter back, vacuum the table, and restore the XID to the original value. This is done with pg_resetxlog, though I am not sure if we shipped it in 7.3. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "The West won the world not by the superiority of its ideas or values or religion but rather by its superiority in applying organized violence. Westerners often forget this fact, non-Westerners never do." (Samuel P. Huntington)
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> As far as getting out of the situation, the only really good answer is a >> dump and reload. I can't think of any simple way of getting rid of the >> bogus row, but what you should be able to do to let pg_dump work is to >> rename misp to something else. You can rename it back after getting >> through the dump/reload, of course. > Or roll the XID counter back, vacuum the table, and restore the XID to > the original value. This is done with pg_resetxlog, though I am not > sure if we shipped it in 7.3. That seems fairly hazardous, in particular there might be undesirable side-effects on other system catalogs while you are running with the set-back XID counter. Also, I see no very good reason to assume that this is the only wraparound problem present in the DB. A dump and reload would probably be useful to help check for other inconsistencies. regards, tom lane
-----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Tom Lane Sent: 05 July 2007 19:17 To: Alvaro Herrera Cc: James Wilford; pgsql-admin@postgresql.org Subject: Re: [ADMIN] More than one pg_database entry for database Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> As far as getting out of the situation, the only really good answer >> is a dump and reload. I can't think of any simple way of getting rid >> of the bogus row, but what you should be able to do to let pg_dump >> work is to rename misp to something else. You can rename it back >> after getting through the dump/reload, of course. > Or roll the XID counter back, vacuum the table, and restore the XID to > the original value. This is done with pg_resetxlog, though I am not > sure if we shipped it in 7.3. That seems fairly hazardous, in particular there might be undesirable side-effects on other system catalogs while you are running with the set-back XID counter. Also, I see no very good reason to assume that this is the only wraparound problem present in the DB. A dump and reload would probably be useful to help check for other inconsistencies. regards, tom lane I'd like to try the dump and reload - the only problem is that I can't use pg_dump because of the error. And I can't rename the entry in pg_database because my update statement won't match any rows. I'm going to set up a test box with the same OS/PG version and try copying the raw data across from the data directory. It looks like the only option at the moment. And then I'll definitely upgrade! Thanks, James
"James Wilford" <jwilford@mistral.net> writes: > I'd like to try the dump and reload - the only problem is that I can't > use pg_dump because of the error. And I can't rename the entry in > pg_database because my update statement won't match any rows. You can't change the dead entry, but it should work to rename the live one. BTW, you have one of the most annoying quoting styles I have ever seen: it's impossible to tell your words from those of the message you are replying to, and not trimming the overhead from that message wastes still more of your readers' time. Please change your habits, or you are likely to find that people stop reading your mail. regards, tom lane