Обсуждение: duplicate primary index in bayes db from SpamAssassin
Dear list, I don't know whether pgsql-admin or the SA list is the correct one to ask, so I start here. I have a bayes Postgres database for SpamAssassin, which I use shared from 2 servers to have the same content. I think this could be a locking issue or whatever, I don't care. BUT: I have lots of "duplicate index" elements now, preventing a restore or cleanup. Here's the dump and diff if anyone cares: http://zmi.at/x/bayes_dup_idx.dump.bz2 (upload currently running) http://zmi.at/x/bayes_dup_idx.diff.bz2 Could the problem be caused by any settings to the DB, or because of the stored procedures? Normally there should be no way this can happen, so what can cause this? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
Michael Monnerie <michael.monnerie@it-management.at> writes: > care. BUT: I have lots of "duplicate index" elements now, preventing a > restore or cleanup. What PG version is this? regards, tom lane
Michael Monnerie wrote: > Dear list, I don't know whether pgsql-admin or the SA list is the > correct one to ask, so I start here. I have a bayes Postgres database > for SpamAssassin, which I use shared from 2 servers to have the same > content. I think this could be a locking issue or whatever, I don't > care. What do you mean "shared from 2 servers"? Are these two servers using a shared network mount with the database files? Or do you mean that these two servers connect as clients to a single Postgres server? What versions of things are all these running? Please be as specific as humanly possible. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Montag, 14. April 2008 Tom Lane wrote: > What PG version is this? Sorry: # select version(); version ------------------------------------------------------------------------------------------------------------ PostgreSQL 8.1.11 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux) It's the RPM installed from openSUSE 10.2 mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
On Montag, 14. April 2008 Alvaro Herrera wrote: > What do you mean "shared from 2 servers"? Are these two servers > using a shared network mount with the database files? Or do you mean > that these two servers connect as clients to a single Postgres > server? Two servers have SA running, and connect via different users (who have the same access rights) to the same Postgres server running the bayes db. > What versions of things are all these running? Please be as specific > as humanly possible. # select version(); version ------------------------------------------------------------------------------------------------------------ PostgreSQL 8.1.11 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux) It's the RPM installed from openSUSE 10.2 The SA versions are 3.2.0 from one server and 3.2.4 from the other. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
On Montag, 14. April 2008 Tom Lane wrote: > What PG version is this? I've looked a little more into history, and PostgreSQL was 8.1.5 originally from CD, then 8.1.9 and now 8.1.11. I'm currently investigating a way to delete duplicates from the table. Is there any chance to do that? A SELECT * from bayes_token; shows no dups, as there's a primary index... at least it works here. This is the bayes_token table and its index: CREATE TABLE bayes_token ( id integer DEFAULT 0 NOT NULL, token bytea DEFAULT ''::bytea NOT NULL, spam_count integer DEFAULT 0 NOT NULL, ham_count integer DEFAULT 0 NOT NULL, atime integer DEFAULT 0 NOT NULL ); ALTER TABLE ONLY bayes_token ADD CONSTRAINT bayes_token_pkey PRIMARY KEY (id, token); Could I create a new db without indices, restore the data there, and then run a "delete from.. where duplicates"? Sometimes there's 3x the same content in the primary index, but I'd need to delete all except the one with the highest spam_count. I just found I even have a duplicate in bayes_vars: COPY bayes_vars (id, username, spam_count, ham_count, token_count, last_expire, last_atime_delta, last_expire_reduce, oldest_token_age, newest_token_age) FROM stdin; 1 vscan 194393 517531 2602114 1206030039 2764800 15304 1203265204 1206057801 1 vscan 194398 517535 2602799 1206030039 2764800 15304 1203265204 1206064729 4 vscan 18305 25403 2042983 1208143427 1382400 13268 1206501543 1208216540 \. Definition: CREATE TABLE bayes_vars ( id serial NOT NULL, username character varying(200) DEFAULT ''::character varying NOT NULL, spam_count integer DEFAULT 0 NOT NULL, ham_count integer DEFAULT 0 NOT NULL, token_count integer DEFAULT 0 NOT NULL, last_expire integer DEFAULT 0 NOT NULL, last_atime_delta integer DEFAULT 0 NOT NULL, last_expire_reduce integer DEFAULT 0 NOT NULL, oldest_token_age integer DEFAULT 2147483647 NOT NULL, newest_token_age integer DEFAULT 0 NOT NULL ); ALTER TABLE ONLY bayes_vars ADD CONSTRAINT bayes_vars_pkey PRIMARY KEY (id); But here, I could see the dups with SELECT, and even delete one record: # select * from bayes_vars; id | username | spam_count | ham_count | token_count | last_expire | last_atime_delta | last_expire_reduce | oldest_token_age | newest_token_age ----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------ 1 | vscan | 194393 | 517531 | 2602114 | 1206030039 | 2764800 | 15304 | 1203265204 | 1206057801 1 | vscan | 194398 | 517535 | 2602799 | 1206030039 | 2764800 | 15304 | 1203265204 | 1206064729 4 | vscan | 18375 | 25828 | 2050196 | 1208229525 | 1382400 | 13268 | 1206501543 | 1208240610 (3 Zeilen) bayes_pg_v1=# delete from bayes_vars where spam_count =194393; DELETE 1 bayes_pg_v1=# select * from bayes_vars; id | username | spam_count | ham_count | token_count | last_expire | last_atime_delta | last_expire_reduce | oldest_token_age | newest_token_age ----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------ 1 | vscan | 194398 | 517535 | 2602799 | 1206030039 | 2764800 | 15304 | 1203265204 | 1206064729 4 | vscan | 18375 | 25829 | 2050215 | 1208229525 | 1382400 | 13268 | 1206501543 | 1208240637 (2 Zeilen) mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
On Montag, 14. April 2008 Michael Monnerie wrote: > > What do you mean "shared from 2 servers"? Are these two servers > > using a shared network mount with the database files? Or do you > > mean that these two servers connect as clients to a single Postgres > > server? > > Two servers have SA running, and connect via different users (who > have the same access rights) to the same Postgres server running the > bayes db. SA=SpamAssassin, I should possibly explain. There are stored procedures in the DB which get called, could the problem be connected to this? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
On Montag, 14. April 2008 Michael Monnerie wrote: > problem with duplicate primary index entries... I just got a warning in my Log: Apr 16 01:52:19 db1.zmi.at postgres[21103]: [3-1] DB=bayes_pg_v1_dupidx U=postgres H=[local] 48053eaf.526f VACUUM:WARNUNG: Index »bayes_token_pkey« enthält 2539650 Zeilenversionen, aber Tabelle enthält 4752337 Zeilenversionen (Index contains 2539650 rows, but table has 4752337 rows) BTW: I know that the DB language cannot be changed without recreating, but can I change the language of the log messages? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
On Mittwoch, 16. April 2008 Michael Monnerie wrote: > problem with duplicate primary index entries... I created the bayes db fresh, named it bayes_pg_v2 and re-relearned all ham/spam from 3 concurrent connections into the db. No problem so far. But I renamed the old, broken db and tried a and got this: INFO: »bayes_token«: 20567514 entfernbare, 4752337 nicht entfernbare Zeilenversionen in 186274 Seiten gefunden DETAIL: 0 tote Zeilenversionen können noch nicht entfernt werden. Nicht entfernbare Zeilenversionen sind zwischen 56 und 56 Bytes lang. Es gibt 0 unbenutzte Itemzeiger. Gesamter freier Platz (einschließlich entfernbare Zeilenversionen) ist 1154820852 Bytes. 33480 Seiten sind leer oder werden leer werden, einschließlich 0 am Ende der Tabelle. 186149 Seiten mit 1154819352 freien Bytes sind mögliche Ziele zum Verschieben. It looks like autovacuum - which is on - could not be run, and the db growed a lot, and now a VACUUM cannot finish because of the duplicate index: INFO: Index »bayes_token_pkey« enthält 2539650 Zeilenversionen in 76273 Seiten DETAIL: 0 Indexzeilenversionen wurde entfernt. 11727 Indexseiten wurden gelöscht, 11727 sind gegenwärtig wiederverwendbar. CPU 10.44s/2.49u sec elapsed 172.59 sec. WARNUNG: Index »bayes_token_pkey« enthält 2539650 Zeilenversionen, aber Tabelle enthält 4752337 Zeilenversionen TIP: Bauen Sie den Index mit REINDEX neu. But REINDEX doesn't work of course, because of the duplicates... I'd like to drop that broken db, or should I keep it for analyzation? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
Michael Monnerie <michael.monnerie@it-management.at> writes: > I'd like to drop that broken db, or should I keep it for analyzation? Looking back over the thread, it seems that the problems you saw could be explained by the VACUUM-related bugs we fixed in 8.1.9 and 8.1.10; although obviously this theory requires assuming that the dups were there before you updated to 8.1.11. You didn't say much about time frame so I don't know if that's plausible. This latest bit with many fewer index entries than table entries is just weird, though. The 8.1.10 VACUUM bug could have led to having *more* index entries than table entries, but not the other way around. Before you zap the database, would you be willing to make the actual database files (not a dump, but the stored files) available? I'd be interested to take a closer look at that index. regards, tom lane
On Donnerstag, 17. April 2008 Tom Lane wrote: > Before you zap the database, would you be willing to make the actual > database files (not a dump, but the stored files) available? I'd > be interested to take a closer look at that index. Yes, could you please give me the quick hint how to find out which SELECT I must do to find the correct directory to a database? I saw it once, but... many other things got stored in my brain in between, so I VACUUMed that information out ;-) mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
On Donnerstag, 17. April 2008 Tom Lane wrote: > This latest bit with many fewer index entries than table entries is > just weird, though. The 8.1.10 VACUUM bug could have led to having > *more* index entries than table entries, but not the other way > around. Somehow I can think of it like that: SA didn't find user "vscan" anymore, so created a new entry for it. And then it created all the new learned ham/spam tokens again despite they were there already. Must have to do with not finding it from the index or so. It would be nice if pg_dump would FAIL on such a broken database. Like this, I would have become informed about the problem. The bayes db isn't quite interactive, and SA seems to ignore errors. The reason I found it was that our SPAM filter started to let spam through which did not happen for years (yes, we have *very hard* anti spam settings, our customers love it that way - no spam). So I got aware that bayes got stupid and then I found the mess... mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
Michael Monnerie <michael.monnerie@it-management.at> writes: > On Donnerstag, 17. April 2008 Tom Lane wrote: >> Before you zap the database, would you be willing to make the actual >> database files (not a dump, but the stored files) available? �I'd >> be interested to take a closer look at that index. > Yes, could you please give me the quick hint how to find out which > SELECT I must do to find the correct directory to a database? I saw it > once, but... many other things got stored in my brain in between, so I > VACUUMed that information out ;-) SELECT oid, datname FROM pg_database; The stuff I need to look at is under $PGDATA/base/OID/. Please also send the results of SELECT relfilenode, relname FROM pg_class; in the busted database --- that'll save trying to claw it out of pg_class by hand. regards, tom lane
I wrote: > This latest bit with many fewer index entries than table entries is just > weird, though. The 8.1.10 VACUUM bug could have led to having *more* > index entries than table entries, but not the other way around. > Before you zap the database, would you be willing to make the actual > database files (not a dump, but the stored files) available? I'd > be interested to take a closer look at that index. I finally got a chance to do some analysis, and what I find is multiple instances of index corruption. There are 18 index pages that are all zeroes: Bogus pd_special pointer on page 65663 Bogus pd_special pointer on page 65847 Bogus pd_special pointer on page 65932 Bogus pd_special pointer on page 65948 Bogus pd_special pointer on page 65956 Bogus pd_special pointer on page 65967 Bogus pd_special pointer on page 65974 Bogus pd_special pointer on page 66066 Bogus pd_special pointer on page 66456 Bogus pd_special pointer on page 66775 Bogus pd_special pointer on page 67979 Bogus pd_special pointer on page 68126 Bogus pd_special pointer on page 68501 Bogus pd_special pointer on page 68565 Bogus pd_special pointer on page 68854 Bogus pd_special pointer on page 70321 Bogus pd_special pointer on page 70326 Bogus pd_special pointer on page 75066 It is possible to have a zero page in an index, but only if a backend extended the index by one page and then crashed before it could write a WAL record about the added page. Unless you've been having a whole lot of PANIC crashes you didn't mention, there's something else going on here. I also found a bunch of places where the sibling links between pages are wrong/inconsistent: block 1174 has left link 27874, but block 27874 has right link 66918 block 1309 has right link 59817, but block 59817 has left link 46980 block 2476 has right link 27885, but block 27885 has left link 71959 block 4489 has right link 59819, but block 59819 has left link 61853 block 7591 has right link 25594, but block 25594 has left link 27816 block 20986 has right link 1174, but block 1174 has left link 27874 block 24616 has left link 1309, but block 1309 has right link 59817 block 25416 has left link 5155, but block 5155 has right link 59809 block 27484 has left link 4489, but block 4489 has right link 59819 block 27816 has left link 7591, but block 7591 has right link 25594 block 43322 has left link 2476, but block 2476 has right link 27885 block 59809 has right link 25416, but block 25416 has left link 5155 block 59810 has left link 15449, but block 15449 has right link 39432 block 59810 has right link 23744, but block 23744 has left link 40043 block 59811 has left link 10693, but block 10693 has right link 24259 block 59811 has right link 24259, but block 24259 has left link 10693 block 59812 has left link 17191, but block 17191 has right link 68457 block 59812 has right link 24704, but block 24704 has left link 68457 block 59813 has left link 20391, but block 20391 has right link 21998 block 59813 has right link 27337, but block 27337 has left link 52006 block 59814 has left link 6918, but block 6918 has right link 17551 block 59814 has right link 17551, but block 17551 has left link 6918 block 59815 has left link 16727, but block 16727 has right link 32141 block 59815 has right link 22466, but block 22466 has left link 32141 THe fact that blocks between 59809 and 59819 are involved in the majority of these cases is interesting. I'm not sure what to make of it, but it suggests that some single event messed up a number of block splits or deletions, rather than that there's an infrequent bug in those operations. There are also odd inconsistencies between the tree levels: there are 64142 blocks in tree level 0 but only 64136 downlinks in level 1, and there are four places where a downlink's next key fails to match the high key of the child page. So this index has definitely suffered multiple events where a split or deletion didn't work right. On the whole I'm thinking that you may have an intermittent hardware problem, especially since we've not seen comparable symptoms reported by anyone else. regards, tom lane
On Mittwoch, 30. April 2008 Tom Lane wrote: > I finally got a chance to do some analysis, and what I find is > multiple instances of index corruption. There are 18 index pages > that are all zeroes: > [snip] > On the whole I'm thinking that you may have an intermittent hardware > problem, especially since we've not seen comparable symptoms reported > by anyone else. We've had a bad crash once where the server couldn't write it's buffers. The RAID controller also caches writes, and the server is in VMware, which I guess adds to the problem as the whole client machine also has caches and so on. I now remember we had a (very NOT important) database that we simply deleted and recreated as it's only for a web-app without data in it, but didn't see any other problems at that time. So probably this is this kind of "should never ever happen" except your machine goes bye-bye badly. What makes me a bit worried is that pg_dump did not report an error at all, but made it's backup as if everything is OK. Could that be changed maybe so that in case of a problem it at least cries out loudly "broken db found" or so? Or even better it should not create a backup file but say "to save this broken db please recall pg_dump with --ignore-broken-db". We would have found it earlier if the backup wouldn't have worked. It's always good to have that kind of extra warnings, better than oversee problems. Thanks for your analyzation. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4
Вложения
Michael Monnerie <michael.monnerie@it-management.at> writes: > So probably this is this kind of "should never ever happen" except your > machine goes bye-bye badly. What makes me a bit worried is that pg_dump > did not report an error at all, but made it's backup as if everything > is OK. Could that be changed maybe so that in case of a problem it at > least cries out loudly "broken db found" or so? pg_dump intentionally does *not* look at the contents of indexes (at least not any on user tables). I think this is a good thing because it makes it more likely that you can dump a damaged database. pg_dump is not supposed to be some kind of fsck-like tool. regards, tom lane
On Mittwoch, 7. Mai 2008 Tom Lane wrote: > pg_dump intentionally does *not* look at the contents of indexes (at > least not any on user tables). I think this is a good thing because > it makes it more likely that you can dump a damaged database. > pg_dump is not supposed to be some kind of fsck-like tool. OK, I see that normally you should get that info out of the postgresql log. The problem with this application is, that often it happens to try an insert on already existing primary keys, which makes you overlook the problem is real. Maybe something can be improved here, but I don't know what would be the best way. Maybe a big log line like YOUR DATABASE xxx SEEMS TO BE DAMAGED - PLEASE CHECK such a line would jump into the eyes even on a big log, and it makes you nervous enough to really check things. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4