Re: 7.0.2 issues / Geocrawler
От | Tim Perdue |
---|---|
Тема | Re: 7.0.2 issues / Geocrawler |
Дата | |
Msg-id | 396C5637.B11DE9AE@valinux.com обсуждение исходный текст |
Ответ на | 7.0.2 issues / Geocrawler (Tim Perdue <tperdue@valinux.com>) |
Список | pgsql-hackers |
This is a *big* help. Yes, the table is approx 10-12GB in size and running your length() and update queries is going to take a lifetime, since it will require a calculation on 4 million rows. This doesn't address the serious performance problem I'm finding in 7.0.2 for a multi-key select/order by/limit/offset query, which I sent in a separate email. Tim Mike Mascari wrote: > > Tim Perdue wrote: > > ... > > After that, I get this error from psql: > > > > ERROR: btree: index item size 2820 exceeds maximum 2717 > > > > Any way to tell where that item is at? > > I've been wondering at the state of the problems you've been > having with PostgreSQL and wondering why I haven't experienced > the same. I think this may very well be it. Earlier versions of > PostgreSQL allowed for the creation of indexes on fields whose > length would not permit at least 2 entries per index page. 95% of > the time, things would work fine. But 5% you would get corrupted > data. > > Before creating the index: > > SELECT * FROM tbl_main_archive WHERE Length(fld_mail_subject) > > 2700; > > will get you the list of records which cannot be indexed. You're > attempting to create a multi-key index so I would truncate (or > delete) any record whose fld_mail_subject is > 2700: > > UPDATE tbl_main_archive SET fld_mail_subject = > SubStr(fld_mail_subject, 1, 2700); > > At this point, your index creation should be relatively quick > (and successful) depending upon how many rows you have. I have a > few tables with ~2 million rows that take about 5 - 10 minutes > (with fsync off, naturally) to index. I would also recommend > letting PostgreSQL determine the correct "ops": > > CREATE INDEX idx_mail_archive_list_subject > ON tbl_mail_archive (fld_mail_list, fld_mail_subject); > > Without following the lists every day, most people wouldn't know > about this issue. I'm surprised it took so long for PostgreSQL > 7.0.2 to bail on the index creation though. Is this a > particularly large table? At any rate, this is an example of a > bug which *would* allow for the kinds of corruption you've seen > in the past that has been addressed in 7.0.2, as Tom Lane crushed > them by the hundreds. If you can: > > psql db_geocrawler < 6_4dump.txt > > and it never bails, then you know all your data is "clean". Until > that point, any index you have on a "text" datatype is subject to > similar problems. > > Hope that helps, > > Mike Mascari -- Founder - PHPBuilder.com / Geocrawler.com Lead Developer - SourceForge VA Linux Systems 408-542-5723
В списке pgsql-hackers по дате отправления: