Re: Various intermittent bugs/instability - how to debug?
От | Mark Cave-Ayland |
---|---|
Тема | Re: Various intermittent bugs/instability - how to debug? |
Дата | |
Msg-id | 48C7889E.9010905@siriusit.co.uk обсуждение исходный текст |
Ответ на | Various intermittent bugs/instability - how to debug? (Frederik Ramm <frederik.ramm@geofabrik.de>) |
Список | pgsql-general |
Frederik Ramm wrote: > Dear > PostgreSQL community, > > I hope you can help me with a problem I'm having - I'm stuck and > don't know how to debug this further. > > I have a rather large nightly process that imports a lot of data from > the OpenStreetMap project into a PostGIS database, then proceeds doing > all sorts of things - creating spatial indexes, computing bounding > boxes, doing simplification of geometries, that kind of stuff. The whole > job usually takes about five hours. > > I'm running this on a Quad-Core Linux (Ubuntu, PostgreSQL 8.3) machine > with 8 GB RAM. > > Every other night, the process aborts with some strange error message, > and never at the same position: > > ERROR: invalid page header in block 166406 of relation "node_tags" > > ERROR: could not open segment 2 of relation 1663/24253056/24253895 > (target block 1421295656): No such file or directory > > ERROR: Unknown geometry type: 10 > > When I continue the process after the failure, it will usually work. > > I know you all think "hardware problem" now. Of course this was my first > guess as well. I ran a memory test for a night, no results; I downgraded > do "failsafe defaults" for all BIOS timings, again no change. Ran > "cpuburn" and all sorts of other things to grill the hardware - nothing. > > Then I bought an entirely new machine; similar setup, but using a > Gigabyte instead of Asus mainboard, different chipset, slightly faster > Quad-Core processor, and again 8 GB RAM and Ubuntu "Hardy" with > PostgresSQL 8.3 and matching PostGIS. > > Believe it or not, this machine shows the *same* problems. It is not > 100% reproducible, sometimes the job works fully, but every other day it > just breaks down with one of the funny messages like above. No memtest > errors here either. > > Both machines are "consumer" quality, i.e. normal Intel processors and > not the "server" (Xeon) stock. > > I am at a loss - how can I proceed? This looks like a hardware problem > alright, but so simliar problems on two so different machines? Is there > something wrong with Intel's Quad-Core CPUs? > > What could I do to have a better chance of reproducing the error and > ultimately identifying the component responsible? Is there some kind of > "PostgresSQL load test", something like "cpuburn" for PostgreSQL? > > Have there been other reports of intermittent problems like mine, and > does anybody have any blind guesses...? > > Thanks > Frederik Hi Frederik, We did find a memory clobber in the PostGIS ANALYZE routine a while back, but the fix hasn't yet made it into a release. If you are building from source, please can you try applying the patch here: http://code.google.com/p/postgis/issues/detail?id=43 and reporting back whether it helps or not? ATB, Mark. -- Mark Cave-Ayland Sirius Corporation - The Open Source Experts http://www.siriusit.co.uk T: +44 870 608 0063
В списке pgsql-general по дате отправления: