Обсуждение: Vacuumdb error

Поиск
Список
Период
Сортировка

Vacuumdb error

От
"Bhella Paramjeet-PFCW67"
Дата:
Hi,

We have our production postgres 8.0.10 database running on linux x86_64
machine. Recently we have started getting an error from one of our
database while running vacuumdb. We are not getting this error during
backups just only during vacuuming of a database. Can anyone please help
us figure out why we are getting this error and how we can get rid of
it. Any help will be appreciated.

Here is the error:
tst_021 ERROR:  invalid page header in block 8 of relation
"securityevent_pkey"


Thanks
Paramjeet Bhella


Re: Vacuumdb error

От
Tom Lane
Дата:
"Bhella Paramjeet-PFCW67" <PBhella@Motorola.com> writes:
> We have our production postgres 8.0.10 database running on linux x86_64
> machine. Recently we have started getting an error from one of our
> database while running vacuumdb. We are not getting this error during
> backups just only during vacuuming of a database. Can anyone please help
> us figure out why we are getting this error and how we can get rid of
> it. Any help will be appreciated.

> Here is the error:
> tst_021 ERROR:  invalid page header in block 8 of relation
> "securityevent_pkey"

Well, since it's just an index, you should be able to fix it by
reindexing.  But I'd worry a bit about what caused the corruption.
8.0.10 is not exactly current --- you should update to 8.0.latest.
And running some memory and disk diagnostics might not be wasted
effort.

            regards, tom lane

Re: Vacuumdb error - corruption

От
"Bhella Paramjeet-PFCW67"
Дата:
Hi Tom,

We recreated the index and that fixed it last week. We also had similar
failure few days back on a table but fortunately we had the luxury of
recreating it as the data was not so important, so we survived.
Yesterday we upgraded our postgres databases to 8.0.15 as suggested and
again we got error while vacuum analayze was run on the database and
this time it is on a different database and different table. I am not
able to access the table in question. Here is the error I get while
accessing the table. How do we recover this table from the error.

ectest=# select count(*) from securityevent;
ERROR:  could not access status of transaction 33554431
DETAIL:  could not open file "/pgdata/ec/data/pg_clog/001F": No such
file or directory

Error in the database vacuum log.
INFO:  vacuuming "public.securityevent"
WARNING:  relation "securityevent" TID 21/3: OID is invalid
vacuumdb: vacuuming of database "ectest" failed: ERROR:  could not
access status of transaction 33554431
DETAIL:  could not open file "/pgdata/ec/data/pg_clog/001F": No such
file or directory

Here are the machine specifics on which this database is running:
Platform: Linux x86_64
OS:  Red Hat Enterprise Linux ES release 4 Kernel version: 2.6.9-34

According to our sysadmin there are no bad disks on the emc storage. So
what do you suggest we should do to narrow out the problem. Any help
will be highly appreciated. I have looked into the postgres archives and
people have reported this problem but there were no responses as to how
they resolved this issue. Do you still think it is due to block
corruption on disk? Please advice.

Thanks
Paramjeet Kaur

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Thursday, April 17, 2008 5:18 PM
To: Bhella Paramjeet-PFCW67
Cc: pgsql-admin@postgresql.org; Subbiah Stalin-XCGF84
Subject: Re: [ADMIN] Vacuumdb error

"Bhella Paramjeet-PFCW67" <PBhella@Motorola.com> writes:
> We have our production postgres 8.0.10 database running on linux
> x86_64 machine. Recently we have started getting an error from one of
> our database while running vacuumdb. We are not getting this error
> during backups just only during vacuuming of a database. Can anyone
> please help us figure out why we are getting this error and how we can

> get rid of it. Any help will be appreciated.

> Here is the error:
> tst_021 ERROR:  invalid page header in block 8 of relation
> "securityevent_pkey"

Well, since it's just an index, you should be able to fix it by
reindexing.  But I'd worry a bit about what caused the corruption.
8.0.10 is not exactly current --- you should update to 8.0.latest.
And running some memory and disk diagnostics might not be wasted effort.

            regards, tom lane

Re: Vacuumdb error - corruption

От
Tom Lane
Дата:
"Bhella Paramjeet-PFCW67" <PBhella@Motorola.com> writes:
> Error in the database vacuum log.
> INFO:  vacuuming "public.securityevent"
> WARNING:  relation "securityevent" TID 21/3: OID is invalid

That smells like a data corruption problem ...

> vacuumdb: vacuuming of database "ectest" failed: ERROR:  could not
> access status of transaction 33554431

and so does that, particularly since the value equates to hex 01FFFFFF.
It's a lot easier to believe a hardware-ish fault stuffing such a value
than a software bug.

> Here are the machine specifics on which this database is running:
> Platform: Linux x86_64
> OS:  Red Hat Enterprise Linux ES release 4 Kernel version: 2.6.9-34

If I'm reading the Red Hat CVS correctly, that kernel is two years old
next week.  Perhaps a newer kernel would help your problems.  A quick
troll through the changelog reveals a number of x86_64-specific fixes
that sound like they could have resulted in userspace data corruption.

            regards, tom lane

Re: Vacuumdb error - corruption

От
"Bhella Paramjeet-PFCW67"
Дата:
Hi Eric,

No database is not sitting on NFS storage. We are using emc storage and
the file system is fibre attached to storage.
In last few days we have had lots of errors on the database. Last night
while taking a pg_dump we got invalid memory alloc request error "ecdemo
2008-04-25 00:40:31 CDTFATAL:  invalid memory alloc request size
936749196708242529" and also "invalid page header" errors.

Thanks
Paramjeet Kaur

-----Original Message-----
From: Eric Comeau [mailto:Eric.Comeau@signiant.com]
Sent: Friday, April 25, 2008 2:39 PM
To: Bhella Paramjeet-PFCW67; Tom Lane
Cc: pgsql-admin@postgresql.org; Subbiah Stalin-XCGF84
Subject: RE: Vacuumdb error - corruption



> -----Original Message-----
> From: Bhella Paramjeet-PFCW67 [mailto:PBhella@Motorola.com]
> Sent: Thursday, April 24, 2008 1:13 PM
> To: Tom Lane
> Cc: pgsql-admin@postgresql.org; Subbiah Stalin-XCGF84
> Subject: Re: Vacuumdb error - corruption
>
>
> According to our sysadmin there are no bad disks on the emc storage.
So

EMC Storage? Is the Database sitting on NFS storage?


> what do you suggest we should do to narrow out the problem. Any help
> will be highly appreciated. I have looked into the postgres archives
and
> people have reported this problem but there were no responses as to
how
> they resolved this issue. Do you still think it is due to block
> corruption on disk? Please advice.
>
> Thanks
> Paramjeet Kaur
>
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, April 17, 2008 5:18 PM
> To: Bhella Paramjeet-PFCW67
> Cc: pgsql-admin@postgresql.org; Subbiah Stalin-XCGF84
> Subject: Re: [ADMIN] Vacuumdb error
>
> "Bhella Paramjeet-PFCW67" <PBhella@Motorola.com> writes:
> > We have our production postgres 8.0.10 database running on linux
> > x86_64 machine. Recently we have started getting an error from one
of
> > our database while running vacuumdb. We are not getting this error
> > during backups just only during vacuuming of a database. Can anyone
> > please help us figure out why we are getting this error and how we
can
>
> > get rid of it. Any help will be appreciated.
>
> > Here is the error:
> > tst_021 ERROR:  invalid page header in block 8 of relation
> > "securityevent_pkey"
>
> Well, since it's just an index, you should be able to fix it by
> reindexing.  But I'd worry a bit about what caused the corruption.
> 8.0.10 is not exactly current --- you should update to 8.0.latest.
> And running some memory and disk diagnostics might not be wasted
effort.
>
>             regards, tom lane

Re: Vacuumdb error - corruption

От
"Eric Comeau"
Дата:

> -----Original Message-----
> From: Bhella Paramjeet-PFCW67 [mailto:PBhella@Motorola.com]
> Sent: Thursday, April 24, 2008 1:13 PM
> To: Tom Lane
> Cc: pgsql-admin@postgresql.org; Subbiah Stalin-XCGF84
> Subject: Re: Vacuumdb error - corruption
>
>
> According to our sysadmin there are no bad disks on the emc storage.
So

EMC Storage? Is the Database sitting on NFS storage?


> what do you suggest we should do to narrow out the problem. Any help
> will be highly appreciated. I have looked into the postgres archives
and
> people have reported this problem but there were no responses as to
how
> they resolved this issue. Do you still think it is due to block
> corruption on disk? Please advice.
>
> Thanks
> Paramjeet Kaur
>
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, April 17, 2008 5:18 PM
> To: Bhella Paramjeet-PFCW67
> Cc: pgsql-admin@postgresql.org; Subbiah Stalin-XCGF84
> Subject: Re: [ADMIN] Vacuumdb error
>
> "Bhella Paramjeet-PFCW67" <PBhella@Motorola.com> writes:
> > We have our production postgres 8.0.10 database running on linux
> > x86_64 machine. Recently we have started getting an error from one
of
> > our database while running vacuumdb. We are not getting this error
> > during backups just only during vacuuming of a database. Can anyone
> > please help us figure out why we are getting this error and how we
can
>
> > get rid of it. Any help will be appreciated.
>
> > Here is the error:
> > tst_021 ERROR:  invalid page header in block 8 of relation
> > "securityevent_pkey"
>
> Well, since it's just an index, you should be able to fix it by
> reindexing.  But I'd worry a bit about what caused the corruption.
> 8.0.10 is not exactly current --- you should update to 8.0.latest.
> And running some memory and disk diagnostics might not be wasted
effort.
>
>             regards, tom lane

Re: Vacuumdb error - corruption

От
Andrew Sullivan
Дата:
On Fri, Apr 25, 2008 at 06:03:12PM -0400, Bhella Paramjeet-PFCW67 wrote:

> No database is not sitting on NFS storage. We are using emc storage and
> the file system is fibre attached to storage.

What's the filesystem?  Are you sure you don't have any bad memory in
the box?  I'm suspicious of the hardware first.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/