Re: help troubleshooting invalid page header error

Поиск
Список
Период
Сортировка
От Sameer Kumar
Тема Re: help troubleshooting invalid page header error
Дата
Msg-id CADp-Sm4umU9-bvfF6=m8_rsnq+-q8ASXXPCAxca3ZwBs=VKd8Q@mail.gmail.com
обсуждение исходный текст
Ответ на help troubleshooting invalid page header error  (Cory Zue <czue@dimagi.com>)
Ответы Re: help troubleshooting invalid page header error  (Cory Zue <czue@dimagi.com>)
Список pgsql-general

On 23 Dec 2014 12:05, "Cory Zue" <czue@dimagi.com> wrote:
>
> Hi all,
>
> Our postgres instance on one of our production machines has recently been returning errors of the form "DatabaseError: invalid page header in block 1 of relation base/16384/76623" from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated.
>
> ---
> A description of what you are trying to achieve and what results you expect.:
>
> Intermittent queries are failing with the error "DatabaseError: invalid page header in block 1 of relation base/16384/76623"
>
> PostgreSQL version number you are running: 
>
> PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit
>
> How you installed PostgreSQL: 
>
> from standard package installer
>
> Changes made to the settings in the postgresql.conf file:  
>
>
>              name             |       current_setting       |        source        
> ------------------------------+-----------------------------+----------------------
>  checkpoint_completion_target | 0.9                         | configuration file
>  checkpoint_segments          | 32                          | configuration file
>  checkpoint_timeout           | 15min                       | configuration file
>  DateStyle                    | ISO, MDY                    | configuration file
>  default_text_search_config   | pg_catalog.english          | configuration file
>  effective_cache_size         | 1GB                         | configuration file
>  lc_messages                  | en_US.UTF-8                 | configuration file
>  lc_monetary                  | en_US.UTF-8                 | configuration file
>  lc_numeric                   | en_US.UTF-8                 | configuration file
>  lc_time                      | en_US.UTF-8                 | configuration file
>  log_checkpoints              | on                          | configuration file
>  log_connections              | off                         | configuration file
>  log_destination              | csvlog                      | configuration file
>  log_directory                | /opt/data/pgsql/data/pg_log | configuration file
>  log_disconnections           | off                         | configuration file
>  log_duration                 | on                          | configuration file
>  log_filename                 | postgres-%Y-%m-%d_%H%M%S    | configuration file
>  log_lock_waits               | on                          | configuration file
>  log_min_duration_statement   | 250ms                       | configuration file
>  log_rotation_age             | 1d                          | configuration file
>  log_rotation_size            | 1GB                         | configuration file
>  log_temp_files               | 0                           | configuration file
>  log_timezone                 | Asia/Kolkata                | command line
>  log_truncate_on_rotation     | on                          | configuration file
>  logging_collector            | on                          | configuration file
>  maintenance_work_mem         | 768MB                       | configuration file
>  max_connections              | 500                         | configuration file
>  max_stack_depth              | 2MB                         | environment variable
>  port                         | 5432                        | command line
>  shared_buffers               | 4GB                         | configuration file
>  ssl                          | on                          | configuration file
>  TimeZone                     | Asia/Kolkata                | command line
>  timezone_abbreviations       | Default                     | command line
>  wal_buffers                  | 16MB                        | configuration file
>  work_mem                     | 48MB                        | configuration file
>
> It's also probably worth noting that postgres is installed on an encrypted volume which is mounted using ecryptfs.
>
> Operating system and version:
>
> RedHatEnterpriseServer, version 6.6
>
> What program you're using to connect to PostgreSQL:
>
> Python (django)
>  
> Is there anything relevant or unusual in the PostgreSQL server logs?:
>
> I see lots of instances of this error (and similar). I'm not sure what else I should be looking for.
>
> What you were doing when the error happened / how to cause the error:
>
> I haven't explicitly tried to reproduce it, but it seems to consistently happen with certain queries. However, the system was rebooted shortly before the errors started occuring. The system was rebooted because another database (elasticsearch) was having problems on the same machine and the reboot was to attempt to resolve things.
>
> The EXACT TEXT of the error message you're getting, if there is one:
>
> DatabaseError: invalid page header in block 1 of relation base/16384/76623
>
> (block and relation numbers change)
>
> Unfortunately, I'm not completely familiar with the CPU and disk/RAID configurations used on the server. However it is storing to a (software) encrypted volume as mentioned above.
>
> Have you ever set fsync=off in the postgresql config file?
> No
> Have you had any unexpected power loss lately? Replaced a failed RAID disk? Had an operating system crash?
> Not recently, though the system did reboot normally as described above.
> Have you run a file system check? (chkdsk / fsck)
> No.
> Are there any error messages in the system logs? (unix/linux: dmesg, /var/log/syslog ;
> I haven't seen anything obvious but I wasn't sure what to look for.
>

I guess you missed to provide the details and kernel version (rhel version and kernel level).
This will give you kernel patch level-

uname -a

I had once faced this issue and I was on a buggy patch of Linux kernel. I just had to update to latest patch. That worked for me.

В списке pgsql-general по дате отправления:

Предыдущее
От: Borodin Vladimir
Дата:
Сообщение: Checksums and full_page_writes
Следующее
От: Anton Melser
Дата:
Сообщение: Drastic select count performance hit when jsonb GIN indices are present