Обсуждение: Backend dies for DBI

Поиск
Список
Период
Сортировка

Backend dies for DBI

От
Bjorn Solberg
Дата:
I have looked at the FAQs and archives, but haven't seen anything
resembling this so please bear with me:

Setup:

Linux x86, 2.2.16
Apache 1.3.19
mod_perl 1.25
HTML-Mason-0.89
PostgreSQL 7.0.3
DBI-1.14
DBD-Pg-0.95

Mason (which is a template package on top of mod_perl, see
http://www.masonhq.com/) opens a connection to the database, using DBI.
This handle is then used for subsequent DB access.  Mason runs within the
web server.

This worked flawlessly until I had a power outage a month or so ago.
Since then, the backend postmaster process seems to die now and then, and
I have been unable to reproduce it consistently.  That is, the number of
times I access the (dynamic HTML) page (and thus the DB) before the DB
crashes, varies.  The log file from postmaster simply says:

--->
StartTransactionCommand
[...]
ProcessQuery
CommitTransactionCommand
proc_exit(0)
shmem_exit(0)
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 1297 exited with status 0
proc_exit(0)
shmem_exit(0)
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 1298 exited with status 0
---<

The postmaster is started like this:

/usr/local/pgsql/bin/postmaster -d 2 -D /usr/local/pgsql/data >> /usr/local/pgsql/pgsql.log 2>&1

DBI (I think) sometimes output a message like "Backend message type 0x50
arrived while idle".  This can happen several times, and the crash does
not occur directly following any of the messages.

The DBI error message simply says

--->
DBD::Pg::st execute failed: pqReadData() -- backend closed the channel unexpectedly.       This probably means the
backendterminated abnormally       before or while processing the request.
 
---<

The same is true for Apache 1.3.12, mod_perl 1.24.  Any pointers and
hints so I can figure out how to make the backend not die, are greatly
appreciated.

Thanks,

Bjorn.


Re: Backend dies for DBI

От
Tom Lane
Дата:
Bjorn Solberg <bjorn@geocities.com> writes:
> The log file from postmaster simply says:

> --->
> StartTransactionCommand
> [...]
> ProcessQuery
> CommitTransactionCommand
> proc_exit(0)
> shmem_exit(0)
> exit(0)
> /usr/local/pgsql/bin/postmaster: reaping dead processes...
> /usr/local/pgsql/bin/postmaster: CleanupProc: pid 1297 exited with status 0
> proc_exit(0)
> shmem_exit(0)
> exit(0)
> /usr/local/pgsql/bin/postmaster: reaping dead processes...
> /usr/local/pgsql/bin/postmaster: CleanupProc: pid 1298 exited with status 0
> ---<

This looks absolutely normal --- there's no evidence here for anything
except normal backend exit.  At least the backend thinks it's normal.
You should be looking for client or protocol-level problems.

> DBI (I think) sometimes output a message like "Backend message type 0x50
> arrived while idle".  This can happen several times, and the crash does
> not occur directly following any of the messages.

This may indicate running out of memory for a query result on the client
side; libpq is not very graceful about dealing with that :-(.  You
should check on the sizes of query results you are fetching and the
amount of memory available to your DBI applications.  Fetching a few
rows at a time via a CURSOR is a good workaround if that seems to be
the problem.
        regards, tom lane


Re: Backend dies for DBI

От
Bjorn Halvor Solberg
Дата:
Tom Lane writes:

> Bjorn Solberg <bjorn@geocities.com> writes:
>> The log file from postmaster simply says:

---> 
>> StartTransactionCommand
>> [...]
>> ProcessQuery
>> CommitTransactionCommand
>> proc_exit(0)
>> shmem_exit(0)
>> exit(0)
>> /usr/local/pgsql/bin/postmaster: reaping dead processes...
>> /usr/local/pgsql/bin/postmaster: CleanupProc: pid 1297 exited with status 0
>> proc_exit(0)
>> shmem_exit(0)
>> exit(0)
>> /usr/local/pgsql/bin/postmaster: reaping dead processes...
>> /usr/local/pgsql/bin/postmaster: CleanupProc: pid 1298 exited with status 0
>> ---<

> This looks absolutely normal --- there's no evidence here for anything
> except normal backend exit.  At least the backend thinks it's normal.
> You should be looking for client or protocol-level problems.

OK.  I thought it was abnormal because it only happens when DBI loses
connection.

>> DBI (I think) sometimes output a message like "Backend message type 0x50
>> arrived while idle".  This can happen several times, and the crash does
>> not occur directly following any of the messages.

> This may indicate running out of memory for a query result on the client
> side; libpq is not very graceful about dealing with that :-(.  You
> should check on the sizes of query results you are fetching and the
> amount of memory available to your DBI applications.  Fetching a few
> rows at a time via a CURSOR is a good workaround if that seems to be
> the problem.

Thank you for your help, I'll look into this.  I do check the result of
each query, this is a very small database so I can't quite see how that
would affect the system like this.

What confuses me is that it ran fine for a couple of months.  Then, after
the power outage, it crashes every few hours, resulting in having to
restart the web server to regain the DB handle.  Currently the DB handle
is allocated once at (web-)server startup.  Maybe I'll change it to do a
connect and disconnect for each web page instead.  I figured doing it
only once and reusing the DB handle would save some time, but maybe it is
an unsafe thing to do?

Bjorn.


Re: Backend dies for DBI

От
Tom Lane
Дата:
> What confuses me is that it ran fine for a couple of months.  Then, after
> the power outage, it crashes every few hours, resulting in having to
> restart the web server to regain the DB handle.

Hmm.  Possibly some corrupted data in the database now?
        regards, tom lane


RE: Backend dies for DBI

От
"Oscar Serrano"
Дата:

> -----Mensaje original-----
> De: pgsql-interfaces-owner@postgresql.org
> [mailto:pgsql-interfaces-owner@postgresql.org]En nombre de Tom Lane
> Enviado el: jueves, 08 de marzo de 2001 2:51
> Para: Bjorn Halvor Solberg
> CC: pgsql-interfaces@postgresql.org
> Asunto: Re: [INTERFACES] Backend dies for DBI
>
>
> > What confuses me is that it ran fine for a couple of months.
> Then, after
> > the power outage, it crashes every few hours, resulting in having to
> > restart the web server to regain the DB handle.
>
> Hmm.  Possibly some corrupted data in the database now?
>
>             regards, tom lane
>

Recently I had a problem similar to this.
I don't know why, once or twice a day, some backend died abnormaly so
killing all other backends. The problem was some index got corrupted. Using
the command REINDEX TABLE table_name FORCE, it fixed the problem with the
index. But the problem started again.
Finally I've found that the problem was in NFS. My databases where in
another computer (different from the postgres binaries) mounted by NFS. Now
I've put postmaster and databases in the same hard disk and now it works.

BTW, I have a conexion to the postmaster on every cgi (mod_perl) because I
thought that it was not possible to open a conexión from within the
httpd.conf and use it for every cgi.

> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>



Re: Backend dies for DBI

От
Bjorn Halvor Solberg
Дата:
Tom Lane writes:

>> What confuses me is that it ran fine for a couple of months.  Then, after
>> the power outage, it crashes every few hours, resulting in having to
>> restart the web server to regain the DB handle.

> Hmm.  Possibly some corrupted data in the database now?

That thought did come to mind, but manual queries go through just fine.
I did vacuum the database, but that didn't help.

Maybe using just one DBI database handle is the wrong way to use it?
Still, what puzzles me is that it worked fine for months before the power
outage.  I've changed it to open and close the database handle (connect/
disconnect) for every web page served now, instead of having just one per
server process for the lifetime of the process, and haven't seen the
problem since.  I suppose there are performance issues with that, but the
database and nature of the access is such that that isn't important now. :)

The database is on the local machine, the same host as the web server is
running on, so it is not due to an NFS problem.

I suppose there are more people out there using PostgreSQL through DBI in
Apache, either through mod_perl, Mason or other means - what do you do?
Connect and disconnect per serverd web page, or keep one connection open
for each web server process?

Thanks for hints and help!

Bjorn.


Re: Backend dies for DBI

От
Tom Lane
Дата:
Bjorn Halvor Solberg <bhso@pacbell.net> writes:
> What confuses me is that it ran fine for a couple of months.  Then, after
> the power outage, it crashes every few hours, resulting in having to
> restart the web server to regain the DB handle.

>> Hmm.  Possibly some corrupted data in the database now?

> That thought did come to mind, but manual queries go through just fine.
> I did vacuum the database, but that didn't help.

This is reaching a little bit, but something to think about: maybe you
are not running exactly the same software or configuration as you were
before the power outage.  A well known gotcha on machines that stay up
for a long time between boots is that you may have changed something
without realizing that the effects would disappear at the next boot
(or alternatively, only come into effect at the next boot).  For
example, starting a daemon without remembering to arrange for the boot
scripts to start it, or changing a daemon's config file and neglecting
to test the results, or installing a new version of some shlib that's
used by a daemon.  When a reboot does finally happen, you've forgotten
that there was anything to worry about, and so the sudden change in
behavior is mysterious.

I realize this doesn't give anything concrete to go on, but maybe it
will give you an idea of where to look.
        regards, tom lane