Re: PQexec() hangs on OOM
От | Michael Paquier |
---|---|
Тема | Re: PQexec() hangs on OOM |
Дата | |
Msg-id | CAB7nPqT6gKj6iS9VTPth_h6Sz5Jo-177s6QJN_jrW66wyCjJ=w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: PQexec() hangs on OOM (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: PQexec() hangs on OOM
|
Список | pgsql-bugs |
On Fri, Sep 18, 2015 at 11:32 PM, Amit Kapila wrote: > IIRC, this is required to sanely report "out of memory" error in case > of replication protocol (master-standby setup). This loop and in-particular > this check is quite similar to PQexecFinish() functions check and loop > where we return last result. I think it is better to keep both the places > in-sync > and also I think this is required to report the error appropriately. I have > tried manual debugging for the out of memory error for this case and > it works well with this check and without the check it doesn't report > the error in an appropriate way(don't remember exactly what was > the difference). If required, I can again try to reproduce the scenario > and share the exact report. I just had a look at that again... Put for example a call to pg_usleep in libpqrcv_identify_system after executing IDENTIFY_SYSTEM and before calling PQresultStatus, then take a breakpoint on the WAL receiver process when starting up a standby. This gives plenty of room to emulate the OOM failure in getCopyStart. When the check on PGRES_FATAL_ERROR is not added and when emulating the OOM immediately, libpqrcv_PQexec loops twice and thinks it can start up strrep but fails afterwards. Here is the failure seen from the standby: LOG: started streaming WAL from primary at 0/3000000 on timeline 1 FATAL: could not send data to WAL stream: server closed the connection unexpectedly And from the master: LOG: unexpected EOF on standby connection The WAL receiver process ultimately restarts after. When the check on PGRES_FATAL_ERROR is added, strrep fails to start appropriately and the error is fetched correctly by the WAL receiver: FATAL: could not start WAL streaming: out of memory In short: Amit seems right to have added this check. Note for later: looking at patches during conferences is really a bad habit. -- Michael
В списке pgsql-bugs по дате отправления: