Re: OOM in libpq and infinite loop with getCopyStart()

Поиск
Список
Период
Сортировка
От Aleksander Alekseev
Тема Re: OOM in libpq and infinite loop with getCopyStart()
Дата
Msg-id 20160309175401.590c2f30@fujitsu
обсуждение исходный текст
Ответ на Re: OOM in libpq and infinite loop with getCopyStart()  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: OOM in libpq and infinite loop with getCopyStart()  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
Hello, Michael

Thanks a lot for steps to reproduce you provided.

I tested your path on Ubuntu Linux 14.04 LTS (GCC 4.8.4) and FreeBSD
10.2 RELEASE (Clang 3.4.1). In both cases patch applies cleanly, there
are no warnings during compilation and all regression tests pass. A few
files are not properly pgindent'ed though:

```
diff --git a/src/interfaces/libpq/fe-exec.c
b/src/interfaces/libpq/fe-exec.c index c99f193..2769719 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2031,7 +2031,7 @@ PQexecFinish(PGconn *conn)           conn->status == CONNECTION_BAD)           break;       else
if((conn->asyncStatus == PGASYNC_COPY_IN ||
 
-                 conn->asyncStatus == PGASYNC_COPY_OUT  ||
+                 conn->asyncStatus == PGASYNC_COPY_OUT ||                 conn->asyncStatus == PGASYNC_COPY_BOTH) &&
            result->resultStatus == PGRES_FATAL_ERROR)           break;
 
diff --git a/src/interfaces/libpq/fe-protocol3.c
b/src/interfaces/libpq/fe-protocol3.c index 21a1d9b..280ca16 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -49,9 +49,9 @@ static int    getParamDescriptions(PGconn *conn, int
msgLength); static int getAnotherTuple(PGconn *conn, int msgLength);static int getParameterStatus(PGconn *conn);static
intgetNotify(PGconn *conn);
 
-static int getCopyStart(PGconn *conn,
-                        ExecStatusType copytype,
-                        int msgLength);
+static int getCopyStart(PGconn *conn,
+            ExecStatusType copytype,
+            int msgLength);static int getReadyForQuery(PGconn *conn);static void reportErrorPosition(PQExpBuffer msg,
constchar *query,                   int loc, int encoding);
 
```

Indeed your patch solves issues you described. Still here is something
that concerns me:

```
$ gdb --args pg_receivexlog --verbose -D ./temp_data/

(gdb) b getCopyStart
Function "getCopyStart" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (getCopyStart) pending.
(gdb) r
Starting program: /usr/local/pgsql/bin/pg_receivexlog --verbose
-D ./temp_data/ [Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1". pg_receivexlog: starting log
streaming at 0/1000000 (timeline 1)

Breakpoint 1, getCopyStart (conn=0x610220, copytype=PGRES_COPY_BOTH,
msgLength=3) at fe-protocol3.c:1398 1398        const char
*errmsg = NULL; (gdb) n
1400        result = PQmakeEmptyPGresult(conn, copytype);
(gdb) 
1401        if (!result)
(gdb) p result = 0
$1 = (PGresult *) 0x0
(gdb) c
Continuing.
pg_receivexlog: could not send replication command "START_REPLICATION":
out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
again pg_receivexlog: starting log streaming at 0/1000000 (timeline 1)

Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
msgLength=3) at fe-protocol3.c:1398 1398        const char
*errmsg = NULL; (gdb) n
1400        result = PQmakeEmptyPGresult(conn, copytype);
(gdb) n
1401        if (!result)
(gdb) p result = 0
$2 = (PGresult *) 0x0
(gdb) c
Continuing.
pg_receivexlog: could not send replication command "START_REPLICATION":
out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
again pg_receivexlog: starting log streaming at 0/1000000 (timeline 1)

Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
msgLength=3) at fe-protocol3.c:1398 1398        const char
*errmsg = NULL;
```

Granted this behaviour is a bit better then the current one. But
basically it's the same infinite loop only with pauses and warnings. I
wonder if this is a behaviour we really want. For instance wouldn't it
be better just to terminate an application in out-of-memory case? "Let
it crash" as Erlang programmers say.

Best regards,
Aleksander



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mithun Cy
Дата:
Сообщение: Explain [Analyze] produces parallel scan for select Into table statements.
Следующее
От: Matthias Kurz
Дата:
Сообщение: Alter or rename enum value