Обсуждение: Re: Memory leak in plpython3u (with testcase and patch)

Поиск
Список
Период
Сортировка

Re: Memory leak in plpython3u (with testcase and patch)

От
Tom Lane
Дата:
Mat Arye <mat@timescale.com> writes:
> I found a memory leak in plpython3u. It happens when converting the
> argument for an spi plan execution (via plpy.execute() or similar). I've
> attached a sql file to create two plpython3u functions to reproduce the
> issue (memory_leak_test.sql).

I see the leak all right, and your patch does seem to fix it, so
I believe your diagnosis that PLy_output_convert() is what's leaking.

> I believe the issue is in the call to `PLy_output_convert` inside
> `PLy_spi_execute_plan`. I've attached a patch that reduces the memory usage
> to ~70MB from ~720 MB when using test_mem. I based what I did on what
> `PLy_input_convert` does. But I am not an expert in this part of the code
> so I am not sure the patch is correct. In particular, I don't fully grasp
> the comment in `PLy_input_convert` about why the scratch is reset before
> not after the conversion cycle and what that has to do with python
> refcounts. A review by an expert would be appreciated.

The corresponding concern here would be that some pass-by-reference
conversion result Datum could come back verbatim as an output of
SPI_execute_plan, and we mustn't reset the scratch context again
before we're done with using the Datum.  Sadly, I don't think it
will work if that happens, because what we do as soon as
SPI_execute_plan finishes is to call PLy_spi_execute_fetch_result,
and that calls PLy_input_from_tuple, which is one of the things
that will zap the scratch context.  So this doesn't feel very safe
... and indeed it falls over in plpython's existing regression tests.

However, I don't really see why we need to use that scratch context.
PLy_spi_execute_plan runs a subtransaction, so we could perfectly well
use the subtransaction's CurTransactionContext.  (IOW, this wouldn't
even be an issue if PLy_spi_subtransaction_begin didn't forcibly
switch back to oldcontext.)

A couple of other points:

* PLy_spi_execute_plan already has an outer-level variable named
oldcontext, so this coding will draw complaints about a shadowed
local variable in compilers that are picky about that.  But
I don't think you need the inner oldcontext variable anyway, since you
know perfectly well that the previous context is the outer oldcontext.
Just switch back to that after using CurTransactionContext.

* Looking around at other callers of PLy_output_convert, it seems
likely that PLy_cursor_plan() has this same problem.  (I think we're
okay though with the calls from PLy_exec_function and
PLy_modify_tuple, since those could only happen right before return
from the plpython function; there's no path to iterate them enough
times to create a meaningful leak.)

            regards, tom lane



Re: Memory leak in plpython3u (with testcase and patch)

От
Tom Lane
Дата:
I wrote:
> However, I don't really see why we need to use that scratch context.
> PLy_spi_execute_plan runs a subtransaction, so we could perfectly well
> use the subtransaction's CurTransactionContext.

Nah, I'm wrong about that: I forgot CurTransactionContext goes away on
subtransaction abort, but not subtransaction commit.  So we really
need to make our own context with suitable lifespan.

I also noticed that the code is moving heaven and earth to store
the argument Datums ... but for some reason not the associated
isnull flags ... in the PLyPlanObject.  This is just nuts, because
the values don't need to survive past the functions that are computing
them.  In the cursor case, the values will be copied into the cursor
portal, so they still don't need to survive.  We can drop all of
that.

So I arrive at the attached.  (This is just for HEAD.  In the back
branches, we'd better leave the PLyPlanObject.values field in place
for ABI safety, but we don't have to use it.)

            regards, tom lane

diff --git a/src/pl/plpython/plpy_cursorobject.c b/src/pl/plpython/plpy_cursorobject.c
index 6108384c9a..bb3fa8a390 100644
--- a/src/pl/plpython/plpy_cursorobject.c
+++ b/src/pl/plpython/plpy_cursorobject.c
@@ -140,7 +140,6 @@ PLy_cursor_plan(PyObject *ob, PyObject *args)
 {
     PLyCursorObject *cursor;
     volatile int nargs;
-    int            i;
     PLyPlanObject *plan;
     PLyExecutionContext *exec_ctx = PLy_current_execution_context();
     volatile MemoryContext oldcontext;
@@ -199,13 +198,30 @@ PLy_cursor_plan(PyObject *ob, PyObject *args)
     PG_TRY();
     {
         Portal        portal;
+        MemoryContext tmpcontext;
+        Datum       *volatile values;
         char       *volatile nulls;
         volatile int j;

+        /*
+         * Converted arguments and associated cruft will be in this context,
+         * which is local to our subtransaction.
+         */
+        tmpcontext = AllocSetContextCreate(CurTransactionContext,
+                                           "PL/Python temporary context",
+                                           ALLOCSET_SMALL_SIZES);
+        MemoryContextSwitchTo(tmpcontext);
+
         if (nargs > 0)
-            nulls = palloc(nargs * sizeof(char));
+        {
+            values = (Datum *) palloc(nargs * sizeof(Datum));
+            nulls = (char *) palloc(nargs * sizeof(char));
+        }
         else
+        {
+            values = NULL;
             nulls = NULL;
+        }

         for (j = 0; j < nargs; j++)
         {
@@ -217,7 +233,7 @@ PLy_cursor_plan(PyObject *ob, PyObject *args)
             {
                 bool        isnull;

-                plan->values[j] = PLy_output_convert(arg, elem, &isnull);
+                values[j] = PLy_output_convert(arg, elem, &isnull);
                 nulls[j] = isnull ? 'n' : ' ';
             }
             PG_FINALLY(2);
@@ -227,7 +243,9 @@ PLy_cursor_plan(PyObject *ob, PyObject *args)
             PG_END_TRY(2);
         }

-        portal = SPI_cursor_open(NULL, plan->plan, plan->values, nulls,
+        MemoryContextSwitchTo(oldcontext);
+
+        portal = SPI_cursor_open(NULL, plan->plan, values, nulls,
                                  exec_ctx->curr_proc->fn_readonly);
         if (portal == NULL)
             elog(ERROR, "SPI_cursor_open() failed: %s",
@@ -237,40 +255,18 @@ PLy_cursor_plan(PyObject *ob, PyObject *args)

         PinPortal(portal);

+        MemoryContextDelete(tmpcontext);
         PLy_spi_subtransaction_commit(oldcontext, oldowner);
     }
     PG_CATCH();
     {
-        int            k;
-
-        /* cleanup plan->values array */
-        for (k = 0; k < nargs; k++)
-        {
-            if (!plan->args[k].typbyval &&
-                (plan->values[k] != PointerGetDatum(NULL)))
-            {
-                pfree(DatumGetPointer(plan->values[k]));
-                plan->values[k] = PointerGetDatum(NULL);
-            }
-        }
-
         Py_DECREF(cursor);
-
+        /* Subtransaction abort will remove the tmpcontext */
         PLy_spi_subtransaction_abort(oldcontext, oldowner);
         return NULL;
     }
     PG_END_TRY();

-    for (i = 0; i < nargs; i++)
-    {
-        if (!plan->args[i].typbyval &&
-            (plan->values[i] != PointerGetDatum(NULL)))
-        {
-            pfree(DatumGetPointer(plan->values[i]));
-            plan->values[i] = PointerGetDatum(NULL);
-        }
-    }
-
     Assert(cursor->portalname != NULL);
     return (PyObject *) cursor;
 }
diff --git a/src/pl/plpython/plpy_planobject.c b/src/pl/plpython/plpy_planobject.c
index bbef889329..9427674d2f 100644
--- a/src/pl/plpython/plpy_planobject.c
+++ b/src/pl/plpython/plpy_planobject.c
@@ -54,7 +54,6 @@ PLy_plan_new(void)
     ob->plan = NULL;
     ob->nargs = 0;
     ob->types = NULL;
-    ob->values = NULL;
     ob->args = NULL;
     ob->mcxt = NULL;

diff --git a/src/pl/plpython/plpy_planobject.h b/src/pl/plpython/plpy_planobject.h
index 729effb163..a6b34fae19 100644
--- a/src/pl/plpython/plpy_planobject.h
+++ b/src/pl/plpython/plpy_planobject.h
@@ -15,7 +15,6 @@ typedef struct PLyPlanObject
     SPIPlanPtr    plan;
     int            nargs;
     Oid           *types;
-    Datum       *values;
     PLyObToDatum *args;
     MemoryContext mcxt;
 } PLyPlanObject;
diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index bcbd07b70a..77fbfd6c86 100644
--- a/src/pl/plpython/plpy_spi.c
+++ b/src/pl/plpython/plpy_spi.c
@@ -66,7 +66,6 @@ PLy_spi_prepare(PyObject *self, PyObject *args)

     plan->nargs = nargs;
     plan->types = nargs ? palloc0(sizeof(Oid) * nargs) : NULL;
-    plan->values = nargs ? palloc0(sizeof(Datum) * nargs) : NULL;
     plan->args = nargs ? palloc0(sizeof(PLyObToDatum) * nargs) : NULL;

     MemoryContextSwitchTo(oldcontext);
@@ -172,8 +171,7 @@ PyObject *
 PLy_spi_execute_plan(PyObject *ob, PyObject *list, long limit)
 {
     volatile int nargs;
-    int            i,
-                rv;
+    int            rv;
     PLyPlanObject *plan;
     volatile MemoryContext oldcontext;
     volatile ResourceOwner oldowner;
@@ -219,13 +217,30 @@ PLy_spi_execute_plan(PyObject *ob, PyObject *list, long limit)
     PG_TRY();
     {
         PLyExecutionContext *exec_ctx = PLy_current_execution_context();
+        MemoryContext tmpcontext;
+        Datum       *volatile values;
         char       *volatile nulls;
         volatile int j;

+        /*
+         * Converted arguments and associated cruft will be in this context,
+         * which is local to our subtransaction.
+         */
+        tmpcontext = AllocSetContextCreate(CurTransactionContext,
+                                           "PL/Python temporary context",
+                                           ALLOCSET_SMALL_SIZES);
+        MemoryContextSwitchTo(tmpcontext);
+
         if (nargs > 0)
-            nulls = palloc(nargs * sizeof(char));
+        {
+            values = (Datum *) palloc(nargs * sizeof(Datum));
+            nulls = (char *) palloc(nargs * sizeof(char));
+        }
         else
+        {
+            values = NULL;
             nulls = NULL;
+        }

         for (j = 0; j < nargs; j++)
         {
@@ -237,7 +252,7 @@ PLy_spi_execute_plan(PyObject *ob, PyObject *list, long limit)
             {
                 bool        isnull;

-                plan->values[j] = PLy_output_convert(arg, elem, &isnull);
+                values[j] = PLy_output_convert(arg, elem, &isnull);
                 nulls[j] = isnull ? 'n' : ' ';
             }
             PG_FINALLY(2);
@@ -247,47 +262,23 @@ PLy_spi_execute_plan(PyObject *ob, PyObject *list, long limit)
             PG_END_TRY(2);
         }

-        rv = SPI_execute_plan(plan->plan, plan->values, nulls,
+        MemoryContextSwitchTo(oldcontext);
+
+        rv = SPI_execute_plan(plan->plan, values, nulls,
                               exec_ctx->curr_proc->fn_readonly, limit);
         ret = PLy_spi_execute_fetch_result(SPI_tuptable, SPI_processed, rv);

-        if (nargs > 0)
-            pfree(nulls);
-
+        MemoryContextDelete(tmpcontext);
         PLy_spi_subtransaction_commit(oldcontext, oldowner);
     }
     PG_CATCH();
     {
-        int            k;
-
-        /*
-         * cleanup plan->values array
-         */
-        for (k = 0; k < nargs; k++)
-        {
-            if (!plan->args[k].typbyval &&
-                (plan->values[k] != PointerGetDatum(NULL)))
-            {
-                pfree(DatumGetPointer(plan->values[k]));
-                plan->values[k] = PointerGetDatum(NULL);
-            }
-        }
-
+        /* Subtransaction abort will remove the tmpcontext */
         PLy_spi_subtransaction_abort(oldcontext, oldowner);
         return NULL;
     }
     PG_END_TRY();

-    for (i = 0; i < nargs; i++)
-    {
-        if (!plan->args[i].typbyval &&
-            (plan->values[i] != PointerGetDatum(NULL)))
-        {
-            pfree(DatumGetPointer(plan->values[i]));
-            plan->values[i] = PointerGetDatum(NULL);
-        }
-    }
-
     if (rv < 0)
     {
         PLy_exception_set(PLy_exc_spi_error,