Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()
Дата
Msg-id 1306569.1687978174@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> The following script:
> ...
> invokes errors or a server crash:

Thanks for the report!  The good news is that the case shown here
is easily fixed, as attached.  get_cached_rowtype() explicitly
specifies that its result doesn't hold good across database
access if you don't increment the tupdesc's refcount.  Most of
the callers get this right, but ExecEvalFieldStoreDeForm() is
doing things in the wrong order by fetching the tupdesc before
detoasting the input datum.  We could fix that by temporarily
incrementing the refcount, but I don't see any reason we can't
just reorder the steps to make it safe.

The bad news is that while investigating this I came across
another hazard that seems much messier to fix.  To wit, if
you have a tuple with "missing" pass-by-ref columns, then
some of the columns output by heap_deform_tuple() will
contain pointers into the tupdesc (cf. getmissingattr()).
That's not sufficient lifespan in the scenario we're dealing
with here: if the tupdesc gets trashed anytime before the
eventual ExecEvalFieldStoreForm(), we'll have garbage in the
result tuple.

Worse, I fear there may be many other places with latent bugs of the
same ilk.  Before the attmissingval code was added, one could assume
that the result of heap_deform_tuple would hold good as long as you
had a pin on the source tuple.  But now it depends on metadata as
well, and I don't think we have a coherent story about that.

Any thoughts what to do?  I considered making getmissingattr
apply datumCopy, but that would probably lead to unpleasant
memory leaks.

            regards, tom lane

diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 7a4d7a4eee..851946a927 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -2015,7 +2015,8 @@ CheckOpSlotCompatibility(ExprEvalStep *op, TupleTableSlot *slot)
  * changed: if not NULL, *changed is set to true on any update
  *
  * The returned TupleDesc is not guaranteed pinned; caller must pin it
- * to use it across any operation that might incur cache invalidation.
+ * to use it across any operation that might incur cache invalidation,
+ * including for example detoasting of input tuples.
  * (The TupleDesc is always refcounted, so just use IncrTupleDescRefCount.)
  *
  * NOTE: because composite types can change contents, we must be prepared
@@ -3174,17 +3175,6 @@ ExecEvalFieldSelect(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
 void
 ExecEvalFieldStoreDeForm(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
 {
-    TupleDesc    tupDesc;
-
-    /* Lookup tupdesc if first time through or if type changes */
-    tupDesc = get_cached_rowtype(op->d.fieldstore.fstore->resulttype, -1,
-                                 op->d.fieldstore.rowcache, NULL);
-
-    /* Check that current tupdesc doesn't have more fields than we allocated */
-    if (unlikely(tupDesc->natts > op->d.fieldstore.ncolumns))
-        elog(ERROR, "too many columns in composite type %u",
-             op->d.fieldstore.fstore->resulttype);
-
     if (*op->resnull)
     {
         /* Convert null input tuple into an all-nulls row */
@@ -3200,6 +3190,7 @@ ExecEvalFieldStoreDeForm(ExprState *state, ExprEvalStep *op, ExprContext *econte
         Datum        tupDatum = *op->resvalue;
         HeapTupleHeader tuphdr;
         HeapTupleData tmptup;
+        TupleDesc    tupDesc;

         tuphdr = DatumGetHeapTupleHeader(tupDatum);
         tmptup.t_len = HeapTupleHeaderGetDatumLength(tuphdr);
@@ -3207,6 +3198,20 @@ ExecEvalFieldStoreDeForm(ExprState *state, ExprEvalStep *op, ExprContext *econte
         tmptup.t_tableOid = InvalidOid;
         tmptup.t_data = tuphdr;

+        /*
+         * Lookup tupdesc if first time through or if type changes.  Because
+         * we don't pin the tupdesc, we must not do this lookup until after
+         * doing DatumGetHeapTupleHeader: that could do database access while
+         * detoasting the datum.
+         */
+        tupDesc = get_cached_rowtype(op->d.fieldstore.fstore->resulttype, -1,
+                                     op->d.fieldstore.rowcache, NULL);
+
+        /* Check that current tupdesc doesn't have more fields than allocated */
+        if (unlikely(tupDesc->natts > op->d.fieldstore.ncolumns))
+            elog(ERROR, "too many columns in composite type %u",
+                 op->d.fieldstore.fstore->resulttype);
+
         heap_deform_tuple(&tmptup, tupDesc,
                           op->d.fieldstore.values,
                           op->d.fieldstore.nulls);

В списке pgsql-bugs по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: BUG #18002: Duplicate entries of row possible even after having primary key
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()