Обсуждение: Re: BUG #18059: Unexpected error 25001 in stored procedure

Поиск
Список
Период
Сортировка

Re: BUG #18059: Unexpected error 25001 in stored procedure

От
Tom Lane
Дата:
[ redirected to -hackers ]

PG Bug reporting form <noreply@postgresql.org> writes:
> Steps to reproduce:
> 1. Create stored procedure

> create or replace procedure test_proc()
> language plpgsql as $procedure$
> begin
>    commit;
>    set transaction isolation level repeatable read;
>    -- here follows some useful code which is omitted for brevity
> end
> $procedure$;

> 2. Open new connection

> 3. Execute the following 3 queries one by one:
> a) call test_proc();
> b) create temporary table "#tmp"(c int) on commit drop;
> c) call test_proc();
> On step c) you'll get an error
> [25001]: ERROR: SET TRANSACTION ISOLATION LEVEL must be called before any
> query

Thanks for the report!

I looked into this.  The issue is that the plancache decides it needs
to revalidate the plan for the SET command, and that causes a
transaction start (or at least acquisition of a snapshot), which then
causes check_transaction_isolation to complain.  The weird sequence
that you have to go through to trigger the failure is conditioned by
the need to get the plancache entry into the needs-revalidation state
at the right time.  This wasn't really a problem when the plancache
code was written, but now that we have procedures it's not good.

We could imagine trying to terminate the new transaction once we've
finished revalidating the plan, but that direction seems silly to me.
A SET command has no plan to rebuild, while for commands that do need
that, terminating and restarting the transaction adds useless overhead.
So the right fix seems to be to just do nothing.  plancache.c already
knows revalidation should do nothing for TransactionStmts, but that
amount of knowledge is insufficient, as shown by this report.

One reasonable precedent is found in PlannedStmtRequiresSnapshot:
we could change plancache.c to exclude exactly the same utility
commands that does, viz

    if (IsA(utilityStmt, TransactionStmt) ||
        IsA(utilityStmt, LockStmt) ||
        IsA(utilityStmt, VariableSetStmt) ||
        IsA(utilityStmt, VariableShowStmt) ||
        IsA(utilityStmt, ConstraintsSetStmt) ||
    /* efficiency hacks from here down */
        IsA(utilityStmt, FetchStmt) ||
        IsA(utilityStmt, ListenStmt) ||
        IsA(utilityStmt, NotifyStmt) ||
        IsA(utilityStmt, UnlistenStmt) ||
        IsA(utilityStmt, CheckPointStmt))
        return false;

However, this feels unsatisfying.  "Does it require a snapshot?" is not
the same question as "does it have a plan that could need rebuilding?".
The vast majority of utility statements do not have any such plan:
they are left untouched by parse analysis, rewriting, and planning.

What I'm inclined to propose, therefore, is that we make revalidation
be a no-op for every statement type for which transformStmt() reaches
its default: case.  (When it does so, the resulting CMD_UTILITY Query
will not get any processing from the rewriter or planner either.)
That gives us this list of statements requiring revalidation:

        case T_InsertStmt:
        case T_DeleteStmt:
        case T_UpdateStmt:
        case T_MergeStmt:
        case T_SelectStmt:
        case T_ReturnStmt:
        case T_PLAssignStmt:
        case T_DeclareCursorStmt:
        case T_ExplainStmt:
        case T_CreateTableAsStmt:
        case T_CallStmt:

For maintainability's sake I'd suggest writing a new function along
the line of RawStmtRequiresParseAnalysis() and putting it beside
transformStmt(), rather than allowing plancache.c to know directly
which statement types require analysis.

Thoughts?

            regards, tom lane



Re: BUG #18059: Unexpected error 25001 in stored procedure

От
Robert Haas
Дата:
On Sat, Aug 19, 2023 at 1:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What I'm inclined to propose, therefore, is that we make revalidation
> be a no-op for every statement type for which transformStmt() reaches
> its default: case.  (When it does so, the resulting CMD_UTILITY Query
> will not get any processing from the rewriter or planner either.)
> That gives us this list of statements requiring revalidation:
>
>         case T_InsertStmt:
>         case T_DeleteStmt:
>         case T_UpdateStmt:
>         case T_MergeStmt:
>         case T_SelectStmt:
>         case T_ReturnStmt:
>         case T_PLAssignStmt:
>         case T_DeclareCursorStmt:
>         case T_ExplainStmt:
>         case T_CreateTableAsStmt:
>         case T_CallStmt:

That sounds like the right thing. It is perhaps unfortunate that we
don't have a proper parse analysis/execution distinction for other
types of statements, but if that ever changes then this can be
revisited.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: BUG #18059: Unexpected error 25001 in stored procedure

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Aug 19, 2023 at 1:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I'm inclined to propose, therefore, is that we make revalidation
>> be a no-op for every statement type for which transformStmt() reaches
>> its default: case.  (When it does so, the resulting CMD_UTILITY Query
>> will not get any processing from the rewriter or planner either.)

> That sounds like the right thing. It is perhaps unfortunate that we
> don't have a proper parse analysis/execution distinction for other
> types of statements, but if that ever changes then this can be
> revisited.

I started to code this, and immediately noticed that transformStmt()
already has a companion function analyze_requires_snapshot() that
returns "true" in the cases of interest ... except that it does
not return true for T_CallStmt.  Perhaps that was intentional to
begin with, but it is very hard to believe that it isn't a bug now,
since transformCallStmt can invoke nearly arbitrary processing via
transformExpr().  What semantic anomalies, if any, do we risk if CALL
processing forces a transaction start?  (I rather imagine it does
already, somewhere later on...)

Anyway, I'm now of two minds whether to use analyze_requires_snapshot()
as-is for plancache.c's invalidation test, or duplicate it under a
different name, or have two names but one is just an alias for the
other.  It still seems like "analyze requires snapshot" isn't
necessarily the exact inverse condition of "analyze is a no-op", but
it is today (assuming we agree that CALL needs a snapshot), and maybe
maintaining two duplicate functions is silly.  Thoughts?

            regards, tom lane



Re: BUG #18059: Unexpected error 25001 in stored procedure

От
Tom Lane
Дата:
I wrote:
> I started to code this, and immediately noticed that transformStmt()
> already has a companion function analyze_requires_snapshot() that
> returns "true" in the cases of interest ... except that it does
> not return true for T_CallStmt.  Perhaps that was intentional to
> begin with, but it is very hard to believe that it isn't a bug now,
> since transformCallStmt can invoke nearly arbitrary processing via
> transformExpr().  What semantic anomalies, if any, do we risk if CALL
> processing forces a transaction start?  (I rather imagine it does
> already, somewhere later on...)

I poked around some more, and determined that there should not be any
new semantic anomalies if analyze_requires_snapshot starts returning
true for CallStmts, because ExecuteCallStmt already acquires and
releases a snapshot before invoking the procedure (at least in the
non-atomic case, which is the one of interest here).  I spent some
time trying to devise a test case showing it's broken, and did not
succeed: the fact that we disallow sub-SELECTs in CALL arguments makes
it a lot harder than I'd expected to reach anyplace that would require
having a transaction snapshot set.  Nonetheless, I have very little
faith that such a scenario doesn't exist today, and even less that
we won't add one in future.  The only real reason I can see for not
setting a snapshot here is as a micro-optimization.  While that's
not without value, it seems hard to argue that CALL deserves an
optimization that SELECT doesn't get.

I also realized that ReturnStmts are likewise missing from
analyze_requires_snapshot().  This is probably unreachable, because
ReturnStmt can only appear in a SQL-language function and I can't
think of a scenario where we'd be parsing one outside a transaction.
Nonetheless it seems hard to argue that this is an optimization
we need to keep.

Hence I propose the attached patch, which invents
stmt_requires_parse_analysis() and makes analyze_requires_snapshot()
into an alias for it, so that all these statement types are treated
alike.  I made the adjacent comments a lot more opinionated, too,
in hopes that future additions won't overlook these concerns.

The actual bug fix is in plancache.c.  I decided to invert the tests
in plancache.c, because the macro really needed renaming anyway and
it seemed to read better this way.  I also noticed that
ResetPlanCache() already tries to optimize away invalidation of
utility statements, but that logic seems no longer necessary ---
what's more, it's outright wrong for CALL, because that does need
invalidation and won't get it.  (I have not tried to build a test
case proving that that's broken, but surely it is.)

Barring objections, this needs to be back-patched as far as v11.

            regards, tom lane

diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 4006632092..bc8176f545 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -335,6 +335,11 @@ transformStmt(ParseState *pstate, Node *parseTree)
     }
 #endif                            /* RAW_EXPRESSION_COVERAGE_TEST */

+    /*
+     * Caution: when changing the set of statement types that have non-default
+     * processing here, see also stmt_requires_parse_analysis() and
+     * analyze_requires_snapshot().
+     */
     switch (nodeTag(parseTree))
     {
             /*
@@ -421,14 +426,22 @@ transformStmt(ParseState *pstate, Node *parseTree)
 }

 /*
- * analyze_requires_snapshot
- *        Returns true if a snapshot must be set before doing parse analysis
- *        on the given raw parse tree.
+ * stmt_requires_parse_analysis
+ *        Returns true if parse analysis will do anything non-trivial
+ *        with the given raw parse tree.
+ *
+ * Generally, this should return true for any statement type for which
+ * transformStmt() does more than wrap a CMD_UTILITY Query around it.
+ * When it returns false, the caller may assume that there is no situation
+ * in which parse analysis of the raw statement could need to be re-done.
  *
- * Classification here should match transformStmt().
+ * Currently, since the rewriter and planner do nothing for CMD_UTILITY
+ * Queries, a false result means that the entire parse analysis/rewrite/plan
+ * pipeline will never need to be re-done.  If that ever changes, callers
+ * will likely need adjustment.
  */
 bool
-analyze_requires_snapshot(RawStmt *parseTree)
+stmt_requires_parse_analysis(RawStmt *parseTree)
 {
     bool        result;

@@ -442,6 +455,7 @@ analyze_requires_snapshot(RawStmt *parseTree)
         case T_UpdateStmt:
         case T_MergeStmt:
         case T_SelectStmt:
+        case T_ReturnStmt:
         case T_PLAssignStmt:
             result = true;
             break;
@@ -452,12 +466,12 @@ analyze_requires_snapshot(RawStmt *parseTree)
         case T_DeclareCursorStmt:
         case T_ExplainStmt:
         case T_CreateTableAsStmt:
-            /* yes, because we must analyze the contained statement */
+        case T_CallStmt:
             result = true;
             break;

         default:
-            /* other utility statements don't have any real parse analysis */
+            /* all other statements just get wrapped in a CMD_UTILITY Query */
             result = false;
             break;
     }
@@ -465,6 +479,30 @@ analyze_requires_snapshot(RawStmt *parseTree)
     return result;
 }

+/*
+ * analyze_requires_snapshot
+ *        Returns true if a snapshot must be set before doing parse analysis
+ *        on the given raw parse tree.
+ */
+bool
+analyze_requires_snapshot(RawStmt *parseTree)
+{
+    /*
+     * Currently, this should return true in exactly the same cases that
+     * stmt_requires_parse_analysis() does, so we just invoke that function
+     * rather than duplicating it.  We keep the two entry points separate for
+     * clarity of callers, since from the callers' standpoint these are
+     * different conditions.
+     *
+     * While there may someday be a statement type for which transformStmt()
+     * does something nontrivial and yet no snapshot is needed for that
+     * processing, it seems likely that making such a choice would be fragile.
+     * If you want to install an exception, document the reasoning for it in a
+     * comment.
+     */
+    return stmt_requires_parse_analysis(parseTree);
+}
+
 /*
  * transformDeleteStmt -
  *      transforms a Delete Statement
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index d67cd9a405..7d4168f82f 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -77,13 +77,15 @@

 /*
  * We must skip "overhead" operations that involve database access when the
- * cached plan's subject statement is a transaction control command.
- * For the convenience of postgres.c, treat empty statements as control
- * commands too.
+ * cached plan's subject statement is a transaction control command or one
+ * that requires a snapshot not to be set yet (such as SET or LOCK).  More
+ * generally, statements that do not require parse analysis/rewrite/plan
+ * activity never need to be revalidated, so we can treat them all like that.
+ * For the convenience of postgres.c, treat empty statements that way too.
  */
-#define IsTransactionStmtPlan(plansource)  \
-    ((plansource)->raw_parse_tree == NULL || \
-     IsA((plansource)->raw_parse_tree->stmt, TransactionStmt))
+#define StmtPlanRequiresRevalidation(plansource)  \
+    ((plansource)->raw_parse_tree != NULL && \
+     stmt_requires_parse_analysis((plansource)->raw_parse_tree))

 /*
  * This is the head of the backend's list of "saved" CachedPlanSources (i.e.,
@@ -383,13 +385,13 @@ CompleteCachedPlan(CachedPlanSource *plansource,
     plansource->query_context = querytree_context;
     plansource->query_list = querytree_list;

-    if (!plansource->is_oneshot && !IsTransactionStmtPlan(plansource))
+    if (!plansource->is_oneshot && StmtPlanRequiresRevalidation(plansource))
     {
         /*
          * Use the planner machinery to extract dependencies.  Data is saved
          * in query_context.  (We assume that not a lot of extra cruft is
          * created by this call.)  We can skip this for one-shot plans, and
-         * transaction control commands have no such dependencies anyway.
+         * plans not needing revalidation have no such dependencies anyway.
          */
         extract_query_dependencies((Node *) querytree_list,
                                    &plansource->relationOids,
@@ -568,11 +570,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
     /*
      * For one-shot plans, we do not support revalidation checking; it's
      * assumed the query is parsed, planned, and executed in one transaction,
-     * so that no lock re-acquisition is necessary.  Also, there is never any
-     * need to revalidate plans for transaction control commands (and we
-     * mustn't risk any catalog accesses when handling those).
+     * so that no lock re-acquisition is necessary.  Also, if the statement
+     * type can't require revalidation, we needn't do anything (and we mustn't
+     * risk catalog accesses when handling, eg, transaction control commands).
      */
-    if (plansource->is_oneshot || IsTransactionStmtPlan(plansource))
+    if (plansource->is_oneshot || !StmtPlanRequiresRevalidation(plansource))
     {
         Assert(plansource->is_valid);
         return NIL;
@@ -1029,8 +1031,8 @@ choose_custom_plan(CachedPlanSource *plansource, ParamListInfo boundParams)
     /* Otherwise, never any point in a custom plan if there's no parameters */
     if (boundParams == NULL)
         return false;
-    /* ... nor for transaction control statements */
-    if (IsTransactionStmtPlan(plansource))
+    /* ... nor when planning would be a no-op */
+    if (!StmtPlanRequiresRevalidation(plansource))
         return false;

     /* Let settings force the decision */
@@ -1972,8 +1974,8 @@ PlanCacheRelCallback(Datum arg, Oid relid)
         if (!plansource->is_valid)
             continue;

-        /* Never invalidate transaction control commands */
-        if (IsTransactionStmtPlan(plansource))
+        /* Never invalidate if parse/plan would be a no-op anyway */
+        if (!StmtPlanRequiresRevalidation(plansource))
             continue;

         /*
@@ -2057,8 +2059,8 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
         if (!plansource->is_valid)
             continue;

-        /* Never invalidate transaction control commands */
-        if (IsTransactionStmtPlan(plansource))
+        /* Never invalidate if parse/plan would be a no-op anyway */
+        if (!StmtPlanRequiresRevalidation(plansource))
             continue;

         /*
@@ -2167,7 +2169,6 @@ ResetPlanCache(void)
     {
         CachedPlanSource *plansource = dlist_container(CachedPlanSource,
                                                        node, iter.cur);
-        ListCell   *lc;

         Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);

@@ -2179,32 +2180,16 @@ ResetPlanCache(void)
          * We *must not* mark transaction control statements as invalid,
          * particularly not ROLLBACK, because they may need to be executed in
          * aborted transactions when we can't revalidate them (cf bug #5269).
+         * In general there's no point in invalidating statements for which a
+         * new parse analysis/rewrite/plan cycle would certainly give the same
+         * results.
          */
-        if (IsTransactionStmtPlan(plansource))
+        if (!StmtPlanRequiresRevalidation(plansource))
             continue;

-        /*
-         * In general there is no point in invalidating utility statements
-         * since they have no plans anyway.  So invalidate it only if it
-         * contains at least one non-utility statement, or contains a utility
-         * statement that contains a pre-analyzed query (which could have
-         * dependencies.)
-         */
-        foreach(lc, plansource->query_list)
-        {
-            Query       *query = lfirst_node(Query, lc);
-
-            if (query->commandType != CMD_UTILITY ||
-                UtilityContainsQuery(query->utilityStmt))
-            {
-                /* non-utility statement, so invalidate */
-                plansource->is_valid = false;
-                if (plansource->gplan)
-                    plansource->gplan->is_valid = false;
-                /* no need to look further */
-                break;
-            }
-        }
+        plansource->is_valid = false;
+        if (plansource->gplan)
+            plansource->gplan->is_valid = false;
     }

     /* Likewise invalidate cached expressions */
diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h
index 1cef1833a6..c96483ae78 100644
--- a/src/include/parser/analyze.h
+++ b/src/include/parser/analyze.h
@@ -47,6 +47,7 @@ extern List *transformUpdateTargetList(ParseState *pstate,
 extern Query *transformTopLevelStmt(ParseState *pstate, RawStmt *parseTree);
 extern Query *transformStmt(ParseState *pstate, Node *parseTree);

+extern bool stmt_requires_parse_analysis(RawStmt *parseTree);
 extern bool analyze_requires_snapshot(RawStmt *parseTree);

 extern const char *LCS_asString(LockClauseStrength strength);
diff --git a/src/pl/plpgsql/src/expected/plpgsql_call.out b/src/pl/plpgsql/src/expected/plpgsql_call.out
index 1ec6182a8d..7ab23c6a21 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_call.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_call.out
@@ -35,6 +35,23 @@ SELECT * FROM test1;
  55
 (1 row)

+-- Check that plan revalidation doesn't prevent setting transaction properties
+-- (bug #18059).  This test must include the first temp-object creation in
+-- this script, or it won't exercise the bug scenario.  Hence we put it early.
+CREATE PROCEDURE test_proc3a()
+LANGUAGE plpgsql
+AS $$
+BEGIN
+   COMMIT;
+   SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
+   RAISE NOTICE 'done';
+END;
+$$;
+CALL test_proc3a();
+NOTICE:  done
+CREATE TEMP TABLE tt1(f1 int);
+CALL test_proc3a();
+NOTICE:  done
 -- nested CALL
 TRUNCATE TABLE test1;
 CREATE PROCEDURE test_proc4(y int)
diff --git a/src/pl/plpgsql/src/sql/plpgsql_call.sql b/src/pl/plpgsql/src/sql/plpgsql_call.sql
index 5028398348..14bbffa0b2 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_call.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_call.sql
@@ -38,6 +38,24 @@ CALL test_proc3(55);
 SELECT * FROM test1;


+-- Check that plan revalidation doesn't prevent setting transaction properties
+-- (bug #18059).  This test must include the first temp-object creation in
+-- this script, or it won't exercise the bug scenario.  Hence we put it early.
+CREATE PROCEDURE test_proc3a()
+LANGUAGE plpgsql
+AS $$
+BEGIN
+   COMMIT;
+   SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
+   RAISE NOTICE 'done';
+END;
+$$;
+
+CALL test_proc3a();
+CREATE TEMP TABLE tt1(f1 int);
+CALL test_proc3a();
+
+
 -- nested CALL
 TRUNCATE TABLE test1;