Re: identifying the backend that owns a temporary schema

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: identifying the backend that owns a temporary schema
Дата
Msg-id 4157446.1664405780@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: identifying the backend that owns a temporary schema  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: identifying the backend that owns a temporary schema  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
Nathan Bossart <nathandbossart@gmail.com> writes:
> Thanks for the suggestion.  I used it in v4 of the patch.

I reviewed this and made some changes, some cosmetic some less so.

Notably, I was bemused that of the four calls of
pgstat_fetch_stat_local_beentry, three tested for a NULL result even
though they cannot get one, while the fourth (pg_stat_get_backend_idset)
*is* at hazard of a NULL result but lacked a check.  I changed
pg_stat_get_backend_idset so that it too cannot get a NULL, and deleted
the dead code from the other callers.

A point that still bothers me a bit about pg_stat_get_backend_idset is
that it could miss or duplicate some backend IDs if the user calls
pg_stat_clear_snapshot() partway through the SRF's run, and we reload
a different set of backend entries than we had before.  I added a comment
about that, with an argument why it's not worth working harder, but
is the argument convincing?  If not, what should we do?

Also, I realized that the functions we're changing here are mostly
not exercised in the current regression tests :-(.  So I added a
small test case.

I think this is probably committable if you agree with my changes.

            regards, tom lane

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..342b20ebeb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5485,20 +5485,23 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
    the <structname>pg_stat_activity</structname> view, returns a set of records
    containing all the available information about each backend process.
    Sometimes it may be more convenient to obtain just a subset of this
-   information.  In such cases, an older set of per-backend statistics
+   information.  In such cases, another set of per-backend statistics
    access functions can be used; these are shown in <xref
    linkend="monitoring-stats-backend-funcs-table"/>.
-   These access functions use a backend ID number, which ranges from one
-   to the number of currently active backends.
+   These access functions use the session's backend ID number, which is a
+   small positive integer that is distinct from the backend ID of any
+   concurrent session, although a session's ID can be recycled as soon as
+   it exits.  The backend ID is used, among other things, to identify the
+   session's temporary schema if it has one.
    The function <function>pg_stat_get_backend_idset</function> provides a
-   convenient way to generate one row for each active backend for
+   convenient way to list all the active backends' ID numbers for
    invoking these functions.  For example, to show the <acronym>PID</acronym>s and
    current queries of all backends:

 <programlisting>
-SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
-       pg_stat_get_backend_activity(s.backendid) AS query
-    FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
+SELECT pg_stat_get_backend_pid(backendid) AS pid,
+       pg_stat_get_backend_activity(backendid) AS query
+FROM pg_stat_get_backend_idset() AS backendid;
 </programlisting>
   </para>

@@ -5526,8 +5529,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
         <returnvalue>setof integer</returnvalue>
        </para>
        <para>
-        Returns the set of currently active backend ID numbers (from 1 to the
-        number of active backends).
+        Returns the set of currently active backend ID numbers.
        </para></entry>
       </row>

diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index c7ed1e6d7a..22c79e156b 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -846,6 +846,12 @@ pgstat_read_current_status(void)
         /* Only valid entries get included into the local array */
         if (localentry->backendStatus.st_procpid > 0)
         {
+            /*
+             * The array index is exactly the BackendId of the source backend.
+             * Note that this means the localBackendStatusTable is in order by
+             * backend_id.  pgstat_fetch_stat_beentry() depends on that.
+             */
+            localentry->backend_id = i;
             BackendIdGetTransactionIds(i,
                                        &localentry->backend_xid,
                                        &localentry->backend_xmin);
@@ -1045,26 +1051,57 @@ pgstat_get_my_query_id(void)
     return MyBEEntry->st_query_id;
 }

+/* ----------
+ * cmp_lbestatus
+ *
+ *    Comparison function for bsearch() on an array of LocalPgBackendStatus.
+ *    The backend_id field is used to compare the arguments.
+ * ----------
+ */
+static int
+cmp_lbestatus(const void *a, const void *b)
+{
+    const LocalPgBackendStatus *lbestatus1 = (const LocalPgBackendStatus *) a;
+    const LocalPgBackendStatus *lbestatus2 = (const LocalPgBackendStatus *) b;
+
+    return lbestatus1->backend_id - lbestatus2->backend_id;
+}

 /* ----------
  * pgstat_fetch_stat_beentry() -
  *
  *    Support function for the SQL-callable pgstat* functions. Returns
- *    our local copy of the current-activity entry for one backend.
+ *    our local copy of the current-activity entry for one backend,
+ *    or NULL if the given beid doesn't identify any known session.
+ *
+ *    The beid argument is the BackendId of the desired session
+ *    (note that this is unlike pgstat_fetch_stat_local_beentry()).
  *
  *    NB: caller is responsible for a check if the user is permitted to see
  *    this info (especially the querystring).
  * ----------
  */
 PgBackendStatus *
-pgstat_fetch_stat_beentry(int beid)
+pgstat_fetch_stat_beentry(BackendId beid)
 {
+    LocalPgBackendStatus key;
+    LocalPgBackendStatus *ret;
+
     pgstat_read_current_status();

-    if (beid < 1 || beid > localNumBackends)
-        return NULL;
+    /*
+     * Since the localBackendStatusTable is in order by backend_id, we can use
+     * bsearch() to search it efficiently.
+     */
+    key.backend_id = beid;
+    ret = (LocalPgBackendStatus *) bsearch(&key, localBackendStatusTable,
+                                           localNumBackends,
+                                           sizeof(LocalPgBackendStatus),
+                                           cmp_lbestatus);
+    if (ret)
+        return &ret->backendStatus;

-    return &localBackendStatusTable[beid - 1].backendStatus;
+    return NULL;
 }


@@ -1074,6 +1111,10 @@ pgstat_fetch_stat_beentry(int beid)
  *    Like pgstat_fetch_stat_beentry() but with locally computed additions (like
  *    xid and xmin values of the backend)
  *
+ *    The beid argument is a 1-based index in the localBackendStatusTable
+ *    (note that this is unlike pgstat_fetch_stat_beentry()).
+ *    Returns NULL if the argument is out of range (no current caller does that).
+ *
  *    NB: caller is responsible for a check if the user is permitted to see
  *    this info (especially the querystring).
  * ----------
@@ -1094,7 +1135,8 @@ pgstat_fetch_stat_local_beentry(int beid)
  * pgstat_fetch_stat_numbackends() -
  *
  *    Support function for the SQL-callable pgstat* functions. Returns
- *    the maximum current backend id.
+ *    the number of sessions known in the localBackendStatusTable, i.e.
+ *    the maximum 1-based index to pass to pgstat_fetch_stat_local_beentry().
  * ----------
  */
 int
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index be15b4b2e5..d5ca9208d7 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -415,7 +415,6 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
 {
     FuncCallContext *funcctx;
     int           *fctx;
-    int32        result;

     /* stuff done only on the first call of the function */
     if (SRF_IS_FIRSTCALL())
@@ -424,11 +423,10 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
         funcctx = SRF_FIRSTCALL_INIT();

         fctx = MemoryContextAlloc(funcctx->multi_call_memory_ctx,
-                                  2 * sizeof(int));
+                                  sizeof(int));
         funcctx->user_fctx = fctx;

         fctx[0] = 0;
-        fctx[1] = pgstat_fetch_stat_numbackends();
     }

     /* stuff done on every call of the function */
@@ -436,12 +434,22 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
     fctx = funcctx->user_fctx;

     fctx[0] += 1;
-    result = fctx[0];

-    if (result <= fctx[1])
+    /*
+     * We recheck pgstat_fetch_stat_numbackends() each time through, just in
+     * case the local status data has been refreshed since we started.  It's
+     * plenty cheap enough if not.  If a refresh does happen, we'll likely
+     * miss or duplicate some backend IDs, but we're content not to crash.
+     * (Refreshing midway through such a query would be problematic usage
+     * anyway, since the other pgstatfuncs functions that the backend IDs will
+     * get passed to would then deliver inconsistent results.)
+     */
+    if (fctx[0] <= pgstat_fetch_stat_numbackends())
     {
         /* do when there is more left to send */
-        SRF_RETURN_NEXT(funcctx, Int32GetDatum(result));
+        LocalPgBackendStatus *local_beentry = pgstat_fetch_stat_local_beentry(fctx[0]);
+
+        SRF_RETURN_NEXT(funcctx, Int32GetDatum(local_beentry->backend_id));
     }
     else
     {
@@ -493,17 +501,13 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
         int            i;

         local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
-
-        if (!local_beentry)
-            continue;
-
         beentry = &local_beentry->backendStatus;

         /*
          * Report values for only those backends which are running the given
          * command.
          */
-        if (!beentry || beentry->st_progress_command != cmdtype)
+        if (beentry->st_progress_command != cmdtype)
             continue;

         /* Value available to all callers */
@@ -558,24 +562,6 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)

         /* Get the next one in the list */
         local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
-        if (!local_beentry)
-        {
-            int            i;
-
-            /* Ignore missing entries if looking for specific PID */
-            if (pid != -1)
-                continue;
-
-            for (i = 0; i < lengthof(nulls); i++)
-                nulls[i] = true;
-
-            nulls[5] = false;
-            values[5] = CStringGetTextDatum("<backend information not available>");
-
-            tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
-            continue;
-        }
-
         beentry = &local_beentry->backendStatus;

         /* If looking for specific PID, ignore all the others */
@@ -1180,9 +1166,9 @@ pg_stat_get_db_numbackends(PG_FUNCTION_ARGS)
     result = 0;
     for (beid = 1; beid <= tot_backends; beid++)
     {
-        PgBackendStatus *beentry = pgstat_fetch_stat_beentry(beid);
+        LocalPgBackendStatus *local_beentry = pgstat_fetch_stat_local_beentry(beid);

-        if (beentry && beentry->st_databaseid == dbid)
+        if (local_beentry->backendStatus.st_databaseid == dbid)
             result++;
     }

diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 7403bca25e..b582b46e9f 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -13,6 +13,7 @@
 #include "datatype/timestamp.h"
 #include "libpq/pqcomm.h"
 #include "miscadmin.h"            /* for BackendType */
+#include "storage/backendid.h"
 #include "utils/backend_progress.h"


@@ -247,6 +248,13 @@ typedef struct LocalPgBackendStatus
      */
     PgBackendStatus backendStatus;

+    /*
+     * The backend ID.  For auxiliary processes, this will be set to a value
+     * greater than MaxBackends (since auxiliary processes do not have proper
+     * backend IDs).
+     */
+    BackendId    backend_id;
+
     /*
      * The xid of the current transaction if available, InvalidTransactionId
      * if not.
@@ -313,7 +321,7 @@ extern uint64 pgstat_get_my_query_id(void);
  * ----------
  */
 extern int    pgstat_fetch_stat_numbackends(void);
-extern PgBackendStatus *pgstat_fetch_stat_beentry(int beid);
+extern PgBackendStatus *pgstat_fetch_stat_beentry(BackendId beid);
 extern LocalPgBackendStatus *pgstat_fetch_stat_local_beentry(int beid);
 extern char *pgstat_clip_activity(const char *raw_activity);

diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 6a10dc462b..7a1ffd404d 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -576,15 +576,21 @@ SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELEC

 -- Test pg_stat_bgwriter checkpointer-related stats, together with pg_stat_wal
 SELECT checkpoints_req AS rqst_ckpts_before FROM pg_stat_bgwriter \gset
--- Test pg_stat_wal
+-- Test pg_stat_wal (and make a temp table so our temp schema exists)
 SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset
-CREATE TABLE test_stats_temp AS SELECT 17;
+CREATE TEMP TABLE test_stats_temp AS SELECT 17;
 DROP TABLE test_stats_temp;
 -- Checkpoint twice: The checkpointer reports stats after reporting completion
 -- of the checkpoint. But after a second checkpoint we'll see at least the
 -- results of the first.
 CHECKPOINT;
 CHECKPOINT;
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush
+--------------------------
+
+(1 row)
+
 SELECT checkpoints_req > :rqst_ckpts_before FROM pg_stat_bgwriter;
  ?column?
 ----------
@@ -597,6 +603,17 @@ SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal;
  t
 (1 row)

+-- Test pg_stat_get_backend_idset() and some allied functions.
+-- In particular, verify that their notion of backend ID matches
+-- our temp schema index.
+SELECT (current_schemas(true))[1] = ('pg_temp_' || beid::text) AS match
+FROM pg_stat_get_backend_idset() beid
+WHERE pg_stat_get_backend_pid(beid) = pg_backend_pid();
+ match
+-------
+ t
+(1 row)
+
 -----
 -- Test that resetting stats works for reset timestamp
 -----
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index a6b0e9e042..197db3eba3 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -303,10 +303,10 @@ SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELEC
 -- Test pg_stat_bgwriter checkpointer-related stats, together with pg_stat_wal
 SELECT checkpoints_req AS rqst_ckpts_before FROM pg_stat_bgwriter \gset

--- Test pg_stat_wal
+-- Test pg_stat_wal (and make a temp table so our temp schema exists)
 SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset

-CREATE TABLE test_stats_temp AS SELECT 17;
+CREATE TEMP TABLE test_stats_temp AS SELECT 17;
 DROP TABLE test_stats_temp;

 -- Checkpoint twice: The checkpointer reports stats after reporting completion
@@ -315,9 +315,16 @@ DROP TABLE test_stats_temp;
 CHECKPOINT;
 CHECKPOINT;

+SELECT pg_stat_force_next_flush();
 SELECT checkpoints_req > :rqst_ckpts_before FROM pg_stat_bgwriter;
 SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal;

+-- Test pg_stat_get_backend_idset() and some allied functions.
+-- In particular, verify that their notion of backend ID matches
+-- our temp schema index.
+SELECT (current_schemas(true))[1] = ('pg_temp_' || beid::text) AS match
+FROM pg_stat_get_backend_idset() beid
+WHERE pg_stat_get_backend_pid(beid) = pg_backend_pid();

 -----
 -- Test that resetting stats works for reset timestamp

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: making relfilenodes 56 bits
Следующее
От: David Rowley
Дата:
Сообщение: Re: A potential memory leak on Merge Join when Sort node is not below Materialize node