Re: Statistics Import and Export

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Statistics Import and Export
Дата
Msg-id CALj2ACW9TYf8Lyuy6Y4HnPB63b4G4FMOOW0BtvWjgQiYE-db4g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Statistics Import and Export  (Corey Huinker <corey.huinker@gmail.com>)
Ответы Re: Statistics Import and Export  (Corey Huinker <corey.huinker@gmail.com>)
Список pgsql-hackers
On Fri, Mar 8, 2024 at 12:06 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>
> Anyway, here's v7. Eagerly awaiting feedback.

Thanks for working on this. It looks useful to have the ability to
restore the stats after upgrade and restore. But, the exported stats
are valid only until the next ANALYZE is run on the table. IIUC,
postgres collects stats during VACUUM, autovacuum and ANALYZE, right?
 Perhaps there are other ways to collect stats. I'm thinking what
problems does the user face if they are just asked to run ANALYZE on
the tables (I'm assuming ANALYZE doesn't block concurrent access to
the tables) instead of automatically exporting stats.

Here are some comments on the v7 patch. I've not dived into the whole
thread, so some comments may be of type repeated or need
clarification. Please bear with me.

1. The following two are unnecessary changes in pg_proc.dat, please remove them.

--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8818,7 +8818,6 @@
 { oid => '3813', descr => 'generate XML text node',
   proname => 'xmltext', proisstrict => 't', prorettype => 'xml',
   proargtypes => 'text', prosrc => 'xmltext' },
-
 { oid => '2923', descr => 'map table contents to XML',
   proname => 'table_to_xml', procost => '100', provolatile => 's',
   proparallel => 'r', prorettype => 'xml',
@@ -12163,8 +12162,24 @@

 # GiST stratnum implementations
 { oid => '8047', descr => 'GiST support',
-  proname => 'gist_stratnum_identity', prorettype => 'int2',
+  proname => 'gist_stratnum_identity',prorettype => 'int2',
   proargtypes => 'int2',
   prosrc => 'gist_stratnum_identity' },

2.
+        they are replaced by the next auto-analyze. This function is used by
+        <command>pg_upgrade</command> and <command>pg_restore</command> to
+        convey the statistics from the old system version into the new one.
+       </para>

Is there any demonstration of pg_set_relation_stats and
pg_set_attribute_stats being used either in pg_upgrade or in
pg_restore? Perhaps, having them as 0002, 0003 and so on patches might
show real need for functions like this. It also clarifies how these
functions pull stats from tables on the old cluster to the tables on
the new cluster.

3. pg_set_relation_stats and pg_set_attribute_stats seem to be writing
to pg_class and might affect the plans as stats can get tampered. Can
we REVOKE the execute permissions from the public out of the box in
src/backend/catalog/system_functions.sql? This way one can decide who
to give permissions to.

4.
+SELECT pg_set_relation_stats('stats_export_import.test'::regclass,
3.6::float4, 15000);
+ pg_set_relation_stats
+-----------------------
+ t
+(1 row)
+
+SELECT reltuples, relpages FROM pg_class WHERE oid =
'stats_export_import.test'::regclass;
+ reltuples | relpages
+-----------+----------
+       3.6 |    15000

Isn't this test case showing a misuse of these functions? Table
actually has  no rows, but we are lying to the postgres optimizer on
stats. I think altering stats of a table mustn't be that easy for the
end user. As mentioned in comment #3, permissions need to be
tightened. In addition, we can also mark the functions pg_upgrade only
with CHECK_IS_BINARY_UPGRADE, but that might not work for pg_restore
(or I don't know if we have a way to know within the server that the
server is running for pg_restore).

5. In continuation to the comment #2, is pg_dump supposed to generate
pg_set_relation_stats and pg_set_attribute_stats statements for each
table? When pg_dump does that , pg_restore can automatically load the
stats.

6.
+/*-------------------------------------------------------------------------
* * statistics.c *
+ * IDENTIFICATION
+ *       src/backend/statistics/statistics.c
+ *
+ *-------------------------------------------------------------------------

A description of what the new file statistics.c does is missing.

7. pgindent isn't happy with new file statistics.c, please check.

8.
+/*
+ * Import statistics for a given relation attribute
+ *
+ * pg_set_attribute_stats(relation Oid, attname name, stainherit bool,
+ *                        stanullfrac float4, stawidth int, stadistinct float4,

Having function definition in the function comment isn't necessary -
it's hard to keep it consistent with pg_proc.dat in future. If
required, one can either look at pg_proc.dat or docs.

9. Isn't it good to add a test case where the plan of a query on table
after exporting the stats would remain same as that of the original
table from which the stats are exported? IMO, this is a more realistic
than just comparing pg_statistic of the tables because this is what an
end-user wants eventually.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: pg_dump include/exclude fails to error
Следующее
От: "Andrey M. Borodin"
Дата:
Сообщение: Re: UUID v7