Re: Statistics Import and Export

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Statistics Import and Export
Дата
Msg-id ZYsYMqukg97HdB7u@momjian.us
обсуждение исходный текст
Ответ на Re: Statistics Import and Export  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Statistics Import and Export  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, Dec 26, 2023 at 02:18:56AM +0100, Tomas Vondra wrote:
> interfaces
> ----------
> 
> When I thought about the ability to dump/load statistics in the past, I
> usually envisioned some sort of DDL that would do the export and import.
> So for example we'd have EXPORT STATISTICS / IMPORT STATISTICS commands,
> or something like that, and that'd do all the work. This would mean
> stats are "first-class citizens" and it'd be fairly straightforward to
> add this into pg_dump, for example. Or at least I think so ...
> 
> Alternatively we could have the usual "functional" interface, with a
> functions to export/import statistics, replacing the DDL commands.
> 
> Unfortunately, none of this works for the pg_upgrade use case, because
> existing cluster versions would not support this new interface, of
> course. That's a significant flaw, as it'd make this useful only for
> upgrades of future versions.
> 
> So I think for the pg_upgrade use case, we don't have much choice other
> than using "custom" export through a view, which is what the patch does.
> 
> However, for the other use case (tweaking optimizer stats) this is not
> really an issue - that always happens on the same instance, so no issue
> with not having the "export" function and so on. I'd bet there are more
> convenient ways to do this than using the export view. I'm sure it could
> share a lot of the infrastructure, ofc.
> 
> I suggest we focus on the pg_upgrade use case for now. In particular, I
> think we really need to find a good way to integrate this into
> pg_upgrade. I'm not against having custom CLI commands, but it's still a
> manual thing - I wonder if we could extend pg_dump to dump stats, or
> make it built-in into pg_upgrade in some way (possibly disabled by
> default, or something like that).

I have some thoughts on this too.  I understand the desire to add
something that can be used for upgrades _to_ PG 17, but I am concerned
that this will give us a cumbersome API that will hamper future
development.  I think we should develop the API we want, regardless of
how useful it is for upgrades _to_ PG 17, and then figure out what
short-term hacks we can add to get it working for upgrades _to_ PG 17; 
these hacks can eventually be removed.  Even if they can't be removed,
they are export-only and we can continue developing the import SQL
command cleanly, and I think import is going to need the most long-term
maintenance.

I think we need a robust API to handle two cases:

*  changes in how we store statistics
*  changes in how how data type values are represented in the statistics

We have had such changes in the past, and I think these two issues are
what have prevented import/export of statistics up to this point.
Developing an API that doesn't cleanly handle these will cause long-term
pain.

In summary, I think we need an SQL-level command for this.  I think we
need to embed the Postgres export version number into the statistics
export file (maybe in the COPY header), and then load the file via COPY
internally (not JSON) into a temporary table that we know matches the
exported Postgres version.  We then need to use SQL to make any
adjustments to it before loading it into pg_statistic.  Doing that
internally in JSON just isn't efficient.  If people want JSON for such
cases, I suggest we add a JSON format to COPY.

I think we can then look at pg_upgrade to see if we can simulate the
export action which can use the statistics import SQL command.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ranier Vilela
Дата:
Сообщение: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Moving forward with TDE