Re: Statistics Import and Export

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Statistics Import and Export
Дата
Msg-id 76596388-6fe6-0baf-351d-734458a46d76@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Statistics Import and Export  (Corey Huinker <corey.huinker@gmail.com>)
Ответы Re: Statistics Import and Export  (Corey Huinker <corey.huinker@gmail.com>)
Список pgsql-hackers
On 11/2/23 06:01, Corey Huinker wrote:
> 
> 
>     Maybe I just don't understand, but I'm pretty sure ANALYZE does not
>     derive index stats from column stats. It actually builds them from the
>     row sample.
> 
> 
> That is correct, my error.
>  
> 
> 
>     > * now support extended statistics except for MCV, which is currently
>     > serialized as an difficult-to-decompose bytea field.
> 
>     Doesn't pg_mcv_list_items() already do all the heavy work?
> 
> 
> Thanks! I'll look into that.
> 
> The comment below in mcv.c made me think there was no easy way to get
> output.
> 
> /*
>  * pg_mcv_list_out      - output routine for type pg_mcv_list.
>  *
>  * MCV lists are serialized into a bytea value, so we simply call byteaout()
>  * to serialize the value into text. But it'd be nice to serialize that into
>  * a meaningful representation (e.g. for inspection by people).
>  *
>  * XXX This should probably return something meaningful, similar to what
>  * pg_dependencies_out does. Not sure how to deal with the deduplicated
>  * values, though - do we want to expand that or not?
>  */
> 

Yeah, that was the simplest output function possible, it didn't seem
worth it to implement something more advanced. pg_mcv_list_items() is
more convenient for most needs, but it's quite far from the on-disk
representation.

That's actually a good question - how closely should the exported data
be to the on-disk format? I'd say we should keep it abstract, not tied
to the details of the on-disk format (which might easily change between
versions).

I'm a bit confused about the JSON schema used in pg_statistic_export
view, though. It simply serializes stakinds, stavalues, stanumbers into
arrays ... which works, but why not to use the JSON nesting? I mean,
there could be a nested document for histogram, MCV, ... with just the
correct fields.

  {
    ...
    histogram : { stavalues: [...] },
    mcv : { stavalues: [...], stanumbers: [...] },
    ...
  }

and so on. Also, what does TRIVIAL stand for?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Explicitly skip TAP tests under Meson if disabled
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: Document efficient self-joins / UPDATE LIMIT techniques.