Re: Autovacuum on partitioned table (autoanalyze)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Autovacuum on partitioned table (autoanalyze)
Дата
Msg-id 20210817104950.bkvv2cvttylpn4j3@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Autovacuum on partitioned table (autoanalyze)  (Álvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
Hi,

On 2021-08-16 17:42:48 -0400, Álvaro Herrera wrote:
> On 2021-Aug-16, Álvaro Herrera wrote:
>
> > Here's the reversal patch for the 14 branch.  (It applies cleanly to
> > master, but the unused member of PgStat_StatTabEntry needs to be
> > removed and catversion bumped).
>
> I have pushed this to both branches.  (I did not remove the item from
> the release notes in the 14 branch.)
>
> It upsets me to have reverted it, but after spending so much time trying
> to correct the problems, I believe it just wasn't salvageable within the
> beta-period code freeze constraints.

:(


> I described the issues I ran into
> in earlier messages; I think a good starting point to re-develop this is
> to revert the reversal commit, then apply my patch at
> https://postgr.es/m/0794d7ca-5183-486b-9c5e-6d434867cecd@www.fastmail.com
> then do something about the remaining problems that were complained
> about.  (Maybe: add an "ancestor OID" member to PgStat_StatTabEntry so
> that the collector knows to propagate counts from children to ancestors
> when the upd/ins/del counts are received.

My suspicion is that it'd be a lot easier to implement this efficiently if
there were no propagation done outside of actually analyzing tables. I.e. have
do_autovacuum() build a hashtable of (parent_table_id, count) and use that to
make the analyze decisions. And then only propagate up the costs to parents of
tables when a child is analyzed (and thus looses its changes_since_analyze)
value. Then we can use hashtable_value + changes_since_analyze for
partitioning decisions of partitioned tables.

I've prototyped this, and it does seem to make do_autovacuum() cheaper. I've
attached that prototype, but note it's in a rough state.

However, unless we change the way inheritance parents are stored, it still
requires repetitive get_partition_ancestors() (or get_partition_parent())
calls in do_autovacuum(), which I think is problematic due to the index scans
you pointed out as well.  The obvious way to address that would be to store
parent oids in pg_class - I suspect duplicating parents in pg_class is the
best way out, but pretty it is not.


> However, consider developing it as follow-up to Horiguchi-san's shmem
> pgstat rather than current pgstat implementation.)

+1


It might be worth to first tackle reusing samples from a relation's children
when building inheritance stats. Either by storing the samples somewhere (not
cheap) and reusing them, or by at least updating a partition's stats when
analyzing the parent.

Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Added schema level support for publication.
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Autovacuum on partitioned table (autoanalyze)