Re: Autovacuum on partitioned table (autoanalyze)
От | Andres Freund |
---|---|
Тема | Re: Autovacuum on partitioned table (autoanalyze) |
Дата | |
Msg-id | 20210817104950.bkvv2cvttylpn4j3@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Autovacuum on partitioned table (autoanalyze) (Álvaro Herrera <alvherre@alvh.no-ip.org>) |
Список | pgsql-hackers |
Hi, On 2021-08-16 17:42:48 -0400, Álvaro Herrera wrote: > On 2021-Aug-16, Álvaro Herrera wrote: > > > Here's the reversal patch for the 14 branch. (It applies cleanly to > > master, but the unused member of PgStat_StatTabEntry needs to be > > removed and catversion bumped). > > I have pushed this to both branches. (I did not remove the item from > the release notes in the 14 branch.) > > It upsets me to have reverted it, but after spending so much time trying > to correct the problems, I believe it just wasn't salvageable within the > beta-period code freeze constraints. :( > I described the issues I ran into > in earlier messages; I think a good starting point to re-develop this is > to revert the reversal commit, then apply my patch at > https://postgr.es/m/0794d7ca-5183-486b-9c5e-6d434867cecd@www.fastmail.com > then do something about the remaining problems that were complained > about. (Maybe: add an "ancestor OID" member to PgStat_StatTabEntry so > that the collector knows to propagate counts from children to ancestors > when the upd/ins/del counts are received. My suspicion is that it'd be a lot easier to implement this efficiently if there were no propagation done outside of actually analyzing tables. I.e. have do_autovacuum() build a hashtable of (parent_table_id, count) and use that to make the analyze decisions. And then only propagate up the costs to parents of tables when a child is analyzed (and thus looses its changes_since_analyze) value. Then we can use hashtable_value + changes_since_analyze for partitioning decisions of partitioned tables. I've prototyped this, and it does seem to make do_autovacuum() cheaper. I've attached that prototype, but note it's in a rough state. However, unless we change the way inheritance parents are stored, it still requires repetitive get_partition_ancestors() (or get_partition_parent()) calls in do_autovacuum(), which I think is problematic due to the index scans you pointed out as well. The obvious way to address that would be to store parent oids in pg_class - I suspect duplicating parents in pg_class is the best way out, but pretty it is not. > However, consider developing it as follow-up to Horiguchi-san's shmem > pgstat rather than current pgstat implementation.) +1 It might be worth to first tackle reusing samples from a relation's children when building inheritance stats. Either by storing the samples somewhere (not cheap) and reusing them, or by at least updating a partition's stats when analyzing the parent. Greetings, Andres Freund
Вложения
В списке pgsql-hackers по дате отправления: