Обсуждение: Re: POC: track vacuum/analyze cumulative time per relation

Поиск
Список
Период
Сортировка

Re: POC: track vacuum/analyze cumulative time per relation

От
wenhui qiu
Дата:
Hi Sami 
 Thank you for your path,it seems some path monitor vacuum status,Can we synthesize their good ideas together。

On Fri, 3 Jan 2025 at 02:24, Sami Imseih <samimseih@gmail.com> wrote:
Hi,

After a recent question regarding tracking vacuum start_time in
pg_stat_all_tables [1], it dawned on me that this view is missing
an important cumulative metric, which is how much time is spent
performing vacuum per table.

Currently, the only way a user can get this
information is if they enable autovacuum logging or track timing
for manual vacuums. Even then, if a user wants to trend
the time spent vacuuming over time, they must store the
timing data somewhere and perform the calculations.

Also, unless autovacuum logging is enabled for all a/v
operations, they could have gaps in their analysis.

Having the total (auto)vacuum elapsed time
along side the existing (auto)vaccum_count
allows a user to track the average time an
operating overtime and to find vacuum tuning
opportunities.

The same can also be said for (auto)analyze.

attached a patch ( without doc changes)
that adds 4 new columns:

total_autovacuum_time
total_vacuum_time
total_autoanalyze_time
total_analyze_time

Below is an example of output and how it
can be used to derive the average vacuum
operation time.

postgres=# select
relname,
autovacuum_count,
total_autovacuum_time,
total_autovacuum_time/NULLIF(autovacuum_count,0) average_autovac_time,
vacuum_count,
total_vacuum_time,
total_vacuum_time/NULLIF(vacuum_count,0) average_vac_time
from pg_catalog.pg_stat_all_tables
where relname = 'pgbench_history';
-[ RECORD 1 ]---------+-----------------
relname               | pgbench_history
autovacuum_count      | 3
total_autovacuum_time | 1689
average_autovac_time  | 563
vacuum_count          | 1
total_vacuum_time     | 1
average_vac_time      | 1

It should be noted that the timing is only tracked
when the vacuum or analyze operation completes,
as is the case with the other metrics.

Also, there is another discussion in-flight [2] regarding
tracking vacuum run history in a view, but this serves a
different purpose as this will provide all the metrics
that are other wise exposed in vacuum logging
via sql. This history will also be required to drop
entries using some criteria to keep the cache from
growing infinitely.

Feedback for the attached patch is appreciated!

Regards,

Sami Imseih
Amazon Web Services (AWS)

[1] https://www.postgresql.org/message-id/flat/CAGjGUAKQ4UBNdkjunH2qLsdUVG-3F9gCuG0Kb0hToo%2BuMmSteQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/b68ab452-c41f-4d04-893f-eaab84f1855b%40vondra.me

Re: POC: track vacuum/analyze cumulative time per relation

От
Alena Rybakina
Дата:
Hi!

On 04.01.2025 06:41, wenhui qiu wrote:
> Hi Sami
>  Thank you for your path,it seems some path monitor vacuum status,Can 
> we synthesize their good ideas together。
>
I've been working on a patch that collects vacuum statistics since May 
[0]. It includes heap and index relation vacuum statistics, and of 
course, database vacuum statistics. We welcome reviews and any feedback!

It collects a bit more information than total_time, such as delay_time, 
buffer statistics, wal statistics, number of processed, skipped pages 
and tuples during vacuum operations, so
this information gives a complete picture of vacuum efficiency, because 
analyzing only total_time statistics does not give a complete picture of 
what vacuum did: was it cleaning almost huge index, cleaning tables or 
just sleeping.

[0] https://commitfest.postgresql.org/51/5012/

-- 
Regards,
Alena Rybakina
Postgres Professional




Re: POC: track vacuum/analyze cumulative time per relation

От
Sami Imseih
Дата:
> this information gives a complete picture of vacuum efficiency, because
> analyzing only total_time statistics does not give a complete picture of
> what vacuum did: was it cleaning almost huge index, cleaning tables or
> just sleeping.

The purpose of total_time is to be able to calculate the average
time of a vacuum per relation from pg_stat_all_tables (by
total_time/vacuum_count) This is useful on its own to be able to
trend slow downs of vacuum ( or improvements after vacuum tuning )
for specific relations.

The work going into [1] is more detailed and will be useful
for deep dive investigations.In short, I think both proposals
have merit.

[1] https://commitfest.postgresql.org/51/5012/

Regards,

Sami



Re: POC: track vacuum/analyze cumulative time per relation

От
Sami Imseih
Дата:
> > this information gives a complete picture of vacuum efficiency, because
> > analyzing only total_time statistics does not give a complete picture of
> > what vacuum did: was it cleaning almost huge index, cleaning tables or
> > just sleeping.
>
> The purpose of total_time is to be able to calculate the average
> time of a vacuum per relation from pg_stat_all_tables (by
> total_time/vacuum_count) This is useful on its own to be able to
> trend slow downs of vacuum ( or improvements after vacuum tuning )
> for specific relations.
>
> The work going into [1] is more detailed and will be useful
> for deep dive investigations.In short, I think both proposals
> have merit.

Missed this point in my last reply. This patch also tracks
(auto)analyze per relation cumulative timing which can
be useful along with (auto)analyze_count to derive
the average.

Regards,

Sami