Обсуждение: Re: POC: track vacuum/analyze cumulative time per relation

Поиск

Список

Период

Сортировка

Re: POC: track vacuum/analyze cumulative time per relation

От

Bertrand Drouvot

Дата:

10 января 2025 г., 18:09:50

Hi,

On Thu, Jan 02, 2025 at 12:24:06PM -0600, Sami Imseih wrote:
> Having the total (auto)vacuum elapsed time
> along side the existing (auto)vaccum_count
> allows a user to track the average time an
> operating overtime and to find vacuum tuning
> opportunities.
> 
> The same can also be said for (auto)analyze.

I think that makes sense to expose those metrics through SQL as you're proposing
here. The more we expose through SQL the better I think.

> attached a patch ( without doc changes)
> that adds 4 new columns:

Thanks!

A few random comments:

=== 1

+       endtime = GetCurrentTimestamp();
        pgstat_report_vacuum(RelationGetRelid(rel),
                                                 rel->rd_rel->relisshared,
                                                 Max(vacrel->new_live_tuples, 0),
                                                 vacrel->recently_dead_tuples +
-                                                vacrel->missed_dead_tuples);
+                                                vacrel->missed_dead_tuples,
+                                                starttime,
+                                                endtime);
        pgstat_progress_end_command();

        if (instrument)
        {
-               TimestampTz endtime = GetCurrentTimestamp();

What about keeping the endtime assignment after the pgstat_progress_end_command()
call? I think that it makes more sense that way.

=== 2

        pgstat_report_vacuum(RelationGetRelid(rel),
                                                 rel->rd_rel->relisshared,
                                                 Max(vacrel->new_live_tuples, 0),
                                                 vacrel->recently_dead_tuples +
-                                                vacrel->missed_dead_tuples);
+                                                vacrel->missed_dead_tuples,
+                                                starttime,
+                                                endtime);

What about doing the elapsedtime computation prior the this call and passed
it as a parameter (then remove the starttime one and keep the endtime as it
is needed)?

=== 3

        pgstat_report_analyze(onerel, totalrows, totaldeadrows,
-                             (va_cols == NIL));
+                             (va_cols == NIL), starttime, endtime);

Same as 2 for pgstat_report_analyze() except that the endtime could be removed
too.

=== 4

+/* pg_stat_get_vacuum_time */
+PG_STAT_GET_RELENTRY_INT64(total_vacuum_time)
+
+/* pg_stat_get_autovacuum_time */
+PG_STAT_GET_RELENTRY_INT64(total_autovacuum_time)
+
+/* pg_stat_get_analyze_time */
+PG_STAT_GET_RELENTRY_INT64(total_analyze_time)
+
+/* pg_stat_get_autoanalyze_time */
+PG_STAT_GET_RELENTRY_INT64(total_autoanalyze_time)

I wonder if it wouldn't be better to use FLOAT8 here (to match things like
pg_stat_get_checkpointer_write_time(), pg_stat_get_checkpointer_sync_time() among
others).

=== 5

+
+       PgStat_Counter total_vacuum_time;       /* user initiated vacuum */
+       PgStat_Counter total_autovacuum_time;   /* autovacuum initiated */
+       PgStat_Counter total_analyze_time;      /* user initiated vacuum */
+       PgStat_Counter total_autoanalyze_time;  /* autovacuum initiated */

Those comments look weird to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: POC: track vacuum/analyze cumulative time per relation

От

Bertrand Drouvot

Дата:

13 января 2025 г., 18:55:50

Hi,

On Fri, Jan 10, 2025 at 04:26:05PM -0600, Sami Imseih wrote:
> {
> /* time is already in msec, just convert to double for presentation */
> PG_RETURN_FLOAT8((double)
> pgstat_fetch_stat_checkpointer()->write_time);
> }
> 
> > +
> > +       PgStat_Counter total_vacuum_time;       /* user initiated vacuum */
> > +       PgStat_Counter total_autovacuum_time;   /* autovacuum initiated */
> > +       PgStat_Counter total_analyze_time;      /* user initiated vacuum */
> > +       PgStat_Counter total_autoanalyze_time;  /* autovacuum initiated */
> >
> > Those comments look weird to me.
> >
> 
> Ok, Will remove these.
> 
> I also updated the comments for the instrument code path to reflect the
> fact starttime is now set for all cases.Also, added documentation.
> 
> See the attached v2.

Thanks for the patch update!

A few random comments:

=== 1

+/* pg_stat_get_vacuum_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_vacuum_time)
+
+/* pg_stat_get_autovacuum_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_autovacuum_time)
+
+/* pg_stat_get_analyze_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_analyze_time)
+
+/* pg_stat_get_autoanalyze_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_autoanalyze_time)
+

The comments do not reflect the function names ("total" is missing to give
pg_stat_get_total_vacuum_time() and such).

=== 2

+#define PG_STAT_GET_RELENTRY_FLOAT8(stat)
+Datum                                        
+CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)
+{                                                       
+       Oid                     relid = PG_GETARG_OID(0); 
+       int64           result;
+       PgStat_StatTabEntry *tabentry;
+                                               
+       if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+               result = 0;
+       else              
+               result = (float8) (tabentry->stat);
+                                      
        +       PG_RETURN_FLOAT8(result);
+}
+
 /* pg_stat_get_analyze_count */
 PG_STAT_GET_RELENTRY_INT64(analyze_count

I think it's better to define the macro just before its first usage (meaning
just after pg_stat_get_vacuum_count()): that would be consistent with the places
the other macros are defined.

=== 3

+       int64           result; 

s/int64/double/?

=== 4

+       Total time this table has spent in manual vacuum
+      </para></entry>

Mention the unit?

=== 5

+       /*
+        * When verbose or autovacuum logging is used, initialize a resource usage
+        * snapshot and optionally track I/O timing.
+        */
        if (instrument)
        {

Out of curiosity, why this extra comment? To be somehow consistent with
do_analyze_rel()?

=== 6

@@ -343,6 +349,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
        pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
                                                                  RelationGetRelid(rel));

+       starttime = GetCurrentTimestamp();

I wonder if it wouldn't make more sense to put the GetCurrentTimestamp() call
before the pgstat_progress_start_command() call. That would be aligned with the
"endtime" being after the pgstat_progress_end_command() and where it was before
the patch.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: POC: track vacuum/analyze cumulative time per relation

От

Bertrand Drouvot

Дата:

15 января 2025 г., 16:11:26

Hi,

On Tue, Jan 14, 2025 at 05:01:52PM -0600, Sami Imseih wrote:
> Please see the attached v3.

Thanks!

A few comments:

=== 1

+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_vacuum_time</structfield> <type>bigint</type>
+      </para>

Those new fields should be documented as "double precision".

=== 2

+#define PG_STAT_GET_RELENTRY_FLOAT8(stat)                                              \
+Datum
\
+CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)                                 \
+{
       \
 
+       Oid                     relid = PG_GETARG_OID(0);                                               \
+       float8          result;                                                                                 \
+       PgStat_StatTabEntry *tabentry;                                                          \
+
       \
 
+       if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)     \
+               result = 0;
\
+       else
\
+               result = (float8) (tabentry->stat);                                             \
+
       \
 
+       PG_RETURN_FLOAT8(result);                                                                       \
+}

I did propose "double" up-thread to be consistent with the code around. That
would mean to also cast to "double". That's just for consistency purpose. What do
you think?

Appart from the above that LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: POC: track vacuum/analyze cumulative time per relation

От

Bertrand Drouvot

Дата:

17 января 2025 г., 16:42:52

Hi,

On Wed, Jan 15, 2025 at 11:09:00AM -0600, Sami Imseih wrote:
> > Appart from the above that LGTM.
> 
> thanks! I took care of your comments.

Thanks!

> I was not sure
> about the need to cast "double" but as you mention this
> is consistent with other parts of pgstatfuncs.c

Yup.

> v4 attached.

One comment:

+       PG_RETURN_FLOAT8(result);                                        \

The "\" indentation looks wrong for this line (was ok in v3).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: POC: track vacuum/analyze cumulative time per relation

От

Sami Imseih

Дата:

24 января 2025 г., 23:19:59

> I was referring to the order of the fields in the structure itself,
> but that's no big deal one way or the other.

I understand your point now. I will group them with the
related counters in the next rev and will use

> This should be one comment for the whole block, or this should use the
> singular for each comment.

I will use a singular "/* time in milliseconds */" comment for each
new field.

This existing write_time field is plural, but I will leave
that one alone for now.

PgStat_Counter write_time; /* times in milliseconds */


> On HEAD in the ANALYZE path, the endtime is calculated after
> index_vacuum_cleanup().  Your patch calculates it before
> index_vacuum_cleanup(), which would result in an incorrect calculation
> if the index cleanup takes a long time with less logs generated, no?
> Sorry for not noticing that earlier on the thread..  Perhaps it would
> be just better to pass the start time to pgstat_report_vacuum() and
> pgstat_report_analyze() then let their internals do the elapsed time
> calculations.  Consistency of the arguments for both functions is
> something worth having, IMO, even if it means a bit more
> GetCurrentTimestamp() in this case.

So currently, the report of the last_(autoanalyze|analyze)_time
is set before the index_vacuum_cleanup, but for logging purposes
the elapsed time is calculated afterwards. Most users will not notice
this, but I think that is wrong as well.

I think we should calculate endtime and elapsedtime and call
pgstat_report_analyze after the index_vacuum_cleanup; and
before vac_close_indexes. This is more accurate and will
avoid incurring the extra GetCurrentTimestamp() call.

What do you think?

Regards,

Sami

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: POC: track vacuum/analyze cumulative time per relation

Re: POC: track vacuum/analyze cumulative time per relation

Re: POC: track vacuum/analyze cumulative time per relation

Re: POC: track vacuum/analyze cumulative time per relation

Re: POC: track vacuum/analyze cumulative time per relation

Re: POC: track vacuum/analyze cumulative time per relation