Обсуждение: a verbose option for autovacuum

Поиск
Список
Период
Сортировка

a verbose option for autovacuum

От
Tommy Li
Дата:
Hi all

I was surprised to see that there's no way to get `VACUUM VERBOSE`-like output from autovacuum. Is there any interest in enabling this?

Additionally, is there any interest in exposing more vacuum options to be run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable. I skipped FULL in that list because I'm assuming no one would ever want autovac to run VACUUM FULL.


Tommy

Re: a verbose option for autovacuum

От
Tom Lane
Дата:
Tommy Li <tommy@coffeemeetsbagel.com> writes:
> I was surprised to see that there's no way to get `VACUUM VERBOSE`-like
> output from autovacuum. Is there any interest in enabling this?

Seems like that would very soon feel like log spam.  What would be the
use case for having this on?  If you want one-off results you could
run VACUUM manually.

> Additionally, is there any interest in exposing more vacuum options to be
> run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the
> VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable.

To the extent that any of these make sense in autovacuum, I'd say they
ought to be managed automatically.  I don't see a strong argument for
users configuring this.  (See also nearby thread about allowing index
AMs to control some of this.)

            regards, tom lane



Re: a verbose option for autovacuum

От
Tommy Li
Дата:
Hey Tom

> Seems like that would very soon feel like log spam.  What would be the
> use case for having this on?  If you want one-off results you could
> run VACUUM manually.

In my case I have a fairly large, fairly frequently updated table with a large number of indexes where autovacuum's runtime can fluctuate between 12 and 24 hours. If I want to investigate why autovacuum today is running many hours longer than it did last week, the only information I have to go off is from pg_stat_progress_vacuum, which reports only progress based on the number of blocks completed across _all_ indexes.

VACUUM VERBOSE's output is nice because it reports runtime per index, which would allow me to see if a specific index has bloated more than usual.

I also have autovacuum throttled much more aggressively than manual vacuums, so information from a one-off manual VACUUM isn't comparable.

As for log spam, I'm not sure it's a problem as long as the verbose option is disabled by default.


Tommy

On Fri, Jan 22, 2021 at 2:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Tommy Li <tommy@coffeemeetsbagel.com> writes:
> I was surprised to see that there's no way to get `VACUUM VERBOSE`-like
> output from autovacuum. Is there any interest in enabling this?

Seems like that would very soon feel like log spam.  What would be the
use case for having this on?  If you want one-off results you could
run VACUUM manually.

> Additionally, is there any interest in exposing more vacuum options to be
> run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the
> VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable.

To the extent that any of these make sense in autovacuum, I'd say they
ought to be managed automatically.  I don't see a strong argument for
users configuring this.  (See also nearby thread about allowing index
AMs to control some of this.)

                        regards, tom lane

Re: a verbose option for autovacuum

От
Stephen Frost
Дата:
Greetings,

On Fri, Jan 22, 2021 at 2:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tommy Li <tommy@coffeemeetsbagel.com> writes:
> > Additionally, is there any interest in exposing more vacuum options to be
> > run by autovac? Right now it runs FREEZE and ANALYZE, which leaves the
> > VERBOSE, SKIP_LOCKED, INDEX_CLEANUP, and TRUNCATE unconfigurable.
>
> To the extent that any of these make sense in autovacuum, I'd say they
> ought to be managed automatically.  I don't see a strong argument for
> users configuring this.  (See also nearby thread about allowing index
> AMs to control some of this.)

I agree that it'd be nice to figure out some way to have these managed
automatically, but it's probably useful to point out to Tommy that you
can set vacuum options on a table level which autovacuum should respect,
such as vacuum_index_cleanup and vacuum_truncate.  For skip locked,
autovacuum already will automatically release it's attempt to acquire a
lock if someone backs up behind it for too long.

Until we get something automatic though, I could see being able to set
TRUNCATE, in particular, to be off globally as useful when running a
system with replicas that might end up having queries which block WAL
replay.  If no one is stepping up to build some way to handle that
automatically then I'd be in support of making it something that an
administrator can configure (to avoid having to always remember to do it
for each table created...).

* Tommy Li (tommy@coffeemeetsbagel.com) wrote:
> > Seems like that would very soon feel like log spam.  What would be the
> > use case for having this on?  If you want one-off results you could
> > run VACUUM manually.
>
> In my case I have a fairly large, fairly frequently updated table with a
> large number of indexes where autovacuum's runtime can fluctuate between 12
> and 24 hours. If I want to investigate why autovacuum today is running many
> hours longer than it did last week, the only information I have to go off
> is from pg_stat_progress_vacuum, which reports only progress based on the
> number of blocks completed across _all_ indexes.
>
> VACUUM VERBOSE's output is nice because it reports runtime per index, which
> would allow me to see if a specific index has bloated more than usual.
>
> I also have autovacuum throttled much more aggressively than manual
> vacuums, so information from a one-off manual VACUUM isn't comparable.

I tend to agree that this is pretty useful information to have included
when trying to figure out what autovacuum's doing.

> As for log spam, I'm not sure it's a problem as long as the verbose option
> is disabled by default.

While this would be in-line with our existing dismal logging defaults,
I'd be happier with a whole bunch more logging enabled by default,
including this, so that we don't have to tell everyone who deploys PG to
go enable this very sensible logging.  Arguments of 'log spam' really
fall down when you're on the receiving end of practically empty PG logs
and trying to figure out what's going on.  Further, log analysis tools
exist to aggregate this data and bring it up to a higher level for
administrators.

Still, I'd much rather have the option, even if disabled by default,
than not have it at all.

Thanks,

Stephen


Вложения

Re: a verbose option for autovacuum

От
Tommy Li
Дата:
Hi Stephen

> ... can set vacuum options on a table level which autovacuum should respect,
> such as vacuum_index_cleanup and vacuum_truncate.  For skip locked,
> autovacuum already will automatically release it's attempt to acquire a
> lock if someone backs up behind it for too long.

This is good information, I wasn't aware that autovacuum respected those settings.
In that case I'd like to focus specifically on the verbose aspect. 

My first thought was a new boolean configuration called "autovacuum_verbose".
I'd want it to behave similarly to autovacuum_vacuum_cost_limit in that it can be
set globally or on a per-table basis.

Thoughts?


Tommy

Re: a verbose option for autovacuum

От
Masahiko Sawada
Дата:
On Tue, Jan 26, 2021 at 2:46 AM Tommy Li <tommy@coffeemeetsbagel.com> wrote:
>
> Hi Stephen
>
> > ... can set vacuum options on a table level which autovacuum should respect,
> > such as vacuum_index_cleanup and vacuum_truncate.  For skip locked,
> > autovacuum already will automatically release it's attempt to acquire a
> > lock if someone backs up behind it for too long.
>
> This is good information, I wasn't aware that autovacuum respected those settings.
> In that case I'd like to focus specifically on the verbose aspect.
>
> My first thought was a new boolean configuration called "autovacuum_verbose".
> I'd want it to behave similarly to autovacuum_vacuum_cost_limit in that it can be
> set globally or on a per-table basis.

I agree to have autovacuum log more information, especially index
vacuums. Currently, the log related to index vacuum is only the number
of index scans. I think it would be helpful if the log has more
details about each index vacuum.

But I'm not sure that neither always logging that nor having set the
parameter per-table basis is a good idea. In the former case, it could
be log spam for example in the case of anti-wraparound vacuums that
vacuums on all tables (and their indexes) in the database. If we set
it per-table basis, it’s useful when the user already knows which
tables are likely to take a long time for autovacuum but won’t work
when the users want to check the autovacuum details for tables that
autovacuum could take a long time for.

Given that we already have log_autovacuum_min_duration, I think this
verbose logging should work together with that. I’d prefer to enable
the verbose logging by default for the same reason Stephen mentioned.
Or maybe we can have a parameter to control verbosity, say
log_autovaucum_verbosity.

Regarding when to log, we can have autovacuum emit index vacuum log
after each lazy_vacuum/cleanup_index() end like VACUUM VERBOSE does,
but I’m not sure how it could work together with
log_autovacuum_min_duration. So one idea could be to have autovacuum
emit a log for each index vacuum statistics along with the current
autovacuum logs at the end of lazy vacuum if the execution time
exceeds log_autovacuum_min_duration. A downside would be one
autovacuum log could be very long if the table has many indexes, and
we still don’t know how much time taken for each index vacuum. But you
can see if a specific index has bloated more than usual. And for the
latter part, we would be able to add the statistics of execution time
for each vacuum phase to the log as a further improvement.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: a verbose option for autovacuum

От
Tommy Li
Дата:
Hi Masahiko

> If we set
> it per-table basis, it’s useful when the user already knows which
> tables are likely to take a long time for autovacuum

I would assume that's the default case, most apps I've seen are designed around a small
number of large tables that take up most of the maintenance effort

> Regarding when to log, we can have autovacuum emit index vacuum log
> after each lazy_vacuum/cleanup_index() end like VACUUM VERBOSE does,
> but I’m not sure how it could work together with
> log_autovacuum_min_duration.

I do like having this rolled into the existing configuration. This might be an absurd idea, but
what if the autovacuum process accumulates the per-index vacuum information until that
threshold is reached, and then spits out the logs all at once? And after the min duration is 
passed, it just logs the rest of the index vacuum information as they occur. That way the 
information is more likely to be available to investigate an abnormally long running vacuum 
while it's still happening.


Tommy

Re: a verbose option for autovacuum

От
"Euler Taveira"
Дата:
On Fri, Jan 29, 2021, at 4:35 AM, Masahiko Sawada wrote:
I agree to have autovacuum log more information, especially index
vacuums. Currently, the log related to index vacuum is only the number
of index scans. I think it would be helpful if the log has more
details about each index vacuum.
+1 for this feature. Sometimes this analysis is useful to confirm your theory;
without data, it is just a wild guess.

But I'm not sure that neither always logging that nor having set the
parameter per-table basis is a good idea. In the former case, it could
be log spam for example in the case of anti-wraparound vacuums that
vacuums on all tables (and their indexes) in the database. If we set
it per-table basis, it’s useful when the user already knows which
tables are likely to take a long time for autovacuum but won’t work
when the users want to check the autovacuum details for tables that
autovacuum could take a long time for.
I prefer a per-table parameter since it allows us a fine-grained tuning. It
covers the cases you provided above. You can disable it at all and only enable
it in critical tables or enable it and disable it for known-to-be-spam tables.

Given that we already have log_autovacuum_min_duration, I think this
verbose logging should work together with that. I’d prefer to enable
the verbose logging by default for the same reason Stephen mentioned.
Or maybe we can have a parameter to control verbosity, say
log_autovaucum_verbosity.
IMO this new parameter is just an option to inject VERBOSE into VACUUM command.
Since there is already a parameter to avoid spam autovacuum messages, this
feature shouldn't hijack log_autovacuum_min_duration behavior. If the
autovacuum command execution time runs less than l_a_m_d, the output should be
discarded.

I don't have a strong opinion about this parameter name but I think your
suggestion (log_autovaccum_verbosity) is easier to guess what this parameter is
for.


--
Euler Taveira

Re: a verbose option for autovacuum

От
Michael Paquier
Дата:
On Wed, Mar 17, 2021 at 08:50:26AM -0300, Euler Taveira wrote:
> Since commit 5f8727f5a6, this patch doesn't apply anymore. Fortunately, it is
> just a small hunk. I reviewed this patch and it looks good to me. There is just
> a small issue (double space after 'if') that I fixed in the attached version.

No major objections to what you are proposing here.

>      /* and log the action if appropriate */
> -    if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
> +    if (IsAutoVacuumWorkerProcess())
>      {
> -        TimestampTz endtime = GetCurrentTimestamp();
> +        TimestampTz endtime = 0;
> +        int i;
>
> -        if (params->log_min_duration == 0 ||
> -            TimestampDifferenceExceeds(starttime, endtime,
> -                                       params->log_min_duration))
> +        if (params->log_min_duration >= 0)
> +            endtime = GetCurrentTimestamp();
> +
> +        if (endtime > 0 &&
> +            (params->log_min_duration == 0 ||
> +             TimestampDifferenceExceeds(starttime, endtime,

Why is there any need to actually change this part?  If I am following
the patch correctly, the reason why you are doing things this way is
to free the set of N statistics all the time for autovacuum.  However,
we could make that much simpler, and your patch is already half-way
through that by adding the stats of the N indexes to LVRelStats.  Here
is the idea:
- Allocation of the N items for IndexBulkDeleteResult at the beginning
of heap_vacuum_rel().  It seems to me that we are going to need the N
index names within LVRelStats, to be able to still call
vac_close_indexes() *before* logging the stats.
- No need to pass down indstats for the two callers of
lazy_vacuum_all_indexes().
- Clean up of vacrelstats once heap_vacuum_rel() is done with it.
--
Michael

Вложения

Re: a verbose option for autovacuum

От
Masahiko Sawada
Дата:
On Thu, Mar 18, 2021 at 3:41 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Mar 17, 2021 at 08:50:26AM -0300, Euler Taveira wrote:
> > Since commit 5f8727f5a6, this patch doesn't apply anymore. Fortunately, it is
> > just a small hunk. I reviewed this patch and it looks good to me. There is just
> > a small issue (double space after 'if') that I fixed in the attached version.
>
> No major objections to what you are proposing here.
>
> >       /* and log the action if appropriate */
> > -     if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
> > +     if (IsAutoVacuumWorkerProcess())
> >       {
> > -             TimestampTz endtime = GetCurrentTimestamp();
> > +             TimestampTz endtime = 0;
> > +             int i;
> >
> > -             if (params->log_min_duration == 0 ||
> > -                     TimestampDifferenceExceeds(starttime, endtime,
> > -                                                                        params->log_min_duration))
> > +             if (params->log_min_duration >= 0)
> > +                     endtime = GetCurrentTimestamp();
> > +
> > +             if (endtime > 0 &&
> > +                     (params->log_min_duration == 0 ||
> > +                      TimestampDifferenceExceeds(starttime, endtime,
>
> Why is there any need to actually change this part?  If I am following
> the patch correctly, the reason why you are doing things this way is
> to free the set of N statistics all the time for autovacuum.  However,
> we could make that much simpler, and your patch is already half-way
> through that by adding the stats of the N indexes to LVRelStats.  Here
> is the idea:
> - Allocation of the N items for IndexBulkDeleteResult at the beginning
> of heap_vacuum_rel().  It seems to me that we are going to need the N
> index names within LVRelStats, to be able to still call
> vac_close_indexes() *before* logging the stats.
> - No need to pass down indstats for the two callers of
> lazy_vacuum_all_indexes().
> - Clean up of vacrelstats once heap_vacuum_rel() is done with it.

Okay, I've updated the patch accordingly. If we add
IndexBulkDeleteResult to LVRelStats I think we can remove
IndexBulkDeleteResult function argument also from some other functions
such as lazy_parallel_vacuum_indexes() and vacuum_indexes_leader().
Please review the attached patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: a verbose option for autovacuum

От
Masahiko Sawada
Дата:
On Thu, Mar 18, 2021 at 9:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Mar 18, 2021 at 3:41 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Wed, Mar 17, 2021 at 08:50:26AM -0300, Euler Taveira wrote:
> > > Since commit 5f8727f5a6, this patch doesn't apply anymore. Fortunately, it is
> > > just a small hunk. I reviewed this patch and it looks good to me. There is just
> > > a small issue (double space after 'if') that I fixed in the attached version.
> >
> > No major objections to what you are proposing here.
> >
> > >       /* and log the action if appropriate */
> > > -     if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
> > > +     if (IsAutoVacuumWorkerProcess())
> > >       {
> > > -             TimestampTz endtime = GetCurrentTimestamp();
> > > +             TimestampTz endtime = 0;
> > > +             int i;
> > >
> > > -             if (params->log_min_duration == 0 ||
> > > -                     TimestampDifferenceExceeds(starttime, endtime,
> > > -                                                                        params->log_min_duration))
> > > +             if (params->log_min_duration >= 0)
> > > +                     endtime = GetCurrentTimestamp();
> > > +
> > > +             if (endtime > 0 &&
> > > +                     (params->log_min_duration == 0 ||
> > > +                      TimestampDifferenceExceeds(starttime, endtime,
> >
> > Why is there any need to actually change this part?  If I am following
> > the patch correctly, the reason why you are doing things this way is
> > to free the set of N statistics all the time for autovacuum.  However,
> > we could make that much simpler, and your patch is already half-way
> > through that by adding the stats of the N indexes to LVRelStats.  Here
> > is the idea:
> > - Allocation of the N items for IndexBulkDeleteResult at the beginning
> > of heap_vacuum_rel().  It seems to me that we are going to need the N
> > index names within LVRelStats, to be able to still call
> > vac_close_indexes() *before* logging the stats.
> > - No need to pass down indstats for the two callers of
> > lazy_vacuum_all_indexes().
> > - Clean up of vacrelstats once heap_vacuum_rel() is done with it.
>
> Okay, I've updated the patch accordingly. If we add
> IndexBulkDeleteResult to LVRelStats I think we can remove
> IndexBulkDeleteResult function argument also from some other functions
> such as lazy_parallel_vacuum_indexes() and vacuum_indexes_leader().
> Please review the attached patch.

Sorry, I attached the wrong version patch. So attached the right one.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: a verbose option for autovacuum

От
Peter Geoghegan
Дата:
On Thu, Mar 18, 2021 at 5:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Okay, I've updated the patch accordingly. If we add
> IndexBulkDeleteResult to LVRelStats I think we can remove
> IndexBulkDeleteResult function argument also from some other functions
> such as lazy_parallel_vacuum_indexes() and vacuum_indexes_leader().
> Please review the attached patch.

That seems much clearer.

Were you going to take care of this, Michael?

-- 
Peter Geoghegan



Re: a verbose option for autovacuum

От
Michael Paquier
Дата:
On Thu, Mar 18, 2021 at 09:38:05AM -0700, Peter Geoghegan wrote:
> Were you going to take care of this, Michael?

Yes, I was waiting for Sawada-san to send an updated version, which he
just did.
--
Michael

Вложения

Re: a verbose option for autovacuum

От
Peter Geoghegan
Дата:
On Thu, Mar 18, 2021 at 2:08 PM Michael Paquier <michael@paquier.xyz> wrote:
> Yes, I was waiting for Sawada-san to send an updated version, which he
> just did.

Great. This really seems worth having. I was hoping that somebody else
could pick this one up.

-- 
Peter Geoghegan



Re: a verbose option for autovacuum

От
Michael Paquier
Дата:
On Thu, Mar 18, 2021 at 11:30:46PM +0900, Masahiko Sawada wrote:
> Sorry, I attached the wrong version patch. So attached the right one.

Thanks.  I have been hacking aon that, and I think that we could do
more in terms of integration of the index stats into LVRelStats to
help with debugging issues, mainly, but also to open the door at
allowing autovacuum to use the parallel path in the future.  Hence,
for consistency, I think that we should change the following things in
LVRelStats:
- Add the number of indexes.  It looks rather unusual to not track
down the number of indexes directly in the structure anyway, as the
stats array gets added there.
- Add all the index names, for parallel and non-parallel mode.
- Replace the index name in the error callback by an index number,
pointing back to its location in indstats and indnames.

As lazy_vacuum_index() requires the index number to be set internally
to it, this means that we need to pass it down across
vacuum_indexes_leader(), lazy_parallel_vacuum_indexes(), but that
seems like an acceptable compromise to me for now.  I think that it
would be good to tighten a bit more the relationship between the index
stats in the DSM for the parallel case and the ones in local memory,
but what we have here looks enough to me so we could figure out that
as a future step.

Sawada-san, what do you think?  Attached is the patch I have finished
with.
--
Michael

Вложения

Re: a verbose option for autovacuum

От
Masahiko Sawada
Дата:
On Fri, Mar 19, 2021 at 3:14 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Mar 18, 2021 at 11:30:46PM +0900, Masahiko Sawada wrote:
> > Sorry, I attached the wrong version patch. So attached the right one.
>
> Thanks.  I have been hacking aon that, and I think that we could do
> more in terms of integration of the index stats into LVRelStats to
> help with debugging issues, mainly, but also to open the door at
> allowing autovacuum to use the parallel path in the future.

Thank you for updating the patch!

> Hence,
> for consistency, I think that we should change the following things in
> LVRelStats:
> - Add the number of indexes.  It looks rather unusual to not track
> down the number of indexes directly in the structure anyway, as the
> stats array gets added there.
> - Add all the index names, for parallel and non-parallel mode.

Agreed with those two changes.

> - Replace the index name in the error callback by an index number,
> pointing back to its location in indstats and indnames.

I like this idea but I'm not sure the approach that the patch took
improved the code. Please read the below my concern.

>
> As lazy_vacuum_index() requires the index number to be set internally
> to it, this means that we need to pass it down across
> vacuum_indexes_leader(), lazy_parallel_vacuum_indexes(), but that
> seems like an acceptable compromise to me for now.  I think that it
> would be good to tighten a bit more the relationship between the index
> stats in the DSM for the parallel case and the ones in local memory,
> but what we have here looks enough to me so we could figure out that
> as a future step.
>
> Sawada-san, what do you think?  Attached is the patch I have finished
> with.

With this idea, vacuum_one_index() will become:

static void
lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,
                  LVDeadTuples *dead_tuples, double reltuples,
                  LVRelStats *vacrelstats, int indnum)

and the caller calls this function as follow:

        for (idx = 0; idx < nindexes; idx++)
            lazy_vacuum_index(Irel[idx], &(vacrelstats->indstats[idx]),
                              vacrelstats->dead_tuples,
                              vacrelstats->old_live_tuples, vacrelstats,
                              idx);

It's not bad but it seems redundant a bit to me. We pass the idx in
spite of passing also Irel[idx] and &(vacrelstats->indstats[idx]). I
think your first idea that is done in v4 patch (saving index names at
the beginning of heap_vacuum_rel() for autovacuum logging purpose
only) and the idea of deferring to close indexes until the end of
heap_vacuum_rel() so that we can refer index name at autovacuum
logging are more simple.

BTW the patch led to a crash in my environment. The problem is here:

 static void
-vacuum_one_index(Relation indrel, IndexBulkDeleteResult **stats,
+vacuum_one_index(Relation indrel,
                 LVShared *lvshared, LVSharedIndStats *shared_indstats,
-                LVDeadTuples *dead_tuples, LVRelStats *vacrelstats)
+                LVDeadTuples *dead_tuples, LVRelStats *vacrelstats,
+                int indnum)
 {
    IndexBulkDeleteResult *bulkdelete_res = NULL;
+   IndexBulkDeleteResult *stats;

We need to initialize *stats with NULL here.

And while looking at the change of vacuum_one_index() I found another problem:

@@ -2349,17 +2400,20 @@ vacuum_one_index(Relation indrel,
IndexBulkDeleteResult **stats,
                 * Update the pointer to the corresponding
bulk-deletion result if
                 * someone has already updated it.
                 */
-               if (shared_indstats->updated && *stats == NULL)
-                       *stats = bulkdelete_res;
+               if (shared_indstats->updated)
+                       stats = bulkdelete_res;
        }
+       else
+               stats = vacrelstats->indstats[indnum];

        /* Do vacuum or cleanup of the index */
        if (lvshared->for_cleanup)
-               lazy_cleanup_index(indrel, stats, lvshared->reltuples,
-
lvshared->estimated_count, vacrelstats);
+               lazy_cleanup_index(indrel, &stats, lvshared->reltuples,
+
lvshared->estimated_count, vacrelstats,
+                                                  indnum);
        else
-               lazy_vacuum_index(indrel, stats, dead_tuples,
-                                                 lvshared->reltuples,
vacelstats);
+               lazy_vacuum_index(indrel, &stats, dead_tuples,
+                                                 lvshared->reltuples,
vacrelstats, indnum);

        /*
         * Copy the index bulk-deletion result returned from ambulkdelete and

If shared_indstats is NULL (e.g., we do " stats =
vacrelstats->indstats[indnum];"), vacrelstats->indstats[indnum] is not
updated since we pass &stats. I think we should pass
&(vacrelstats->indstats[indnum]) instead in this case.

Previously, we update the element of the pointer array of index
statistics to the pointer pointing to either the local memory or DSM.
But with the above change, we do that only when the index statistics
are in the local memory. In other words, vacrelstats->indstats[i] is
never updated if the corresponding index supports parallel indexes. I
think this is not relevant with the change that we'd like to do here
(i.e., passing indnum down).


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



Re: a verbose option for autovacuum

От
Michael Paquier
Дата:
On Sat, Mar 20, 2021 at 01:06:51PM +0900, Masahiko Sawada wrote:
> It's not bad but it seems redundant a bit to me. We pass the idx in
> spite of passing also Irel[idx] and &(vacrelstats->indstats[idx]). I
> think your first idea that is done in v4 patch (saving index names at
> the beginning of heap_vacuum_rel() for autovacuum logging purpose
> only) and the idea of deferring to close indexes until the end of
> heap_vacuum_rel() so that we can refer index name at autovacuum
> logging are more simple.

Okay.

> We need to initialize *stats with NULL here.

Right.  I am wondering why I did not get any complain here.

> If shared_indstats is NULL (e.g., we do " stats =
> vacrelstats->indstats[indnum];"), vacrelstats->indstats[indnum] is not
> updated since we pass &stats. I think we should pass
> &(vacrelstats->indstats[indnum]) instead in this case.

If we get rid completely of this idea around indnum, that I don't
disagree with so let's keep just indname, you mean to keep the second
argument IndexBulkDeleteResult of vacuum_one_index() and pass down
&(vacrelstats->indstats[indnum]) as argument.  No objections from me
to just do that.

> Previously, we update the element of the pointer array of index
> statistics to the pointer pointing to either the local memory or DSM.
> But with the above change, we do that only when the index statistics
> are in the local memory. In other words, vacrelstats->indstats[i] is
> never updated if the corresponding index supports parallel indexes. I
> think this is not relevant with the change that we'd like to do here
> (i.e., passing indnum down).

Yeah, that looks like just some over-engineering design on my side.
Would you like to update the patch with what you think is most
adapted?
--
Michael

Вложения

Re: a verbose option for autovacuum

От
Masahiko Sawada
Дата:
On Sat, Mar 20, 2021 at 3:40 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Mar 20, 2021 at 01:06:51PM +0900, Masahiko Sawada wrote:
> > It's not bad but it seems redundant a bit to me. We pass the idx in
> > spite of passing also Irel[idx] and &(vacrelstats->indstats[idx]). I
> > think your first idea that is done in v4 patch (saving index names at
> > the beginning of heap_vacuum_rel() for autovacuum logging purpose
> > only) and the idea of deferring to close indexes until the end of
> > heap_vacuum_rel() so that we can refer index name at autovacuum
> > logging are more simple.
>
> Okay.
>
> > We need to initialize *stats with NULL here.
>
> Right.  I am wondering why I did not get any complain here.
>
> > If shared_indstats is NULL (e.g., we do " stats =
> > vacrelstats->indstats[indnum];"), vacrelstats->indstats[indnum] is not
> > updated since we pass &stats. I think we should pass
> > &(vacrelstats->indstats[indnum]) instead in this case.
>
> If we get rid completely of this idea around indnum, that I don't
> disagree with so let's keep just indname, you mean to keep the second
> argument IndexBulkDeleteResult of vacuum_one_index() and pass down
> &(vacrelstats->indstats[indnum]) as argument.  No objections from me
> to just do that.
>
> > Previously, we update the element of the pointer array of index
> > statistics to the pointer pointing to either the local memory or DSM.
> > But with the above change, we do that only when the index statistics
> > are in the local memory. In other words, vacrelstats->indstats[i] is
> > never updated if the corresponding index supports parallel indexes. I
> > think this is not relevant with the change that we'd like to do here
> > (i.e., passing indnum down).
>
> Yeah, that looks like just some over-engineering design on my side.
> Would you like to update the patch with what you think is most
> adapted?

I've updated the patch. I saved the index names at the beginning of
heap_vacuum_rel() for autovacuum logging, and add indstats and
nindexes to LVRelStats. Some functions still have nindexes as a
function argument but it seems to make sense since it corresponds the
list of index relations (*Irel). Please review the patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: a verbose option for autovacuum

От
Michael Paquier
Дата:
On Mon, Mar 22, 2021 at 12:17:37PM +0900, Masahiko Sawada wrote:
> I've updated the patch. I saved the index names at the beginning of
> heap_vacuum_rel() for autovacuum logging, and add indstats and
> nindexes to LVRelStats. Some functions still have nindexes as a
> function argument but it seems to make sense since it corresponds the
> list of index relations (*Irel). Please review the patch.

Going back to that, the structure of the static APIs in this file make
the whole logic a bit hard to follow, but the whole set of changes you
have done here makes sense.  It took me a moment to recall and
understand why it is safe to free *stats at the end of
vacuum_one_index() and if the index stats array actually pointed to
the DSM segment correctly within the shared stats.

I think that there is more consolidation possible within LVRelStats,
but let's leave that for another day, if there is any need for such a
move.

To keep it short.  Sold.
--
Michael

Вложения

Re: a verbose option for autovacuum

От
Masahiko Sawada
Дата:
On Tue, Mar 23, 2021 at 1:31 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Mar 22, 2021 at 12:17:37PM +0900, Masahiko Sawada wrote:
> > I've updated the patch. I saved the index names at the beginning of
> > heap_vacuum_rel() for autovacuum logging, and add indstats and
> > nindexes to LVRelStats. Some functions still have nindexes as a
> > function argument but it seems to make sense since it corresponds the
> > list of index relations (*Irel). Please review the patch.
>
> Going back to that, the structure of the static APIs in this file make
> the whole logic a bit hard to follow, but the whole set of changes you
> have done here makes sense.  It took me a moment to recall and
> understand why it is safe to free *stats at the end of
> vacuum_one_index() and if the index stats array actually pointed to
> the DSM segment correctly within the shared stats.
>
> I think that there is more consolidation possible within LVRelStats,
> but let's leave that for another day, if there is any need for such a
> move.

While studying your patch (v5-index_stat_log.patch) I found we can
polish the parallel vacuum code in some places. I'll try it another
day.

>
> To keep it short.  Sold.

Thank you!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/