Re: Add index scan progress to pg_stat_progress_vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Add index scan progress to pg_stat_progress_vacuum
Дата
Msg-id CAD21AoBduTv=AQS_V0or50Fdbz7NjS2o4EWnMCaXTJ9yYJr7ew@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add index scan progress to pg_stat_progress_vacuum  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Add index scan progress to pg_stat_progress_vacuum  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On Thu, Apr 7, 2022 at 10:20 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Apr 6, 2022 at 5:22 PM Imseih (AWS), Sami <simseih@amazon.com> wrote:
> > >    At the beginning of a parallel operation, we allocate a chunk of>
> > >    dynamic shared memory which persists even after some or all workers
> > >    have exited. It's only torn down at the end of the parallel operation.
> > >    That seems like the appropriate place to be storing any kind of data
> > >    that needs to be propagated between parallel workers. The current
> > >    patch uses the main shared memory segment, which seems unacceptable to
> > >    me.
> >
> > Correct, DSM does track shared data. However only participating
> > processes in the parallel vacuum can attach and lookup this data.
> >
> > The purpose of the main shared memory is to allow a process that
> > Is querying the progress views to retrieve the information.
>
> Sure, but I think that you should likely be doing what Andres
> recommended before:
>
> # Why isn't the obvious thing to do here to provide a way to associate workers
> # with their leaders in shared memory, but to use the existing progress fields
> # to report progress? Then, when querying progress, the leader and workers
> # progress fields can be combined to show the overall progress?
>
> That is, I am imagining that you would want to use DSM to propagate
> data from workers back to the leader and then have the leader report
> the data using the existing progress-reporting facilities. Now, if we
> really need a whole row from each worker that doesn't work, but why do
> we need that?

+1

I also proposed the same idea before[1]. The leader can know how many
indexes are processed so far by checking PVIndStats.status allocated
on DSM for each index. We can have the leader check it and update the
progress information before and after vacuuming one index. If we want
to update the progress information more timely, probably we can pass a
callback function to ambulkdelete and amvacuumcleanup so that the
leader can do that periodically, e.g., every 1000 blocks, while
vacuuming an index.

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoBW6SMJ96CNoMeu%2Bf_BR4jmatPcfVA016FdD2hkLDsaTA%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: why pg_walfile_name() cannot be executed during recovery?
Следующее
От: "Jonathan S. Katz"
Дата:
Сообщение: Re: How about a psql backslash command to show GUCs?