Обсуждение: POC: Parallel processing of indexes in autovacuum
Hi!
The
VACUUM
command can be executed with the parallel option. As documentation states, it will perform index vacuum and index cleanup phases of VACUUM
in parallel using integer
background workers. But such an interesting feature is not used for an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added, the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this.Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions. Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum and a cron task, but, I think, we can do better.
Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sake of simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition without them.
As always, any thoughts and opinions are very welcome!
--
Best regards,
Maxim Orlov.
Вложения
HI Maxim Orlov
Thank you for your working on this ,I like your idea ,but I have a suggestion ,autovacuum_max_workers is not need change requires restart , I think those guc are can like
autovacuum_max_workers
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)
Thanks
On Wed, Apr 16, 2025 at 7:05 PM Maxim Orlov <orlovmg@gmail.com> wrote:
Hi!TheVACUUM
command can be executed with the parallel option. As documentation states, it will perform index vacuum and index cleanup phases ofVACUUM
in parallel usinginteger
background workers. But such an interesting feature is not used for an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added, the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this.Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions. Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum and a cron task, but, I think, we can do better.Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sake of simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition without them.As always, any thoughts and opinions are very welcome!
--Best regards,Maxim Orlov.
Hi, On Wed, Apr 16, 2025 at 4:05 AM Maxim Orlov <orlovmg@gmail.com> wrote: > > Hi! > > The VACUUM command can be executed with the parallel option. As documentation states, it will perform index vacuum andindex cleanup phases of VACUUM in parallel using integer background workers. But such an interesting feature is not usedfor an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added,the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this. > > Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions.Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum anda cron task, but, I think, we can do better. > > Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sakeof simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition withoutthem. > > As always, any thoughts and opinions are very welcome! As I understand it, we initially disabled parallel vacuum for autovacuum because their objectives are somewhat contradictory. Parallel vacuum aims to accelerate the process by utilizing additional resources, while autovacuum is designed to perform cleaning operations with minimal impact on foreground transaction processing (e.g., through vacuum delay). Nevertheless, I see your point about the potential benefits of using parallel vacuum within autovacuum in specific scenarios. The crucial consideration is determining appropriate criteria for triggering parallel vacuum in autovacuum. Given that we currently support only parallel index processing, suitable candidates might be autovacuum operations on large tables that have a substantial number of sufficiently large indexes and a high volume of garbage tuples. Once we have parallel heap vacuum, as discussed in thread[1], it would also likely be beneficial to incorporate it into autovacuum during aggressive vacuum or failsafe mode. Although the actual number of parallel workers ultimately depends on the number of eligible indexes, it might be beneficial to introduce a storage parameter, say parallel_vacuum_workers, that allows control over the number of parallel vacuum workers on a per-table basis. Regarding implementation: I notice the WIP patch implements its own parallel vacuum mechanism for autovacuum. Have you considered simply setting at_params.nworkers to a value greater than zero? Regards, [1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Thanks for raising this idea! I am generally -1 on the idea of autovacuum performing parallel index vacuum, because I always felt that the parallel option should be employed in a targeted manner for a specific table. if you have a bunch of large tables, some more important than others, a/c may end up using parallel resources on the least important tables and you will have to adjust a/v settings per table, etc to get the right table to be parallel index vacuumed by a/v. Also, with the TIDStore improvements for index cleanup, and the practical elimination of multi-pass index vacuums, I see this being even less convincing as something to add to a/v. Now, If I am going to allocate extra workers to run vacuum in parallel, why not just provide more autovacuum workers instead so I can get more tables vacuumed within a span of time? > Once we have parallel heap vacuum, as discussed in thread[1], it would > also likely be beneficial to incorporate it into autovacuum during > aggressive vacuum or failsafe mode. IIRC, index cleanup is disabled by failsafe. -- Sami Imseih Amazon Web Services (AWS)
On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > As I understand it, we initially disabled parallel vacuum for > autovacuum because their objectives are somewhat contradictory. > Parallel vacuum aims to accelerate the process by utilizing additional > resources, while autovacuum is designed to perform cleaning operations > with minimal impact on foreground transaction processing (e.g., > through vacuum delay). > Yep, we also decided that we must not create more a/v workers for index processing. In current implementation, the leader process sends a signal to the a/v launcher, and the launcher tries to launch all requested workers. But the number of workers never exceeds `autovacuum_max_workers`. Thus, we will never have more a/v workers than in the standard case (without this feature). > Nevertheless, I see your point about the potential benefits of using > parallel vacuum within autovacuum in specific scenarios. The crucial > consideration is determining appropriate criteria for triggering > parallel vacuum in autovacuum. Given that we currently support only > parallel index processing, suitable candidates might be autovacuum > operations on large tables that have a substantial number of > sufficiently large indexes and a high volume of garbage tuples. > > Although the actual number of parallel workers ultimately depends on > the number of eligible indexes, it might be beneficial to introduce a > storage parameter, say parallel_vacuum_workers, that allows control > over the number of parallel vacuum workers on a per-table basis. > For now, we have three GUC variables for this purpose: max_parallel_index_autovac_workers, autovac_idx_parallel_min_rows, autovac_idx_parallel_min_indexes. That is, everything is as you said. But we are still conducting research on this issue. I would like to get rid of some of these parameters. > Regarding implementation: I notice the WIP patch implements its own > parallel vacuum mechanism for autovacuum. Have you considered simply > setting at_params.nworkers to a value greater than zero? > About `at_params.nworkers = N` - that's exactly what we're doing (you can see it in the `vacuum_rel` function). But we cannot fully reuse code of VACUUM PARALLEL, because it creates its own processes via dynamic bgworkers machinery. As I said above - we don't want to consume additional resources. Also we don't want to complicate communication between processes (the idea is that a/v workers can only send signals to the a/v launcher). As a result, we created our own implementation of parallel index processing control - see changes in vacuumparallel.c and autovacuum.c. -- Best regards, Daniil Davydov
On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote: > > I am generally -1 on the idea of autovacuum performing parallel > index vacuum, because I always felt that the parallel option should > be employed in a targeted manner for a specific table. if you have a bunch > of large tables, some more important than others, a/c may end > up using parallel resources on the least important tables and you > will have to adjust a/v settings per table, etc to get the right table > to be parallel index vacuumed by a/v. Hm, this is a good point. I think I should clarify one moment - in practice, there is a common situation when users have one huge table among all databases (with 80+ indexes created on it). But, of course, in general there may be few such tables. But we can still adjust the autovac_idx_parallel_min_rows parameter. If a table has a lot of dead tuples => it is actively used => table is important (?). Also, if the user can really determine the "importance" of each of the tables - we can provide an appropriate table option. Tables with this option set will be processed in parallel in priority order. What do you think about such an idea? > > Also, with the TIDStore improvements for index cleanup, and the practical > elimination of multi-pass index vacuums, I see this being even less > convincing as something to add to a/v. If I understood correctly, then we are talking about the fact that TIDStore can store so many tuples that in fact a second pass is never needed. But the number of passes does not affect the presented optimization in any way. We must think about a large number of indexes that must be processed. Even within a single pass we can have a 40% increase in speed. > > Now, If I am going to allocate extra workers to run vacuum in parallel, why > not just provide more autovacuum workers instead so I can get more tables > vacuumed within a span of time? For now, only one process can clean up indexes, so I don't see how increasing the number of a/v workers will help in the situation that I mentioned above. Also, we don't consume additional resources during autovacuum in this patch - total number of a/v workers always <= autovacuum_max_workers. BTW, see v2 patch, attached to this letter (bug fixes) :-) -- Best regards, Daniil Davydov
Вложения
> On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote: > > > > I am generally -1 on the idea of autovacuum performing parallel > > index vacuum, because I always felt that the parallel option should > > be employed in a targeted manner for a specific table. if you have a bunch > > of large tables, some more important than others, a/c may end > > up using parallel resources on the least important tables and you > > will have to adjust a/v settings per table, etc to get the right table > > to be parallel index vacuumed by a/v. > > Hm, this is a good point. I think I should clarify one moment - in > practice, there is a common situation when users have one huge table > among all databases (with 80+ indexes created on it). But, of course, > in general there may be few such tables. > But we can still adjust the autovac_idx_parallel_min_rows parameter. > If a table has a lot of dead tuples => it is actively used => table is > important (?). > Also, if the user can really determine the "importance" of each of the > tables - we can provide an appropriate table option. Tables with this > option set will be processed in parallel in priority order. What do > you think about such an idea? I think in most cases, the user will want to determine the priority of a table getting parallel vacuum cycles rather than having the autovacuum determine the priority. I also see users wanting to stagger vacuums of large tables with many indexes through some time period, and give the tables the full amount of parallel workers they can afford at these specific periods of time. A/V currently does not really allow for this type of scheduling, and if we give some kind of GUC to prioritize tables, I think users will constantly have to be modifying this priority. I am basing my comments on the scenarios I have seen on the field, and others may have a different opinion. > > Also, with the TIDStore improvements for index cleanup, and the practical > > elimination of multi-pass index vacuums, I see this being even less > > convincing as something to add to a/v. > > If I understood correctly, then we are talking about the fact that > TIDStore can store so many tuples that in fact a second pass is never > needed. > But the number of passes does not affect the presented optimization in > any way. We must think about a large number of indexes that must be > processed. Even within a single pass we can have a 40% increase in > speed. I am not discounting that a single table vacuum with many indexes will maybe perform better with parallel index scan, I am merely saying that the TIDStore optimization now makes index vacuums better and perhaps there is less of an incentive to use parallel. > > Now, If I am going to allocate extra workers to run vacuum in parallel, why > > not just provide more autovacuum workers instead so I can get more tables > > vacuumed within a span of time? > > For now, only one process can clean up indexes, so I don't see how > increasing the number of a/v workers will help in the situation that I > mentioned above. > Also, we don't consume additional resources during autovacuum in this > patch - total number of a/v workers always <= autovacuum_max_workers. Increasing a/v workers will not help speed up a specific table, what I am suggesting is that instead of speeding up one table, let's just allow other tables to not be starved of a/v cycles due to lack of a/v workers. -- Sami
On Fri, May 2, 2025 at 9:58 AM Sami Imseih <samimseih@gmail.com> wrote: > > > Once we have parallel heap vacuum, as discussed in thread[1], it would > > also likely be beneficial to incorporate it into autovacuum during > > aggressive vacuum or failsafe mode. > > IIRC, index cleanup is disabled by failsafe. Yes. My idea is to use parallel *heap* vacuum in autovacuum during failsafe mode. I think it would make sense as users want to complete freezing tables as soon as possible in this situation. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, May 2, 2025 at 11:13 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > As I understand it, we initially disabled parallel vacuum for > > autovacuum because their objectives are somewhat contradictory. > > Parallel vacuum aims to accelerate the process by utilizing additional > > resources, while autovacuum is designed to perform cleaning operations > > with minimal impact on foreground transaction processing (e.g., > > through vacuum delay). > > > Yep, we also decided that we must not create more a/v workers for > index processing. > In current implementation, the leader process sends a signal to the > a/v launcher, and the launcher tries to launch all requested workers. > But the number of workers never exceeds `autovacuum_max_workers`. > Thus, we will never have more a/v workers than in the standard case > (without this feature). I have concerns about this design. When autovacuuming on a single table consumes all available autovacuum_max_workers slots with parallel vacuum workers, the system becomes incapable of processing other tables. This means that when determining the appropriate autovacuum_max_workers value, users must consider not only the number of tables to be processed concurrently but also the potential number of parallel workers that might be launched. I think it would more make sense to maintain the existing autovacuum_max_workers parameter while introducing a new parameter that would either control the maximum number of parallel vacuum workers per autovacuum worker or set a system-wide cap on the total number of parallel vacuum workers. > > > Regarding implementation: I notice the WIP patch implements its own > > parallel vacuum mechanism for autovacuum. Have you considered simply > > setting at_params.nworkers to a value greater than zero? > > > About `at_params.nworkers = N` - that's exactly what we're doing (you > can see it in the `vacuum_rel` function). But we cannot fully reuse > code of VACUUM PARALLEL, because it creates its own processes via > dynamic bgworkers machinery. > As I said above - we don't want to consume additional resources. Also > we don't want to complicate communication between processes (the idea > is that a/v workers can only send signals to the a/v launcher). Could you elaborate on the reasons why you don't want to use background workers and avoid complicated communication between processes? I'm not sure whether these concerns provide sufficient justification for implementing its own parallel index processing. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
> I think it would more make > sense to maintain the existing autovacuum_max_workers parameter while > introducing a new parameter that would either control the maximum > number of parallel vacuum workers per autovacuum worker or set a > system-wide cap on the total number of parallel vacuum workers. +1, and would it make sense for parallel workers to come from max_parallel_maintenance_workers? This is capped by max_parallel_workers and max_worker_processes, so increasing the defaults for all 3 will be needed as well. -- Sami
On Sat, May 3, 2025 at 3:17 AM Sami Imseih <samimseih@gmail.com> wrote: > > I think in most cases, the user will want to determine the priority of > a table getting parallel vacuum cycles rather than having the autovacuum > determine the priority. I also see users wanting to stagger > vacuums of large tables with many indexes through some time period, > and give the > tables the full amount of parallel workers they can afford at these > specific periods > of time. A/V currently does not really allow for this type of > scheduling, and if we > give some kind of GUC to prioritize tables, I think users will constantly have > to be modifying this priority. If the user wants to determine priority himself, we anyway need to introduce some parameter (GUC or table option) that will give us a hint how we should schedule a/v work. You think that we should think about a more comprehensive behavior for such a parameter (so that the user doesn't have to change it often)? I will be glad to know your thoughts. > > If I understood correctly, then we are talking about the fact that > > TIDStore can store so many tuples that in fact a second pass is never > > needed. > > But the number of passes does not affect the presented optimization in > > any way. We must think about a large number of indexes that must be > > processed. Even within a single pass we can have a 40% increase in > > speed. > > I am not discounting that a single table vacuum with many indexes will > maybe perform better with parallel index scan, I am merely saying that > the TIDStore optimization now makes index vacuums better and perhaps > there is less of an incentive to use parallel. I still insist that this does not affect the parallel index vacuum, because we don't get an advantage in repeated passes. We get the same speed increase whether we have this optimization or not. Although it's even possible that the opposite is true - the situation will be better with the new TIDStore, but I can't say for sure. > > > Now, If I am going to allocate extra workers to run vacuum in parallel, why > > > not just provide more autovacuum workers instead so I can get more tables > > > vacuumed within a span of time? > > > > For now, only one process can clean up indexes, so I don't see how > > increasing the number of a/v workers will help in the situation that I > > mentioned above. > > Also, we don't consume additional resources during autovacuum in this > > patch - total number of a/v workers always <= autovacuum_max_workers. > > Increasing a/v workers will not help speed up a specific table, what I > am suggesting is that instead of speeding up one table, let's just allow > other tables to not be starved of a/v cycles due to lack of a/v workers. OK, I got it. But what if vacuuming of a single table will take (for example) 60% of all time? This is still a possible situation, and the fast vacuum of all other tables will not help us. -- Best regards, Daniil Davydov
On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > In current implementation, the leader process sends a signal to the > > a/v launcher, and the launcher tries to launch all requested workers. > > But the number of workers never exceeds `autovacuum_max_workers`. > > Thus, we will never have more a/v workers than in the standard case > > (without this feature). > > I have concerns about this design. When autovacuuming on a single > table consumes all available autovacuum_max_workers slots with > parallel vacuum workers, the system becomes incapable of processing > other tables. This means that when determining the appropriate > autovacuum_max_workers value, users must consider not only the number > of tables to be processed concurrently but also the potential number > of parallel workers that might be launched. I think it would more make > sense to maintain the existing autovacuum_max_workers parameter while > introducing a new parameter that would either control the maximum > number of parallel vacuum workers per autovacuum worker or set a > system-wide cap on the total number of parallel vacuum workers. > For now we have max_parallel_index_autovac_workers - this GUC limits the number of parallel a/v workers that can process a single table. I agree that the scenario you provided is problematic. The proposal to limit the total number of supportive a/v workers seems attractive to me (I'll implement it as an experiment). It seems to me that this question is becoming a key one. First we need to determine the role of the user in the whole scheduling mechanism. Should we allow users to determine priority? Will this priority affect only within a single vacuuming cycle, or it will be more 'global'? I guess I don't have enough expertise to determine this alone. I will be glad to receive any suggestions. > > About `at_params.nworkers = N` - that's exactly what we're doing (you > > can see it in the `vacuum_rel` function). But we cannot fully reuse > > code of VACUUM PARALLEL, because it creates its own processes via > > dynamic bgworkers machinery. > > As I said above - we don't want to consume additional resources. Also > > we don't want to complicate communication between processes (the idea > > is that a/v workers can only send signals to the a/v launcher). > > Could you elaborate on the reasons why you don't want to use > background workers and avoid complicated communication between > processes? I'm not sure whether these concerns provide sufficient > justification for implementing its own parallel index processing. > Here are my thoughts on this. A/v worker has a very simple role - it is born after the launcher's request and must do exactly one 'task' - vacuum table or participate in parallel index vacuum. We also have a dedicated 'launcher' role, meaning the whole design implies that only the launcher is able to launch processes. If we allow a/v worker to use bgworkers, then : 1) A/v worker will go far beyond his responsibility. 2) Its functionality will overlap with the functionality of the launcher. 3) Resource consumption can jump dramatically, which is unexpected for the user. Autovacuum will also be dependent on other resources (bgworkers pool). The current design does not imply this. I wanted to create a patch that would fit into the existing mechanism without drastic innovations. But if you think that the above is not so important, then we can reuse VACUUM PARALLEL code and it would simplify the final implementation) -- Best regards, Daniil Davydov
On Sat, May 3, 2025 at 5:59 AM Sami Imseih <samimseih@gmail.com> wrote: > > > I think it would more make > > sense to maintain the existing autovacuum_max_workers parameter while > > introducing a new parameter that would either control the maximum > > number of parallel vacuum workers per autovacuum worker or set a > > system-wide cap on the total number of parallel vacuum workers. > > +1, and would it make sense for parallel workers to come from > max_parallel_maintenance_workers? This is capped by > max_parallel_workers and max_worker_processes, so increasing > the defaults for all 3 will be needed as well. I may be wrong, but the `max_parallel_maintenance_workers` parameter is only used for commands that are explicitly run by the user. We already have `autovacuum_max_workers` and I think that code will be more consistent, if we adapt this particular parameter (perhaps with the addition of a new one, as I wrote in the previous letter). -- Best regards, Daniil Davydov
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > In current implementation, the leader process sends a signal to the > > > a/v launcher, and the launcher tries to launch all requested workers. > > > But the number of workers never exceeds `autovacuum_max_workers`. > > > Thus, we will never have more a/v workers than in the standard case > > > (without this feature). > > > > I have concerns about this design. When autovacuuming on a single > > table consumes all available autovacuum_max_workers slots with > > parallel vacuum workers, the system becomes incapable of processing > > other tables. This means that when determining the appropriate > > autovacuum_max_workers value, users must consider not only the number > > of tables to be processed concurrently but also the potential number > > of parallel workers that might be launched. I think it would more make > > sense to maintain the existing autovacuum_max_workers parameter while > > introducing a new parameter that would either control the maximum > > number of parallel vacuum workers per autovacuum worker or set a > > system-wide cap on the total number of parallel vacuum workers. > > > > For now we have max_parallel_index_autovac_workers - this GUC limits > the number of parallel a/v workers that can process a single table. I > agree that the scenario you provided is problematic. > The proposal to limit the total number of supportive a/v workers seems > attractive to me (I'll implement it as an experiment). > > It seems to me that this question is becoming a key one. First we need > to determine the role of the user in the whole scheduling mechanism. > Should we allow users to determine priority? Will this priority affect > only within a single vacuuming cycle, or it will be more 'global'? > I guess I don't have enough expertise to determine this alone. I will > be glad to receive any suggestions. What I roughly imagined is that we don't need to change the entire autovacuum scheduling, but would like autovacuum workers to decides whether or not to use parallel vacuum during its vacuum operation based on GUC parameters (having a global effect) or storage parameters (having an effect on the particular table). The criteria of triggering parallel vacuum in autovacuum might need to be somewhat pessimistic so that we don't unnecessarily use parallel vacuum on many tables. > > > > About `at_params.nworkers = N` - that's exactly what we're doing (you > > > can see it in the `vacuum_rel` function). But we cannot fully reuse > > > code of VACUUM PARALLEL, because it creates its own processes via > > > dynamic bgworkers machinery. > > > As I said above - we don't want to consume additional resources. Also > > > we don't want to complicate communication between processes (the idea > > > is that a/v workers can only send signals to the a/v launcher). > > > > Could you elaborate on the reasons why you don't want to use > > background workers and avoid complicated communication between > > processes? I'm not sure whether these concerns provide sufficient > > justification for implementing its own parallel index processing. > > > > Here are my thoughts on this. A/v worker has a very simple role - it > is born after the launcher's request and must do exactly one 'task' - > vacuum table or participate in parallel index vacuum. > We also have a dedicated 'launcher' role, meaning the whole design > implies that only the launcher is able to launch processes. > > If we allow a/v worker to use bgworkers, then : > 1) A/v worker will go far beyond his responsibility. > 2) Its functionality will overlap with the functionality of the launcher. While I agree that the launcher process is responsible for launching autovacuum worker processes but I'm not sure it should be for launching everything related autovacuums. It's quite possible that we have parallel heap vacuum and processing the particular index with parallel workers in the future. The code could get more complex if we have the autovacuum launcher process launch such parallel workers too. I believe it's more straightforward to divide the responsibility like in a way that the autovacuum launcher is responsible for launching autovacuum workers and autovacuum workers are responsible for vacuuming tables no matter how to do that. > 3) Resource consumption can jump dramatically, which is unexpected for > the user. What extra resources could be used if we use background workers instead of autovacuum workers? > Autovacuum will also be dependent on other resources > (bgworkers pool). The current design does not imply this. I see your point but I think it doesn't necessarily need to reflect it at the infrastructure layer. For example, we can internally allocate extra background worker slots for parallel vacuum workers based on max_parallel_index_autovac_workers in addition to max_worker_processes. Anyway we might need something to check or validate max_worker_processes value to make sure that every autovacuum worker can use the specified number of parallel workers for parallel vacuum. > I wanted to create a patch that would fit into the existing mechanism > without drastic innovations. But if you think that the above is not so > important, then we can reuse VACUUM PARALLEL code and it would > simplify the final implementation) I'd suggest using the existing infrastructure if we can achieve the goal with it. If we find out there are some technical difficulties to implement it without new infrastructure, we can revisit this approach. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
>
> On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > In current implementation, the leader process sends a signal to the
> > > a/v launcher, and the launcher tries to launch all requested workers.
> > > But the number of workers never exceeds `autovacuum_max_workers`.
> > > Thus, we will never have more a/v workers than in the standard case
> > > (without this feature).
> >
> > I have concerns about this design. When autovacuuming on a single
> > table consumes all available autovacuum_max_workers slots with
> > parallel vacuum workers, the system becomes incapable of processing
> > other tables. This means that when determining the appropriate
> > autovacuum_max_workers value, users must consider not only the number
> > of tables to be processed concurrently but also the potential number
> > of parallel workers that might be launched. I think it would more make
> > sense to maintain the existing autovacuum_max_workers parameter while
> > introducing a new parameter that would either control the maximum
> > number of parallel vacuum workers per autovacuum worker or set a
> > system-wide cap on the total number of parallel vacuum workers.
> >
>
> For now we have max_parallel_index_autovac_workers - this GUC limits
> the number of parallel a/v workers that can process a single table. I
> agree that the scenario you provided is problematic.
> The proposal to limit the total number of supportive a/v workers seems
> attractive to me (I'll implement it as an experiment).
>
> It seems to me that this question is becoming a key one. First we need
> to determine the role of the user in the whole scheduling mechanism.
> Should we allow users to determine priority? Will this priority affect
> only within a single vacuuming cycle, or it will be more 'global'?
> I guess I don't have enough expertise to determine this alone. I will
> be glad to receive any suggestions.
What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.
Perhaps we should only provide a reloption, therefore only tables specified
by the user via the reloption can be autovacuumed in parallel?
This gives a targeted approach. Of course if multiple of these allowed tables
are to be autovacuumed at the same time, some may not get all the workers,
But that’s not different from if you are to manually vacuum in parallel the tables
at the same time.
What do you think ?
—
Sami
On Tue, May 6, 2025 at 6:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > What I roughly imagined is that we don't need to change the entire > autovacuum scheduling, but would like autovacuum workers to decides > whether or not to use parallel vacuum during its vacuum operation > based on GUC parameters (having a global effect) or storage parameters > (having an effect on the particular table). The criteria of triggering > parallel vacuum in autovacuum might need to be somewhat pessimistic so > that we don't unnecessarily use parallel vacuum on many tables. > +1, I think about it in the same way. I will expand on this topic in more detail in response to Sami's letter [1], so as not to repeat myself. > > Here are my thoughts on this. A/v worker has a very simple role - it > > is born after the launcher's request and must do exactly one 'task' - > > vacuum table or participate in parallel index vacuum. > > We also have a dedicated 'launcher' role, meaning the whole design > > implies that only the launcher is able to launch processes. > > > > If we allow a/v worker to use bgworkers, then : > > 1) A/v worker will go far beyond his responsibility. > > 2) Its functionality will overlap with the functionality of the launcher. > > While I agree that the launcher process is responsible for launching > autovacuum worker processes but I'm not sure it should be for > launching everything related autovacuums. It's quite possible that we > have parallel heap vacuum and processing the particular index with > parallel workers in the future. The code could get more complex if we > have the autovacuum launcher process launch such parallel workers too. > I believe it's more straightforward to divide the responsibility like > in a way that the autovacuum launcher is responsible for launching > autovacuum workers and autovacuum workers are responsible for > vacuuming tables no matter how to do that. It sounds very tempting. At the very beginning I did exactly that (to make sure that nothing would break in a parallel autovacuum). Only later it was decided to abandon the use of bgworkers. For now both approaches look fair for me. What do you think - will others agree that we can provide more responsibility to a/v workers? > > 3) Resource consumption can jump dramatically, which is unexpected for > > the user. > > What extra resources could be used if we use background workers > instead of autovacuum workers? I meant that more processes are starting to participate in the autovacuum than indicated in autovacuum_max_workers. And if a/v worker will use additional bgworkers => other operations cannot get these resources. > > Autovacuum will also be dependent on other resources > > (bgworkers pool). The current design does not imply this. > > I see your point but I think it doesn't necessarily need to reflect it > at the infrastructure layer. For example, we can internally allocate > extra background worker slots for parallel vacuum workers based on > max_parallel_index_autovac_workers in addition to > max_worker_processes. Anyway we might need something to check or > validate max_worker_processes value to make sure that every autovacuum > worker can use the specified number of parallel workers for parallel > vacuum. I don't think that we can provide all supportive workers for each parallel index vacuuming request. But I got your point - always keep several bgworkers that only a/v workers can use if needed and the size of this additional pool (depending on max_worker_processes) must be user-configurable. > > I wanted to create a patch that would fit into the existing mechanism > > without drastic innovations. But if you think that the above is not so > > important, then we can reuse VACUUM PARALLEL code and it would > > simplify the final implementation) > > I'd suggest using the existing infrastructure if we can achieve the > goal with it. If we find out there are some technical difficulties to > implement it without new infrastructure, we can revisit this approach. OK, in the near future I'll implement it and send a new patch to this thread. I'll be glad if you will take a look on it) [1] https://www.postgresql.org/message-id/CAA5RZ0vfBg%3Dc_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag%40mail.gmail.com -- Best regards, Daniil Davydov
On Mon, May 5, 2025 at 5:21 PM Sami Imseih <samimseih@gmail.com> wrote: > > >> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote: >> > >> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> > > >> > > > In current implementation, the leader process sends a signal to the >> > > > a/v launcher, and the launcher tries to launch all requested workers. >> > > > But the number of workers never exceeds `autovacuum_max_workers`. >> > > > Thus, we will never have more a/v workers than in the standard case >> > > > (without this feature). >> > > >> > > I have concerns about this design. When autovacuuming on a single >> > > table consumes all available autovacuum_max_workers slots with >> > > parallel vacuum workers, the system becomes incapable of processing >> > > other tables. This means that when determining the appropriate >> > > autovacuum_max_workers value, users must consider not only the number >> > > of tables to be processed concurrently but also the potential number >> > > of parallel workers that might be launched. I think it would more make >> > > sense to maintain the existing autovacuum_max_workers parameter while >> > > introducing a new parameter that would either control the maximum >> > > number of parallel vacuum workers per autovacuum worker or set a >> > > system-wide cap on the total number of parallel vacuum workers. >> > > >> > >> > For now we have max_parallel_index_autovac_workers - this GUC limits >> > the number of parallel a/v workers that can process a single table. I >> > agree that the scenario you provided is problematic. >> > The proposal to limit the total number of supportive a/v workers seems >> > attractive to me (I'll implement it as an experiment). >> > >> > It seems to me that this question is becoming a key one. First we need >> > to determine the role of the user in the whole scheduling mechanism. >> > Should we allow users to determine priority? Will this priority affect >> > only within a single vacuuming cycle, or it will be more 'global'? >> > I guess I don't have enough expertise to determine this alone. I will >> > be glad to receive any suggestions. >> >> What I roughly imagined is that we don't need to change the entire >> autovacuum scheduling, but would like autovacuum workers to decides >> whether or not to use parallel vacuum during its vacuum operation >> based on GUC parameters (having a global effect) or storage parameters >> (having an effect on the particular table). The criteria of triggering >> parallel vacuum in autovacuum might need to be somewhat pessimistic so >> that we don't unnecessarily use parallel vacuum on many tables. > > > Perhaps we should only provide a reloption, therefore only tables specified > by the user via the reloption can be autovacuumed in parallel? > > This gives a targeted approach. Of course if multiple of these allowed tables > are to be autovacuumed at the same time, some may not get all the workers, > But that’s not different from if you are to manually vacuum in parallel the tables > at the same time. > > What do you think ? +1. I think that's a good starting point. We can later introduce a new GUC parameter that globally controls the maximum number of parallel vacuum workers used in autovacuum, if necessary. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, May 6, 2025 at 7:21 AM Sami Imseih <samimseih@gmail.com> wrote: > > Perhaps we should only provide a reloption, therefore only tables specified > by the user via the reloption can be autovacuumed in parallel? Аfter your comments (earlier in this thread) I decided to do just that. For now we have reloption, so the user can decide which tables are "important" for parallel index vacuuming. We also set lower bounds (hardcoded) on the number of indexes and the number of dead tuples. For example, there is no need to use a parallel vacuum if the table has only one index. The situation is more complicated with the number of dead tuples - we need tests that would show the optimal minimum value. This issue is still being worked out. > This gives a targeted approach. Of course if multiple of these allowed tables > are to be autovacuumed at the same time, some may not get all the workers, > But that’s not different from if you are to manually vacuum in parallel the tables > at the same time. I fully agree. Recently v2 patch has been supplemented with a new feature [1] - multiple tables in a cluster can be processed in parallel during autovacuum. And of course, not every a/v worker can get enough supportive processes, but this is considered normal behavior. Maximum number of supportive workers is limited by the GUC variable. [1] I guess that I'll send it within the v3 patch, that will also contain logic that was discussed in the letter above - using bgworkers instead of additional a/v workers. BTW, what do you think about this idea? -- Best regards, Daniil Davydov
> On Mon, May 5, 2025 at 5:21 PM Sami Imseih <samimseih@gmail.com> wrote: > > > > > >> On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote: > >> > > >> > On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > >> > > > >> > > > In current implementation, the leader process sends a signal to the > >> > > > a/v launcher, and the launcher tries to launch all requested workers. > >> > > > But the number of workers never exceeds `autovacuum_max_workers`. > >> > > > Thus, we will never have more a/v workers than in the standard case > >> > > > (without this feature). > >> > > > >> > > I have concerns about this design. When autovacuuming on a single > >> > > table consumes all available autovacuum_max_workers slots with > >> > > parallel vacuum workers, the system becomes incapable of processing > >> > > other tables. This means that when determining the appropriate > >> > > autovacuum_max_workers value, users must consider not only the number > >> > > of tables to be processed concurrently but also the potential number > >> > > of parallel workers that might be launched. I think it would more make > >> > > sense to maintain the existing autovacuum_max_workers parameter while > >> > > introducing a new parameter that would either control the maximum > >> > > number of parallel vacuum workers per autovacuum worker or set a > >> > > system-wide cap on the total number of parallel vacuum workers. > >> > > > >> > > >> > For now we have max_parallel_index_autovac_workers - this GUC limits > >> > the number of parallel a/v workers that can process a single table. I > >> > agree that the scenario you provided is problematic. > >> > The proposal to limit the total number of supportive a/v workers seems > >> > attractive to me (I'll implement it as an experiment). > >> > > >> > It seems to me that this question is becoming a key one. First we need > >> > to determine the role of the user in the whole scheduling mechanism. > >> > Should we allow users to determine priority? Will this priority affect > >> > only within a single vacuuming cycle, or it will be more 'global'? > >> > I guess I don't have enough expertise to determine this alone. I will > >> > be glad to receive any suggestions. > >> > >> What I roughly imagined is that we don't need to change the entire > >> autovacuum scheduling, but would like autovacuum workers to decides > >> whether or not to use parallel vacuum during its vacuum operation > >> based on GUC parameters (having a global effect) or storage parameters > >> (having an effect on the particular table). The criteria of triggering > >> parallel vacuum in autovacuum might need to be somewhat pessimistic so > >> that we don't unnecessarily use parallel vacuum on many tables. > > > > > > Perhaps we should only provide a reloption, therefore only tables specified > > by the user via the reloption can be autovacuumed in parallel? > > > > This gives a targeted approach. Of course if multiple of these allowed tables > > are to be autovacuumed at the same time, some may not get all the workers, > > But that’s not different from if you are to manually vacuum in parallel the tables > > at the same time. > > > > What do you think ? > > +1. I think that's a good starting point. We can later introduce a new > GUC parameter that globally controls the maximum number of parallel > vacuum workers used in autovacuum, if necessary. and I this reloption should also apply to parallel heap vacuum in non-failsafe scenarios. In the failsafe case however, all tables will be eligible for parallel vacuum. Anyhow, that discussion could be taken in that thread, but wanted to point that out. -- Sami Imseih Amazon Web Services (AWS)
Hi, On 09/05/25 15:33, Daniil Davydov wrote: > Hi, > As I promised - meet parallel index autovacuum with bgworkers > (Parallel-index-autovacuum-with-bgworkers.patch). This is pretty > simple implementation : > 1) Added new table option `parallel_idx_autovac_enabled` that must be > set to `true` if user wants autovacuum to process table in parallel. > 2) Added new GUC variable `autovacuum_reserved_workers_num`. This is > number of parallel workers from bgworkers pool that can be used only > by autovacuum workers. The `autovacuum_reserved_workers_num` parameter > actually reserves a requested part of the processes, the total number > of which is equal to `max_worker_processes`. > 3) When an autovacuum worker decides to process some table in > parallel, it just sets `VacuumParams->nworkers` to appropriate value > (> 0) and then the code is executed as if it were a regular VACUUM > PARALLEL. > 4) I kept test/modules/autovacuum as sandbox where you can play with > parallel index autovacuum a bit. > > What do you think about this implementation? > I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is failing: ❯❯❯ ninja -C build install ninja: Entering directory `build' [1/126] Compiling C object src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o ../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible pointer to integer conversion initializing 'int' with an expression of type 'void *' [-Wint-conversion] 3613 | NULL, | ^~~~ It seems that the "autovacuum_reserved_workers_num" declaration on guc_tables.c has an extra gettext_noop() call? One other point is that as you've added TAP tests for the autovacuum I think you also need to create a meson.build file as you already create the Makefile. You also need to update the src/test/modules/meson.build and src/test/modules/Makefile to include the new test/modules/autovacuum path. -- Matheus Alcantara
Hi, On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara <matheusssilv97@gmail.com> wrote: > I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is > failing: > ❯❯❯ ninja -C build install > ninja: Entering directory `build' > [1/126] Compiling C object > src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o > FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o > ../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible > pointer to integer conversion initializing 'int' with an expression of > type 'void *' [-Wint-conversion] > 3613 | NULL, > | ^~~~ > Thank you for reviewing this patch! > It seems that the "autovacuum_reserved_workers_num" declaration on > guc_tables.c has an extra gettext_noop() call? Good catch, I fixed this warning in the v2 version. > > One other point is that as you've added TAP tests for the autovacuum I > think you also need to create a meson.build file as you already create > the Makefile. > > You also need to update the src/test/modules/meson.build and > src/test/modules/Makefile to include the new test/modules/autovacuum > path. > OK, I should clarify this moment : modules/autovacuum is not a normal test but a sandbox - just an example of how we can trigger parallel index autovacuum. Also it may be used for debugging purposes. In fact, 001_autovac_parallel.pl is not verifying anything. I'll do as you asked (add all meson and Make stuff), but please don't focus on it. The creation of the real test is still in progress. (I'll try to complete it as soon as possible). In this letter I will divide the patch into 2 parts : implementation and sandbox. What do you think about implementation? -- Best regards, Daniil Davydov
Вложения
On Thu, May 15, 2025 at 10:10 PM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara > <matheusssilv97@gmail.com> wrote: > > I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is > > failing: > > ❯❯❯ ninja -C build install > > ninja: Entering directory `build' > > [1/126] Compiling C object > > src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o > > FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o > > ../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible > > pointer to integer conversion initializing 'int' with an expression of > > type 'void *' [-Wint-conversion] > > 3613 | NULL, > > | ^~~~ > > > > Thank you for reviewing this patch! > > > It seems that the "autovacuum_reserved_workers_num" declaration on > > guc_tables.c has an extra gettext_noop() call? > > Good catch, I fixed this warning in the v2 version. > > > > > One other point is that as you've added TAP tests for the autovacuum I > > think you also need to create a meson.build file as you already create > > the Makefile. > > > > You also need to update the src/test/modules/meson.build and > > src/test/modules/Makefile to include the new test/modules/autovacuum > > path. > > > > OK, I should clarify this moment : modules/autovacuum is not a normal > test but a sandbox - just an example of how we can trigger parallel > index autovacuum. Also it may be used for debugging purposes. > In fact, 001_autovac_parallel.pl is not verifying anything. > I'll do as you asked (add all meson and Make stuff), but please don't > focus on it. The creation of the real test is still in progress. (I'll > try to complete it as soon as possible). > > In this letter I will divide the patch into 2 parts : implementation > and sandbox. What do you think about implementation? Thank you for updating the patches. I have some comments on v2-0001 patch: + { + {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES, + gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."), + gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). " + "*Only* autovacuum workers can use these additional processes. " + "Also, these processes are taken into account in \"max_parallel_workers\"."), + }, + &av_reserved_workers_num, + 0, 0, MAX_BACKENDS, + check_autovacuum_reserved_workers_num, NULL, NULL + }, I find that the name "autovacuum_reserved_workers_num" is generic. It would be better to have a more specific name for parallel vacuum such as autovacuum_max_parallel_workers. This parameter is related to neither autovacuum_worker_slots nor autovacuum_max_workers, which seems fine to me. Also, max_parallel_maintenance_workers doesn't affect this parameter. Which number does this parameter mean to specify: the maximum number of parallel vacuum workers that can be used during autovacuum or the maximum number of parallel vacuum workers that each autovacuum can use? --- The patch includes the changes to bgworker.c so that we can reserve some slots for autovacuums. I guess that this change is not necessarily necessary because if the user sets the related GUC parameters correctly the autovacuum workers can use parallel vacuum as expected. Even if we need this change, I would suggest implementing it as a separate patch. --- + { + { + "parallel_idx_autovac_enabled", + "Allows autovacuum to process indexes of this table in parallel mode", + RELOPT_KIND_HEAP, + ShareUpdateExclusiveLock + }, + false + }, The proposed reloption name doesn't align with our naming conventions. Looking at our existing reloptions, we typically write out full words rather than using abbreviations like 'autovac' or 'idx'. The new reloption name seems not to follow the conventional naming style for existing reloption. For instance, we don't use abbreviations such as 'autovac' and 'idx'. I guess we can implement this parameter as an integer parameter so that the user can specify the number of parallel vacuum workers for the table. For example, we can have a reloption autovacuum_parallel_workers. Setting 0 (by default) means to disable parallel vacuum during autovacuum, and setting special value -1 means to let PostgreSQL calculate the parallel degree for the table (same as the default VACUUM command behavior). I've also considered some alternative names. If we were to use parallel_maintenance_workers, it sounds like it controls the parallel degree for all operations using max_parallel_maintenance_workers, including CREATE INDEX. Similarly, vacuum_parallel_workers could be interpreted as affecting both autovacuum and manual VACUUM commands, suggesting that when users run "VACUUM (PARALLEL) t", the system would use their specified value for the parallel degree. I prefer autovacuum_parallel_workers or vacuum_parallel_workers. --- + /* + * If we are running autovacuum - decide whether we need to process indexes + * of table with given oid in parallel. + */ + if (AmAutoVacuumWorkerProcess() && + params->index_cleanup != VACOPTVALUE_DISABLED && + RelationAllowsParallelIdxAutovac(rel)) I think that this should be done in autovacuum code. --- +/* + * Minimum number of dead tuples required for the table's indexes to be + * processed in parallel during autovacuum. + */ +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 + +/* + * How many indexes should process each parallel worker during autovacuum. + */ +#define NUM_INDEXES_PER_PARALLEL_WORKER 30 These fixed values really useful in common cases? I think we already have an optimization where we skip vacuum indexes if the table has fewer dead tuples (see BYPASS_THRESHOLD_PAGES). Given that we rely on users' heuristics which table needs to use parallel vacuum during autovacuum, I think we don't need to apply these conditions. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
I started looking at the patch but I have some high level thoughts I would like to share before looking further. > > I find that the name "autovacuum_reserved_workers_num" is generic. It > > would be better to have a more specific name for parallel vacuum such > > as autovacuum_max_parallel_workers. This parameter is related to > > neither autovacuum_worker_slots nor autovacuum_max_workers, which > > seems fine to me. Also, max_parallel_maintenance_workers doesn't > > affect this parameter. > > ....... > > I've also considered some alternative names. If we were to use > > parallel_maintenance_workers, it sounds like it controls the parallel > > degree for all operations using max_parallel_maintenance_workers, > > including CREATE INDEX. Similarly, vacuum_parallel_workers could be > > interpreted as affecting both autovacuum and manual VACUUM commands, > > suggesting that when users run "VACUUM (PARALLEL) t", the system would > > use their specified value for the parallel degree. I prefer > > autovacuum_parallel_workers or vacuum_parallel_workers. > > > > This was my headache when I created names for variables. Autovacuum > initially implies parallelism, because we have several parallel a/v > workers. So I think that parameter like > `autovacuum_max_parallel_workers` will confuse somebody. > If we want to have a more specific name, I would prefer > `max_parallel_index_autovacuum_workers`. I don't think we should have a separate pool of parallel workers for those that are used to support parallel autovacuum. At the end of the day, these are parallel workers and they should be capped by max_parallel_workers. I think it will be confusing if we claim these are parallel workers, but they are coming from a different pool. I envision we have another GUC such as "max_parallel_autovacuum_workers" (which I think is a better name) that matches the behavior of "max_parallel_maintenance_worker". Meaning that the autovacuum workers still maintain their existing behavior ( launching a worker per table ), and if they do need to vacuum in parallel, they can draw from a pool of parallel workers. With the above said, I therefore think the reloption should actually be a number of parallel workers rather than a boolean. Let's take an example of a user that has 3 tables they wish to (auto)vacuum can process in parallel, and if available they wish each of these tables could be autovacuumed with 4 parallel workers. However, as to not overload the system, they cap the 'max_parallel_maintenance_worker' to something like 8. If it so happens that all 3 tables are auto-vacuumed at the same time, there may not be enough parallel workers, so one table will be a loser and be vacuumed in serial. That is acceptable, and a/v logging ( and perhaps other stat views ) should display this behavior: workers planned vs workers launched. thoughts? -- Sami Imseih Amazon Web Services (AWS)
On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I have some comments on v2-0001 patch > > Thank you for reviewing this patch! > > > + { > > + {"autovacuum_reserved_workers_num", PGC_USERSET, > > RESOURCES_WORKER_PROCESSES, > > + gettext_noop("Number of worker processes, reserved for > > participation in parallel index processing during autovacuum."), > > + gettext_noop("This parameter is depending on > > \"max_worker_processes\" (not on \"autovacuum_max_workers\"). " > > + "*Only* autovacuum workers can use these > > additional processes. " > > + "Also, these processes are taken into account > > in \"max_parallel_workers\"."), > > + }, > > + &av_reserved_workers_num, > > + 0, 0, MAX_BACKENDS, > > + check_autovacuum_reserved_workers_num, NULL, NULL > > + }, > > > > I find that the name "autovacuum_reserved_workers_num" is generic. It > > would be better to have a more specific name for parallel vacuum such > > as autovacuum_max_parallel_workers. This parameter is related to > > neither autovacuum_worker_slots nor autovacuum_max_workers, which > > seems fine to me. Also, max_parallel_maintenance_workers doesn't > > affect this parameter. > > ....... > > I've also considered some alternative names. If we were to use > > parallel_maintenance_workers, it sounds like it controls the parallel > > degree for all operations using max_parallel_maintenance_workers, > > including CREATE INDEX. Similarly, vacuum_parallel_workers could be > > interpreted as affecting both autovacuum and manual VACUUM commands, > > suggesting that when users run "VACUUM (PARALLEL) t", the system would > > use their specified value for the parallel degree. I prefer > > autovacuum_parallel_workers or vacuum_parallel_workers. > > > > This was my headache when I created names for variables. Autovacuum > initially implies parallelism, because we have several parallel a/v > workers. I'm not sure if it's parallelism. We can have multiple autovacuum workers simultaneously working on different tables, which seems not parallelism to me. > So I think that parameter like > `autovacuum_max_parallel_workers` will confuse somebody. > If we want to have a more specific name, I would prefer > `max_parallel_index_autovacuum_workers`. It's better not to use 'index' as we're trying to extend parallel vacuum to heap scanning/vacuuming as well[1]. > > > + /* > > + * If we are running autovacuum - decide whether we need to process indexes > > + * of table with given oid in parallel. > > + */ > > + if (AmAutoVacuumWorkerProcess() && > > + params->index_cleanup != VACOPTVALUE_DISABLED && > > + RelationAllowsParallelIdxAutovac(rel)) > > > > I think that this should be done in autovacuum code. > > We need params->index cleanup variable to decide whether we need to > use parallel index a/v. In autovacuum.c we have this code : > *** > /* > * index_cleanup and truncate are unspecified at first in autovacuum. > * They will be filled in with usable values using their reloptions > * (or reloption defaults) later. > */ > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED; > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED; > *** > This variable is filled in inside the `vacuum_rel` function, so I > think we should keep the above logic in vacuum.c. I guess that we can specify the parallel degree even if index_cleanup is still UNSPECIFIED. vacuum_rel() would then decide whether to use index vacuuming and vacuumlazy.c would decide whether to use parallel vacuum based on the specified parallel degree and index_cleanup value. > > > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 > > > > These fixed values really useful in common cases? I think we already > > have an optimization where we skip vacuum indexes if the table has > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES). > > When we allocate dead items (and optionally init parallel autocuum) we > don't have sane value for `vacrel->lpdead_item_pages` (which should be > compared with BYPASS_THRESHOLD_PAGES). > The only criterion that we can focus on is the number of dead tuples > indicated in the PgStat_StatTabEntry. My point is that this criterion might not be useful. We have the bypass optimization for index vacuuming and having many dead tuples doesn't necessarily mean index vacuuming taking a long time. For example, even if the table has a few dead tuples, index vacuuming could take a very long time and parallel index vacuuming would help the situation, if the table is very large and has many indexes. > > ---- > > > I guess we can implement this parameter as an integer parameter so > > that the user can specify the number of parallel vacuum workers for > > the table. For example, we can have a reloption > > autovacuum_parallel_workers. Setting 0 (by default) means to disable > > parallel vacuum during autovacuum, and setting special value -1 means > > to let PostgreSQL calculate the parallel degree for the table (same as > > the default VACUUM command behavior). > > ........... > > The patch includes the changes to bgworker.c so that we can reserve > > some slots for autovacuums. I guess that this change is not > > necessarily necessary because if the user sets the related GUC > > parameters correctly the autovacuum workers can use parallel vacuum as > > expected. Even if we need this change, I would suggest implementing > > it as a separate patch. > > .......... > > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 > > +#define NUM_INDEXES_PER_PARALLEL_WORKER 30 > > > > These fixed values really useful in common cases? Given that we rely on > > users' heuristics which table needs to use parallel vacuum during > > autovacuum, I think we don't need to apply these conditions. > > .......... > > I grouped these comments together, because they all relate to a single > question : how much freedom will we give to the user? > Your opinion (as far as I understand) is that we allow users to > specify any number of parallel workers for tables, and it is the > user's responsibility to configure appropriate GUC variables, so that > autovacuum can always process indexes in parallel. > And we don't need to think about thresholds. Even if the table has a > small number of indexes and dead rows - if the user specified table > option, we must do a parallel index a/v with requested number of > parallel workers. > Please correct me if I messed something up. > > I think that this logic is well suited for the `VACUUM (PARALLEL)` sql > command, which is manually called by the user. The current idea that users can use parallel vacuum on particular tables based on their heuristic makes sense to me as the first implementation. > But autovacuum (as I think) should work as stable as possible and > `unnoticed` by other processes. Thus, we must : > 1) Compute resources (such as the number of parallel workers for a > single table's indexes vacuuming) as efficiently as possible. > 2) Provide a guarantee that as many tables as possible (among > requested) will be processed in parallel. I think these ideas could be implemented on top of the current idea. > (1) can be achieved by calculating the parameters on the fly. > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more > accurate value in the near future. I think it requires more things than the number of indexes on the table to achieve (1). Suppose that there is a very large table that gets updates heavily and has a few indexes. If users want to avoid the table from being bloated, it would be a reasonable idea to use parallel vacuum during autovacuum and it would not be a good idea to disallow using parallel vacuum solely because it doesn't have more than 30 indexes. On the other hand, if the table had got many updates but not so now, users might want to use resources for autovacuums on other tables. We might need to consider autovacuum frequencies per table, the statistics of the previous autovacuum, or system loads etc. So I think that in order to achieve (1) we might need more statistics and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine. > (2) can be achieved by workers reserving - we know that N workers > (from bgworkers pool) are *always* at our disposal. And when we use > such workers we are not dependent on other operations in the cluster > and we don't interfere with other operations by taking resources away > from them. Reserving some bgworkers for autovacuum could make sense. But I think it's better to implement it in a general way as it could be useful in other use cases too. That is, it might be a good to implement infrastructure so that any PostgreSQL code (possibly including extensions) can request allocating a pool of bgworkers for specific usage and use bgworkers from them. Regards, [1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, May 22, 2025 at 10:48 AM Sami Imseih <samimseih@gmail.com> wrote: > > I started looking at the patch but I have some high level thoughts I would > like to share before looking further. > > > > I find that the name "autovacuum_reserved_workers_num" is generic. It > > > would be better to have a more specific name for parallel vacuum such > > > as autovacuum_max_parallel_workers. This parameter is related to > > > neither autovacuum_worker_slots nor autovacuum_max_workers, which > > > seems fine to me. Also, max_parallel_maintenance_workers doesn't > > > affect this parameter. > > > ....... > > > I've also considered some alternative names. If we were to use > > > parallel_maintenance_workers, it sounds like it controls the parallel > > > degree for all operations using max_parallel_maintenance_workers, > > > including CREATE INDEX. Similarly, vacuum_parallel_workers could be > > > interpreted as affecting both autovacuum and manual VACUUM commands, > > > suggesting that when users run "VACUUM (PARALLEL) t", the system would > > > use their specified value for the parallel degree. I prefer > > > autovacuum_parallel_workers or vacuum_parallel_workers. > > > > > > > This was my headache when I created names for variables. Autovacuum > > initially implies parallelism, because we have several parallel a/v > > workers. So I think that parameter like > > `autovacuum_max_parallel_workers` will confuse somebody. > > If we want to have a more specific name, I would prefer > > `max_parallel_index_autovacuum_workers`. > > I don't think we should have a separate pool of parallel workers for those > that are used to support parallel autovacuum. At the end of the day, these > are parallel workers and they should be capped by max_parallel_workers. I think > it will be confusing if we claim these are parallel workers, but they > are coming from > a different pool. I agree that parallel vacuum workers used during autovacuum should be capped by the max_parallel_workers. > > I envision we have another GUC such as "max_parallel_autovacuum_workers" > (which I think is a better name) that matches the behavior of > "max_parallel_maintenance_worker". Meaning that the autovacuum workers > still maintain their existing behavior ( launching a worker per table > ), and if they do need > to vacuum in parallel, they can draw from a pool of parallel workers. > > With the above said, I therefore think the reloption should actually be a number > of parallel workers rather than a boolean. Let's take an example of a > user that has 3 tables > they wish to (auto)vacuum can process in parallel, and if available > they wish each of these tables > could be autovacuumed with 4 parallel workers. However, as to not > overload the system, they > cap the 'max_parallel_maintenance_worker' to something like 8. If it > so happens that all > 3 tables are auto-vacuumed at the same time, there may not be enough > parallel workers, > so one table will be a loser and be vacuumed in serial. +1 for the reloption having a number of parallel workers, leaving aside the name competition. > That is > acceptable, and a/v logging > ( and perhaps other stat views ) should display this behavior: workers > planned vs workers launched. Agreed. The workers planned vs. launched is reported only with VERBOSE option so we need to change it so that autovacuum can log it at least. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > > > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > I find that the name "autovacuum_reserved_workers_num" is generic. It > > > would be better to have a more specific name for parallel vacuum such > > > as autovacuum_max_parallel_workers. This parameter is related to > > > neither autovacuum_worker_slots nor autovacuum_max_workers, which > > > seems fine to me. Also, max_parallel_maintenance_workers doesn't > > > affect this parameter. > > > > This was my headache when I created names for variables. Autovacuum > > initially implies parallelism, because we have several parallel a/v > > workers. > > I'm not sure if it's parallelism. We can have multiple autovacuum > workers simultaneously working on different tables, which seems not > parallelism to me. Hm, I didn't thought about the 'parallelism' definition in this way. But I see your point - the next v4 patch will contain the naming that you suggest. > > > So I think that parameter like > > `autovacuum_max_parallel_workers` will confuse somebody. > > If we want to have a more specific name, I would prefer > > `max_parallel_index_autovacuum_workers`. > > It's better not to use 'index' as we're trying to extend parallel > vacuum to heap scanning/vacuuming as well[1]. OK, I'll fix it. > > > + /* > > > + * If we are running autovacuum - decide whether we need to process indexes > > > + * of table with given oid in parallel. > > > + */ > > > + if (AmAutoVacuumWorkerProcess() && > > > + params->index_cleanup != VACOPTVALUE_DISABLED && > > > + RelationAllowsParallelIdxAutovac(rel)) > > > > > > I think that this should be done in autovacuum code. > > > > We need params->index cleanup variable to decide whether we need to > > use parallel index a/v. In autovacuum.c we have this code : > > *** > > /* > > * index_cleanup and truncate are unspecified at first in autovacuum. > > * They will be filled in with usable values using their reloptions > > * (or reloption defaults) later. > > */ > > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED; > > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED; > > *** > > This variable is filled in inside the `vacuum_rel` function, so I > > think we should keep the above logic in vacuum.c. > > I guess that we can specify the parallel degree even if index_cleanup > is still UNSPECIFIED. vacuum_rel() would then decide whether to use > index vacuuming and vacuumlazy.c would decide whether to use parallel > vacuum based on the specified parallel degree and index_cleanup value. > > > > > > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 > > > > > > These fixed values really useful in common cases? I think we already > > > have an optimization where we skip vacuum indexes if the table has > > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES). > > > > When we allocate dead items (and optionally init parallel autocuum) we > > don't have sane value for `vacrel->lpdead_item_pages` (which should be > > compared with BYPASS_THRESHOLD_PAGES). > > The only criterion that we can focus on is the number of dead tuples > > indicated in the PgStat_StatTabEntry. > > My point is that this criterion might not be useful. We have the > bypass optimization for index vacuuming and having many dead tuples > doesn't necessarily mean index vacuuming taking a long time. For > example, even if the table has a few dead tuples, index vacuuming > could take a very long time and parallel index vacuuming would help > the situation, if the table is very large and has many indexes. That sounds reasonable. I'll fix it. > > But autovacuum (as I think) should work as stable as possible and > > `unnoticed` by other processes. Thus, we must : > > 1) Compute resources (such as the number of parallel workers for a > > single table's indexes vacuuming) as efficiently as possible. > > 2) Provide a guarantee that as many tables as possible (among > > requested) will be processed in parallel. > > > > (1) can be achieved by calculating the parameters on the fly. > > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more > > accurate value in the near future. > > I think it requires more things than the number of indexes on the > table to achieve (1). Suppose that there is a very large table that > gets updates heavily and has a few indexes. If users want to avoid the > table from being bloated, it would be a reasonable idea to use > parallel vacuum during autovacuum and it would not be a good idea to > disallow using parallel vacuum solely because it doesn't have more > than 30 indexes. On the other hand, if the table had got many updates > but not so now, users might want to use resources for autovacuums on > other tables. We might need to consider autovacuum frequencies per > table, the statistics of the previous autovacuum, or system loads etc. > So I think that in order to achieve (1) we might need more statistics > and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine. > It's hard for me to imagine exactly how extended statistics will help us track such situations. It seems that for any of our heuristics, it will be possible to come up with a counter example. Maybe we can give advices (via logs) to the user? But for such an idea, tests should be conducted so that we can understand when resource consumption becomes ineffective. I guess that we need to agree on an implementation before conducting such tests. > > (2) can be achieved by workers reserving - we know that N workers > > (from bgworkers pool) are *always* at our disposal. And when we use > > such workers we are not dependent on other operations in the cluster > > and we don't interfere with other operations by taking resources away > > from them. > > Reserving some bgworkers for autovacuum could make sense. But I think > it's better to implement it in a general way as it could be useful in > other use cases too. That is, it might be a good to implement > infrastructure so that any PostgreSQL code (possibly including > extensions) can request allocating a pool of bgworkers for specific > usage and use bgworkers from them. Reserving infrastructure is an ambitious idea. I am not sure that we should implement it within this thread and feature. Maybe we should create a separate thread for it and as a justification, refer to parallel autovacuum? ----- Thanks everybody for feedback! I attach a v4 patch to this letter. Main features : 1) 'parallel_autovacuum_workers' reloption - integer value, that sets the maximum number of parallel a/v workers that can be taken from bgworkers pool in order to process this table. 2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the maximum total number of parallel a/v workers, that can be taken from bgworkers pool. 3) Parallel autovacuum does not try to use thresholds like NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD. 4) Parallel autovacuum now can report statistics like "planned vs. launched". 5) For now I got rid of the 'reserving' idea, so now autovacuum leaders are competing with everyone for parallel workers from the bgworkers pool. What do you think about this implementation? -- Best regards, Daniil Davydov
Вложения
On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > > > > > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > I find that the name "autovacuum_reserved_workers_num" is generic. It > > > > would be better to have a more specific name for parallel vacuum such > > > > as autovacuum_max_parallel_workers. This parameter is related to > > > > neither autovacuum_worker_slots nor autovacuum_max_workers, which > > > > seems fine to me. Also, max_parallel_maintenance_workers doesn't > > > > affect this parameter. > > > > > > This was my headache when I created names for variables. Autovacuum > > > initially implies parallelism, because we have several parallel a/v > > > workers. > > > > I'm not sure if it's parallelism. We can have multiple autovacuum > > workers simultaneously working on different tables, which seems not > > parallelism to me. > > Hm, I didn't thought about the 'parallelism' definition in this way. > But I see your point - the next v4 patch will contain the naming that > you suggest. > > > > > > So I think that parameter like > > > `autovacuum_max_parallel_workers` will confuse somebody. > > > If we want to have a more specific name, I would prefer > > > `max_parallel_index_autovacuum_workers`. > > > > It's better not to use 'index' as we're trying to extend parallel > > vacuum to heap scanning/vacuuming as well[1]. > > OK, I'll fix it. > > > > > + /* > > > > + * If we are running autovacuum - decide whether we need to process indexes > > > > + * of table with given oid in parallel. > > > > + */ > > > > + if (AmAutoVacuumWorkerProcess() && > > > > + params->index_cleanup != VACOPTVALUE_DISABLED && > > > > + RelationAllowsParallelIdxAutovac(rel)) > > > > > > > > I think that this should be done in autovacuum code. > > > > > > We need params->index cleanup variable to decide whether we need to > > > use parallel index a/v. In autovacuum.c we have this code : > > > *** > > > /* > > > * index_cleanup and truncate are unspecified at first in autovacuum. > > > * They will be filled in with usable values using their reloptions > > > * (or reloption defaults) later. > > > */ > > > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED; > > > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED; > > > *** > > > This variable is filled in inside the `vacuum_rel` function, so I > > > think we should keep the above logic in vacuum.c. > > > > I guess that we can specify the parallel degree even if index_cleanup > > is still UNSPECIFIED. vacuum_rel() would then decide whether to use > > index vacuuming and vacuumlazy.c would decide whether to use parallel > > vacuum based on the specified parallel degree and index_cleanup value. > > > > > > > > > +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 > > > > > > > > These fixed values really useful in common cases? I think we already > > > > have an optimization where we skip vacuum indexes if the table has > > > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES). > > > > > > When we allocate dead items (and optionally init parallel autocuum) we > > > don't have sane value for `vacrel->lpdead_item_pages` (which should be > > > compared with BYPASS_THRESHOLD_PAGES). > > > The only criterion that we can focus on is the number of dead tuples > > > indicated in the PgStat_StatTabEntry. > > > > My point is that this criterion might not be useful. We have the > > bypass optimization for index vacuuming and having many dead tuples > > doesn't necessarily mean index vacuuming taking a long time. For > > example, even if the table has a few dead tuples, index vacuuming > > could take a very long time and parallel index vacuuming would help > > the situation, if the table is very large and has many indexes. > > That sounds reasonable. I'll fix it. > > > > But autovacuum (as I think) should work as stable as possible and > > > `unnoticed` by other processes. Thus, we must : > > > 1) Compute resources (such as the number of parallel workers for a > > > single table's indexes vacuuming) as efficiently as possible. > > > 2) Provide a guarantee that as many tables as possible (among > > > requested) will be processed in parallel. > > > > > > (1) can be achieved by calculating the parameters on the fly. > > > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more > > > accurate value in the near future. > > > > I think it requires more things than the number of indexes on the > > table to achieve (1). Suppose that there is a very large table that > > gets updates heavily and has a few indexes. If users want to avoid the > > table from being bloated, it would be a reasonable idea to use > > parallel vacuum during autovacuum and it would not be a good idea to > > disallow using parallel vacuum solely because it doesn't have more > > than 30 indexes. On the other hand, if the table had got many updates > > but not so now, users might want to use resources for autovacuums on > > other tables. We might need to consider autovacuum frequencies per > > table, the statistics of the previous autovacuum, or system loads etc. > > So I think that in order to achieve (1) we might need more statistics > > and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine. > > > > It's hard for me to imagine exactly how extended statistics will help > us track such situations. > It seems that for any of our heuristics, it will be possible to come > up with a counter example. > Maybe we can give advices (via logs) to the user? But for such an > idea, tests should be conducted so that we can understand when > resource consumption becomes ineffective. > I guess that we need to agree on an implementation before conducting such tests. > > > > (2) can be achieved by workers reserving - we know that N workers > > > (from bgworkers pool) are *always* at our disposal. And when we use > > > such workers we are not dependent on other operations in the cluster > > > and we don't interfere with other operations by taking resources away > > > from them. > > > > Reserving some bgworkers for autovacuum could make sense. But I think > > it's better to implement it in a general way as it could be useful in > > other use cases too. That is, it might be a good to implement > > infrastructure so that any PostgreSQL code (possibly including > > extensions) can request allocating a pool of bgworkers for specific > > usage and use bgworkers from them. > > Reserving infrastructure is an ambitious idea. I am not sure that we > should implement it within this thread and feature. > Maybe we should create a separate thread for it and as a > justification, refer to parallel autovacuum? > > ----- > Thanks everybody for feedback! I attach a v4 patch to this letter. > Main features : > 1) 'parallel_autovacuum_workers' reloption - integer value, that sets > the maximum number of parallel a/v workers that can be taken from > bgworkers pool in order to process this table. > 2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the > maximum total number of parallel a/v workers, that can be taken from > bgworkers pool. > 3) Parallel autovacuum does not try to use thresholds like > NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD. > 4) Parallel autovacuum now can report statistics like "planned vs. launched". > 5) For now I got rid of the 'reserving' idea, so now autovacuum > leaders are competing with everyone for parallel workers from the > bgworkers pool. > > What do you think about this implementation? > I think it basically makes sense to me. A few comments: --- The patch implements max_parallel_autovacuum_workers as a PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we don't necessarily need to make it a PGC_POSTMATER since it actually doesn't affect how much shared memory we need to allocate. --- I think it's better to have the prefix "autovacuum" for the new GUC parameter for better consistency with other autovacuum-related GUC parameters. --- #include "storage/spin.h" @@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt) { WaitForParallelWorkersToFinish(pcxt); WaitForParallelWorkersToExit(pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched); + pcxt->nworkers_launched = 0; if (pcxt->known_attached_workers) { @@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt) */ HOLD_INTERRUPTS(); WaitForParallelWorkersToExit(pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched); + RESUME_INTERRUPTS(); I think that it's better to release workers in vacuumparallel.c rather than parallel.c. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed Jun 18, 2025 at 5:03 AM -03, Daniil Davydov wrote: > > Thanks for the review! Please, see v5 patch : > 1) GUC variable and field in autovacuum shmem are renamed > 2) ParallelAutoVacuumReleaseWorkers call moved from parallel.c to > vacuumparallel.c > 3) max_parallel_autovacuum_workers is now PGC_SIGHUP parameter > 4) Fix little bug (ParallelAutoVacuumReleaseWorkers in autovacuum.c:735) > Thanks for the new version! The "autovacuum_max_parallel_workers" declared on guc_tables.c mention that is capped by "max_worker_process": + { + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."), + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), + }, + &autovacuum_max_parallel_workers, + 0, 0, MAX_BACKENDS, + check_autovacuum_max_parallel_workers, NULL, NULL + }, But the postgresql.conf.sample say that it is limited by max_parallel_workers: +#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers IIUC the code, it cap by "max_worker_process", but Masahiko has mention on [1] that it should be capped by max_parallel_workers. --- We actually capping the autovacuum_max_parallel_workers by max_worker_process-1, so we can't have 10 max_worker_process and 10 autovacuum_max_parallel_workers. Is that correct? +bool +check_autovacuum_max_parallel_workers(int *newval, void **extra, + GucSource source) +{ + if (*newval >= max_worker_processes) + return false; + return true; +} --- Locking unnecessary the AutovacuumLock if none if the if's is true can cause some performance issue here? I don't think that this would be a serious problem because this code will only be called if the configuration file is changed during the autovacuum execution right? But I could be wrong, so just sharing my thoughts on this (still learning about [auto]vacuum code). + +/* + * Make sure that number of available parallel workers corresponds to the + * autovacuum_max_parallel_workers parameter (after it was changed). + */ +static void +check_parallel_av_gucs(int prev_max_parallel_workers) +{ + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + if (AutoVacuumShmem->av_available_parallel_workers > + autovacuum_max_parallel_workers) + { + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers); + Typo on "exeed" + /* + * Number of available workers must not exeed limit. + * + * Note, that if some parallel autovacuum workers are running at this + * moment, available workers number will not exeed limit after releasing + * them (see ParallelAutoVacuumReleaseWorkers). + */ --- I'm not seeing an usage of this macro? +/* + * RelationGetParallelAutovacuumWorkers + * Returns the relation's parallel_autovacuum_workers reloption setting. + * Note multiple eval of argument! + */ +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \ + ((relation)->rd_options ? \ + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \ + (defaultpw)) + --- Also pgindent is needed on some files. --- I've made some tests and I can confirm that is working correctly for what I can see. I think that would be to start include the documentation changes, what do you think? [1] https://www.postgresql.org/message-id/CAD21AoAxTkpkLtJDgrH9dXg_h%2ByzOZpOZj3B-4FjW1Mr4qEdbQ%40mail.gmail.com -- Matheus Alcantara
Hi, On Fri, Jul 4, 2025 at 9:21 PM Matheus Alcantara <matheusssilv97@gmail.com> wrote: > > The "autovacuum_max_parallel_workers" declared on guc_tables.c mention > that is capped by "max_worker_process": > + { > + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, > + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."), > + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), > + }, > + &autovacuum_max_parallel_workers, > + 0, 0, MAX_BACKENDS, > + check_autovacuum_max_parallel_workers, NULL, NULL > + }, > > IIUC the code, it cap by "max_worker_process", but Masahiko has mention > on [1] that it should be capped by max_parallel_workers. > Thanks for looking into it! To be honest, I don't think that this parameter should be explicitly capped at all. Other parallel operations (for example parallel index build or VACUUM PARALLEL) just request as many workers as they want without looking at 'max_parallel_workers'. And they will not complain, if not all requested workers were launched. Thus, even if 'autovacuum_max_parallel_workers' is higher than 'max_parallel_workers' the worst that can happen is that not all requested workers will be running (which is a common situation). Users can handle it by looking for logs like "planned vs. launched" and increasing 'max_parallel_workers' if needed. On the other hand, obviously it doesn't make sense to request more workers than 'max_worker_processes' (moreover, this parameter cannot be changed as easily as 'max_parallel_workers'). I will keep the 'max_worker_processes' limit, so autovacuum will not waste time initializing a parallel context if there is no chance that the request will succeed. But it's worth remembering that actually the 'autovacuum_max_parallel_workers' parameter will always be implicitly capped by 'max_parallel_workers'. What do you think about it? > But the postgresql.conf.sample say that it is limited by > max_parallel_workers: > +#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers Good catch, I'll fix it. > --- > > We actually capping the autovacuum_max_parallel_workers by > max_worker_process-1, so we can't have 10 max_worker_process and 10 > autovacuum_max_parallel_workers. Is that correct? Yep. The explanation can be found just above in this letter. > --- > > Locking unnecessary the AutovacuumLock if none if the if's is true can > cause some performance issue here? I don't think that this would be a > serious problem because this code will only be called if the > configuration file is changed during the autovacuum execution right? But > I could be wrong, so just sharing my thoughts on this (still learning > about [auto]vacuum code). > > + > +/* > + * Make sure that number of available parallel workers corresponds to the > + * autovacuum_max_parallel_workers parameter (after it was changed). > + */ > +static void > +check_parallel_av_gucs(int prev_max_parallel_workers) > +{ > + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); > + > + if (AutoVacuumShmem->av_available_parallel_workers > > + autovacuum_max_parallel_workers) > + { > + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers); > + > This function may be called by a/v launcher when we already have some a/v workers running. A/v workers can change the AutoVacuumShmem->av_available_parallel_workers value, so I think we should acquire appropriate lock before reading it. > Typo on "exeed" > > + /* > + * Number of available workers must not exeed limit. > + * > + * Note, that if some parallel autovacuum workers are running at this > + * moment, available workers number will not exeed limit after releasing > + * them (see ParallelAutoVacuumReleaseWorkers). > + */ Oops. I'll fix it. > --- > > I'm not seeing an usage of this macro? > +/* > + * RelationGetParallelAutovacuumWorkers > + * Returns the relation's parallel_autovacuum_workers reloption setting. > + * Note multiple eval of argument! > + */ > +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \ > + ((relation)->rd_options ? \ > + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \ > + (defaultpw)) > + > Yes, this is the relic of a past implementation. I'll delete this macro. > > I've made some tests and I can confirm that is working correctly for > what I can see. I think that would be to start include the documentation > changes, what do you think? > It sounds tempting :) But perhaps first we should agree on the limitation of the 'autovacuum_max_parallel_workers' parameter. Please, see v6 patches : 1) Fixed typos in autovacuum.c and postgresql.conf.sample 2) Removed unused macro 'RelationGetParallelAutovacuumWorkers' -- Best regards, Daniil Davydov
Вложения
On Sun Jul 6, 2025 at 5:00 AM -03, Daniil Davydov wrote: >> The "autovacuum_max_parallel_workers" declared on guc_tables.c mention >> that is capped by "max_worker_process": >> + { >> + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, >> + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkerspool."), >> + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), >> + }, >> + &autovacuum_max_parallel_workers, >> + 0, 0, MAX_BACKENDS, >> + check_autovacuum_max_parallel_workers, NULL, NULL >> + }, >> >> IIUC the code, it cap by "max_worker_process", but Masahiko has mention >> on [1] that it should be capped by max_parallel_workers. > To be honest, I don't think that this parameter should be explicitly > capped at all. > Other parallel operations (for example parallel index build or VACUUM > PARALLEL) just request as many workers as they want without looking at > 'max_parallel_workers'. > And they will not complain, if not all requested workers were launched. > > Thus, even if 'autovacuum_max_parallel_workers' is higher than > 'max_parallel_workers' the worst that can happen is that not all > requested workers will be running (which is a common situation). > Users can handle it by looking for logs like "planned vs. launched" > and increasing 'max_parallel_workers' if needed. > > On the other hand, obviously it doesn't make sense to request more > workers than 'max_worker_processes' (moreover, this parameter cannot > be changed as easily as 'max_parallel_workers'). > > I will keep the 'max_worker_processes' limit, so autovacuum will not > waste time initializing a parallel context if there is no chance that > the request will succeed. > But it's worth remembering that actually the > 'autovacuum_max_parallel_workers' parameter will always be implicitly > capped by 'max_parallel_workers'. > > What do you think about it? > It make sense to me. The main benefit that I see on capping autovacuum_max_parallel_workers parameter is that users will see "invalid value for parameter "autovacuum_max_parallel_workers"" error on logs instead of need to search for "planned vs. launched", which can be trick if log_min_messages is not set to at least the info level (the default warning level will not show this log message). If we decide to not cap this on code I think that at least would be good to mention this on documentation. >> >> I've made some tests and I can confirm that is working correctly for >> what I can see. I think that would be to start include the documentation >> changes, what do you think? >> > > It sounds tempting :) > But perhaps first we should agree on the limitation of the > 'autovacuum_max_parallel_workers' parameter. > Agree > Please, see v6 patches : > 1) Fixed typos in autovacuum.c and postgresql.conf.sample > 2) Removed unused macro 'RelationGetParallelAutovacuumWorkers' > Thanks! -- Matheus Alcantara
Hi, On Tue, Jul 8, 2025 at 10:20 PM Matheus Alcantara <matheusssilv97@gmail.com> wrote: > > On Sun Jul 6, 2025 at 5:00 AM -03, Daniil Davydov wrote: > > I will keep the 'max_worker_processes' limit, so autovacuum will not > > waste time initializing a parallel context if there is no chance that > > the request will succeed. > > But it's worth remembering that actually the > > 'autovacuum_max_parallel_workers' parameter will always be implicitly > > capped by 'max_parallel_workers'. > > > > What do you think about it? > > > > It make sense to me. The main benefit that I see on capping > autovacuum_max_parallel_workers parameter is that users will see > "invalid value for parameter "autovacuum_max_parallel_workers"" error on > logs instead of need to search for "planned vs. launched", which can be > trick if log_min_messages is not set to at least the info level (the > default warning level will not show this log message). > I think I can refer to (for example) 'max_parallel_workers_per_gather' parameter, which allows setting values higher than 'max_parallel_workers' without throwing an error or warning. 'autovacuum_max_parallel_workers' will behave the same way. > If we decide to not cap this on code I think that at least would be good to mention this > on documentation. Sure, it is worth noticing in documentation. -- Best regards, Daniil Davydov
On Sun, Jul 6, 2025 at 5:00 PM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Fri, Jul 4, 2025 at 9:21 PM Matheus Alcantara > <matheusssilv97@gmail.com> wrote: > > > > The "autovacuum_max_parallel_workers" declared on guc_tables.c mention > > that is capped by "max_worker_process": > > + { > > + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, > > + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkerspool."), > > + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), > > + }, > > + &autovacuum_max_parallel_workers, > > + 0, 0, MAX_BACKENDS, > > + check_autovacuum_max_parallel_workers, NULL, NULL > > + }, > > > > IIUC the code, it cap by "max_worker_process", but Masahiko has mention > > on [1] that it should be capped by max_parallel_workers. > > > > Thanks for looking into it! > > To be honest, I don't think that this parameter should be explicitly > capped at all. > Other parallel operations (for example parallel index build or VACUUM > PARALLEL) just request as many workers as they want without looking at > 'max_parallel_workers'. > And they will not complain, if not all requested workers were launched. > > Thus, even if 'autovacuum_max_parallel_workers' is higher than > 'max_parallel_workers' the worst that can happen is that not all > requested workers will be running (which is a common situation). > Users can handle it by looking for logs like "planned vs. launched" > and increasing 'max_parallel_workers' if needed. > > On the other hand, obviously it doesn't make sense to request more > workers than 'max_worker_processes' (moreover, this parameter cannot > be changed as easily as 'max_parallel_workers'). > > I will keep the 'max_worker_processes' limit, so autovacuum will not > waste time initializing a parallel context if there is no chance that > the request will succeed. > But it's worth remembering that actually the > 'autovacuum_max_parallel_workers' parameter will always be implicitly > capped by 'max_parallel_workers'. > > What do you think about it? > > > But the postgresql.conf.sample say that it is limited by > > max_parallel_workers: > > +#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers > > Good catch, I'll fix it. > > > --- > > > > We actually capping the autovacuum_max_parallel_workers by > > max_worker_process-1, so we can't have 10 max_worker_process and 10 > > autovacuum_max_parallel_workers. Is that correct? > > Yep. The explanation can be found just above in this letter. > > > --- > > > > Locking unnecessary the AutovacuumLock if none if the if's is true can > > cause some performance issue here? I don't think that this would be a > > serious problem because this code will only be called if the > > configuration file is changed during the autovacuum execution right? But > > I could be wrong, so just sharing my thoughts on this (still learning > > about [auto]vacuum code). > > > > + > > +/* > > + * Make sure that number of available parallel workers corresponds to the > > + * autovacuum_max_parallel_workers parameter (after it was changed). > > + */ > > +static void > > +check_parallel_av_gucs(int prev_max_parallel_workers) > > +{ > > + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); > > + > > + if (AutoVacuumShmem->av_available_parallel_workers > > > + autovacuum_max_parallel_workers) > > + { > > + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers); > > + > > > > This function may be called by a/v launcher when we already have some > a/v workers running. > A/v workers can change the > AutoVacuumShmem->av_available_parallel_workers value, so I think we > should acquire appropriate lock before reading it. > > > Typo on "exeed" > > > > + /* > > + * Number of available workers must not exeed limit. > > + * > > + * Note, that if some parallel autovacuum workers are running at this > > + * moment, available workers number will not exeed limit after releasing > > + * them (see ParallelAutoVacuumReleaseWorkers). > > + */ > > Oops. I'll fix it. > > > --- > > > > I'm not seeing an usage of this macro? > > +/* > > + * RelationGetParallelAutovacuumWorkers > > + * Returns the relation's parallel_autovacuum_workers reloption setting. > > + * Note multiple eval of argument! > > + */ > > +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \ > > + ((relation)->rd_options ? \ > > + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \ > > + (defaultpw)) > > + > > > > Yes, this is the relic of a past implementation. I'll delete this macro. > > > > > I've made some tests and I can confirm that is working correctly for > > what I can see. I think that would be to start include the documentation > > changes, what do you think? > > > > It sounds tempting :) > But perhaps first we should agree on the limitation of the > 'autovacuum_max_parallel_workers' parameter. > > Please, see v6 patches : > 1) Fixed typos in autovacuum.c and postgresql.conf.sample > 2) Removed unused macro 'RelationGetParallelAutovacuumWorkers' > Thank you for updating the patch! Here are some review comments: --- - shared->maintenance_work_mem_worker = - (nindexes_mwm > 0) ? - maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : - maintenance_work_mem; + + if (AmAutoVacuumWorkerProcess()) + shared->maintenance_work_mem_worker = + (nindexes_mwm > 0) ? + autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) : + autovacuum_work_mem; + else + shared->maintenance_work_mem_worker = + (nindexes_mwm > 0) ? + maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : + maintenance_work_mem; Since we have a similar code in dead_items_alloc() I think it's better to follow it: int vac_work_mem = AmAutoVacuumWorkerProcess() && autovacuum_work_mem != -1 ? autovacuum_work_mem : maintenance_work_mem; That is, we calculate vac_work_mem first and then calculate shared->maintenance_work_mem_worker. I think it's more straightforward as the formula of maintenance_work_mem_worker is the same whereas the amount of memory used for vacuum and autovacuum varies. --- + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(nlaunched_workers); + Why don't we release workers before destroying the parallel context? --- @@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested, * We don't allow performing parallel operation in standalone backend or * when parallelism is disabled. */ - if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0) + if (!IsUnderPostmaster || + (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) || + (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess())) return 0; /* @@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested, parallel_workers = (nrequested > 0) ? Min(nrequested, nindexes_parallel) : nindexes_parallel; - /* Cap by max_parallel_maintenance_workers */ - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); + /* Cap by GUC variable */ + parallel_workers = AmAutoVacuumWorkerProcess() ? + Min(parallel_workers, autovacuum_max_parallel_workers) : + Min(parallel_workers, max_parallel_maintenance_workers); return parallel_workers; How about calculating the maximum number of workers once and using it in the above both places? --- + /* Check how many workers can provide autovacuum. */ + if (AmAutoVacuumWorkerProcess() && nworkers > 0) + nworkers = ParallelAutoVacuumReserveWorkers(nworkers); + I think it's better to move this code to right after setting "nworkers = Min(nworkers, pvs->pcxt->nworkers);" as it's a more related code. The comment needs to be updated as it doesn't match what the function actually does (i.e. reserving the workers). --- /* Reinitialize parallel context to relaunch parallel workers */ if (num_index_scans > 0) + { ReinitializeParallelDSM(pvs->pcxt); + /* + * Release all launched (i.e. reserved) parallel autovacuum + * workers. + */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched); + } Why do we need to release all workers here? If there is a reason, we should mention it as a comment. --- @@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan if (vacuum) ereport(pvs->shared->elevel, - (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)", - "launched %d parallel vacuum workers for index vacuuming (planned: %d)", + (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)", + "launched %d parallel %svacuum workers for index vacuuming (planned: %d)", pvs->pcxt->nworkers_launched), - pvs->pcxt->nworkers_launched, nworkers))); + pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers))); The "%svacuum" part doesn't work in terms of translation. We need to construct the whole sentence instead. But do we need this log message change in the first place? IIUC autovacuums write logs only when the execution time exceed the log_autovacuum_min_duration (or its reloption). The patch unconditionally sets LOG level for autovacuums but I'm not sure it's consistent with other autovacuum logging behavior: + int elevel = AmAutoVacuumWorkerProcess() || + vacrel->verbose ? + INFO : DEBUG2; --- - * Support routines for parallel vacuum execution. + * Support routines for parallel [auto]vacuum execution. The patch includes the change of "vacuum" -> "[auto]vacuum" in many places. While I think we need to mention that vacuumparallel.c supports autovacuums I'm not sure we really need all of them. If we accept this style, we would require for all subsequent changes to follow it, which could increase maintenance costs. --- @@ -299,6 +301,7 @@ typedef struct WorkerInfo av_startingWorker; AutoVacuumWorkItem av_workItems[NUM_WORKITEMS]; pg_atomic_uint32 av_nworkersForBalance; + uint32 av_available_parallel_workers; Other field names seem to have consistent naming rules; 'av_' prefix followed by name in camel case. So how about renaming it to av_freeParallelWorkers or something along those lines? --- +int +ParallelAutoVacuumReserveWorkers(int nworkers) +{ Other exposed functions have "AutoVacuum" prefix, so how about renaming it to AutoVacuumReserveParallelWorkers() or something along those lines? --- + if (AutoVacuumShmem->av_available_parallel_workers < nworkers) + { + /* Provide as many workers as we can. */ + can_launch = AutoVacuumShmem->av_available_parallel_workers; + AutoVacuumShmem->av_available_parallel_workers = 0; + } + else + { + /* OK, we can provide all requested workers. */ + can_launch = nworkers; + AutoVacuumShmem->av_available_parallel_workers -= nworkers; + } Can we simplify this logic as follows? can_launch = Min(AutoVacuumShmem->av_available_parallel_workers, nworkers); AutoVacuumShmem->av_available_parallel_workers -= can_launch; Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, On Mon, Jul 14, 2025 at 2:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > --- > - shared->maintenance_work_mem_worker = > - (nindexes_mwm > 0) ? > - maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : > - maintenance_work_mem; > + > + if (AmAutoVacuumWorkerProcess()) > + shared->maintenance_work_mem_worker = > + (nindexes_mwm > 0) ? > + autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) : > + autovacuum_work_mem; > + else > + shared->maintenance_work_mem_worker = > + (nindexes_mwm > 0) ? > + maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : > + maintenance_work_mem; > > Since we have a similar code in dead_items_alloc() I think it's better > to follow it: > > int vac_work_mem = AmAutoVacuumWorkerProcess() && > autovacuum_work_mem != -1 ? > autovacuum_work_mem : maintenance_work_mem; > > That is, we calculate vac_work_mem first and then calculate > shared->maintenance_work_mem_worker. I think it's more straightforward > as the formula of maintenance_work_mem_worker is the same whereas the > amount of memory used for vacuum and autovacuum varies. > I was confused by the fact that initially maintenance_work_mem was used for calculations, not vac_work_mem. I agree that we should better use already calculated vac_work_mem value. > --- > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ > DestroyParallelContext(pvs->pcxt); > + > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > + if (AmAutoVacuumWorkerProcess()) > + ParallelAutoVacuumReleaseWorkers(nlaunched_workers); > + > > Why don't we release workers before destroying the parallel context? > Destroying parallel context includes waiting for all workers to exit (after which, other operations can use them). If we first call ParallelAutoVacuumReleaseWorkers, some operation can reasonably request all released workers. But this request can fail, because there is no guarantee that workers managed to finish. Actually, there's nothing wrong with that, but I think releasing workers only after finishing work is a more logical approach. > --- > @@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, > int nindexes, int nrequested, > * We don't allow performing parallel operation in standalone backend or > * when parallelism is disabled. > */ > - if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0) > + if (!IsUnderPostmaster || > + (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) || > + (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess())) > return 0; > > /* > @@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation > *indrels, int nindexes, int nrequested, > parallel_workers = (nrequested > 0) ? > Min(nrequested, nindexes_parallel) : nindexes_parallel; > > - /* Cap by max_parallel_maintenance_workers */ > - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); > + /* Cap by GUC variable */ > + parallel_workers = AmAutoVacuumWorkerProcess() ? > + Min(parallel_workers, autovacuum_max_parallel_workers) : > + Min(parallel_workers, max_parallel_maintenance_workers); > > return parallel_workers; > > How about calculating the maximum number of workers once and using it > in the above both places? > Agree. Good idea. > --- > + /* Check how many workers can provide autovacuum. */ > + if (AmAutoVacuumWorkerProcess() && nworkers > 0) > + nworkers = ParallelAutoVacuumReserveWorkers(nworkers); > + > > I think it's better to move this code to right after setting "nworkers > = Min(nworkers, pvs->pcxt->nworkers);" as it's a more related code. > > The comment needs to be updated as it doesn't match what the function > actually does (i.e. reserving the workers). > You are right, I'll fix it. > --- > /* Reinitialize parallel context to relaunch parallel workers */ > if (num_index_scans > 0) > + { > ReinitializeParallelDSM(pvs->pcxt); > > + /* > + * Release all launched (i.e. reserved) parallel autovacuum > + * workers. > + */ > + if (AmAutoVacuumWorkerProcess()) > + ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched); > + } > > Why do we need to release all workers here? If there is a reason, we > should mention it as a comment. > Hm, I guess it was left over from previous patch versions. Actually we don't need to release workers here, as we will try to launch them immediately. It is a bug, thank you for noticing it. > --- > @@ -706,16 +751,16 @@ > parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int > num_index_scan > > if (vacuum) > ereport(pvs->shared->elevel, > - (errmsg(ngettext("launched %d parallel vacuum > worker for index vacuuming (planned: %d)", > - "launched %d parallel vacuum > workers for index vacuuming (planned: %d)", > + (errmsg(ngettext("launched %d parallel %svacuum > worker for index vacuuming (planned: %d)", > + "launched %d parallel %svacuum > workers for index vacuuming (planned: %d)", > pvs->pcxt->nworkers_launched), > - pvs->pcxt->nworkers_launched, nworkers))); > + pvs->pcxt->nworkers_launched, > AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers))); > > The "%svacuum" part doesn't work in terms of translation. We need to > construct the whole sentence instead. > But do we need this log message > change in the first place? IIUC autovacuums write logs only when the > execution time exceed the log_autovacuum_min_duration (or its > reloption). The patch unconditionally sets LOG level for autovacuums > but I'm not sure it's consistent with other autovacuum logging > behavior: > > + int elevel = AmAutoVacuumWorkerProcess() || > + vacrel->verbose ? > + INFO : DEBUG2; > > This log level is used only "for messages about parallel workers launched". I think that such logs relate more to the parallel workers module than autovacuum itself. Moreover, if we emit log "planned vs. launched" each time, it will simplify the task of selecting the optimal value of 'autovacuum_max_parallel_workers' parameter. What do you think? About "%svacuum" - I guess we need to clarify what exactly the workers were launched for. I'll add errhint to this log, but I don't know whether such approach is acceptable. > - * Support routines for parallel vacuum execution. > + * Support routines for parallel [auto]vacuum execution. > > The patch includes the change of "vacuum" -> "[auto]vacuum" in many > places. While I think we need to mention that vacuumparallel.c > supports autovacuums I'm not sure we really need all of them. If we > accept this style, we would require for all subsequent changes to > follow it, which could increase maintenance costs. > Agree. I'll leave a comment which says that vacuumparallel also supports parallel autovacuum. All other changes like "[auto]vacuum" will be deleted. > --- > @@ -299,6 +301,7 @@ typedef struct > WorkerInfo av_startingWorker; > AutoVacuumWorkItem av_workItems[NUM_WORKITEMS]; > pg_atomic_uint32 av_nworkersForBalance; > + uint32 av_available_parallel_workers; > > Other field names seem to have consistent naming rules; 'av_' prefix > followed by name in camel case. So how about renaming it to > av_freeParallelWorkers or something along those lines? > > --- > +int > +ParallelAutoVacuumReserveWorkers(int nworkers) > +{ > > Other exposed functions have "AutoVacuum" prefix, so how about > renaming it to AutoVacuumReserveParallelWorkers() or something along > those lines? > Agreeing with both comments, I'll rename the structure field and functions. > --- > + if (AutoVacuumShmem->av_available_parallel_workers < nworkers) > + { > + /* Provide as many workers as we can. */ > + can_launch = AutoVacuumShmem->av_available_parallel_workers; > + AutoVacuumShmem->av_available_parallel_workers = 0; > + } > + else > + { > + /* OK, we can provide all requested workers. */ > + can_launch = nworkers; > + AutoVacuumShmem->av_available_parallel_workers -= nworkers; > + } > > Can we simplify this logic as follows? > > can_launch = Min(AutoVacuumShmem->av_available_parallel_workers, nworkers); > AutoVacuumShmem->av_available_parallel_workers -= can_launch; > Sure, I'll simplify it. --- Thank you very much for your comments! Please, see v7 patch : 1) Rename few functions and variables + get rid of comments like "[auto]vacuum" in vacuumparallel.c 2) Simplified logic in 'parallel_vacuum_init' and 'AutoVacuumReserveParallelWorkers' functions 3) Refactor and bug fix in 'parallel_vacuum_process_all_indexes' function 4) Change "planned vs. launched" logging, so it can be translated 5) Rebased on newest commit in master branch -- Best regards, Daniil Davydov
Вложения
On Mon, Jul 14, 2025 at 3:49 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > > > --- > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ > > DestroyParallelContext(pvs->pcxt); > > + > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > > + if (AmAutoVacuumWorkerProcess()) > > + ParallelAutoVacuumReleaseWorkers(nlaunched_workers); > > + > > > > Why don't we release workers before destroying the parallel context? > > > > Destroying parallel context includes waiting for all workers to exit (after > which, other operations can use them). > If we first call ParallelAutoVacuumReleaseWorkers, some operation can > reasonably request all released workers. But this request can fail, > because there is no guarantee that workers managed to finish. > > Actually, there's nothing wrong with that, but I think releasing workers > only after finishing work is a more logical approach. > > > --- > > @@ -706,16 +751,16 @@ > > parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int > > num_index_scan > > > > if (vacuum) > > ereport(pvs->shared->elevel, > > - (errmsg(ngettext("launched %d parallel vacuum > > worker for index vacuuming (planned: %d)", > > - "launched %d parallel vacuum > > workers for index vacuuming (planned: %d)", > > + (errmsg(ngettext("launched %d parallel %svacuum > > worker for index vacuuming (planned: %d)", > > + "launched %d parallel %svacuum > > workers for index vacuuming (planned: %d)", > > pvs->pcxt->nworkers_launched), > > - pvs->pcxt->nworkers_launched, nworkers))); > > + pvs->pcxt->nworkers_launched, > > AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers))); > > > > The "%svacuum" part doesn't work in terms of translation. We need to > > construct the whole sentence instead. > > But do we need this log message > > change in the first place? IIUC autovacuums write logs only when the > > execution time exceed the log_autovacuum_min_duration (or its > > reloption). The patch unconditionally sets LOG level for autovacuums > > but I'm not sure it's consistent with other autovacuum logging > > behavior: > > > > + int elevel = AmAutoVacuumWorkerProcess() || > > + vacrel->verbose ? > > + INFO : DEBUG2; > > > > > > This log level is used only "for messages about parallel workers launched". > I think that such logs relate more to the parallel workers module than > autovacuum itself. Moreover, if we emit log "planned vs. launched" each > time, it will simplify the task of selecting the optimal value of > 'autovacuum_max_parallel_workers' parameter. What do you think? INFO level is normally not sent to the server log. And regarding autovacuums, they don't write any log mentioning it started. If we want to write planned vs. launched I think it's better to gather these statistics during execution and write it together with other existing logs. > > About "%svacuum" - I guess we need to clarify what exactly the workers > were launched for. I'll add errhint to this log, but I don't know whether such > approach is acceptable. I'm not sure errhint is an appropriate place. If we write such information together with other existing autovacuum logs as I suggested above, I think we don't need to add such information to this log message. I've reviewed v7 patch and here are some comments: + { + { + "parallel_autovacuum_workers", + "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. " + "If value is 0 then parallel degree will computed based on number of indexes.", + RELOPT_KIND_HEAP, + ShareUpdateExclusiveLock + }, + -1, -1, 1024 + }, Many autovacuum related reloptions have the prefix "autovacuum". So how about renaming it to autovacuum_parallel_worker (change check_parallel_av_gucs() name too accordingly)? --- +bool +check_autovacuum_max_parallel_workers(int *newval, void **extra, + GucSource source) +{ + if (*newval >= max_worker_processes) + return false; + return true; +} I think we don't need to strictly check the autovacuum_max_parallel_workers value. Instead, we can accept any integer value but internally cap by max_worker_processes. --- +/* + * Make sure that number of available parallel workers corresponds to the + * autovacuum_max_parallel_workers parameter (after it was changed). + */ +static void +check_parallel_av_gucs(int prev_max_parallel_workers) +{ I think this function doesn't just check the value but does adjust the number of available workers, so how about adjust_free_parallel_workers() or something along these lines? --- + /* + * Number of available workers must not exceed limit. + * + * Note, that if some parallel autovacuum workers are running at this + * moment, available workers number will not exceed limit after + * releasing them (see ParallelAutoVacuumReleaseWorkers). + */ + AutoVacuumShmem->av_freeParallelWorkers = + autovacuum_max_parallel_workers; I think the comment refers to the following code in AutoVacuumReleaseParallelWorkers(): + /* + * If autovacuum_max_parallel_workers variable was reduced during parallel + * autovacuum execution, we must cap available workers number by its new + * value. + */ + if (AutoVacuumShmem->av_freeParallelWorkers > + autovacuum_max_parallel_workers) + { + AutoVacuumShmem->av_freeParallelWorkers = + autovacuum_max_parallel_workers; + } After the autovacuum launchers decreases av_freeParallelWorkers, it's not guaranteed that the autovacuum worker already reflects the new value from the config file when executing the AutoVacuumReleaseParallelWorkers(), which leds to skips the above codes. For example, suppose that autovacuum_max_parallel_workers is 10 and 3 parallel workers are running by one autovacuum worker (i.e., av_freeParallelWorkers = 7 now), if the user changes autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust av_freeParallelWorkers to 5. However, if the worker doesn't reload the config file and executes AutoVacuumReleaseParallelWorkers(), it increases av_freeParallelWorkers to 8 and skips the adjusting logic. I've not tested this scenarios so I might be missing something though. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, On Fri, Jul 18, 2025 at 2:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Jul 14, 2025 at 3:49 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > > > This log level is used only "for messages about parallel workers launched". > > I think that such logs relate more to the parallel workers module than > > autovacuum itself. Moreover, if we emit log "planned vs. launched" each > > time, it will simplify the task of selecting the optimal value of > > 'autovacuum_max_parallel_workers' parameter. What do you think? > > INFO level is normally not sent to the server log. And regarding > autovacuums, they don't write any log mentioning it started. If we > want to write planned vs. launched I think it's better to gather these > statistics during execution and write it together with other existing > logs. > > > > > About "%svacuum" - I guess we need to clarify what exactly the workers > > were launched for. I'll add errhint to this log, but I don't know whether such > > approach is acceptable. > > I'm not sure errhint is an appropriate place. If we write such > information together with other existing autovacuum logs as I > suggested above, I think we don't need to add such information to this > log message. > I thought about it for some time and came up with this idea : 1) When gathering such statistics, we need to take into account that users might not want autovacuum to log something. Thus, we should collect statistics in "higher" level that knows about log_min_duration. 2) By analogy with the rest of the statistics, we can only accumulate a total number of planned and launched parallel workers. Alternatively, we could build an array (one element for each index scan) of "planned vs. launched". But it will make the code "dirty", and I don't sure that this will be useful. This may be a discussion point, so I will separate it to another .patch file. > I've reviewed v7 patch and here are some comments: > > + { > + { > + "parallel_autovacuum_workers", > + "Maximum number of parallel autovacuum workers that can be > taken from bgworkers pool for processing this table. " > + "If value is 0 then parallel degree will computed based on > number of indexes.", > + RELOPT_KIND_HEAP, > + ShareUpdateExclusiveLock > + }, > + -1, -1, 1024 > + }, > > Many autovacuum related reloptions have the prefix "autovacuum". So > how about renaming it to autovacuum_parallel_worker (change > check_parallel_av_gucs() name too accordingly)? > I have no objections. > --- > +bool > +check_autovacuum_max_parallel_workers(int *newval, void **extra, > + GucSource source) > +{ > + if (*newval >= max_worker_processes) > + return false; > + return true; > +} > > I think we don't need to strictly check the > autovacuum_max_parallel_workers value. Instead, we can accept any > integer value but internally cap by max_worker_processes. > I don't think that such a limitation is excessive, but I don't see similar behavior in other "max_parallel_..." GUCs, so I think we can get rid of it. I'll replace the "check hook" with an "assign hook", where autovacuum_max_parallel_workers will be limited. > --- > +/* > + * Make sure that number of available parallel workers corresponds to the > + * autovacuum_max_parallel_workers parameter (after it was changed). > + */ > +static void > +check_parallel_av_gucs(int prev_max_parallel_workers) > +{ > > I think this function doesn't just check the value but does adjust the > number of available workers, so how about > adjust_free_parallel_workers() or something along these lines? I agree, it's better this way. > > --- > + /* > + * Number of available workers must not exceed limit. > + * > + * Note, that if some parallel autovacuum workers are running at this > + * moment, available workers number will not exceed limit after > + * releasing them (see ParallelAutoVacuumReleaseWorkers). > + */ > + AutoVacuumShmem->av_freeParallelWorkers = > + autovacuum_max_parallel_workers; > > I think the comment refers to the following code in > AutoVacuumReleaseParallelWorkers(): > > + /* > + * If autovacuum_max_parallel_workers variable was reduced during parallel > + * autovacuum execution, we must cap available workers number by its new > + * value. > + */ > + if (AutoVacuumShmem->av_freeParallelWorkers > > + autovacuum_max_parallel_workers) > + { > + AutoVacuumShmem->av_freeParallelWorkers = > + autovacuum_max_parallel_workers; > + } > > After the autovacuum launchers decreases av_freeParallelWorkers, it's > not guaranteed that the autovacuum worker already reflects the new > value from the config file when executing the > AutoVacuumReleaseParallelWorkers(), which leds to skips the above > codes. For example, suppose that autovacuum_max_parallel_workers is 10 > and 3 parallel workers are running by one autovacuum worker (i.e., > av_freeParallelWorkers = 7 now), if the user changes > autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust > av_freeParallelWorkers to 5. However, if the worker doesn't reload the > config file and executes AutoVacuumReleaseParallelWorkers(), it > increases av_freeParallelWorkers to 8 and skips the adjusting logic. > I've not tested this scenarios so I might be missing something though. > Yes, this is a possible scenario. I'll rework av_freeParallelWorkers calculation. Main change is that a/v worker now checks whether config reload is pending. Thus, it will have the relevant value of the autovacuum_max_parallel_workers parameter. Thank you very much for your comments! Please, see v8 patches: 1) Rename table option. 2) Replace check_hook with assign_hook for autovacuum_max_parallel_workers. 3) Simplify and correct logic for handling autovacuum_max_parallel_workers parameter change. 4) Rework logic with "planned vs. launched" statistics for autovacuum (see second patch file). 5) Get rid of "sandbox" - I don't see the point in continuing to drag him along. -- Best regards, Daniil Davydov
Вложения
Thanks for the patches! I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and have a few comments from my initial pass. 1/ Please run pgindent. 2/ Documentation is missing. There may be more, but here are the places I found that likely need updates for the new behavior, reloptions, GUC, etc. Including docs in the patch early would help clarify expected behavior. https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM https://www.postgresql.org/docs/current/runtime-config-autovacuum.html https://www.postgresql.org/docs/current/sql-createtable.html https://www.postgresql.org/docs/current/sql-altertable.html https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS One thing I am unclear on is the interaction between max_worker_processes, max_parallel_workers, and max_parallel_maintenance_workers. For example, does the following change mean that manual VACUUM PARALLEL is no longer capped by max_parallel_maintenance_workers? @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested, parallel_workers = (nrequested > 0) ? Min(nrequested, nindexes_parallel) : nindexes_parallel; - /* Cap by max_parallel_maintenance_workers */ - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); + /* Cap by GUC variable */ + parallel_workers = Min(parallel_workers, max_parallel_workers); 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ? + "autovacuum_parallel_workers", + "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. " 4/ The comment "When parallel autovacuum worker die" suggests an abnormal exit. "Terminates" seems clearer, since this applies to both normal and abnormal exits. instead of: + * When parallel autovacuum worker die, how about this: * When parallel autovacuum worker terminates, 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before DestroyParallelContext? + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers); 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside AutoVacuumReleaseParallelWorkers()? if (!AmAutoVacuumWorkerProcess()) return; 7/ It looks like the psql tab completion for autovacuum_parallel_workers is missing: test=# alter table t set (autovacuum_ autovacuum_analyze_scale_factor autovacuum_analyze_threshold autovacuum_enabled autovacuum_freeze_max_age autovacuum_freeze_min_age autovacuum_freeze_table_age autovacuum_multixact_freeze_max_age autovacuum_multixact_freeze_min_age autovacuum_multixact_freeze_table_age autovacuum_vacuum_cost_delay autovacuum_vacuum_cost_limit autovacuum_vacuum_insert_scale_factor autovacuum_vacuum_insert_threshold autovacuum_vacuum_max_threshold autovacuum_vacuum_scale_factor autovacuum_vacuum_threshold -- Sami Imseih Amazon Web Services (AWS)
Hi, On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote: > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and > have a few comments from my initial pass. > > 1/ Please run pgindent. OK, I'll do it. > 2/ Documentation is missing. There may be more, but here are the places I > found that likely need updates for the new behavior, reloptions, GUC, etc. > Including docs in the patch early would help clarify expected behavior. > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html > https://www.postgresql.org/docs/current/sql-createtable.html > https://www.postgresql.org/docs/current/sql-altertable.html > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS > Thanks for gathering it all together. I'll update the documentation so it will reflect changes in autovacuum daemon, reloptions and GUC parameters. So far, I don't see what we can add to vacuum-basics and alter-table paragraphs. I'll create separate .patch file for changes in documentation. > One thing I am unclear on is the interaction between max_worker_processes, > max_parallel_workers, and max_parallel_maintenance_workers. For example, does > the following change mean that manual VACUUM PARALLEL is no longer capped by > max_parallel_maintenance_workers? > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, > int nindexes, int nrequested, > parallel_workers = (nrequested > 0) ? > Min(nrequested, nindexes_parallel) : nindexes_parallel; > > - /* Cap by max_parallel_maintenance_workers */ > - parallel_workers = Min(parallel_workers, > max_parallel_maintenance_workers); > + /* Cap by GUC variable */ > + parallel_workers = Min(parallel_workers, max_parallel_workers); > Oh, it is my poor choice of a name for a local variable (I'll rename it). This variable can get different values depending on performed operation : autovacuum_max_parallel_workers for parallel autovacuum and max_parallel_maintenance_workers for maintenance VACUUM. > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ? > > + "autovacuum_parallel_workers", > + "Maximum number of parallel autovacuum workers > that can be taken from bgworkers pool for processing this table. " > I don't think that we should refer to max_parallel_workers here. Actually, this reloption doesn't depend on max_parallel_workers at all. I wrote about bgworkers pool (both here and in description of autovacuum_max_parallel_workers parameter) in order to clarify that parallel autovacuum will use dynamic workers instead of launching more a/v workers. BTW, I don't really like that the comment on this option turns out to be very large. I'll leave only short description in reloptions.c and move clarification about zero value in rel.h Mentions of bgworkers pool will remain only in description of autovacuum_max_parallel_workers. > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal > exit. "Terminates" seems clearer, since this applies to both normal and > abnormal exits. > > instead of: > + * When parallel autovacuum worker die, > > how about this: > * When parallel autovacuum worker terminates, > Sounds reasonable, I'll fix it. > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before > DestroyParallelContext? > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember > this value */ > DestroyParallelContext(pvs->pcxt); > + > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > + if (AmAutoVacuumWorkerProcess()) > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > I wrote about it above [1], but I think I can duplicate my thoughts here : """ Destroying parallel context includes waiting for all workers to exit (after which, other operations can use them). If we first call ParallelAutoVacuumReleaseWorkers, some operation can reasonably request all released workers. But this request can fail, because there is no guarantee that workers managed to finish. Actually, there's nothing wrong with that, but I think releasing workers only after finishing work is a more logical approach. """ > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside > AutoVacuumReleaseParallelWorkers()? > > if (!AmAutoVacuumWorkerProcess()) > return; > It seems to me that the opposite is true. If there is no alternative to calling AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts will disappear after viewing the AmAutoVacuumWorkerProcess code, but IMO code in vacuumparallel.c will become less intuitive. > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is > missing: > > test=# alter table t set (autovacuum_ > autovacuum_analyze_scale_factor > autovacuum_analyze_threshold > autovacuum_enabled > autovacuum_freeze_max_age > autovacuum_freeze_min_age > autovacuum_freeze_table_age > autovacuum_multixact_freeze_max_age > autovacuum_multixact_freeze_min_age > autovacuum_multixact_freeze_table_age > autovacuum_vacuum_cost_delay > autovacuum_vacuum_cost_limit > autovacuum_vacuum_insert_scale_factor > autovacuum_vacuum_insert_threshold > autovacuum_vacuum_max_threshold > autovacuum_vacuum_scale_factor > autovacuum_vacuum_threshold > Good catch, I'll fix it. Thank you for the review! Please, see v9 patches : 1) Run pgindent + rebase patches on newest commit in master. 2) Introduce changes for documentation. 3) Rename local variable in parallel_vacuum_compute_workers. 4) Shorten the description of autovacuum_parallel_workers in reloptions.c (move clarifications for it into rel.h). 5) Reword "When parallel autovacuum worker die" comment. 6) Add tab completion for autovacuum_parallel_workers table option. [1] https://www.postgresql.org/message-id/CAJDiXgi7KB7wSQ%3DUx%3DngdaCvJnJ5x-ehvTyiuZez%2B5uKHtV6iQ%40mail.gmail.com -- Best regards, Daniil Davydov
Вложения
On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote: > > > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and > > have a few comments from my initial pass. > > > > 1/ Please run pgindent. > > OK, I'll do it. > > > 2/ Documentation is missing. There may be more, but here are the places I > > found that likely need updates for the new behavior, reloptions, GUC, etc. > > Including docs in the patch early would help clarify expected behavior. > > > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS > > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM > > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html > > https://www.postgresql.org/docs/current/sql-createtable.html > > https://www.postgresql.org/docs/current/sql-altertable.html > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS > > > > Thanks for gathering it all together. I'll update the documentation so > it will reflect changes in autovacuum daemon, reloptions and GUC > parameters. So far, I don't see what we can add to vacuum-basics > and alter-table paragraphs. > > I'll create separate .patch file for changes in documentation. > > > One thing I am unclear on is the interaction between max_worker_processes, > > max_parallel_workers, and max_parallel_maintenance_workers. For example, does > > the following change mean that manual VACUUM PARALLEL is no longer capped by > > max_parallel_maintenance_workers? > > > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, > > int nindexes, int nrequested, > > parallel_workers = (nrequested > 0) ? > > Min(nrequested, nindexes_parallel) : nindexes_parallel; > > > > - /* Cap by max_parallel_maintenance_workers */ > > - parallel_workers = Min(parallel_workers, > > max_parallel_maintenance_workers); > > + /* Cap by GUC variable */ > > + parallel_workers = Min(parallel_workers, max_parallel_workers); > > > > Oh, it is my poor choice of a name for a local variable (I'll rename it). > This variable can get different values depending on performed operation : > autovacuum_max_parallel_workers for parallel autovacuum and > max_parallel_maintenance_workers for maintenance VACUUM. > > > > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ? > > > > + "autovacuum_parallel_workers", > > + "Maximum number of parallel autovacuum workers > > that can be taken from bgworkers pool for processing this table. " > > > > I don't think that we should refer to max_parallel_workers here. > Actually, this reloption doesn't depend on max_parallel_workers at all. > I wrote about bgworkers pool (both here and in description of > autovacuum_max_parallel_workers parameter) in order to clarify that > parallel autovacuum will use dynamic workers instead of launching > more a/v workers. > > BTW, I don't really like that the comment on this option turns out to be > very large. I'll leave only short description in reloptions.c and move > clarification about zero value in rel.h > Mentions of bgworkers pool will remain only in > description of autovacuum_max_parallel_workers. > > > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal > > exit. "Terminates" seems clearer, since this applies to both normal and > > abnormal exits. > > > > instead of: > > + * When parallel autovacuum worker die, > > > > how about this: > > * When parallel autovacuum worker terminates, > > > > Sounds reasonable, I'll fix it. > > > > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before > > DestroyParallelContext? > > > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember > > this value */ > > DestroyParallelContext(pvs->pcxt); > > + > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > > + if (AmAutoVacuumWorkerProcess()) > > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > > > > I wrote about it above [1], but I think I can duplicate my thoughts here : > """ > Destroying parallel context includes waiting for all workers to exit (after > which, other operations can use them). > If we first call ParallelAutoVacuumReleaseWorkers, some operation can > reasonably request all released workers. But this request can fail, > because there is no guarantee that workers managed to finish. > > Actually, there's nothing wrong with that, but I think releasing workers > only after finishing work is a more logical approach. > """ > > > > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside > > AutoVacuumReleaseParallelWorkers()? > > > > if (!AmAutoVacuumWorkerProcess()) > > return; > > > > It seems to me that the opposite is true. If there is no alternative to calling > AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts > will disappear after viewing the AmAutoVacuumWorkerProcess code, > but IMO code in vacuumparallel.c will become less intuitive. > > > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is > > missing: > > > > test=# alter table t set (autovacuum_ > > autovacuum_analyze_scale_factor > > autovacuum_analyze_threshold > > autovacuum_enabled > > autovacuum_freeze_max_age > > autovacuum_freeze_min_age > > autovacuum_freeze_table_age > > autovacuum_multixact_freeze_max_age > > autovacuum_multixact_freeze_min_age > > autovacuum_multixact_freeze_table_age > > autovacuum_vacuum_cost_delay > > autovacuum_vacuum_cost_limit > > autovacuum_vacuum_insert_scale_factor > > autovacuum_vacuum_insert_threshold > > autovacuum_vacuum_max_threshold > > autovacuum_vacuum_scale_factor > > autovacuum_vacuum_threshold > > > > Good catch, I'll fix it. > > Thank you for the review! Please, see v9 patches : > 1) Run pgindent + rebase patches on newest commit in master. > 2) Introduce changes for documentation. > 3) Rename local variable in parallel_vacuum_compute_workers. > 4) Shorten the description of autovacuum_parallel_workers in > reloptions.c (move clarifications for it into rel.h). > 5) Reword "When parallel autovacuum worker die" comment. > 6) Add tab completion for autovacuum_parallel_workers table option. Thank you for updating the patch. Here are some review comments. + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers); + We release the reserved worker in parallel_vacuum_end(). However, parallel_vacuum_end() is called only once at the end of vacuum. I think we need to release the reserved worker after index vacuuming or cleanup, otherwise we would end up holding the reserved workers until the end of vacuum even if we invoke index vacuuming multiple times. --- +void +assign_autovacuum_max_parallel_workers(int newval, void *extra) +{ + autovacuum_max_parallel_workers = Min(newval, max_worker_processes); +} I don't think we need the assign hook for this GUC parameter. We can internally cap the maximum value by max_worker_processes like other GUC parameters such as max_parallel_maintenance_workers and max_parallel_workers. ---+ /* Refresh autovacuum_max_parallel_workers paremeter */ + CHECK_FOR_INTERRUPTS(); + if (ConfigReloadPending) + { + ConfigReloadPending = false; + ProcessConfigFile(PGC_SIGHUP); + } + + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + /* + * If autovacuum_max_parallel_workers parameter was reduced during + * parallel autovacuum execution, we must cap available workers number by + * its new value. + */ + AutoVacuumShmem->av_freeParallelWorkers = + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers, + autovacuum_max_parallel_workers); + + LWLockRelease(AutovacuumLock); I think another race condition could occur; suppose autovacuum_max_parallel_workers is set to '5' and one autovacuum worker reserved 5 workers, meaning that AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes autovacuum_max_parallel_workers to 3 and reloads the conf file right after the autovacuum worker checks the interruption. The launcher processes calls adjust_free_parallel_workers() but av_freeParallelWorkers remains 0, and the autovacuum worker increments it by 5 as its autovacuum_max_parallel_workers value is still 5. I think that we can have the autovacuum_max_parallel_workers value on shmem, and only the launcher process can modify its value if the GUC is changed. Autovacuum workers simply increase or decrease the av_freeParallelWorkers within the range of 0 and the autovacuum_max_parallel_workers value on shmem. When changing autovacuum_max_parallel_workers and av_freeParallelWorkers values on shmem, the launcher process calculates the number of workers reserved at that time and calculate the new av_freeParallelWorkers value by subtracting the new autovacuum_max_parallel_workers by the number of reserved workers. --- +AutoVacuumReserveParallelWorkers(int nworkers) +{ + int can_launch; How about renaming it to 'nreserved' or something? can_launch looks like it's a boolean variable to indicate whether the process can launch workers. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Aug 7, 2025 at 4:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <3danissimo@gmail.com> wrote: > > > > Hi, > > > > On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote: > > > > > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and > > > have a few comments from my initial pass. > > > > > > 1/ Please run pgindent. > > > > OK, I'll do it. > > > > > 2/ Documentation is missing. There may be more, but here are the places I > > > found that likely need updates for the new behavior, reloptions, GUC, etc. > > > Including docs in the patch early would help clarify expected behavior. > > > > > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS > > > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM > > > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html > > > https://www.postgresql.org/docs/current/sql-createtable.html > > > https://www.postgresql.org/docs/current/sql-altertable.html > > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES > > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS > > > > > > > Thanks for gathering it all together. I'll update the documentation so > > it will reflect changes in autovacuum daemon, reloptions and GUC > > parameters. So far, I don't see what we can add to vacuum-basics > > and alter-table paragraphs. > > > > I'll create separate .patch file for changes in documentation. > > > > > One thing I am unclear on is the interaction between max_worker_processes, > > > max_parallel_workers, and max_parallel_maintenance_workers. For example, does > > > the following change mean that manual VACUUM PARALLEL is no longer capped by > > > max_parallel_maintenance_workers? > > > > > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, > > > int nindexes, int nrequested, > > > parallel_workers = (nrequested > 0) ? > > > Min(nrequested, nindexes_parallel) : nindexes_parallel; > > > > > > - /* Cap by max_parallel_maintenance_workers */ > > > - parallel_workers = Min(parallel_workers, > > > max_parallel_maintenance_workers); > > > + /* Cap by GUC variable */ > > > + parallel_workers = Min(parallel_workers, max_parallel_workers); > > > > > > > Oh, it is my poor choice of a name for a local variable (I'll rename it). > > This variable can get different values depending on performed operation : > > autovacuum_max_parallel_workers for parallel autovacuum and > > max_parallel_maintenance_workers for maintenance VACUUM. > > > > > > > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ? > > > > > > + "autovacuum_parallel_workers", > > > + "Maximum number of parallel autovacuum workers > > > that can be taken from bgworkers pool for processing this table. " > > > > > > > I don't think that we should refer to max_parallel_workers here. > > Actually, this reloption doesn't depend on max_parallel_workers at all. > > I wrote about bgworkers pool (both here and in description of > > autovacuum_max_parallel_workers parameter) in order to clarify that > > parallel autovacuum will use dynamic workers instead of launching > > more a/v workers. > > > > BTW, I don't really like that the comment on this option turns out to be > > very large. I'll leave only short description in reloptions.c and move > > clarification about zero value in rel.h > > Mentions of bgworkers pool will remain only in > > description of autovacuum_max_parallel_workers. > > > > > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal > > > exit. "Terminates" seems clearer, since this applies to both normal and > > > abnormal exits. > > > > > > instead of: > > > + * When parallel autovacuum worker die, > > > > > > how about this: > > > * When parallel autovacuum worker terminates, > > > > > > > Sounds reasonable, I'll fix it. > > > > > > > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before > > > DestroyParallelContext? > > > > > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember > > > this value */ > > > DestroyParallelContext(pvs->pcxt); > > > + > > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > > > + if (AmAutoVacuumWorkerProcess()) > > > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > > > > > > > I wrote about it above [1], but I think I can duplicate my thoughts here : > > """ > > Destroying parallel context includes waiting for all workers to exit (after > > which, other operations can use them). > > If we first call ParallelAutoVacuumReleaseWorkers, some operation can > > reasonably request all released workers. But this request can fail, > > because there is no guarantee that workers managed to finish. > > > > Actually, there's nothing wrong with that, but I think releasing workers > > only after finishing work is a more logical approach. > > """ > > > > > > > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside > > > AutoVacuumReleaseParallelWorkers()? > > > > > > if (!AmAutoVacuumWorkerProcess()) > > > return; > > > > > > > It seems to me that the opposite is true. If there is no alternative to calling > > AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts > > will disappear after viewing the AmAutoVacuumWorkerProcess code, > > but IMO code in vacuumparallel.c will become less intuitive. > > > > > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is > > > missing: > > > > > > test=# alter table t set (autovacuum_ > > > autovacuum_analyze_scale_factor > > > autovacuum_analyze_threshold > > > autovacuum_enabled > > > autovacuum_freeze_max_age > > > autovacuum_freeze_min_age > > > autovacuum_freeze_table_age > > > autovacuum_multixact_freeze_max_age > > > autovacuum_multixact_freeze_min_age > > > autovacuum_multixact_freeze_table_age > > > autovacuum_vacuum_cost_delay > > > autovacuum_vacuum_cost_limit > > > autovacuum_vacuum_insert_scale_factor > > > autovacuum_vacuum_insert_threshold > > > autovacuum_vacuum_max_threshold > > > autovacuum_vacuum_scale_factor > > > autovacuum_vacuum_threshold > > > > > > > Good catch, I'll fix it. > > > > Thank you for the review! Please, see v9 patches : > > 1) Run pgindent + rebase patches on newest commit in master. > > 2) Introduce changes for documentation. > > 3) Rename local variable in parallel_vacuum_compute_workers. > > 4) Shorten the description of autovacuum_parallel_workers in > > reloptions.c (move clarifications for it into rel.h). > > 5) Reword "When parallel autovacuum worker die" comment. > > 6) Add tab completion for autovacuum_parallel_workers table option. > > Thank you for updating the patch. Here are some review comments. > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > + if (AmAutoVacuumWorkerProcess()) > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > + > > We release the reserved worker in parallel_vacuum_end(). However, > parallel_vacuum_end() is called only once at the end of vacuum. I > think we need to release the reserved worker after index vacuuming or > cleanup, otherwise we would end up holding the reserved workers until > the end of vacuum even if we invoke index vacuuming multiple times. > > --- > +void > +assign_autovacuum_max_parallel_workers(int newval, void *extra) > +{ > + autovacuum_max_parallel_workers = Min(newval, max_worker_processes); > +} > > I don't think we need the assign hook for this GUC parameter. We can > internally cap the maximum value by max_worker_processes like other > GUC parameters such as max_parallel_maintenance_workers and > max_parallel_workers. > > ---+ /* Refresh autovacuum_max_parallel_workers paremeter */ > + CHECK_FOR_INTERRUPTS(); > + if (ConfigReloadPending) > + { > + ConfigReloadPending = false; > + ProcessConfigFile(PGC_SIGHUP); > + } > + > + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); > + > + /* > + * If autovacuum_max_parallel_workers parameter was reduced during > + * parallel autovacuum execution, we must cap available > workers number by > + * its new value. > + */ > + AutoVacuumShmem->av_freeParallelWorkers = > + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers, > + autovacuum_max_parallel_workers); > + > + LWLockRelease(AutovacuumLock); > > I think another race condition could occur; suppose > autovacuum_max_parallel_workers is set to '5' and one autovacuum > worker reserved 5 workers, meaning that > AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes > autovacuum_max_parallel_workers to 3 and reloads the conf file right > after the autovacuum worker checks the interruption. The launcher > processes calls adjust_free_parallel_workers() but > av_freeParallelWorkers remains 0, and the autovacuum worker increments > it by 5 as its autovacuum_max_parallel_workers value is still 5. > > I think that we can have the autovacuum_max_parallel_workers value on > shmem, and only the launcher process can modify its value if the GUC > is changed. Autovacuum workers simply increase or decrease the > av_freeParallelWorkers within the range of 0 and the > autovacuum_max_parallel_workers value on shmem. When changing > autovacuum_max_parallel_workers and av_freeParallelWorkers values on > shmem, the launcher process calculates the number of workers reserved > at that time and calculate the new av_freeParallelWorkers value by > subtracting the new autovacuum_max_parallel_workers by the number of > reserved workers. > > --- > +AutoVacuumReserveParallelWorkers(int nworkers) > +{ > + int can_launch; > > How about renaming it to 'nreserved' or something? can_launch looks > like it's a boolean variable to indicate whether the process can > launch workers. While testing the patch, I found there are other two problems: 1. when an autovacuum worker who reserved workers fails with an error, the reserved workers are not released. I think we need to ensure that all reserved workers are surely released at the end of vacuum even with an error. 2. when an autovacuum worker (not parallel vacuum worker) who uses parallel vacuum gets SIGHUP, it errors out with the error message "parameter "max_stack_depth" cannot be set during a parallel operation". Autovacuum checks the configuration file reload in vacuum_delay_point(), and while reloading the configuration file, it attempts to set max_stack_depth in InitializeGUCOptionsFromEnvironment() (which is called by ProcessConfigFileInternal()). However, it cannot change max_stack_depth since the worker is in parallel mode but max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't happen in regular backends who are using parallel queries because they check the configuration file reload at the end of each SQL command. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, Thank you very much for your comments! In this letter I'll answer both of your recent letters. On Fri, Aug 8, 2025 at 6:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Thank you for updating the patch. Here are some review comments. > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > + if (AmAutoVacuumWorkerProcess()) > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > + > > We release the reserved worker in parallel_vacuum_end(). However, > parallel_vacuum_end() is called only once at the end of vacuum. I > think we need to release the reserved worker after index vacuuming or > cleanup, otherwise we would end up holding the reserved workers until > the end of vacuum even if we invoke index vacuuming multiple times. > Yep, you are right. It was easy to miss because typically the autovacuum takes only one cycle to process a table. Since both index vacuum and index cleanup uses the parallel_vacuum_process_all_indexes function, I think that both releasing and reserving should be placed there. > --- > +void > +assign_autovacuum_max_parallel_workers(int newval, void *extra) > +{ > + autovacuum_max_parallel_workers = Min(newval, max_worker_processes); > +} > > I don't think we need the assign hook for this GUC parameter. We can > internally cap the maximum value by max_worker_processes like other > GUC parameters such as max_parallel_maintenance_workers and > max_parallel_workers. Ok, I get it - we don't want to give a configuration error for no serious reason. Actually, we already internally capping autovacuum_max_parallel_workers by max_worker_processes (inside parallel_vacuum_compute_workers function). This is the same behavior as max_parallel_maintenance_workers got. I'll get rid of the assign hook and add one more capping inside autovacuum shmem initialization : Since max_worker_processes is PGC_POSTMASTER parameter, av_freeParallelWorkers must not exceed its value. > > ---+ /* Refresh autovacuum_max_parallel_workers paremeter */ > + CHECK_FOR_INTERRUPTS(); > + if (ConfigReloadPending) > + { > + ConfigReloadPending = false; > + ProcessConfigFile(PGC_SIGHUP); > + } > + > + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); > + > + /* > + * If autovacuum_max_parallel_workers parameter was reduced during > + * parallel autovacuum execution, we must cap available > workers number by > + * its new value. > + */ > + AutoVacuumShmem->av_freeParallelWorkers = > + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers, > + autovacuum_max_parallel_workers); > + > + LWLockRelease(AutovacuumLock); > > I think another race condition could occur; suppose > autovacuum_max_parallel_workers is set to '5' and one autovacuum > worker reserved 5 workers, meaning that > AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes > autovacuum_max_parallel_workers to 3 and reloads the conf file right > after the autovacuum worker checks the interruption. The launcher > processes calls adjust_free_parallel_workers() but > av_freeParallelWorkers remains 0, and the autovacuum worker increments > it by 5 as its autovacuum_max_parallel_workers value is still 5. > I think this problem can be solved if we put AutovacuumLock acquiring before processing the config file, but I understand that this is a bad way. > I think that we can have the autovacuum_max_parallel_workers value on > shmem, and only the launcher process can modify its value if the GUC > is changed. Autovacuum workers simply increase or decrease the > av_freeParallelWorkers within the range of 0 and the > autovacuum_max_parallel_workers value on shmem. When changing > autovacuum_max_parallel_workers and av_freeParallelWorkers values on > shmem, the launcher process calculates the number of workers reserved > at that time and calculate the new av_freeParallelWorkers value by > subtracting the new autovacuum_max_parallel_workers by the number of > reserved workers. > Good idea, I agree. Replacing the GUC parameter with the variable in shmem leaves the current logic of free workers management unchanged. Essentially, this is the same solution as I described above, but we are holding lock not during config reloading, but during a simple value check. It makes much more sense. > --- > +AutoVacuumReserveParallelWorkers(int nworkers) > +{ > + int can_launch; > > How about renaming it to 'nreserved' or something? can_launch looks > like it's a boolean variable to indicate whether the process can > launch workers. > There are no objections. On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > While testing the patch, I found there are other two problems: > > 1. when an autovacuum worker who reserved workers fails with an error, > the reserved workers are not released. I think we need to ensure that > all reserved workers are surely released at the end of vacuum even > with an error. > Agree. I'll add a try/catch block to the parallel_vacuum_process_all_indexes (the only place where we are reserving workers). > 2. when an autovacuum worker (not parallel vacuum worker) who uses > parallel vacuum gets SIGHUP, it errors out with the error message > "parameter "max_stack_depth" cannot be set during a parallel > operation". Autovacuum checks the configuration file reload in > vacuum_delay_point(), and while reloading the configuration file, it > attempts to set max_stack_depth in > InitializeGUCOptionsFromEnvironment() (which is called by > ProcessConfigFileInternal()). However, it cannot change > max_stack_depth since the worker is in parallel mode but > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't > happen in regular backends who are using parallel queries because they > check the configuration file reload at the end of each SQL command. > Hm, this is a really serious problem. I see only two ways to solve it (both are not really good) : 1) Do not allow processing of the config file during parallel autovacuum execution. 2) Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup phase. I'm a bit wary about it, because the design says that we should be in parallel mode during the whole parallel operation. But actually, if we can make sure that all launched workers are exited, I don't see reasons, why can't we just exit parallel mode at the end of parallel_vacuum_process_all_indexes. What do you think about it? By now, I haven't made any changes related to this problem. Again, thank you for the review. Please, see v10 patches (only 0001 has been changed) : 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes. 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can release workers even after an error. This required adding a static variable to account for the total number of reserved workers (av_nworkers_reserved). 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside autovacuum code. Assign hook has been removed. 4) Use shmem value for determining the maximum number of parallel autovacuum workers (eliminate race condition between launcher and leader process). -- Best regards, Daniil Davydov
Вложения
On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > > On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > 2. when an autovacuum worker (not parallel vacuum worker) who uses > > parallel vacuum gets SIGHUP, it errors out with the error message > > "parameter "max_stack_depth" cannot be set during a parallel > > operation". Autovacuum checks the configuration file reload in > > vacuum_delay_point(), and while reloading the configuration file, it > > attempts to set max_stack_depth in > > InitializeGUCOptionsFromEnvironment() (which is called by > > ProcessConfigFileInternal()). However, it cannot change > > max_stack_depth since the worker is in parallel mode but > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't > > happen in regular backends who are using parallel queries because they > > check the configuration file reload at the end of each SQL command. > > > > Hm, this is a really serious problem. I see only two ways to solve it (both are > not really good) : > 1) > Do not allow processing of the config file during parallel autovacuum > execution. > > 2) > Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup > phase. I'm a bit wary about it, because the design says that we should > be in parallel > mode during the whole parallel operation. But actually, if we can make > sure that all > launched workers are exited, I don't see reasons, why can't we just > exit parallel mode > at the end of parallel_vacuum_process_all_indexes. > > What do you think about it? Hmm, given that we're trying to support parallel heap vacuum on another thread[1] and we will probably support it in autovacuums, it seems to me that these approaches won't work. Another idea would be to allow autovacuum workers to process the config file even in parallel mode. GUC changes in the leader worker would not affect parallel vacuum workers, but it is fine to me. In the context of autovacuum, only specific GUC parameters related to cost-based delays need to be affected also to parallel vacuum workers. Probably we need some changes to compute_parallel_delay() so that parallel workers can compute the sleep time based on the new vacuum_cost_limit and vacuum_cost_delay after the leader process (i.e., autovacuum worker) reloads the config file. > > Again, thank you for the review. Please, see v10 patches (only 0001 > has been changed) : > 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes. > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can > release workers even after an error. This required adding a static > variable to account > for the total number of reserved workers (av_nworkers_reserved). > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside > autovacuum code. Assign hook has been removed. > 4) Use shmem value for determining the maximum number of parallel autovacuum > workers (eliminate race condition between launcher and leader process). Thank you for updating the patch! I'll review the new version patches. Regards, [1] https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi!
On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com> wrote:
> >
> >
> > On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > > 2. when an autovacuum worker (not parallel vacuum worker) who uses
> > > parallel vacuum gets SIGHUP, it errors out with the error message
> > > "parameter "max_stack_depth" cannot be set during a parallel
> > > operation". Autovacuum checks the configuration file reload in
> > > vacuum_delay_point(), and while reloading the configuration file, it
> > > attempts to set max_stack_depth in
> > > InitializeGUCOptionsFromEnvironment() (which is called by
> > > ProcessConfigFileInternal()). However, it cannot change
> > > max_stack_depth since the worker is in parallel mode but
> > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> > > happen in regular backends who are using parallel queries because they
> > > check the configuration file reload at the end of each SQL command.
> > >
> >
> > Hm, this is a really serious problem. I see only two ways to solve it (both are
> > not really good) :
> > 1)
> > Do not allow processing of the config file during parallel autovacuum
> > execution.
> >
> > 2)
> > Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
> > phase. I'm a bit wary about it, because the design says that we should
> > be in parallel
> > mode during the whole parallel operation. But actually, if we can make
> > sure that all
> > launched workers are exited, I don't see reasons, why can't we just
> > exit parallel mode
> > at the end of parallel_vacuum_process_all_indexes.
> >
> > What do you think about it?
>
> Hmm, given that we're trying to support parallel heap vacuum on
> another thread[1] and we will probably support it in autovacuums, it
> seems to me that these approaches won't work.
>
> Another idea would be to allow autovacuum workers to process the
> config file even in parallel mode. GUC changes in the leader worker
> would not affect parallel vacuum workers, but it is fine to me. In the
> context of autovacuum, only specific GUC parameters related to
> cost-based delays need to be affected also to parallel vacuum workers.
> Probably we need some changes to compute_parallel_delay() so that
> parallel workers can compute the sleep time based on the new
> vacuum_cost_limit and vacuum_cost_delay after the leader process
> (i.e., autovacuum worker) reloads the config file.
>
> >
> > Again, thank you for the review. Please, see v10 patches (only 0001
> > has been changed) :
> > 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
> > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
> > release workers even after an error. This required adding a static
> > variable to account
> > for the total number of reserved workers (av_nworkers_reserved).
> > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
> > autovacuum code. Assign hook has been removed.
> > 4) Use shmem value for determining the maximum number of parallel autovacuum
> > workers (eliminate race condition between launcher and leader process).
>
> Thank you for updating the patch! I'll review the new version patches.
I've rebased this patchset to the current master. That required me to move the new GUC definition to guc_parameters.dat. Also, I adjusted typedefs.list and made pgindent. Some notes about the patch.
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
Not sure about the usage of word "future" here. It doesn't look clear what it means. Could we use "below" or "within this file"?
I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard, the overhead of setting/doing jumps is platform-dependent, and not harmless on some platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly setting a jump in autovacuum workers too.
In general, I think this patchset badly lack of testing. I think it needs tap tests checking from the logs that autovacuum has been done in parallel. Also, it would be good to set up some injection points, and check that reserved autovacuum parallel workers are getting released correctly in the case of errors.
------
Regards,
Alexander Korotkov
Supabase
On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com> wrote:
> >
> >
> > On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > > 2. when an autovacuum worker (not parallel vacuum worker) who uses
> > > parallel vacuum gets SIGHUP, it errors out with the error message
> > > "parameter "max_stack_depth" cannot be set during a parallel
> > > operation". Autovacuum checks the configuration file reload in
> > > vacuum_delay_point(), and while reloading the configuration file, it
> > > attempts to set max_stack_depth in
> > > InitializeGUCOptionsFromEnvironment() (which is called by
> > > ProcessConfigFileInternal()). However, it cannot change
> > > max_stack_depth since the worker is in parallel mode but
> > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> > > happen in regular backends who are using parallel queries because they
> > > check the configuration file reload at the end of each SQL command.
> > >
> >
> > Hm, this is a really serious problem. I see only two ways to solve it (both are
> > not really good) :
> > 1)
> > Do not allow processing of the config file during parallel autovacuum
> > execution.
> >
> > 2)
> > Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
> > phase. I'm a bit wary about it, because the design says that we should
> > be in parallel
> > mode during the whole parallel operation. But actually, if we can make
> > sure that all
> > launched workers are exited, I don't see reasons, why can't we just
> > exit parallel mode
> > at the end of parallel_vacuum_process_all_indexes.
> >
> > What do you think about it?
>
> Hmm, given that we're trying to support parallel heap vacuum on
> another thread[1] and we will probably support it in autovacuums, it
> seems to me that these approaches won't work.
>
> Another idea would be to allow autovacuum workers to process the
> config file even in parallel mode. GUC changes in the leader worker
> would not affect parallel vacuum workers, but it is fine to me. In the
> context of autovacuum, only specific GUC parameters related to
> cost-based delays need to be affected also to parallel vacuum workers.
> Probably we need some changes to compute_parallel_delay() so that
> parallel workers can compute the sleep time based on the new
> vacuum_cost_limit and vacuum_cost_delay after the leader process
> (i.e., autovacuum worker) reloads the config file.
>
> >
> > Again, thank you for the review. Please, see v10 patches (only 0001
> > has been changed) :
> > 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
> > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
> > release workers even after an error. This required adding a static
> > variable to account
> > for the total number of reserved workers (av_nworkers_reserved).
> > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
> > autovacuum code. Assign hook has been removed.
> > 4) Use shmem value for determining the maximum number of parallel autovacuum
> > workers (eliminate race condition between launcher and leader process).
>
> Thank you for updating the patch! I'll review the new version patches.
I've rebased this patchset to the current master. That required me to move the new GUC definition to guc_parameters.dat. Also, I adjusted typedefs.list and made pgindent. Some notes about the patch.
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
Not sure about the usage of word "future" here. It doesn't look clear what it means. Could we use "below" or "within this file"?
I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard, the overhead of setting/doing jumps is platform-dependent, and not harmless on some platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly setting a jump in autovacuum workers too.
In general, I think this patchset badly lack of testing. I think it needs tap tests checking from the logs that autovacuum has been done in parallel. Also, it would be good to set up some injection points, and check that reserved autovacuum parallel workers are getting released correctly in the case of errors.
------
Regards,
Alexander Korotkov
Supabase
Вложения
On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > Hi! > > On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com> wrote: > > > > > > > > > On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > 2. when an autovacuum worker (not parallel vacuum worker) who uses > > > > parallel vacuum gets SIGHUP, it errors out with the error message > > > > "parameter "max_stack_depth" cannot be set during a parallel > > > > operation". Autovacuum checks the configuration file reload in > > > > vacuum_delay_point(), and while reloading the configuration file, it > > > > attempts to set max_stack_depth in > > > > InitializeGUCOptionsFromEnvironment() (which is called by > > > > ProcessConfigFileInternal()). However, it cannot change > > > > max_stack_depth since the worker is in parallel mode but > > > > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't > > > > happen in regular backends who are using parallel queries because they > > > > check the configuration file reload at the end of each SQL command. > > > > > > > > > > Hm, this is a really serious problem. I see only two ways to solve it (both are > > > not really good) : > > > 1) > > > Do not allow processing of the config file during parallel autovacuum > > > execution. > > > > > > 2) > > > Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup > > > phase. I'm a bit wary about it, because the design says that we should > > > be in parallel > > > mode during the whole parallel operation. But actually, if we can make > > > sure that all > > > launched workers are exited, I don't see reasons, why can't we just > > > exit parallel mode > > > at the end of parallel_vacuum_process_all_indexes. > > > > > > What do you think about it? > > > > Hmm, given that we're trying to support parallel heap vacuum on > > another thread[1] and we will probably support it in autovacuums, it > > seems to me that these approaches won't work. > > > > Another idea would be to allow autovacuum workers to process the > > config file even in parallel mode. GUC changes in the leader worker > > would not affect parallel vacuum workers, but it is fine to me. In the > > context of autovacuum, only specific GUC parameters related to > > cost-based delays need to be affected also to parallel vacuum workers. > > Probably we need some changes to compute_parallel_delay() so that > > parallel workers can compute the sleep time based on the new > > vacuum_cost_limit and vacuum_cost_delay after the leader process > > (i.e., autovacuum worker) reloads the config file. > > > > > > > > Again, thank you for the review. Please, see v10 patches (only 0001 > > > has been changed) : > > > 1) Reserve and release workers only inside parallel_vacuum_process_all_indexes. > > > 2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can > > > release workers even after an error. This required adding a static > > > variable to account > > > for the total number of reserved workers (av_nworkers_reserved). > > > 3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside > > > autovacuum code. Assign hook has been removed. > > > 4) Use shmem value for determining the maximum number of parallel autovacuum > > > workers (eliminate race condition between launcher and leader process). > > > > Thank you for updating the patch! I'll review the new version patches. > > I've rebased this patchset to the current master. That required me to move the new GUC definition to guc_parameters.dat. Also, I adjusted typedefs.list and made pgindent. Some notes about the patch. Thank you for updating the patch! > I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard, the overhead of setting/doing jumps isplatform-dependent, and not harmless on some platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly setting a jump in autovacuumworkers too. I wonder if using the TRY/CATCH block is not enough to ensure that autovacuum workers release the reserved parallel workers in FATAL cases. > In general, I think this patchset badly lack of testing. I think it needs tap tests checking from the logs that autovacuumhas been done in parallel. Also, it would be good to set up some injection points, and check that reserved autovacuumparallel workers are getting released correctly in the case of errors. +1 IIUC the patch still has one problem in terms of reloading the configuration parameters during parallel mode as I mentioned before[1]. Regards, [1] https://www.postgresql.org/message-id/CAD21AoBRRXbNJEvCjS-0XZgCEeRBzQPKmrSDjJ3wZ8TN28vaCQ%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com