Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
От | Alvaro Herrera |
---|---|
Тема | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] |
Дата | |
Msg-id | 20150122174601.GB1663@alvh.no-ip.org обсуждение исходный текст |
Ответ на | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: TODO : Allow parallel cores to be used by vacuumdb [
WIP ]
|
Список | pgsql-hackers |
Here's v23. I reworked a number of things. First, I changed trivial stuff like grouping all the vacuuming options in a struct, to avoid passing an excessive number of arguments to functions. full, freeze, analyze_only, and_analyze and verbose are all in a single struct now. Also, the stage_commands and stage_messages was duplicated by your patch; I moved them to a file-level static struct. I made prepare_command reset the string buffer and receive an optional table name, so that it can append it to the generated command, and append the semicolon as well. Forcing the callers to reset the string before calling, and having them add the table name and semicolon afterwards was awkward and unnecessarily verbose. You had a new in_abort() function in common.c which seems an unnecessary layer; in its place I just exported the inAbort boolean flag it was returning, and renamed to CancelRequested. I was then troubled by the fact that vacuum_one_database() was being called in a loop by main() when multiple tables are vacuumed, but vacuum_parallel() was doing the loop internally. I found this discrepancy confusing, so I renamed that new function to vacuum_one_database_parallel and modified the original vacuum_one_database to do the loop internally as well. Now they are, in essence, a mirror of each other, one doing the parallel stuff and one doing it serially. This seems to make more sense to me -- but see below. I also modified some underlying stuff like GetIdleSlot returning a ParallelSlot pointer instead of an array index. Since its caller always has to dereference the array with the given index, it makes more sense to return the right element pointer instead, so I made it do that. Also, that way, instead of returning NO_SLOT in case of error it can just return NULL; no need for extra cognitive burden. I also changed select_loop. In your patch it had two implementations, one WIN32 and another one for the rest. It looks nicer to me to have only one with small exceptions in the places that need it. (I haven't tested the WIN32 path.) Also, instead of returning ERROR_IN_ABORT I made it set a boolean flag in case of error, which seems cleaner. I changed GetQueryResult as I described in a previous message. There are two things that continue to bother me and I would like you, dear patch author, to change them before committing this patch: 1. I don't like having vacuum_one_database() and a separate vacuum_one_database_parallel(). I think we should merge them into one function, which does either thing according to parameters. There's plenty in there that's duplicated. 2. in particular, the above means that run_parallel_vacuum can no longer exist as it is. Right now vacuum_one_database_parallel relies on run_parallel_vacuum to do the actual job parallellization. I would like to have that looping in the improved vacuum_one_database() function instead. Looking forward to v24, -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: