Re: increased max_parallel_workers_per_gather results in fewerworkers?
От | Philip Semanchuk |
---|---|
Тема | Re: increased max_parallel_workers_per_gather results in fewerworkers? |
Дата | |
Msg-id | B2259D14-3A5B-4FCC-B46C-6A9EDBCACCF0@americanefficient.com обсуждение исходный текст |
Ответ на | Re: increased max_parallel_workers_per_gather results in fewer workers? (Sebastian Dressler <sebastian@swarm64.com>) |
Список | pgsql-performance |
> On Jun 4, 2020, at 3:03 PM, Sebastian Dressler <sebastian@swarm64.com> wrote: > > Hi Philip, > >> On 4. Jun 2020, at 20:37, Philip Semanchuk <philip@americanefficient.com> wrote: >> >> [...] >>> >>>> This brings up a couple of questions — >>>> 1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table — >>>> >>>> max_workers = log3(table size / min_parallel_table_scan_size) >>>> >>>> Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the tablethat will be read before allocating workers? >>> >>> "table size" is the number of PSQL pages, i.e. relation-size / 8 kB. This comes from statistics. >> >> OK, so it sounds like the planner does *not* use the values in pg_stats when planning workers, true? > > Full disclosure: I am not too deep into these internals, likely others on the list know much more about it. But with respectto the relation size, I think this is tracked elsewhere, it might be affected by other parameters though like vacuumingand probably, the estimated amount of how much of the table is scanned also plays a role. I’m not too familiar with the internals either, but if I interpret this line of code correctly, it’s seems that pg_statsis not involved, and the worker allocation is based strictly on pages in the relation -- https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/allpaths.c#L800 That means I still don’t have a reason for why this query gets a different number of workers depending on the WHERE clause,but I can experiment with that more on my own. >> I’m still trying to understand one thing I’ve observed. I can run the query that produced the plan in the gist I linkedto above with max_parallel_workers_per_gather=6 and the year param = 2018, and I get 6 workers. When I set the yearparam=2022 I get only one worker. Same tables, same query, different parameter. That suggests to me that the planneris using pg_stats when allocating workers, but I can imagine there might be other things going on that I don’t understand.(I haven’t ruled out that this might be an AWS-specific quirk, either.) > > I think it would be helpful, if you could post again both plans. The ideal would be to use https://explain.dalibo.com/and share the links. You will have to generate them with JSON format, but still can anonymizethem. I really appreciate all the help you and others have already given. I think I’m good for now. Thank you so much, Philip
В списке pgsql-performance по дате отправления: