Re: Parallel Inserts in CREATE TABLE AS
От | Luc Vlaming |
---|---|
Тема | Re: Parallel Inserts in CREATE TABLE AS |
Дата | |
Msg-id | 9a31a93b-ea03-9cf6-e075-61766bfe2720@swarm64.com обсуждение исходный текст |
Ответ на | Re: Parallel Inserts in CREATE TABLE AS (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Список | pgsql-hackers |
On 05-01-2021 13:57, Bharath Rupireddy wrote: > On Tue, Jan 5, 2021 at 1:00 PM Luc Vlaming <luc@swarm64.com> wrote: >> Reviewing further v20-0001: >> >> I would still opt for moving the code for the parallel worker into a >> separate function, and then setting rStartup of the dest receiver to >> that function in ExecParallelGetInsReceiver, as its completely >> independent code. Just a matter of style I guess. > > If we were to have a intorel_startup_worker and assign it to > self->pub.rStartup, 1) we can do it in the CreateIntoRelDestReceiver, > we have to pass a parameter to CreateIntoRelDestReceiver as an > indication of parallel worker, which requires code changes in places > wherever CreateIntoRelDestReceiver is used. 2) we can also assign > intorel_startup_worker after CreateIntoRelDestReceiver in > ExecParallelGetInsReceiver, but that doesn't look good to me. 3) we > can duplicate CreateIntoRelDestReceiver and have a > CreateIntoRelParallelDestReceiver with the only change being that > self->pub.rStartup = intorel_startup_worker; > > IMHO, the way it is currently, looks good. Anyways, I'm open to > changing that if we agree on any of the above 3 ways. The current way is good enough, it was a suggestion as personally I find it hard to read to have two completely separate code paths in the same function. If any I would opt for something like 3) where there's a CreateIntoRelParallelDestReceiver which calls CreateIntoRelDestReceiver and then overrides rStartup to intorel_startup_worker. Then no callsites have to change except the ones that are for parallel workers. > > If we were to do any of the above, then we might have to do the same > thing for other commands Refresh Materialized View or Copy To where we > can parallelize. > > Thoughts? > >> Maybe I'm not completely following why but afaics we want parallel >> inserts in various scenarios, not just CTAS? I'm asking because code like >> + if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS) >> + pg_atomic_add_fetch_u64(&fpes->processed, >> queryDesc->estate->es_processed); >> seems very specific to CTAS. For now that seems fine but I suppose that >> would be generalized soon after? Basically I would have expected the if >> to compare against PARALLEL_INSERT_CMD_UNDEF. > > After this patch is reviewed and goes for commit, then the next thing > I plan to do is to allow parallel inserts in Refresh Materialized View > and it can be used for that. I think the processed variable can also > be used for parallel inserts in INSERT INTO SELECT [1] as well. > Currently, I'm keeping it for CTAS, maybe later (after this is > committed) it can be generalized. > > Thoughts? Sounds good > > [1] - https://www.postgresql.org/message-id/CAA4eK1LMmz58ej5BgVLJ8VsUGd%3D%2BKcaA8X%3DkStORhxpfpODOxg%40mail.gmail.com > >> Apart from these small things v20-0001 looks (very) good to me. >> v20-0003 and v20-0004: >> looks good to me. > > Thanks. > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com > Kind regards, Luc
В списке pgsql-hackers по дате отправления: