Re: Parallel copy
От | vignesh C |
---|---|
Тема | Re: Parallel copy |
Дата | |
Msg-id | CALDaNm2rRNnLoC0g9z9aDzpz1FoAanv5u00ri1d+E8Y3of5rxg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel copy (Ashutosh Sharma <ashu.coek88@gmail.com>) |
Ответы |
Re: Parallel copy
Re: Parallel copy Re: Parallel copy |
Список | pgsql-hackers |
Thanks Ashutosh for your comments. On Wed, Sep 16, 2020 at 6:36 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi Vignesh, > > I've spent some time today looking at your new set of patches and I've > some thoughts and queries which I would like to put here: > > Why are these not part of the shared cstate structure? > > SerializeString(pcxt, PARALLEL_COPY_KEY_NULL_PRINT, cstate->null_print); > SerializeString(pcxt, PARALLEL_COPY_KEY_DELIM, cstate->delim); > SerializeString(pcxt, PARALLEL_COPY_KEY_QUOTE, cstate->quote); > SerializeString(pcxt, PARALLEL_COPY_KEY_ESCAPE, cstate->escape); > I have used shared_cstate mainly to share the integer & bool data types from the leader to worker process. The above data types are of char* data type, I will not be able to use it like how I could do it for integer type. So I preferred to send these as separate keys to the worker. Thoughts? > I think in the refactoring patch we could replace all the cstate > variables that would be shared between the leader and workers with a > common structure which would be used even for a serial copy. Thoughts? > Currently we are using shared_cstate only to share integer & bool data types from leader to worker. Once worker retrieves the shared data for integer & bool data types, worker will copy it to cstate. I preferred this way because only for integer & bool we retrieve to shared_cstate & copy it to cstate and for rest of the members any way we are directly copying back to cstate. Thoughts? > Have you tested your patch when encoding conversion is needed? If so, > could you please point out the email that has the test results. > We have not yet done encoding testing, we will do and post the results separately in the coming days. > Apart from above, I've noticed some cosmetic errors which I am sharing here: > > +#define IsParallelCopy() (cstate->is_parallel) > +#define IsLeader() (cstate->pcdata->is_leader) > > This doesn't look to be properly aligned. > Fixed. > + shared_info_ptr = (ParallelCopyShmInfo *) > shm_toc_allocate(pcxt->toc, sizeof(ParallelCopyShmInfo)); > + PopulateParallelCopyShmInfo(shared_info_ptr, full_transaction_id); > > .. > > + /* Store shared build state, for which we reserved space. */ > + shared_cstate = (SerializedParallelCopyState > *)shm_toc_allocate(pcxt->toc, est_cstateshared); > > In the first case, while typecasting you've added a space between the > typename and the function but that is missing in the second case. I > think it would be good if you could make it consistent. > Fixed > Same comment applies here as well: > > + pg_atomic_uint32 line_state; /* line state */ > + uint64 cur_lineno; /* line number for error messages */ > +}ParallelCopyLineBoundary; > > ... > > + CommandId mycid; /* command id */ > + ParallelCopyLineBoundaries line_boundaries; /* line array */ > +} ParallelCopyShmInfo; > > There is no space between the closing brace and the structure name in > the first case but it is in the second one. So, again this doesn't > look consistent. > Fixed > I could also find this type of inconsistency in comments. See below: > > +/* It can hold upto 10000 record information for worker to process. RINGSIZE > + * should be a multiple of WORKER_CHUNK_COUNT, as wrap around cases > is currently > + * not handled while selecting the WORKER_CHUNK_COUNT by the worker. */ > +#define RINGSIZE (10 * 1000) > > ... > > +/* > + * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data > + * block to process to avoid lock contention. Read RINGSIZE comments before > + * changing this value. > + */ > +#define WORKER_CHUNK_COUNT 50 > > You may see these kinds of errors at other places as well if you scan > through your patch. Fixed. Please find the attached v5 patch which has the fixes for the same. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Вложения
- v5-0001-Copy-code-readjustment-to-support-parallel-copy.patch
- v5-0002-Framework-for-leader-worker-in-parallel-copy.patch
- v5-0003-Allow-copy-from-command-to-process-data-from-file.patch
- v5-0004-Documentation-for-parallel-copy.patch
- v5-0005-Tests-for-parallel-copy.patch
- v5-0006-Parallel-Copy-For-Binary-Format-Files.patch
В списке pgsql-hackers по дате отправления: