RE: Perform streaming logical transactions by background workers and parallel apply
От | houzj.fnst@fujitsu.com |
---|---|
Тема | RE: Perform streaming logical transactions by background workers and parallel apply |
Дата | |
Msg-id | OS0PR01MB5716F3493A051C5BB23CAD3C943F9@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | RE: Perform streaming logical transactions by background workers and parallel apply ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>) |
Ответы |
RE: Perform streaming logical transactions by background workers and parallel apply
|
Список | pgsql-hackers |
On Monday, November 7, 2022 9:19 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Friday, November 4, 2022 7:45 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > On Fri, Nov 4, 2022 at 1:36 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > > > On Thu, Nov 3, 2022 at 6:36 PM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > Thanks for the analysis and summary ! > > > > > > > > I tried to implement the above idea and here is the patch set. > > > > > > > > > > Few comments on v42-0001 > > > =========================== > > > > > Thanks for the comments. > > > Few more comments on v42-0001 > > =============================== > > 1. In parallel_apply_send_data(), it seems winfo->serialize_changes > > and switching_to_serialize are set to indicate that we have changed > > parallel to serialize mode. Isn't using just the > > switching_to_serialize sufficient? Also, it would be better to name > > switching_to_serialize as parallel_to_serialize or something like > > that. > > I slightly change the logic to let serialize the message directly when timeout > instead of invoking apply_dispatch again so that we don't need the > switching_to_serialize. > > > > > 2. In parallel_apply_send_data(), the patch has already initialized > > the fileset, and then again in apply_handle_stream_start(), it will do > > the same if we fail while sending stream_start message to the parallel > > worker. It seems we don't need to initialize fileset again for > > TRANS_LEADER_PARTIAL_SERIALIZE state in apply_handle_stream_start() > > unless I am missing something. > > Fixed. > > > 3. > > apply_handle_stream_start(StringInfo s) { ... > > + if (!first_segment) > > + { > > + /* > > + * Unlock the shared object lock so that parallel apply worker > > + * can continue to receive and apply changes. > > + */ > > + parallel_apply_unlock(winfo->shared->stream_lock_id); > > ... > > } > > > > Can we have an assert before this unlock call that the lock must be > > held? Similarly, if there are other places then we can have assert > > there as well. > > It seems we don't have a standard API can be used without a transaction. > Maybe we can use the list ParallelApplyLockids to check that ? > > > 4. It is not very clear to me how maintaining ParallelApplyLockids > > list is helpful. > > I will think about this and remove this in next version list if possible. > > > > > 5. > > /* > > + * Handle STREAM START message when the transaction was spilled to disk. > > + * > > + * Inintialize fileset if not yet and open the file. > > + */ > > +void > > +serialize_stream_start(TransactionId xid, bool first_segment) { > > + /* > > + * Start a transaction on stream start, > > > > This function's name and comments seem to indicate that it is to > > handle stream_start message. Is that really the case? It is being > > called from parallel_apply_send_data() which made me think it can be > > used from other places as well. > > Adjusted the comment. > > Here is the new version patch set which addressed comments as of last Friday. > I also added some comments for the newly introduced codes in this version. > Sorry, I posted the wrong patch for V43 which lack some changes. Attach the correct patch set here. Best regards, Hou zj
Вложения
В списке pgsql-hackers по дате отправления: