Re: Perform streaming logical transactions by background workers and parallel apply
От | Masahiko Sawada |
---|---|
Тема | Re: Perform streaming logical transactions by background workers and parallel apply |
Дата | |
Msg-id | CAD21AoDytm9ziQkGty81ugsHZmzNJ_DzYVNzFPVi-pSnP97k_w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Perform streaming logical transactions by background workers and parallel apply (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Perform streaming logical transactions by background workers and parallel apply
|
Список | pgsql-hackers |
On Mon, May 8, 2023 at 8:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, May 5, 2023 at 9:14 AM Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: > > > > On Wednesday, May 3, 2023 3:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > Attach another patch to fix the problem that pa_shutdown will access invalid > > MyLogicalRepWorker. I personally want to avoid introducing new static variable, > > so I only reorder the callback registration in this version. > > > > When testing this, I notice a rare case that the leader is possible to receive > > the worker termination message after the leader stops the parallel worker. This > > is unnecessary and have a risk that the leader would try to access the detached > > memory queue. This is more likely to happen and sometimes cause the failure in > > regression tests after the registration reorder patch because the dsm is > > detached earlier after applying the patch. > > > > I think it is only possible for the leader apply can worker to try to > receive the error message from an error queue after your 0002 patch. > Because another place already detached from the queue before stopping > the parallel apply workers. So, I combined both the patches and > changed a few comments and a commit message. Let me know what you > think of the attached. I have one comment on the detaching error queue part: + /* + * Detach from the error_mq_handle for the parallel apply worker before + * stopping it. This prevents the leader apply worker from trying to + * receive the message from the error queue that might already be detached + * by the parallel apply worker. + */ + shm_mq_detach(winfo->error_mq_handle); + winfo->error_mq_handle = NULL; In pa_detach_all_error_mq(), we try to detach error queues of all workers in the pool. I think we should check if the queue is already detached (i.e. is NULL) there. Otherwise, we will end up a SEGV if an error happens after detaching the error queue and before removing the worker from the pool. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: