Re: Excessive number of replication slots for 12->14 logical replication
От | Amit Kapila |
---|---|
Тема | Re: Excessive number of replication slots for 12->14 logical replication |
Дата | |
Msg-id | CAA4eK1JK0zs0H27-i3Pko2na-xt2vrO2r6Y3mn+sfr9CWzobBg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Excessive number of replication slots for 12->14 logical replication (Ajin Cherian <itsajin@gmail.com>) |
Ответы |
Re: Excessive number of replication slots for 12->14 logical replication
|
Список | pgsql-bugs |
On Thu, Aug 25, 2022 at 7:38 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Wed, Aug 24, 2022 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Aug 23, 2022 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Aug 18, 2022 at 11:28 AM Ajin Cherian <itsajin@gmail.com> wrote: > > > > > > > > On Thu, Aug 18, 2022 at 3:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > Looks good. > > > > > > > > > > I have one minor comment: > > > > > > > > > > - * SUBREL_STATE_FINISHEDCOPY. The apply worker can also > > > > > - * concurrently try to drop the origin and by this time > > > > > - * the origin might be already removed. For these reasons, > > > > > - * passing missing_ok = true. > > > > > + * SUBREL_STATE_FINISHEDCOPY. So passing missing_ok = true. > > > > > > > > > > I think we should change "the apply worker" to "the tablesync worker" > > > > > but should not remove this sentence. The fact that another process > > > > > could concurrently try to drop the origin is still true. > > > > > > > > > > The rest looks good to me. > > > > > > > > > > > > > Updated as described. > > > > > > > > > > The patch looks good to me though I would like to test it a bit more > > > before pushing. > > > > > > > While testing/reviewing it further, I noticed that the patch has used > > missing_ok as true when dropping origin via tablesync worker. I don't > > think that is correct because the concurrent operations that remove > > origin like a refresh for the subscription take an access exclusive > > lock on pg_subscription which prevent the previous operation to update > > the rel state to SUBREL_STATE_SYNCDONE to succeed. So, I think we > > should pass missing_ok as false which would be consistent with slot > > handling. I have changed that and comments a few places. What do you > > think of the attached? > > > > Yes, this makes sense. I tested with a debugger to create simultaneous > alter subscription refresh publication AND tablesync worker > deleting the tracking origin. What I see is that the alter subscription command > gets the lock on SubscriptionRelRelationId and then actually goes on to > kill the tablesync worker. Even if the tablesync worker is waiting for > the same lock, > it looks like the SIGTERM is being handled and the worker is killed. > > From my testing, there doesn't seem to be a case where the tablesync worker > would try and drop a previously dropped origin. So, I think we can pass > missing_ok as 'false' while dropping origin. The patch looks good to me. > Thanks for the testing. I'll push this sometime early next week (by Tuesday) unless Sawada-San or someone else has any comments on it. -- With Regards, Amit Kapila.
В списке pgsql-bugs по дате отправления: