Re: [HACKERS] Interval for launching the table sync worker
От | Petr Jelinek |
---|---|
Тема | Re: [HACKERS] Interval for launching the table sync worker |
Дата | |
Msg-id | 8b02fa32-893a-9c68-1530-b46fc4b20578@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Interval for launching the table sync worker (Masahiko Sawada <sawada.mshk@gmail.com>) |
Ответы |
Re: [HACKERS] Interval for launching the table sync worker
|
Список | pgsql-hackers |
On 19/04/17 15:57, Masahiko Sawada wrote: > On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek > <petr.jelinek@2ndquadrant.com> wrote: >> On 19/04/17 14:42, Masahiko Sawada wrote: >>> On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI >>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >>>> At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com> >>>>> On 18/04/17 18:14, Peter Eisentraut wrote: >>>>>> On 4/18/17 11:59, Petr Jelinek wrote: >>>>>>> Hmm if we create hashtable for this, I'd say create hashtable for the >>>>>>> whole table_states then. The reason why it's list now was that it seemed >>>>>>> unnecessary to have hashtable when it will be empty almost always but >>>>>>> there is no need to have both hashtable + list IMHO. >>>> >>>> I understant that but I also don't like the frequent palloc/pfree >>>> in long-lasting context and double loop like Peter. >>>> >>>>>> The difference is that we blow away the list of states when the catalog >>>>>> changes, but we keep the hash table with the start times around. We >>>>>> need two things with different life times. >>>> >>>> On the other hand, hash seems overdone. Addition to that, the >>>> hash-version leaks stale entries while subscriptions are >>>> modified. But vacuuming them costs high. >>>> >>>>> Why can't we just update the hashtable based on the catalog? I mean once >>>>> the record is not needed in the list, the table has been synced so there >>>>> is no need for the timestamp either since we'll not try to start the >>>>> worker again. >>> >>> I guess the table sync worker for the same table could need to be >>> started again. For example, please image a case where the table >>> belonging to the publication is removed from it and the corresponding >>> subscription is refreshed, and then the table is added to it again. We >>> have the record of the table with timestamp in the hash table when the >>> table sync in the first time, but the table sync after refreshed could >>> have to wait for the interval. >>> >> >> But why do we want to wait in such case where user has explicitly >> requested refresh? >> > > Yeah, sorry, I meant that we don't want to wait but cannot launch the > tablesync worker in such case. > > But after more thought, the latest patch Peter proposed has the same > problem. Perhaps we need to scan always whole pg_subscription_rel and > remove the entry if the corresponding table get synced. > Yes that's what I mean by "Why can't we just update the hashtable based on the catalog". And if we do that then I don't understand why do we need both hastable and linked list if we need to update both based on catalog reads anyway. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: