Re: Single transaction in the tablesync worker?

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: Single transaction in the tablesync worker?
Дата
Msg-id 03e167c7-ccdd-e036-90ab-8231e9f4c944@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 06/02/2021 06:07, Amit Kapila wrote:
> On Sat, Feb 6, 2021 at 6:22 AM Peter Smith <smithpb2250@gmail.com> wrote:
>> On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
>> <petr.jelinek@enterprisedb.com> wrote:
>>>> +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char syncslotname[NAMEDATALEN])
>>>> +{
>>>> +     if (syncslotname)
>>>> +             sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
>>>> +     else
>>>> +             syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
>>>> +
>>>> +     return syncslotname;
>>>> +}
>>> Given that we are now explicitly dropping slots, what happens here if we
>>> have 2 different downstreams that happen to get same suboid and reloid,
>>> will one of the drop the slot of the other one? Previously with the
>>> cleanup being left to temp slot we'd at maximum got error when creating
>>> it but with the new logic in LogicalRepSyncTableStart it feels like we
>>> could get into situation where 2 downstreams are fighting over slot no?
>>>
> I think so. See, if the alternative suggested below works or if you
> have any other suggestions for the same?
>
>> The PG docs [1] says "there is only one copy of pg_subscription per
>> cluster, not one per database". IIUC that means it is not possible for
>> 2 different subscriptions to have the same suboid.
>>
> I think he is talking about two different clusters having separate
> subscriptions but point to the same publisher. In different clusters,
> we can get the same subid/relid. I think we need a cluster-wide unique
> identifier to distinguish among different subscribers. How about using
> the system_identifier stored in the control file (we can use
> GetSystemIdentifier to retrieve it).  I think one concern could be
> that adding that to slot name could exceed the max length of slot
> (NAMEDATALEN -1) but I don't think that is the case here
> (pg_%u_sync_%u_UINT64_FORMAT (3 + 10 + 6 + 10 + 20 + '\0')). Note last
> is system_identifier in this scheme.


Yep that's what I mean and system_identifier seems like a good choice to me.


> Do you guys think that works or let me know if you have any other
> better idea? Petr, is there a reason why such an identifier is not
> considered originally, is there any risk in it?


Originally it was not considered likely because it's all based on 
pglogical/BDR work where ids are hashes of stuff that's unique across 
group of instances, not counter based like Oids in PostgreSQL and I 
simply didn't realize it could be a problem until reading this patch :)


-- 
Petr Jelinek




В списке pgsql-hackers по дате отправления:

Предыдущее
От: "osumi.takamichi@fujitsu.com"
Дата:
Сообщение: RE: Single transaction in the tablesync worker?
Следующее
От: Petr Jelinek
Дата:
Сообщение: Re: pg_replication_origin_drop API potential race condition