Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

Поиск
Список
Период
Сортировка
От Melih Mutlu
Тема Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Дата
Msg-id CAGPVpCRdDBRav4AR6NgAg+7HBfokppWwJkmjwzjWeuy1i7HYqA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Why after step 4, do you need to drop the replication slot? Won't just
clearing the required info from the catalog be sufficient?

The replication slots that we read from the catalog will not be used for anything else after we're done with syncing the table which the rep slot belongs to.
It's removed from the catalog when the sync is completed and it basically becomes a slot that is not linked to any table or worker. That's why I think it should be dropped rather than left behind.

Note that if a worker dies and its replication slot continues to exist, that slot will only be used to complete the sync process of the one table that the dead worker was syncing but couldn't finish.
When that particular table is synced and becomes ready, the replication slot has no use anymore.     
 
Hmm, I think even if there is an iota of a chance which I think is
there, we can't use worker_pid. Assume, that if the same worker_pid is
assigned to another worker once the worker using it got an error out,
the new worker will fail as soon as it will try to create a
replication slot.

Right. If something like that happens, worker will fail without doing anything. Then a new one will be launched and that one will continue to do the work.
The worst case might be having conflicting pid over and over again while also having replication slots whose name includes one of those pids still exist.
It seems unlikely but possible, yes.  
 
I feel it would be better or maybe we need to think of some other
identifier but one thing we need to think about before using a 64-bit
unique identifier here is how will we retrieve its last used value
after restart of server. We may need to store it in a persistent way
somewhere.

We might consider storing this info in a catalog again. Since this last used value will be different for each subscription, pg_subscription can be a good place to keep that. 
 
The problems will be similar to the slot name. The origin is used to
track the progress of replication, so, if we use the wrong origin name
after the restart, it can send the wrong start_streaming position to
the publisher.

I understand. But origin naming logic is still the same. Its format is like pg_<subid>_<relid> . 
I did not need to change this since it seems to me origins should belong to only one table. The patch does not reuse origins.
So I don't think this change introduces an issue with origin. What do you think?

Thanks,
Melih

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: making relfilenodes 56 bits
Следующее
От: Jacob Champion
Дата:
Сообщение: Re: [PATCH] Log details for client certificate failures