Re: Slow catchup of 2PC (twophase) transactions on replica in LR
От | Amit Kapila |
---|---|
Тема | Re: Slow catchup of 2PC (twophase) transactions on replica in LR |
Дата | |
Msg-id | CAA4eK1Khy_YWFoQ1HOF_tGtiixD8YoTg86coX1-ckxt8vK3U=Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Slow catchup of 2PC (twophase) transactions on replica in LR (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
RE: Slow catchup of 2PC (twophase) transactions on replica in LR
|
Список | pgsql-hackers |
On Mon, Jul 8, 2024 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > I see that in 0003/0004, the patch first aborts pending prepared > transactions, update's catalog, and then change slot's property via > walrcv_alter_slot. What if there is any ERROR (say the remote node is > not reachable or there is an error while updating the catalog) after > we abort the pending prepared transaction? Won't we end up with lost > prepared transactions in such a case? > Considering the above is a problem the other possibility I thought of is to change the order like abort prepared xacts after slot update. That is also dangerous because any failure while aborting could make a slot change permanent whereas the subscription option will still be old value. Now, because the slot's two_phase property is off, at commit, it can resend the entire transaction which can create a problem because the corresponding prepared transaction will already be present. One more thing to think about in this regard is what if we fail after aborting a few prepared transactions and not all? At this stage, I am not able to think of a good solution for these problems. So, if we don't get a solution for these, we can document that users can first manually abort prepared transactions and then switch off the two_phase option using Alter Subscription command. -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления: