RE: Slow catchup of 2PC (twophase) transactions on replica in LR

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема RE: Slow catchup of 2PC (twophase) transactions on replica in LR
Дата
Msg-id OSBPR01MB2552AF02B039F00EF7791401F5DA2@OSBPR01MB2552.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на RE: Slow catchup of 2PC (twophase) transactions on replica in LR  ("Vitaly Davydov" <v.davydov@postgrespro.ru>)
Ответы Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Список pgsql-hackers
Dear Vitaly,

Thanks for giving comments! PSA new version patch.

> Thank you very much for the patch. In general, it seem to work well for me, but
> there seems to be a memory access problem in libpqrcv_alter_slot ->
> quote_identifier in case of NULL slot_name. It happens, if the two_phase option
> is altered on a subscription without slot. I think, a simple check for NULL may
> fix the problem. I guess, the same problem may be for failover option.

You are right. Regarding the failover option, it requires that slot_name is valid.
In case of two_phase, we must connect to the publisher only when altering "true"
to "false", slot_name must be there only at that time. Updated.

> Another possible problem is related to my use case. I haven't reproduced this
> case, just some thoughts. I guess, when two_phase is ON, the PREPARE statement
> may be truncated from the WAL at checkpoint, but COMMIT PREPARED is still kept
> in the WAL. On catchup, I would ask the master to send transactions from some
> restart LSN. I would like to get all such transactions competely, with theirs
> bodies, not only COMMIT PREPARED messages.

I don't think it is a real issue. WALs for prepared transactions will retain
until they are committed/aborted.
When the two_phase is on and transactions are PREPAREd, they will not be
cleaned up from the memory (See ReorderBufferProcessTXN()).  Then, RUNNING_XACT
record leads to update the restart_lsn of the slot but it cannot be move forward
because ReorderBufferGetOldestTXN() returns the prepared transaction (See
SnapBuildProcessRunningXacts()). restart_decoding_lsn of each transaction, which
is a candidate of restart_lsn of the slot. is always behind the startpoint of
its txn.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/ 


Вложения

В списке pgsql-hackers по дате отправления: