Re: ReplicationSlotRelease may set the statusFlags of other processes in PG14
От | Michael Paquier |
---|---|
Тема | Re: ReplicationSlotRelease may set the statusFlags of other processes in PG14 |
Дата | |
Msg-id | ZfkNP1OdgBSPPTsR@paquier.xyz обсуждение исходный текст |
Ответ на | ReplicationSlotRelease may set the statusFlags of other processes in PG14 ("feichanghong" <feichanghong@qq.com>) |
Ответы |
Re: ReplicationSlotRelease may set the statusFlags of other processes in PG14
|
Список | pgsql-bugs |
On Sat, Mar 16, 2024 at 10:29:03PM +0800, feichanghong wrote: > A process utilizing replication slots (usually walsender) calls callback > functions in the order of RemoveProcFromArray->ProcKill upon abnormal exit. > Within RemoveProcFromArray, MyProc is already removed from the ProcArray. > ProcKill then attempts to set ProcGlobal->statusFlags[MyProc->pgxactoff] again > via ReplicationSlotRelease. By this time, the flag may already be assigned to > another process. Oops. > To replicate the issue, execute the following steps: > 1. Apply the attached v1-0000-v14-invalidate-pgxactoff-after-remove-pgproc.patch, > where pgxactoff is set to an invalid value in ProcArrayRemove, and some > checks are added. > 2. Use the SQL below to terminate the walsender process. > ``` > select pg_terminate_backend(pid) from pg_stat_activity where backend_type = 'walsender'; > ``` > # Fix > > To fix the issue, I have provided some patches in the attachment: > 1. Backpatching 2f6501f into the PG14 version will fix the problem. > 2. In PG14-head, ProcArrayRemove needs to reset pgxactoff, and some assert > checks should be done when setting ProcGlobal->statusFlags. Yeah, that's something that we had better fix in all stable branches. The asserts would offer some protection moving on, but I would take the safer move of only adding a protection like what you are suggestion on HEAD and not in stable branches, just in case we're missing something around them. -- Michael
Вложения
В списке pgsql-bugs по дате отправления: