Re: Weird test mixup

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: Weird test mixup
Дата
Msg-id 20240501231214.40@rfd.leadboat.com
обсуждение исходный текст
Ответ на Re: Weird test mixup  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Weird test mixup
Список pgsql-hackers
While writing an injection point test, I encountered a variant of the race
condition that f4083c4 fixed.  It had three sessions and this sequence of
events:

s1: local-attach to POINT
s2: enter InjectionPointRun(POINT), yield CPU just before injection_callback()
s3: detach POINT, deleting the InjectionPointCondition record
s2: wake up and run POINT as though it had been non-local

On Sat, Mar 16, 2024 at 08:40:21AM +0900, Michael Paquier wrote:
> On Fri, Mar 15, 2024 at 11:23:31AM +0200, Heikki Linnakangas wrote:
> > Wrt. the spinlock and shared memory handling, I think this would be simpler
> > if you could pass some payload in the InjectionPointAttach() call, which
> > would be passed back to the callback function:
> > 
> > In this case, the payload would be the "slot index" in shared memory.
> > 
> > Or perhaps always allocate, say, 1024 bytes of working area for every
> > attached injection point that the test module can use any way it wants. Like
> > for storing extra conditions, or for the wakeup counter stuff in
> > injection_wait(). A fixed size working area is a little crude, but would be
> > very handy in practice.

That would be one good way to solve it.  (Storing a slot index has the same
race condition, but it fixes the race to store a struct containing the PID.)

The best alternative I see is to keep an InjectionPointCondition forever after
creating it.  Give it a "bool valid" field that we set on detach.  I don't see
a major reason to prefer one of these over the other.  One puts a negligible
amount of memory pressure on the main segment, but it simplifies the module
code.  I lean toward the "1024 bytes of working area" idea.  Other ideas or
opinions?


Separately, injection_points_cleanup() breaks the rules by calling
InjectionPointDetach() while holding a spinlock.  The latter has an
elog(ERROR), and reaching that elog(ERROR) leaves a stuck spinlock.  I haven't
given as much thought to solutions for this one.

Thanks,
nm



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dmitry Koval
Дата:
Сообщение: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Следующее
От: Jacob Champion
Дата:
Сообщение: Re: [PATCH] json_lex_string: don't overread on bad UTF8