Re: Atomic operations within spinlocks
От | Andres Freund |
---|---|
Тема | Re: Atomic operations within spinlocks |
Дата | |
Msg-id | 20200609060847.vd2c67ih4bz6vyww@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Atomic operations within spinlocks (Andres Freund <andres@anarazel.de>) |
Ответы |
global barrier & atomics in signal handlers (Re: Atomic operationswithin spinlocks)
|
Список | pgsql-hackers |
On 2020-06-05 17:19:26 -0700, Andres Freund wrote: > Hi, > > On 2020-06-04 19:33:02 -0700, Andres Freund wrote: > > But it looks like that code is currently buggy (and looks like it always > > has been), because we don't look at the nested argument when > > initializing the semaphore. So we currently allocate too many > > semaphores, without benefiting from them :(. > > I wrote a patch for this, and when I got around to to testing it, I > found that our tests currently don't pass when using both > --disable-spinlocks and --disable-atomics. Turns out to not be related > to the issue above, but the global barrier support added in 13. > > That *reads* two 64 bit atomics in a signal handler. Which is normally > fine, but not at all cool when atomics (or just 64 bit atomics) are > backed by spinlocks. Because we can "self interrupt" while already > holding the spinlock. > > It looks to me that that's a danger whenever 64bit atomics are backed by > spinlocks, not just when both --disable-spinlocks and --disable-atomics > are used. But I suspect that it's really hard to hit the tiny window of > danger when those options aren't used. While we have buildfarm animals > testing each of those separately, we don't have one that tests both > together... > > I'm not really sure what to do about that issue. The easisest thing > would probably be to change the barrier generation to 32bit (which > doesn't have to use locks for reads in any situation). I tested doing > that, and it fixes the hangs for me. > > > Randomly noticed while looking at the code: > uint64 flagbit = UINT64CONST(1) << (uint64) type; > > that shouldn't be 64bit, right? Attached is a series of patches addressing these issues, of varying quality: 1) This fixes the above mentioned issue in the global barrier code by using 32bit atomics. That might be fine, or it might not. I just included it here because otherwise the tests cannot be run fully. 2) Fixes spinlock emulation when more than INT_MAX spinlocks are initialized in the lifetime of a single backend 3) Add spinlock tests to normal regression tests. - Currently as part of test_atomic_ops. Probably not worth having a separate SQL function? - Currently contains a test for 1) that's run when the spinlock emulation is used. Probably too slow to actually indclude? Takes 15s on my computer... OTOH, it's just with --disable-spinlocks... - Could probably remove the current spinlock tests after this. The only thing they additionally test is a stuck spinlock. Since they're not run currently, they don't seem worth much? 4) Fix the potential for deadlocks when using atomics while holding a spinlock, add tests for that. Any comments? Greetings, Andres Freund
Вложения
В списке pgsql-hackers по дате отправления: