Re: Lockless queue of waiters in LWLock
От | Andres Freund |
---|---|
Тема | Re: Lockless queue of waiters in LWLock |
Дата | |
Msg-id | 20221031231517.wvkcj533c4rnm5zq@awork3.anarazel.de обсуждение исходный текст |
Ответ на | Lockless queue of waiters in LWLock (Pavel Borisov <pashkin.elfe@gmail.com>) |
Ответы |
Re: Lockless queue of waiters in LWLock
|
Список | pgsql-hackers |
Hi, Thanks for working on this - I think it's something we need to improve. On 2022-10-31 14:38:23 +0400, Pavel Borisov wrote: > If the lock state contains references to the queue head and tail, we can > implement a lockless queue of waiters for the LWLock. Adding new items to > the queue head or tail can be done with a single CAS operation (adding to > the tail will also require further setting the reference from the previous > tail). Given that there could be only one lock releaser, which wakes up > waiters in the queue, we can handle all the concurrency issues with > reasonable complexity. Right now lock releases happen *after* the lock is released. I suspect that is at least part of the reason for the regression you're seeing. It also looks like you're going to need a substantially higher number of atomic operations - right now the queue processing needs O(1) atomic ops, but your approach ends up with O(waiters) atomic ops. I suspect it might be worth going halfway, i.e. put the list head/tail in the atomic variable, but process the list with a lock held, after the lock is released. > Removing the queue spinlock bit and the corresponding contention should > give the performance gain on high concurrent LWLock usage scenarios. > Currently, the maximum size of procarray is 2^18 (as stated in > buf_internals.h), so with the use of the 64-bit LWLock state variable, we > can store the procarray indexes for both head and tail of the queue, all > the existing machinery, and even have some spare bits for future flags. Another advantage: It shrinks the LWLock struct, which makes it cheaper to make locks more granular... > I can not understand why the performance of a lockless queue patch has a > minor regression in the region of around 20 connections, even when compared > to the current master branch. I suspect that's the point where the additional atomic operations hurt the most, while not yet providing a substantial gain. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: