[PATCH] Improve performance of NOTIFY over many databases (issueblocking on AccessExclusiveLock on object 0 of class 1262 of database 0)
От | Martijn van Oosterhout |
---|---|
Тема | [PATCH] Improve performance of NOTIFY over many databases (issueblocking on AccessExclusiveLock on object 0 of class 1262 of database 0) |
Дата | |
Msg-id | CADWG95uFj8rLM52Er80JnhRsTbb_AqPP1ANHS8XQRGbqLrU+jA@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: [PATCH] Improve performance of NOTIFY over many databases (issueblocking on AccessExclusiveLock on object 0 of class 1262 of database 0)
|
Список | pgsql-hackers |
Hoi hackers, We've been having issues with NOTIFYs blocking over multiple databases (see [1] for more details). That was 9.4 but we've updated the database to 11.3 and still have the same issue. Now however we could use perf to do profiling and got the following profile (useless details elided): --32.83%--ProcessClientReadInterrupt --32.68%--ProcessNotifyInterrupt --32.16%--asyncQueueReadAllNotifications --23.37%--asyncQueueAdvanceTail --20.49%--LWLockAcquire --18.93%--LWLockQueueSelf --12.99%--LWLockWaitListLock (from: perf record -F 99 -ag -- sleep 600) That shows that more than 20% of the time is spent in that single function, waiting for an exclusive lock on the AsyncQueueLock. This will block any concurrent session doing a NOTIFY in any database on the system. This would certainly explain the symptoms we're seeing (process xxx still waiting for AccessExclusiveLock on object 0 of class 1262 of database 0). Analysis of the code leads me to the following hypothesis (and hence to the attached patches): We have ~150 databases, each of which has 2 active backends with an active LISTEN. When a NOTIFY happens anywhere on any database it (under an exclusive lock) makes a list of 300 backends to send a signal to. It then wakes up all of those backends. Each backend then examines the message and all but one discards it as being for the wrong database. Each backend then calls asyncQueueAdvanceTail (because the current position of the each backend was the tail) which then takes an exclusive lock and checks all the other backends to see if the tail can be advanced. All of these will conclude 'no', except the very last one which concludes the tail can be advanced by about 50 bytes or so. So the inner loop of asyncQueueAdvanceTail will, while holding a global exclusive lock, execute 2*150*4000 (max backends) = 1.2 million times for basically no benefit. During this time, no other transaction anywhere in the system that does a NOTIFY will be able to commit. The attached patches attempt reduce the overhead in two ways: Patch 1: Changes asyncQueueAdvanceTail to do nothing unless the QUEUE_HEAD is on a different page than the QUEUE_TAIL. The idea is that there's no point trying to advance the tail unless we can actually usefully truncate the SLRU. This does however mean that asyncQueueReadAllNotifications always has to call asyncQueueAdvanceTail since it can no longer be guaranteed that any backend is still at the tail, which is one of the assumptions of the current code. Not sure if this is a problem or if it can be improved without tracking much more state. Patch 2: Changes SignalBackends to only notify other backends when (a) they're the same database as me or (b) the notify queue has advanced to a new SLRU page. This avoids backends being woken up for messages which they are not interested in. As a consequence of these changes, we can reduce the number of exclusive locks and backend wake ups in our case by a factor of 300. You still however get a thundering herd at the end of each SLRU page. Note: these patches have not yet been extensively tested, and so should only be used as basis for discussion. Comments? Suggestions? [1] https://www.postgresql.org/message-id/CADWG95t0j9zF0uwdcMH81KMnDsiTAVHxmBvgYqrRJcD-iLwQhw@mail.gmail.com -- Martijn van Oosterhout <kleptog@gmail.com> http://svana.org/kleptog/
Вложения
В списке pgsql-hackers по дате отправления: