Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA
От | Sokolov Yura |
---|---|
Тема | Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA |
Дата | |
Msg-id | 5a35973617684f3b671f391e0a316e54@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA (Jesper Pedersen <jesper.pedersen@redhat.com>) |
Ответы |
Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA
Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA |
Список | pgsql-hackers |
Hi, Jesper Thank you for reviewing. On 2017-09-08 18:33, Jesper Pedersen wrote: > Hi, > > On 07/18/2017 01:20 PM, Sokolov Yura wrote: >> I'm sending rebased version with couple of one-line tweaks. >> (less skip_wait_list on shared lock, and don't update spin-stat on >> aquiring) >> > > I have been running this patch on a 2S/28C/56T/256Gb w/ 2 x RAID10 SSD > setup (1 to 375 clients on logged tables). > > I'm seeing > > -M prepared: Up to 11% improvement > -M prepared -S: No improvement, no regression ("noise") > -M prepared -N: Up to 12% improvement > > for all runs the improvement shows up the closer you get to the number > of CPU threads, or above. Although I'm not seeing the same > improvements as you on very large client counts there are definitely > improvements :) It is expected: - patch "fixes NUMA": for example, it doesn't give improvement on 1 socket at all (I've tested it using numactl to bind to 1 socket) - and certainly it gives less improvement on 2 sockets than on 4 sockets (and 28 cores vs 72 cores also gives difference), - one of hot points were CLogControlLock, and it were fixed with "Group mode for CLOG updates" [1] > > Some comments: > ============== > > lwlock.c: > --------- > > ... > > +static void > +add_lwlock_stats_spin_stat(LWLock *lock, SpinDelayStatus *delayStatus) > > Add a method description. Neighbor functions above also has no description, that is why I didn't add. But ok, I've added now. > > +static inline bool > +LWLockAttemptLockOrQueueSelf(LWLock *lock, LWLockMode mode, > LWLockMode waitmode) > > I'll leave it to the Committer to decide if this method is too big to > be "inline". GCC 4.9 doesn't want to inline it without directive, and function call is then remarkable in profile. Attach contains version with all suggestions applied except remove of "inline". > Open questions: > --------------- > * spins_per_delay as extern > * Calculation of skip_wait_list Currently calculation of skip_wait_list is mostly empirical (ie best i measured). I strongly think that instead of spins_per_delay something dependent on concrete lock should be used. I tried to store it in a LWLock itself, but it were worse. Recently I understand it should be stored in array indexed by tranche, but I didn't implement it yet, and therefore didn't measure. [1]: https://commitfest.postgresql.org/14/358/ -- With regards, Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: