Re: Improving spin-lock implementation on ARM.
От | Amit Khandekar |
---|---|
Тема | Re: Improving spin-lock implementation on ARM. |
Дата | |
Msg-id | CAJ3gD9eO3P-W+skRQHbyUQDytiz7gkS8b8PknwkDi0uvRw7fGA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Improving spin-lock implementation on ARM. (Krunal Bauskar <krunalbauskar@gmail.com>) |
Список | pgsql-hackers |
On Thu, 26 Nov 2020 at 10:55, Krunal Bauskar <krunalbauskar@gmail.com> wrote: > Hardware: ARM Kunpeng 920 BareMetal Server 2.6 GHz. 64 cores (56 cores for server and 8 for client) [2 numa nodes] > Storage: 3.2 TB NVMe SSD > OS: CentOS Linux release 7.6 > PGSQL: baseline = Release Tag 13.1 > Invocation suite: https://github.com/mysqlonarm/benchmark-suites/tree/master/pgsql-pbench (Uses pgbench) Using the same hardware, attached are my improvement figures, which are pretty much in line with your figures. Except that, I did not run for more than 400 number of clients. And, I am getting some improvement even for select-only workloads, in case of 200-400 clients. For read-write load, I had seen that the s_lock() contention was caused when the XLogFlush() uses the spinlock. But for read-only case, I have not analyzed where the improvement occurred. The .png files in the attached tar have the graphs for head versus patch. The GUCs that I changed : work_mem=64MB shared_buffers=128GB maintenance_work_mem = 1GB min_wal_size = 20GB max_wal_size = 100GB checkpoint_timeout = 60min checkpoint_completion_target = 0.9 full_page_writes = on synchronous_commit = on effective_io_concurrency = 200 log_checkpoints = on For backends, 64 CPUs were allotted (covering 2 NUMA nodes) , and for pgbench clients a separate set of 28 CPUs were allotted on a different socket. Server was pre_warmed().
Вложения
В списке pgsql-hackers по дате отправления: