Re: Patch: fix lock contention for HASHHDR.mutex
От | Robert Haas |
---|---|
Тема | Re: Patch: fix lock contention for HASHHDR.mutex |
Дата | |
Msg-id | CA+Tgmobtf9nH566_jjs=jrTyMq5HdQdaRF5j7o+AbdOWQHE_Ow@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Patch: fix lock contention for HASHHDR.mutex (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Patch: fix lock contention for HASHHDR.mutex
Re: Patch: fix lock contention for HASHHDR.mutex |
Список | pgsql-hackers |
On Thu, Feb 11, 2016 at 9:53 AM, Robert Haas <robertmhaas@gmail.com> wrote: > The fact that InitLocks() doesn't do this has been discussed before > and there's no consensus on changing it. It is, at any rate, a > separate issue. I'll go through the rest of this patch again now. I did a little bit of cleanup on this patch and the results are attached. I also did some benchmarking of this version. I tested this on two systems, in each case using five-minute, read-only pgbench runs at scale factor 3000, varying the client count and taking the median of three runs. First, I tested it on hydra, a 2-socket, 16-processor, 64-thread POWER box. This is a community resource hosted at OSUOSL. Second, I tested it on cthulhu, an 8-socket, 64-processor, 128-thread x86_64 box. This is an EnterpriseDB resource. Here are the results: hydra, master vs. patched 1 client: 8042.872109 vs. 7839.587491 (-2.5%) 64 clients: 213311.852043 vs. 214002.314071 (+0.3%) 96 clients: 219551.356907 vs. 221908.397489 (+1.1%) 128 clients: 210279.022760 vs. 217974.079171 (+3.7%) cthulhu, master vs. patched 1 client: 3407.705820 vs. 3645.129360 (+7.0%) 64 clients: 88667.681890 vs. 82636.914343 (-6.8%) 96 clients: 79303.750250 vs. 105442.993869 (+33.0%) 128 clients: 74684.510668 vs. 120984.938371 (+62.0%) Obviously, the high-client count results on cthulhu are stellar. I'm *assuming* that the regressions are just random variation. I am however wondering if it to set the freelist affinity based on something other than the hash value, like say the PID, so that the same process doesn't keep switching to a different freelist for every buffer eviction. It also strikes me that it's probably quite likely that slock_t mutex[NUM_FREELISTS] is a poor way to lay out this data in memory. For example, on a system where slock_t is just one byte, most likely all of those mutexes are going to be in the same cache line, which means you're going to get a LOT of false sharing. It seems like it would be sensible to define a new struct that contains an slock_t, a long, and a HASHELEMENT *, and then make an array of those structs. That wouldn't completely eliminate false sharing, but it would reduce it quite a bit. My guess is that if you did that, you could reduce the number of freelists to 8 or less and get pretty much the same performance benefit that you're getting right now with 32. And that, too, seems likely to be good for single-client performance. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: