Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
От | Tomas Vondra |
---|---|
Тема | Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers |
Дата | |
Msg-id | 84c22fbb-b9c4-a02f-384b-b4feb2c67193@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: Speed up Clog Access by increasing CLOG buffers (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
|
Список | pgsql-hackers |
Hi, > The attached results show that: > > (a) master shows the same zig-zag behavior - No idea why this wasn't > observed on the previous runs. > > (b) group_update actually seems to improve the situation, because the > performance keeps stable up to 72 clients, while on master the > fluctuation starts way earlier. > > I'll redo the tests with a newer kernel - this was on 3.10.x which is > what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches > you submitted, if the 4.8.6 kernel does not help. > > Overall, I'm convinced this issue is unrelated to the patches. I've been unable to rerun the tests on this hardware with a newer kernel, so nothing new on the x86 front. But as discussed with Amit in Tokyo at pgconf.asia, I got access to a Power8e machine (IBM 8247-22L to be precise). It's a much smaller machine compared to the x86 one, though - it only has 24 cores in 2 sockets, 128GB of RAM and less powerful storage, for example. I've repeated a subset of x86 tests and pushed them to https://bitbucket.org/tvondra/power8-results-2 The new results are prefixed with "power-" and I've tried to put them right next to the "same" x86 tests. In all cases the patches significantly reduce the contention on CLogControlLock, just like on x86. Which is good and expected. Otherwise the results are rather boring - no major regressions compared to master, and all the patches perform almost exactly the same. Compare for example this: * http://tvondra.bitbucket.org/#dilip-300-unlogged-sync * http://tvondra.bitbucket.org/#power-dilip-300-unlogged-sync So the results seem much smoother compared to x86, and the performance difference is roughly 3x, which matches the 24 vs. 72 cores. For pgbench, the difference is much more significant, though: * http://tvondra.bitbucket.org/#pgbench-300-unlogged-sync-skip * http://tvondra.bitbucket.org/#power-pgbench-300-unlogged-sync-skip So, we're doing ~40k on Power8, but 220k on x86 (which is ~6x more, so double per-core throughput). My first guess was that this is due to the x86 machine having better I/O subsystem, so I've reran the tests with data directory in tmpfs, but that produced almost the same results. Of course, this observation is unrelated to this patch. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: