Re: What exactly is our CRC algorithm?
От | Abhijit Menon-Sen |
---|---|
Тема | Re: What exactly is our CRC algorithm? |
Дата | |
Msg-id | 20141119155811.GA32492@toroid.org обсуждение исходный текст |
Ответ на | Re: What exactly is our CRC algorithm? (Abhijit Menon-Sen <ams@2ndQuadrant.com>) |
Ответы |
Re: What exactly is our CRC algorithm?
|
Список | pgsql-hackers |
At 2014-11-11 16:56:00 +0530, ams@2ndQuadrant.com wrote: > > I'm working on this (first speeding up the default calculation using > slice-by-N, then adding support for the SSE4.2 CRC instruction on > top). I've done the first part in the attached patch, and I'm working on the second (especially the bits to issue CPUID at startup and decide which implementation to use). As a benchmark, I ran pg_xlogdump --stats against 11GB of WAL data (674 segments) generated by running a total of 2M pgbench transactions on a db initialised with scale factor 25. The tests were run on my i5-3230 CPU, and the code in each case was compiled with "-O3 -msse4.2" (and without --enable-debug). The profile was dominated by the CRC calculation in ValidXLogRecord. With HEAD's CRC code: bin/pg_xlogdump --stats wal/000000010000000000000001 29.81s user 3.56s system 77% cpu 43.274 total bin/pg_xlogdump --stats wal/000000010000000000000001 29.59s user 3.85s system 75% cpu 44.227 total With slice-by-4 (a minor variant of the attached patch; the results are included only for curiosity's sake, but I can post the code if needed): bin/pg_xlogdump --stats wal/000000010000000000000001 13.52s user 3.82s system 48% cpu 35.808 total bin/pg_xlogdump --stats wal/000000010000000000000001 13.34s user 3.96s system 48% cpu 35.834 total With slice-by-8 (i.e. the attached patch): bin/pg_xlogdump --stats wal/000000010000000000000001 7.88s user 3.96s system 34% cpu 34.414 total bin/pg_xlogdump --stats wal/000000010000000000000001 7.85s user 4.10s system 34% cpu 35.001 total (Note the progressive reduction in user time from ~29s to ~8s.) Finally, just for comparison, here's what happens when we use the hardware instruction via gcc's __builtin_ia32_crc32xx intrinsics (i.e. the additional patch I'm working on): bin/pg_xlogdump --stats wal/000000010000000000000001 3.33s user 4.79s system 23% cpu 34.832 total There are a number of potential micro-optimisations, I just wanted to submit the obvious thing first and explore more possibilities later. -- Abhijit
Вложения
В списке pgsql-hackers по дате отправления: