Re: Optimize Arm64 crc32c implementation in Postgresql
От | Heikki Linnakangas |
---|---|
Тема | Re: Optimize Arm64 crc32c implementation in Postgresql |
Дата | |
Msg-id | 6811959f-e7e5-74e2-4645-d5eb0d40d10d@iki.fi обсуждение исходный текст |
Ответ на | Re: Optimize Arm64 crc32c implementation in Postgresql (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Optimize Arm64 crc32c implementation in Postgresql
|
Список | pgsql-hackers |
On 02/03/18 06:42, Andres Freund wrote: > On 2018-03-02 11:37:52 +1300, Thomas Munro wrote: >> So... that stuff probably needs either a configure check for the >> getauxval function and/or those headers, or an OS check? > > It'd probably be better to not rely on os specific headers, and instead > directly access the capabilities. Anyone got an idea on how to do that? I googled around a bit, but couldn't find any examples. All the examples I could find very Linux-specific, and used getauxval(), except for this in the FreeBSD kernel itself: https://github.com/freebsd/freebsd/blob/master/sys/libkern/crc32.c#L775. I'm no expert on FreeBSD, but that doesn't seem suitable for use in a user program. In any case, I reworked this patch to follow the example of the existing code more closely. Notable changes: * Use compiler intrinsics instead of inline assembly. * If the target architecture has them, use the CRC instructions without a runtime check. You'll get that if you use "CFLAGS=armv8.1-a", for example, as the CRC Extension was made mandatory in ARM v8.1. This should work even on FreeBSD or other non-Linux systems, where getauxval() is not available. * I removed the loop to handle two uint64's at a time, using the LDP instruction. I couldn't find a compiler intrinsic for that, and it was actually slower, at least on the system I have access to, than a straightforward loop that processes 8 bytes at a time. * I tested this on Linux, with gcc and clang, on an ARM64 virtual machine that I had available (not an emulator, but a VM on a shared ARM64 server). - Heikki
Вложения
В списке pgsql-hackers по дате отправления: