pgsql: Add inline incremental hash functions for in-memory use

Поиск
Список
Период
Сортировка
От John Naylor
Тема pgsql: Add inline incremental hash functions for in-memory use
Дата
Msg-id E1rQhmw-0026oy-FJ@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Add inline incremental hash functions for in-memory use

It can be useful for a hash function to expose separate initialization,
accumulation, and finalization steps.  In particular, this is useful
for building inline hash functions for simplehash.  Instead of trying
to whack around hash_bytes while maintaining its current behavior on
all platforms, we base this work on fasthash (MIT licensed) which
is simple, faster than hash_bytes for inputs over 12 bytes long,
and also passes the hash function testing suite SMHasher.

The fasthash functions have been reimplemented using our added-on
incremental interface to validate that this method will still give
the same answer, provided we have the input length ahead of time.

This functionality lives in a new header hashfn_unstable.h. The name
implies we have the freedom to change things across versions that
would be unacceptable for our other hash functions that are used for
e.g. hash indexes and hash partitioning. As such, these should only
be used for in-memory data structures like hash tables. There is also
no guarantee of being independent of endianness or pointer size.

As demonstration, use fasthash for pgstat_hash_hash_key.  Previously
this called the 32-bit murmur finalizer on the three elements,
then joined them with hash_combine(). The new function is simpler,
faster and takes up less binary space. While the collision and bias
behavior were almost certainly fine with the previous coding, now we
have objective confidence of that.

There are other places that could benefit from this, but that is left
for future work.

Reviewed by Jeff Davis, Heikki Linnakangas, Jian He, Junwang Zhao
Credit to Andres Freund for the idea

Discussion: https://postgr.es/m/20231122223432.lywt4yz2bn7tlp27%40awork3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/e97b672c88f6e5938a2b81021bd4b590b013976f

Modified Files
--------------
src/include/common/hashfn_unstable.h | 221 +++++++++++++++++++++++++++++++++++
src/include/utils/pgstat_internal.h  |  12 +-
src/tools/pgindent/typedefs.list     |   1 +
3 files changed, 225 insertions(+), 9 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: pgsql: psql: Add ignore_slash_options in bind's inactive branch
Следующее
От: John Naylor
Дата:
Сообщение: pgsql: Add optimized C string hashing