Re: General purpose hashing func in pgbench

Поиск
Список
Период
Сортировка
От Ildar Musin
Тема Re: General purpose hashing func in pgbench
Дата
Msg-id 38f1503e-7b63-4612-24a9-041af25bb022@postgrespro.ru
обсуждение исходный текст
Ответ на Re: General purpose hashing func in pgbench  (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы Re: General purpose hashing func in pgbench  (Fabien COELHO <coelho@cri.ensmp.fr>)
Список pgsql-hackers
Hello Fabien,


24/12/2017 11:12, Fabien COELHO пишет:
>
> Yep. The ugliness is significantly linked to the choice of name. With
> MM2_MUL and MM2_ROT ISTM that it is more readable:
>
>>     k *= MM2_MUL;
>>     k ^= k >> MM2_ROT;
>>     k *= MM2_MUL;
>>     result ^= k;
>>     result *= MM2_MUL;
Ok, will do.
>
>> [...] So I'd better leave it the way it is. Actually I was thinking
>> to do the same to fnv1a too : )
>
> I think that the implementation style should be homogeneous, so I'd
> suggest at least to stick to one style.
>
> I noticed from the source of all human knowledege (aka Wikipedia:-)
> that there seems to be a murmur3 successor. Have you considered it?
> One good reason to skip it would be that the implementation is long
> and complex. I'm not sure about a 8-byte input simplified version.
Murmur2 naturally supports 8-byte data. Murmur3 has 32- and 128-bit
versions. So to handle int64 I could
1) split input value into two halfs and combine somehow the results of
32 bit version or
2) use 128-bit version and discard higher bytes.

Btw, postgres core already has a 32bit murmur3 implementation, but it
only uses the finalization part of algorithm (see murmurhash32). As my
colleague Yura Sokolov told me in a recent conversation it alone
provides pretty good randomization. I haven't tried it yet though.

>
> Just a question: Have you looked at SipHash24?
>
>     https://en.wikipedia.org/wiki/SipHash
>
> The interesting point is that it can use a key and seems somehow
> cryptographically secure, for a similar cost. However the how to
> decide for/control the key is unclear.
>
Not yet. As I can understand from the wiki its main feature is to
prevent attacks with crafted input data. How can it be useful in
benchmarking? Unless it shows superior performance and randomization.

-- 
Ildar Musin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: Unique indexes & constraints on partitioned tables
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: General purpose hashing func in pgbench