Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption
От | ktm@rice.edu |
---|---|
Тема | Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption |
Дата | |
Msg-id | 20131007125023.GQ16128@aart.rice.edu обсуждение исходный текст |
Ответ на | Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly
high memory consumption
|
Список | pgsql-hackers |
On Mon, Oct 07, 2013 at 12:41:58AM +0200, Tomas Vondra wrote: > > 2. Consider using a simpler/faster hash function, like FNV[1] or Jenkins[2]. > > For fun, try not hashing those ints at all and see how that performs (that, > > I think, is what you get from HashSet<int> in Java/C#). > > I've used crc32 mostly because it's easily available in the code (i.e. > I'm lazy), but I've done some quick research / primitive benchmarking > too. For example hashing 2e9 integers takes this much time: > > FNV-1 = 11.9 > FNV-1a = 11.9 > jenkins = 38.8 > crc32 = 32.0 > > So it's not really "slow" and the uniformity seems to be rather good. > > I'll try FNV in the future, however I don't think that's the main issue > right now. > Hi Tomas, If you are going to use a function that is not currently in the code, please consider xxhash: http://code.google.com/p/xxhash/ Here are some benchmarks for some of the faster hash functions: Name Speed Q.Score Author xxHash 5.4 GB/s 10 MumurHash 3a 2.7 GB/s 10 Austin Appleby SpookyHash 2.0 GB/s 10 Bob Jenkins SBox 1.4 GB/s 9 Bret Mulvey Lookup3 1.2 GB/s 9 Bob Jenkins CityHash64 1.05 GB/s 10 Pike & Alakuijala FNV 0.55 GB/s 5 Fowler, Noll, Vo CRC32 0.43 GB/s 9 MD5-32 0.33 GB/s 10 Ronald L. Rivest SHA1-32 0.28 GB/s 10 Regards, Ken
В списке pgsql-hackers по дате отправления: