Re: Combining hash values
От | Robert Haas |
---|---|
Тема | Re: Combining hash values |
Дата | |
Msg-id | CA+Tgmobpm8SxuB2Y4G672jx+xwZZmVmvZUPUKZ_3RYW1=5=KAQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Combining hash values (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Combining hash values
|
Список | pgsql-hackers |
On Mon, Aug 1, 2016 at 11:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: >> On that subject, while looking at hashfunc.c, I spotted that >> hashint8() has a very obvious deficiency, which causes disastrous >> performance with certain inputs: > > Well, if you're trying to squeeze 64 bits into a 32-bit result, there > are always going to be collisions somewhere. > >> I'd suggest using hash_uint32() for values that fit in a 32-bit >> integer and hash_any() otherwise. > > Perhaps, but this'd break existing hash indexes. That might not be > a fatal objection, but if we're going to put users through that > I'd like to think a little bigger in terms of the benefits we get. > I've thought for some time that we needed to move to 64-bit hash function > results, because the size of problem that's reasonable to use a hash join > or hash aggregation for keeps increasing. Maybe we should do that and fix > hashint8 as a side effect. Well, considering that Amit is working on makes hash indexes WAL-logged in v10[1], this seems like an awfully good time to get any breakage we want to do out of the way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] https://www.postgresql.org/message-id/CAA4eK1LfzcZYxLoXS874Ad0+S-ZM60U9bwcyiUZx9mHZ-KCWhw@mail.gmail.com
В списке pgsql-hackers по дате отправления: