Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
От | Tom Lane |
---|---|
Тема | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Дата | |
Msg-id | 12511.1517008946@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop |
Список | pgsql-bugs |
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > I suspect you're right the hash is biased to lohalf bits, as you wrote > in the 19/12 message. I don't see any bias in what it's doing, which is basically xoring the two halves and hashing the result. It's possible though that Todd's data set contains values in which corresponding bits of the high and low halves are correlated somehow, in which case the xor would produce a lot of cancellation and a relatively small number of distinct outputs. If we weren't bound by backwards compatibility, we could consider changing to logic more like "if the value is within the int4 range, apply int4hash, otherwise hash all 8 bytes normally". But I don't see how we can change that now that hash indexes are first-class citizens. In any case, we still need a fix for the behavior that the hash table size is blown out by lots of collisions, because that can happen no matter what the hash function is. Andres seems to have dropped the ball on doing something about that. regards, tom lane
В списке pgsql-bugs по дате отправления: