Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
От | Tomas Vondra |
---|---|
Тема | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop |
Дата | |
Msg-id | ebe8b411-006e-7863-3c60-67ddafe1c636@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
|
Список | pgsql-bugs |
On 12/06/2017 08:35 PM, Andres Freund wrote: > On 2017-12-06 20:32:24 +0100, Tomas Vondra wrote: >> On 12/05/2017 11:01 PM, Tomas Vondra wrote: >>> >>> Based on the initial discussion, the problem here is twofold. >>> >>> Firstly, the code assumes that if it gets certain number of bucket >>> collisions (essentially, the initial bucket and certain number of >>> neighboring buckets already being full), making the table larger is >>> guaranteed to reduce the number of collisions. >>> >>> Which is obviously untrue for duplicate hash values, but it may also >>> happen for distinct hash values that form a chain (think a sequence of >>> hash values 1,2,3,4,5,6,7,...,K - that is never gonna get fixed). > >> I've managed to construct (by sheer brute force) an example that breaks >> simplehash in this way. It triggers the growth by having a sequence of >> distinct but consecutive hash values, e.g. 1, 2, 3, 4, ..., 1000, and >> then trying to insert a value with duplicate hash (say, 4). > > I fail to be excited by this. That's possible for any sort of hashtable > in some way. You might hit issues due to resizing or due to lookup > performance, but you'll hit problem. That's a fairly fundamental issue > of pure hashtables. > Eh? I think you misunderstood the issue this triggers. I'm perfectly fine with poor lookup performance - that's expected and reasonable. But it actually triggers the same behavior as the already presented example, i.e. the hash table grows indefinitely (so eventually gets OOM). I really doubt allocating 3GB of memory (which is when my laptop gives up and kills the query) to do distinct on 50MB table is a sane behavior. Even if the table is constructed in so intentionally evil way. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-bugs по дате отправления: