Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
От | Andres Freund |
---|---|
Тема | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Дата | |
Msg-id | 790BC80E-D8E5-499B-A1E3-8EE0752B53ED@anarazel.de обсуждение исходный текст |
Ответ на | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
|
Список | pgsql-bugs |
On December 6, 2017 11:57:44 AM PST, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > >On 12/06/2017 08:35 PM, Andres Freund wrote: >> On 2017-12-06 20:32:24 +0100, Tomas Vondra wrote: >>> On 12/05/2017 11:01 PM, Tomas Vondra wrote: >>>> >>>> Based on the initial discussion, the problem here is twofold. >>>> >>>> Firstly, the code assumes that if it gets certain number of bucket >>>> collisions (essentially, the initial bucket and certain number of >>>> neighboring buckets already being full), making the table larger is >>>> guaranteed to reduce the number of collisions. >>>> >>>> Which is obviously untrue for duplicate hash values, but it may >also >>>> happen for distinct hash values that form a chain (think a sequence >of >>>> hash values 1,2,3,4,5,6,7,...,K - that is never gonna get fixed). >> >>> I've managed to construct (by sheer brute force) an example that >breaks >>> simplehash in this way. It triggers the growth by having a sequence >of >>> distinct but consecutive hash values, e.g. 1, 2, 3, 4, ..., 1000, >and >>> then trying to insert a value with duplicate hash (say, 4). >> >> I fail to be excited by this. That's possible for any sort of >hashtable >> in some way. You might hit issues due to resizing or due to lookup >> performance, but you'll hit problem. That's a fairly fundamental >issue >> of pure hashtables. >> > >Eh? I think you misunderstood the issue this triggers. I'm perfectly >fine with poor lookup performance - that's expected and reasonable. But >it actually triggers the same behavior as the already presented >example, >i.e. the hash table grows indefinitely (so eventually gets OOM). > >I really doubt allocating 3GB of memory (which is when my laptop gives >up and kills the query) to do distinct on 50MB table is a sane >behavior. >Even if the table is constructed in so intentionally evil way. No, I got that. I just don't think time and memory are that different. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
В списке pgsql-bugs по дате отправления: