Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
От | Andres Freund |
---|---|
Тема | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop |
Дата | |
Msg-id | 20180129010213.rt3nu73ilff756f6@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop |
Список | pgsql-bugs |
On 2018-01-26 18:48:35 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2017-12-10 23:09:42 +0100, Tomas Vondra wrote: > >> FWIW I do agree the data sets shared in this thread are pretty extreme > >> and it doesn't make much sense to slow the regular cases. I'll be > >> perfectly happy if we stop the OOM, making those cases fast is a bonus. > > > Yea, agreed on that. I'm kinda inclined to go for stop-growing in 10, > > and so something better in 11. And then later possibly backpatch if > > we've grown some confidence? > > +1. I'd like to see some response to this included in 10.2, and time > grows short for that. Here are two patches that I think we want for 10.2, and the start of one that I think we want for master. 0002 is needed because otherwise the lack of extra growth leads to noticeably worse performance when filling an underestimated a coordinator hash table from the workers - turns out our hash combine (and most hash combines) let a lot of clustering survive. By adding a final hashing round the bit perturbation is near perfect. The commit messages need to be polished a bit, but other than that I think these are reasonable fixes. Plan to push by Monday evening at the latest. The third patch is a version of the random IV discussed in this thread. I do think we want to add usage of the extended hash functions, as prototyped by Tomas, as that actually helps to fix issues with actual hash conflicts. But we additionally need a fallback path for types without extended hashtables, and the random IV is a good idea nonetheless. There's no ABI difference in my patch, so I think this is actually something we could backpatch. But I don't think it's urgent, so I'm not planning to do that for 10.2. The one thing that could confuse people is that it can lead to output order changes from run to run - I think that's actually good, nobody should rely on hashagg etc output being stable, but it might be a bit much in a stable release? In my tests this solves the worst performance issues in Todd's case, Tomas's, Thomas's and still does ok performancwith with a TPC-H Q18 (which showcases the underestimated worker hashtable into leader hashtable issue). Greetings, Andres Freund
Вложения
В списке pgsql-bugs по дате отправления: