Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
От | Tomas Vondra |
---|---|
Тема | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop |
Дата | |
Msg-id | 3971fa24-285b-1c57-db95-b15c513162c1@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
|
Список | pgsql-bugs |
On 01/29/2018 02:02 AM, Andres Freund wrote: > On 2018-01-26 18:48:35 -0500, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> On 2017-12-10 23:09:42 +0100, Tomas Vondra wrote: >>>> FWIW I do agree the data sets shared in this thread are pretty extreme >>>> and it doesn't make much sense to slow the regular cases. I'll be >>>> perfectly happy if we stop the OOM, making those cases fast is a bonus. >> >>> Yea, agreed on that. I'm kinda inclined to go for stop-growing in 10, >>> and so something better in 11. And then later possibly backpatch if >>> we've grown some confidence? >> >> +1. I'd like to see some response to this included in 10.2, and time >> grows short for that. > > Here are two patches that I think we want for 10.2, and the start of one > that I think we want for master. 0002 is needed because otherwise the > lack of extra growth leads to noticeably worse performance when filling > an underestimated a coordinator hash table from the workers - turns out > our hash combine (and most hash combines) let a lot of clustering > survive. By adding a final hashing round the bit perturbation is near > perfect. The commit messages need to be polished a bit, but other than > that I think these are reasonable fixes. Plan to push by Monday evening > at the latest. > > The third patch is a version of the random IV discussed in this > thread. I do think we want to add usage of the extended hash functions, > as prototyped by Tomas, as that actually helps to fix issues with actual > hash conflicts. But we additionally need a fallback path for types > without extended hashtables, and the random IV is a good idea > nonetheless. There's no ABI difference in my patch, so I think this is > actually something we could backpatch. But I don't think it's urgent, so > I'm not planning to do that for 10.2. The one thing that could confuse > people is that it can lead to output order changes from run to run - I > think that's actually good, nobody should rely on hashagg etc output > being stable, but it might be a bit much in a stable release? > > > In my tests this solves the worst performance issues in Todd's case, > Tomas's, Thomas's and still does ok performancwith with a TPC-H Q18 > (which showcases the underestimated worker hashtable into leader > hashtable issue). > Sounds reasonable. Do you think repeating the performance testing is necessary? Not sure when do you plan to push the changes, I may not have time for the tests until after FOSDEM. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-bugs по дате отправления: