Re: significant slowdown of HashAggregate between 9.6 and 10
От | Andres Freund |
---|---|
Тема | Re: significant slowdown of HashAggregate between 9.6 and 10 |
Дата | |
Msg-id | 20200605163317.fampbw6xnnxuz3ly@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: significant slowdown of HashAggregate between 9.6 and 10 (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
Hi, On 2020-06-05 15:25:26 +0200, Tomas Vondra wrote: > I think you're right. I think I was worried about having to resize the > hash table in case of an under-estimate, and it seemed fine to waste a > tiny bit more memory to prevent that. It's pretty cheap to resize a hashtable with a handful of entries, so I'm not worried about that. It's also how it has worked for a *long* time, so I think unless we have some good reason to change that, I wouldn't. > But this example shows we may need to scan the hash table > sequentially, which means it's not just about memory consumption. We *always* scan the hashtable sequentially, no? Otherwise there's no way to get at the aggregated data. > So in hindsight we either don't need the limit at all, or maybe it > could be much lower (IIRC it reduces probability of collision, but > maybe dynahash does that anyway internally). This is simplehash using code. Which resizes on a load factor of 0.9. > I wonder if hashjoin has the same issue, but probably not - I don't > think we'll ever scan that internal hash table sequentially. I think we do for some outer joins (c.f. ExecPrepHashTableForUnmatched()), but it's probably not relevant performance-wise. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: