Re: significant slowdown of HashAggregate between 9.6 and 10

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: significant slowdown of HashAggregate between 9.6 and 10
Дата	5 июня 2020 г. 16:33:17
Msg-id	20200605163317.fampbw6xnnxuz3ly@alap3.anarazel.de обсуждение исходный текст
Ответ на	Re: significant slowdown of HashAggregate between 9.6 and 10 (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2020-06-05 15:25:26 +0200, Tomas Vondra wrote:
> I think you're right. I think I was worried about having to resize the
> hash table in case of an under-estimate, and it seemed fine to waste a
> tiny bit more memory to prevent that.

It's pretty cheap to resize a hashtable with a handful of entries, so I'm not
worried about that. It's also how it has worked for a *long* time, so I think
unless we have some good reason to change that, I wouldn't.

> But this example shows we may need to scan the hash table
> sequentially, which means it's not just about memory consumption.

We *always* scan the hashtable sequentially, no? Otherwise there's no way to
get at the aggregated data.

> So in hindsight we either don't need the limit at all, or maybe it
> could be much lower (IIRC it reduces probability of collision, but
> maybe dynahash does that anyway internally).

This is simplehash using code. Which resizes on a load factor of 0.9.

> I wonder if hashjoin has the same issue, but probably not - I don't
> think we'll ever scan that internal hash table sequentially.

I think we do for some outer joins (c.f. ExecPrepHashTableForUnmatched()), but
it's probably not relevant performance-wise.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: significant slowdown of HashAggregate between 9.6 and 10