Re: Number of buckets in a hash join

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Number of buckets in a hash join
Дата	28 января 2013 г. 16:58:08
Msg-id	17849.1359392285@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Number of buckets in a hash join (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список	pgsql-hackers

Дерево обсуждения

Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> The first question is, why do we aim at 10 tuples per bucket?

I see nothing particularly wrong with that.  The problem here is with
having 1000 tuples per bucket.

> Ideally, the planner would always make a good guess the number of rows, 
> but for the situations that it doesn't, it would be good if the hash 
> table was enlarged if it becomes too full.

Yeah, possibly.  The proposed test case actually doesn't behave very
badly if work_mem is small, because there is logic in there to adjust
the number of batches.  You didn't say what work_mem you're testing at,
but it's clearly more than the default 1MB.  I think the issue arises if
the initial estimate of hashtable size is a good bit less than work_mem,
so the number of buckets is set to something a good bit less than what
would be optimal if we're using more of work_mem.  This seems a little
reminiscent of what we did recently in tuplesort to make better use of
work_mem --- in both cases we have to choose a pointer-array size that
will make best use of work_mem after the tuples themselves are added.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Number of buckets in a hash join