Re: bad estimation together with large work_mem generates terrible slow hash joins

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: bad estimation together with large work_mem generates terrible slow hash joins
Дата	16 августа 2014 г. 13:31:20
Msg-id	53EF5D1C.3060505@fuzzy.cz обсуждение исходный текст
Ответ на	Re: bad estimation together with large work_mem generates terrible slow hash joins (Tomas Vondra <tv@fuzzy.cz>)
Ответы	Re: bad estimation together with large work_mem generates terrible slow hash joins
Список	pgsql-hackers

Дерево обсуждения


On 12.8.2014 00:30, Tomas Vondra wrote:
> On 11.8.2014 20:25, Robert Haas wrote:
>> It also strikes me that when there's only 1 batch, the set of bits
>> that map onto the batch number is zero-width, and one zero-width bit
>> range is as good as another.  In other words, if you're only planning
>> to do one batch, you can easily grow the number of buckets on the fly.
>> Growing the number of buckets only becomes difficult once you have
>> more than one batch.

...

> I was considering using reversing the bits of the hash, because that's
> pretty much the simplest solution. But I think you're right it might
> actually work like this:
>
>   * Are more batches needed?
>
>       (yes) => just use nbuckets = my_log2(work_mem / tuple_size)
>
>       (no) => go ahead, until processing all tuples or hitting work_mem
>
>               (work_mem) => meh, use the same nbuckets above
>
>               (all tuples) => compute optimal nbuckets / resize
>
>
> But I need to think about this a bit. So far it seems to me there's no
> way additional batches might benefit from increasing nbuckets further.

I think this is a simple and solid solution, solving the batchno
computation issues quite nicely. Attached is v10 patch (bare and
combined with the dense allocation), that does this:

1) when we know we'll need batching, buckets are sized for full work_mem
   (using the estimated tuple width, etc.)

2) without the batching, we estimate the 'right number of buckets' for
   the estimated number of tuples, and keep track of the optimal number
   as tuples are added to the hash table

   - if we discover we need to start batching, we keep the current
     optimal value (which should be the same as the max number of
     buckets) and don't mess with it anymore (making it possible to
     compute batch IDs just like before)

   - also, on the fist rebatch (nbatch=1 => nbatch=2) the hash table
     is resized as part of the rebatch

   - if the hash build completes without batching, we do the resize

I believe the patch is pretty much perfect. I plan to do more thorough
testing on a wide range of queries in the next few days.

I also removed the 'enable_hash_resize' GUC, because it would be more
complex to implement this properly after doing the resize as part of
rebatch etc.. So either it would make the patch more complex, or it
wouldn't do what the name promises.

regards
Tomas

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: bad estimation together with large work_mem generates terrible slow hash joins

Вложения