Re: tweaking NTUP_PER_BUCKET

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: tweaking NTUP_PER_BUCKET
Дата	13 июля 2014 г. 19:32:50
Msg-id	53C2DECC.2010408@fuzzy.cz обсуждение исходный текст
Ответ на	Re: tweaking NTUP_PER_BUCKET (Tomas Vondra <tv@fuzzy.cz>)
Ответы	Re: tweaking NTUP_PER_BUCKET
Список	pgsql-hackers

Дерево обсуждения

On 11.7.2014 19:25, Tomas Vondra wrote:
> 2) walking through the tuples sequentially
> ------------------------------------------
>
> The other option is not to walk the tuples through buckets, but by
> walking throught the chunks - we know the tuples are stored as
> HashJoinTuple/MinimalTuple, so it shouldn't be that difficult. And by
> walking the tuples in the order they're stored, we may just rearrange
> the tuples in already allocated memory. And maybe it could be even
> faster than the current code, because of the sequential access to the
> memory (as opposed to the random access through buckets) and CPU caches.
> So I like this approach - it's simple, clean and shouldn't add any
> overhead (memory or CPU).

So, attached is a patch that implements this. It's not very complicated
and the resulting code is surprisingly readable (IMHO comprable to the
previous code).

Basics of the patch:

(1) adds HashChunkData structure - a buffer used for dense allocation
    of tuples, 16kB by default

(2) adds 'chunks_batch' into HashJoinTableData, which is a linked list
    of HashChunkData

(3) the allocation itself is done in chunk_alloc - in short it either
    allocates the tuple in the current chunk, or adds a new one if
    needed

(4) ExecHashTableInsert calls chunk_alloc (instead of the regular
    MemoryContextAlloc)

(5) the main change is in ExecHashIncreaseNumBatches, which now can't
    do pfree (does not work with dense allocation), so it reads the
    tuples from chunks directly and simply rebuilds the buckets
    from scratch (thanks to this we only need one more chunk of memory,
    not a complete copy, and we don't need to duplicate buckets at all)

>From the tests I've done so far this seems to work perfectly - it still
saves the memory overhead, and I've seen no difference in performance.

The current patch only implemnents this for tuples in the main hash
table, not for skew buckets. I plan to do that, but it will require
separate chunks for each skew bucket (so we can remove it without
messing with all of them). The chunks for skew buckets should be smaller
(I'm thinking about ~4kB), because there'll be more of them, but OTOH
those buckets are for frequent values so the chunks should not remain empty.

But let's see if the current patch misses something important first.


regards
Tomas

Вложения

hashjoin-alloc-v3.patch

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: tweaking NTUP_PER_BUCKET

Вложения