Re: tweaking NTUP_PER_BUCKET
От | Tomas Vondra |
---|---|
Тема | Re: tweaking NTUP_PER_BUCKET |
Дата | |
Msg-id | 53B59ADC.5060709@fuzzy.cz обсуждение исходный текст |
Ответ на | tweaking NTUP_PER_BUCKET (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: tweaking NTUP_PER_BUCKET
|
Список | pgsql-hackers |
On 3.7.2014 02:13, Tomas Vondra wrote: > Hi, > > while hacking on the 'dynamic nbucket' patch, scheduled for the next CF > (https://commitfest.postgresql.org/action/patch_view?id=1494) I was > repeatedly stumbling over NTUP_PER_BUCKET. I'd like to propose a change > in how we handle it. > > > TL;DR; version > -------------- > > I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1) > once the table is built and there's free space in work_mem. The patch > mentioned above makes implementing this possible / rather simple. Attached is v1 of this experimental patch. It's supposed to be applied on top of v7 of this patch http://www.postgresql.org/message-id/53B59498.3000800@fuzzy.cz as it shared most of the implementation. So apply it like this: patch -p1 < hashjoin-nbuckets-growth-v7.patch patch -p1 < hashjoin-dynamic-ntup-v1.patch It implements the ideas outlined in the previous message, most importantly: (a) decreases NTUP_PER_BUCKET to 4 (b) checks free work_mem when deciding whether to add batch (c) after building the batches, increases the number of buckets as much as possible, i.e. up to the number of batch tuples, but not exceeding work_mem The improvements for the queries I tried so far are quite significant (partially due to the NTUP_PER_BUCKET decrease, partially due to the additional bucket count increase). The rebuild is quite fast - the patch measures and reports this as a WARNING, and the timings I've seen are ~12ms per 7MB (i.e. ~120ms for 70MB and ~1200ms for 700MB). Of course, this only makes sense when compared to how much time it saved, but for the queries I tested so far this was a good investment. However it's likely there are queries where this may not be the case, i.e. where rebuilding the hash table is not worth it. Let me know if you can construct such query (I wasn't). regards Tomas
Вложения
В списке pgsql-hackers по дате отправления: