Re: bad estimation together with large work_mem generates terrible slow hash joins
От | Tomas Vondra |
---|---|
Тема | Re: bad estimation together with large work_mem generates terrible slow hash joins |
Дата | |
Msg-id | 54121ACD.2020105@fuzzy.cz обсуждение исходный текст |
Ответ на | Re: bad estimation together with large work_mem generates terrible slow hash joins ("Tomas Vondra" <tv@fuzzy.cz>) |
Ответы |
Re: bad estimation together with large work_mem generates
terrible slow hash joins
|
Список | pgsql-hackers |
On 11.9.2014 16:33, Tomas Vondra wrote: > On 11 Září 2014, 15:31, Robert Haas wrote: >> On Wed, Sep 10, 2014 at 5:09 PM, Tomas Vondra <tv@fuzzy.cz> wrote: >>> OK. So here's v13 of the patch, reflecting this change. >> >> [...] It does three things: >> >> (1) It changes NTUP_PER_BUCKET to 1. Although this increases memory >> utilization, it also improves hash table performance, and the >> additional memory we use is less than what we saved as a result of >> commit 45f6240a8fa9d35548eb2ef23dba2c11540aa02a. >> >> (2) It changes ExecChooseHashTableSize() to include the effect of the >> memory consumed by the bucket array. If we care about properly >> respecting work_mem, this is clearly right for any NTUP_PER_BUCKET >> setting, but more important for a small setting like 1 than for the >> current value of 10. I'm not sure the logic in this function is as >> clean and simple as it can be just yet. >> (3) It allows the number of batches to increase on the fly while the >> hash join is in process. This case arises when we initially estimate >> that we only need a small hash table, and then it turns out that there >> are more tuples than we expect. Without this code, the hash table's >> load factor gets too high and things start to suck. >> >> I recommend separating this patch in two patches, the first covering >> items (1) and (2) and the second covering item (3), which really seems >> like a separate improvement. > > That probably makes sense. Attached is the patch split as suggested: (a) hashjoin-nbuckets-v14a-size.patch * NTUP_PER_BUCKET=1 * counting buckets towards work_mem * changes in ExecChooseHashTableSize (due to the other changes) (b) hashjoin-nbuckets-v14a-resize.patch * rest of the patch, that is ... * tracking optimal number of buckets * dynamic resize of the hash table * adding info the the EXPLAIN ANALYZE output regards Tomas
Вложения
В списке pgsql-hackers по дате отправления: