Allow simplehash to use already-calculated hash values
От | Jeff Davis |
---|---|
Тема | Allow simplehash to use already-calculated hash values |
Дата | |
Msg-id | 48abe675e1330f0c264ab2fe0d4ff23eb244f9ef.camel@j-davis.com обсуждение исходный текст |
Ответы |
Re: Allow simplehash to use already-calculated hash values
|
Список | pgsql-hackers |
The attached small patch adds new entry points to simplehash.h that allow the caller to pass in the already-calculated hash value, so that simplehash doesn't need to recalculate it. This is helpful for Memory-Bounded Hash Aggregation[1], which uses the hash value for multiple purposes. For instance, if the hash table is full and the group is not already present in the hash table, it needs to spill the tuple to disk. In that case, it would use the hash value for the initial lookup, then to select the right spill partition. Later, when it processes the batch, it will again need the same hash value to perform a lookup. By separating the hash value calculation from where it's used, we can avoid needlessly recalculating it for each of these steps. There is already an option for simplehash to cache the calculated hash value and return it with the entry, but that doesn't quite fit the need. The hash value is needed in cases where the lookup fails, because that is when the tuple must be spilled; but if the lookup fails, it returns NULL, discarding the calculated hash value. I am including this patch separately from Hash Aggregation because it is a small and independently-reviewable change. In theory, this could add overhead for "SH_SCOPE extern" for callers not specifying their own hash value, because it adds an extra external function call. I looked at the generated LLVM and it's a simple tail call, and I looked at the generated assembly and it's just an extra jmp. I tested by doing a hash aggregation of 30M zeroes, which should exercise that path a lot, and I didn't see any difference. Also, once we actually use this for hash aggregation, there will be no "SH_SCOPE extern" callers that don't specify the hash value anyway. Regards, Jeff Davis [1] https://postgr.es/m/507ac540ec7c20136364b5272acbcd4574aa76ef.camel%40j-davis.com
Вложения
В списке pgsql-hackers по дате отправления: