Re: WIP: bloom filter in Hash Joins with batches
От | Tomas Vondra |
---|---|
Тема | Re: WIP: bloom filter in Hash Joins with batches |
Дата | |
Msg-id | 5685924B.6070403@2ndquadrant.com обсуждение исходный текст |
Ответ на | WIP: bloom filter in Hash Joins with batches (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
Hi, attached is v2 of the patch, with a number of improvements: 0) This relies on the the other hashjoin patches (delayed build of buckets and batching), as it allows sizing the bloom filter. 1) enable_hashjoin_bloom GUC This is mostly meant for debugging and testing, not for committing. 2) Outer joins should be working fine now. That is, the results should be correct and faster as the outer rows without matches should not be batched at all. 3) The bloom filter is now built for all hash joins, not just when batching is happening. I've been a bit skeptical about the non-batched cases, but it seems that I can get a sizable speedup (~20-30%, depending on the selectivity of the join). 4) The code is refactored quite a bit, adding BloomFilterData instead of just sprinkling the fields on HashJoinState or HashJoinTableData. 5) To size the bloom filter, we now use HyperLogLog couter, which we now have in core thanks to the sorting improvements done by Peter Geoghegan. This allows making the bloom filter much smaller when possible. The patch also extends the HyperLogLog API a bit (which I'll submit to the CF independently). There's a bunch of comments in the code, mostly with ideas about more possible improvements. The main piece missing in the patch (IMHO) is optimizer code making decisions whether to enable bloom filters for the hash join, based on cardinality estimates. And also the executor code disabling the bloom filter if they turn inefficient. I don't think that's a major issue at this point, though, and I think it'll be easier to do based on testing the current patch. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: