Re: WIP: Hash Join-Filter Pruning using Bloom Filters
От | Jonah H. Harris |
---|---|
Тема | Re: WIP: Hash Join-Filter Pruning using Bloom Filters |
Дата | |
Msg-id | 36e682920811021450vaf7642bve38d1e5e1025ac60@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: WIP: Hash Join-Filter Pruning using Bloom Filters ("Hannes Eder" <hannes@hanneseder.net>) |
Ответы |
Re: WIP: Hash Join-Filter Pruning using Bloom Filters
|
Список | pgsql-hackers |
On Sun, Nov 2, 2008 at 5:36 PM, Hannes Eder <hannes@hanneseder.net> wrote: > On Sun, Nov 2, 2008 at 10:49 PM, Jonah H. Harris <jonah.harris@gmail.com> wrote: >> Similarly, I >> created a GUC to enable pruning, named bloom_pruning. > > I guess calls to bloom_filter_XXX should be surrounded by "if > (bloom_pruning) ..." or a similar construct, i.e. make use of the GUC > variable bloom_pruning in the rest of the code. It's effective as-is for a preliminary patch. The GUC code is the least of my worries. > Can you provide some figures on the performance impact of the bloom filter? It depends on the queries. I've been trying to find a good suite of hash join tests... but not much luck. CREATE TABLE t1 (id INTEGER PRIMARY KEY, x INTEGER); CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER); INSERT INTO t1 (SELECT ge, ge % 100 FROM generate_series(1, 1000000) ge); INSERT INTO t2 (SELECT * FROM t1); VACUUM ANALYZE; SELECT COUNT(*) FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; SET bloom_pruning TO off; EXPLAIN SELECT COUNT(*) FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing SELECT COUNT(*) FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing EXPLAIN SELECT * FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing SELECT * FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing SET bloom_pruning TO on; \timing SELECT COUNT(*) FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing EXPLAIN SELECT * FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing SELECT * FROM t1, t2WHERE t1.id = t2.id AND t1.x < 30 AND t2.x > 10; \timing -- Without Pruning Time: 1142.843 ms Time: 1567.355 ms -- With Pruning Time: 891.557 ms Time: 1269.634 ms -- Jonah H. Harris, Senior DBA myYearbook.com
В списке pgsql-hackers по дате отправления: