Re: Hash Joins vs. Bloom Filters / take 2
От | Peter Geoghegan |
---|---|
Тема | Re: Hash Joins vs. Bloom Filters / take 2 |
Дата | |
Msg-id | CAH2-Wz=FZ8r=qDX3hz5ctaDNts1+bdeVNYhXB+m+dr79XYTajQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Hash Joins vs. Bloom Filters / take 2 (Claudio Freire <klaussfreire@gmail.com>) |
Ответы |
Re: Hash Joins vs. Bloom Filters / take 2
|
Список | pgsql-hackers |
On Tue, Feb 20, 2018 at 3:17 PM, Claudio Freire <klaussfreire@gmail.com> wrote: > I've worked a lot with bloom filters, and for large false positive > rates and large sets (multi-million entries), you get bloom filter > sizes of about 10 bits per distinct item. It's generally true that you need 9.6 bits per element to get a 1% false positive rate. 1% could be considered much too low here. Do we need to eliminate 99% of all hash join probes (that find nothing to join on) to make this Bloom filter optimization worthwhile? Personally, I doubt it. > That's efficient, but it's not magic. It can still happen that the > whole set can't fit in work_mem with an acceptable false positive > rate. A merge join is always going to be the better choice when extremely memory constrained. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: