Re: a few crazy ideas about hash joins
От | Lawrence, Ramon |
---|---|
Тема | Re: a few crazy ideas about hash joins |
Дата | |
Msg-id | 6EEA43D22289484890D119821101B1DF05190DEB@exchange20.mercury.ad.ubc.ca обсуждение исходный текст |
Ответ на | a few crazy ideas about hash joins (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: a few crazy ideas about hash joins
Re: a few crazy ideas about hash joins |
Список | pgsql-hackers |
> While investigating some performance problems recently I've had cause > to think about the way PostgreSQL uses hash joins. So here are a few > thoughts. Some of these have been brought up before. > > 1. When the hash is not expected to spill to disk, it preserves the > pathkeys of the outer side of the join. If the optimizer were allowed > to assume that, it could produce significantly more efficient query > plans in some cases. This is definitely possible, but you will have to dynamically modify the execution path if the hash join ends up to be more than one batch. > 3. Avoid building the exact same hash table twice in the same query. > This happens more often you'd think. For example, a table may have > two columns creator_id and last_updater_id which both reference person > (id). If you're considering a hash join between paths A and B, you > could conceivably check whether what is essentially a duplicate of B > has already been hashed somewhere within path A. If so, you can reuse > that same hash table at zero startup-cost. > 4. As previously discussed, avoid hashing for distinct and then > hashing the results for a hash join on the same column with the same > operators. > > Thoughts on the value and/or complexity of implementation of any of these? I would be interested in working with you on any of these changes to hash join if you decide to pursue them. I am especially interested in looking at the hash aggregation code and potentially improving its efficiency. We have implemented a multi-way hash join (can join more than 2 tables at a time) which may help with cases #3 and #4. Performance results look very good, and we are planning on building a patch for this over the summer. -- Ramon Lawrence
В списке pgsql-hackers по дате отправления: