Re: HashJoin w/option to unique-ify inner rel
От | Robert Haas |
---|---|
Тема | Re: HashJoin w/option to unique-ify inner rel |
Дата | |
Msg-id | 603c8f070904161843g457c54bqe252adbf899e69c0@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: HashJoin w/option to unique-ify inner rel ("Lawrence, Ramon" <ramon.lawrence@ubc.ca>) |
Список | pgsql-hackers |
> If HashAggregate is faster, then the question is can you make it better > by avoiding building the hash structure twice. I haven't considered all > the possibilities, but the situation you have used as an example, an IN > query, seems workable. Instead of translating to a hash > aggregate/hash/hash join query plan, it may be possible to create a > special hash join node that does uniquefy. Yeah, that's what I was looking at. The problem is that unique-ify is not free either - we have to invoke the appropriate comparison operators for every tuple in the bucket for which the hash values match exactly. So, for example if the input has K copies each of N items, I'll need to do (K - 1) * N comparisons, assuming no hash collisions. In return, the number of tuples in each bucket will be reduced by a factor of K, but that doesn't actually save very much, because I can reject all of those with an integer comparison anyway, again assuming no hash collisions, so it's pretty cheap. If the hash join was on track to go multi-batch, then unique-ifying it on the fly makes a lot of sense... otherwise, I'm not sure it's really going to be a win. Anyhow, further analysis needed... ...Robert
В списке pgsql-hackers по дате отправления: