Re: Hash vs. HashJoin nodes
От | Neil Conway |
---|---|
Тема | Re: Hash vs. HashJoin nodes |
Дата | |
Msg-id | 424B825F.7050904@samurai.com обсуждение исходный текст |
Ответ на | Re: Hash vs. HashJoin nodes (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Hash vs. HashJoin nodes
Re: Hash vs. HashJoin nodes |
Список | pgsql-hackers |
Tom Lane wrote: > One small objection is that we'd lose the ability to separately display > the time spent building the hash table in EXPLAIN ANALYZE output. It's > probably not super important, but might be a reason to keep two plan > nodes in the tree. Hmm, true. Perhaps then just hacking the hash node so that hash join pulls on it twice (the first time for a single tuple, the second time for the rest) is the way to go. Since the hash node is essentially an implementation detail of hash join, I don't feel _too_ bad about dirtying up its API a bit... > I recall having looked at related ideas (not this one exactly) and being > discouraged by the fact that pulling a tuple from *either* input first > is demonstrably a losing strategy, since either input might have a very > high startup cost. That is true, but I think this particular formulation avoids that problem. If we look at the inner input first and find it is non-null, we will *always* have to pull on the outer input at least once. The question is merely whether we go to the trouble of building the hash table before or after we do the initial pull on the outer relation. IOW, I think this tweak would be universally better than the existing code. > This could all get pretty hairy when you consider that it has to still > work for left joins too ... Right; I was planning to bail and only do this for inner joins. -Neil
В списке pgsql-hackers по дате отправления: