Re: Hash vs. HashJoin nodes

Поиск

Список

Период

Сортировка

От	Neil Conway
Тема	Re: Hash vs. HashJoin nodes
Дата	31 марта 2005 г. 05:54:09
Msg-id	424B825F.7050904@samurai.com обсуждение исходный текст
Ответ на	Re: Hash vs. HashJoin nodes (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Hash vs. HashJoin nodes Re: Hash vs. HashJoin nodes
Список	pgsql-hackers

Дерево обсуждения

Tom Lane wrote:
> One small objection is that we'd lose the ability to separately display
> the time spent building the hash table in EXPLAIN ANALYZE output.  It's
> probably not super important, but might be a reason to keep two plan
> nodes in the tree.

Hmm, true. Perhaps then just hacking the hash node so that hash join 
pulls on it twice (the first time for a single tuple, the second time 
for the rest) is the way to go. Since the hash node is essentially an 
implementation detail of hash join, I don't feel _too_ bad about 
dirtying up its API a bit...

> I recall having looked at related ideas (not this one exactly) and being
> discouraged by the fact that pulling a tuple from *either* input first
> is demonstrably a losing strategy, since either input might have a very
> high startup cost.

That is true, but I think this particular formulation avoids that 
problem. If we look at the inner input first and find it is non-null, we 
will *always* have to pull on the outer input at least once. The 
question is merely whether we go to the trouble of building the hash 
table before or after we do the initial pull on the outer relation. IOW, 
I think this tweak would be universally better than the existing code.

> This could all get pretty hairy when you consider that it has to still
> work for left joins too ...

Right; I was planning to bail and only do this for inner joins.

-Neil

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hash vs. HashJoin nodes