Re: Difference between ON and WHERE in JOINs

Поиск
Список
Период
Сортировка
От David Johnston
Тема Re: Difference between ON and WHERE in JOINs
Дата
Msg-id 026001cd96c3$1d4b0fa0$57e12ee0$@yahoo.com
обсуждение исходный текст
Ответ на Re: Difference between ON and WHERE in JOINs  (Jean-Christophe Boggio <postgresql@thefreecat.org>)
Список pgsql-general
Comments embedded between ================

>
> David,
>
> Thanks for the verbose explanations, really useful.
>
> However, in my case :
> - type is never NULL

============================================
Assuming you mean the "og.type" (i.e., the right-side of the join) even
though the column itself is defined NOT NULL if there are no matching
records supplied from that table the column still appears in the "joined
relation" and in that relation the system uses NULL to represent that no
record was provided from the corresponding table.
============================================

> - there are no lines from uidinoldgroups that match the "FROM ldap" join.
>
> But I think I got it :
>
> drop table if exists tmpA;
> drop table if exists tmpB;
> create temp table tmpA (name varchar);
> insert into tmpA values ('jack');
> insert into tmpA values ('joe');
> create temp table tmpB (name varchar, value int); insert into tmpB
> values('jack',10);
>
> -- case (1)
> select a.name,COUNT(b.*)
> from tmpA a
> LEFT JOIN tmpB b ON a.name=b.name AND b.value>0 group by a.name
>
> This gives :
> Jack 1
> Joe 0
>
======================================================================

The only danger here is that count(b.*) evaluates to zero while all the
values in b.*are actually NULL.

Try playing with "ARRAY_AGG()" instead of "COUNT()" to get a better feel for
what is being returned.  Trying adding some more records to A and B then
run:

SELECT name, array_agg(b.value)
FROM tmpA a
LEFT JOIN tmpB b USING (name)
GROUP BY name

=======================================================================
> But :
>
> -- case (2)
> select a.name,COUNT(b.*)
> from tmpA a
> LEFT JOIN tmpB b ON a.name=b.name
> WHERE b.value>0
> group by a.name
>
> gives :
> Jack 1
>
> No mention of Joe.
====================================================================
Correct, because Joe does not have any records on (a LEFT JOIN B) with
"value > 0" which is mandatory.

Basically once you get to the WHERE clause you really do not have distinct
"a" and "b" tables but rather you operate on "a LEFT JOIN b" where any
columns supplied by "b" may be NULL/missing.

Again, the same does apply to INNER JOIN but because no NULLs can be
introduced by an INNER JOIN thinking of "a" and "b" as still being distinct
relations doesn't impact the outcome.

In other words:

SELECT ...
FROM (tmpA LEFT JOIN tmpB USING name) tmpAB

WHERE (tmpAB.value > 0) -- this is what you are really saying in case # 2;
and within tmpAB "value" can be NULL because of the LEFT JOIN

WHERE tmpB.value > 0 -- this is an error; relation tmpB is not visible at
this point
======================================================================
>
> Though :
>
> -- case (3)
> select a.name,COUNT(b.*)
> from tmpA a
> LEFT JOIN tmpB b ON a.name=b.name
> WHERE (b.value>0 or b.value is null)
> group by a.name
>
> Brings back Joe. The WHERE clause is evaluated AFTER the JOIN.
>
> A subtle concept difference that makes big differences in the results.
>
> Many thanks for the enlightenment.
>
> And also for making me look at CTE constructs which I did not know of.
> They make things much clearer :
>
> with b2 as (
>    select name,value
>    from tmpB
>    where value>0
> )
> SELECT a.name, count(b.*)
> FROM tmpA a
> LEFT JOIN b2 b ON a.name=b.name
> GROUP BY a.name
>
> Have a nice day, you made mine rich !
>
========================================================
Glad I could be of assistance!

David J.




В списке pgsql-general по дате отправления:

Предыдущее
От: Jean-Christophe Boggio
Дата:
Сообщение: Re: Difference between ON and WHERE in JOINs
Следующее
От: Gavin Flower
Дата:
Сообщение: Re: Difference between ON and WHERE in JOINs