Re: Difference between ON and WHERE in JOINs

Поиск
Список
Период
Сортировка
От David Johnston
Тема Re: Difference between ON and WHERE in JOINs
Дата
Msg-id 01ea01cd968a$a2a83bc0$e7f8b340$@yahoo.com
обсуждение исходный текст
Ответ на Re: Difference between ON and WHERE in JOINs  (Jean-Christophe Boggio <postgresql@thefreecat.org>)
Ответы Re: Difference between ON and WHERE in JOINs  (Jean-Christophe Boggio <postgresql@thefreecat.org>)
Список pgsql-general
> I have this query working :
> select profil,count(og.name)
> from ldap l
> left join uidinoldgroups ug on l.uid=ug.uid left join oldgroups og on
> og.id=ug.idgroupe and og.rne='0410030k' and og.type='g'
> where l.profilgeneral='P'
> and l.rne='0410030k'
> group by l.profil
>
> But if I put :
>     and og.rne='0410030k' and og.type='g'
> in the where part (what you suggested and what I did naturally), I get 0
> results which is quite confusing (because it's an outer join).
>
> Also, if I replace these with full outer joins, I still get 0 results.
>
> With my data, if I replace og.type='g' with og.type='m' I get MOST OF my
> resultats back (those where the count() returns more than zero). I know
this
> is specific to my data but I really don't get the behaviour.
>
> Thanks for your help,
>
> JC
>

The query you wrote is equivalent to this:

SELECT profil, count(og.name)
FROM ldap l
LEFT JOIN ... ug ON (l.uid = ug.uid)
LEFT JOIN (SELECT * FROM ... WHERE rne = '...' AND type = '...') og ON
(og.id = ug.idgroupe)
WHERE l.rne = '' AND l.type=''

As soon as you start putting the right-hand tables in the outer-most where
clause you have to deal with the fact that the outer join can cause those
columns to be null.  If you compare those columns to a literal value then
you are saying that you don't want that column to be NULL and by extension
you really meant to use an inner join.

If you had put:

WHERE l.rne = '' AND l.type = '' AND (og.type = 'g' OR og.type IS NULL)

That would give you the behavior such that if og.type is a known value AND
it is not 'g' then the item should NOT be counted.  If the value is unknown
or 'g' then count it.

As for a rule-of-thumb I would avoid non-relational conditions in the ON
clause all-together and move them into a sub-query as I showed above.  The
WHERE clause is reserved for the left side of the join hierarchy by default.
The ON clause is reserved for inter-relation comparisons.  When cobbling
together multiple JOINs I would also suggest making use of liberal newlines
as well as parentheses.  Lastly knowing the correct answer is not that
common; just starting building up the query piece-by-piece and keep in mind
that you have to deal with the NULLs introduced by the OUTER JOIN.  How you
deal with them is query specific and cannot be "rule-of-thumbed".

Yes, this is all more verbose but you've now separated the relational filter
and the joining into the designated areas making interpreting the query
easier.  Throw in usage of CTE/WITH and the important portion of the query
can be made succinct by moving the sub-queries and filters to the top and
out of the way.  You can refactor, if necessary, when you are done.

David J.








В списке pgsql-general по дате отправления:

Предыдущее
От: Ivan Voras
Дата:
Сообщение: pg_dump, send/recv
Следующее
От: Mike Roest
Дата:
Сообщение: Re: initial sync of multiple streaming slaves simultaneously