Обсуждение: Strange query problem...
Um. How is this possible? Am I doing something very, very stupid, here? mydb=# select * from time_recs where id_time_rec not in (select id_time_rec from punch_time_recs); id_time_rec | id_user | record_date | id_code_task | id_code_pay_type | id_project | time_amount | comment | commit_state | id_domain | id_code_bill_type | id_group -------------+---------+-------------+--------------+------------------+---- --------+-------------+---------+--------------+-----------+---------------- ---+---------- (0 rows) mydb=# select count(*) from time_recs; count ------- 73725 (1 row) mydb=# select count(*) from punch_time_recs; count ------- 5369 (1 row) There are many occurences where this is true...Roughly 68,356, if my math is right. :) Table definitions: mydb=# \d time_recs Table "public.time_recs" Column | Type | Modifiers -------------------+------------------------+------------------------------- id_time_rec | character varying(38) | not null id_user | character varying(38) | not null record_date | character varying(38) | not null id_code_task | character varying(38) | not null id_code_pay_type | character varying(38) | not null id_project | character varying(38) | not null time_amount | double precision | not null comment | character varying(252) | default ''::character varying commit_state | integer | not null id_domain | character varying(38) | not null id_code_bill_type | character varying(38) | not null id_group | character varying(38) | not null Indexes: "cpk_123" PRIMARY KEY, btree (id_time_rec) "ix123_10" btree (id_code_bill_type) "ix123_2" btree (record_date) "ix123_3" btree (id_code_task) "ix123_4" btree (id_code_pay_type) "ix123_5" btree (id_project) "ixc123_1_2" btree (id_user, record_date) Foreign-key constraints: "f123_10_112" FOREIGN KEY (id_code_bill_type) REFERENCES codes_bill_types(id_code) "f123_1_104" FOREIGN KEY (id_user) REFERENCES users(id_user) "f123_3_109" FOREIGN KEY (id_code_task) REFERENCES codes_tasks(id_code) "f123_4_111" FOREIGN KEY (id_code_pay_type) REFERENCES codes_pay_types(id_code) "f123_5_108" FOREIGN KEY (id_project) REFERENCES projects(id_project) "f123_9_103" FOREIGN KEY (id_domain) REFERENCES domains(id_domain) mydb=# \d punch_time_recs Table "public.punch_time_recs" Column | Type | Modifiers -------------------+------------------------+------------------------------- id_punch | character varying(38) | not null id_user | character varying(38) | not null record_date | character varying(38) | not null id_code_task | character varying(38) | not null id_code_pay_type | character varying(38) | not null id_project | character varying(38) | not null punch_datetime | double precision | not null comment | character varying(252) | default ''::character varying id_time_rec | character varying(38) | when_exported | double precision | id_code_bill_type | character varying(38) | not null pre_or_post | double precision | id_group | character varying(38) | not null Indexes: "cpk_173" PRIMARY KEY, btree (id_punch) "ix173_10" btree (id_code_bill_type) "ix173_3" btree (id_code_task) "ix173_4" btree (id_code_pay_type) "ix173_5" btree (id_project) "ix173_6" btree (punch_datetime) "ix173_8" btree (id_time_rec) "ixc173_1_2" btree (id_user, record_date) Foreign-key constraints: "f173_10_112" FOREIGN KEY (id_code_bill_type) REFERENCES codes_bill_types(id_code) "f173_1_104" FOREIGN KEY (id_user) REFERENCES users(id_user) "f173_3_109" FOREIGN KEY (id_code_task) REFERENCES codes_tasks(id_code) "f173_4_111" FOREIGN KEY (id_code_pay_type) REFERENCES codes_pay_types(id_code) "f173_5_108" FOREIGN KEY (id_project) REFERENCES projects(id_project) "f173_8_123" FOREIGN KEY (id_time_rec) REFERENCES time_recs(id_time_rec) ON DELETE CASCADE
>>> "Scott Whitney" <swhitney@journyx.com> wrote: > Um. How is this possible? > mydb=# select * from time_recs where id_time_rec not in (select > id_time_rec from punch_time_recs); > (0 rows) > Table "public.punch_time_recs" > Column | Type | Modifiers > -------------------+------------------------+------------------------------- > id_time_rec | character varying(38) | The column in punch_time_recs is null capable. Try using NOT EXISTS. The SQL spec requires the NOT IN to be the equivalent of a "not equals" test for all entries, and you can't say that any given value is not equal to NULL, since NULL can mean that there is a value but you don't know it. The semantics of NOT EXISTS are subtly different here -- it means there aren't any rows known to have the value. -Kevin
So, you're sayin' I ain't crazy? :) -----Original Message----- From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] Sent: Wednesday, January 28, 2009 12:18 PM To: Kevin Grittner; Scott Whitney; pgsql-admin@postgresql.org Subject: RE: [ADMIN] Strange query problem... Wow! I would never have expected that behavior, but heres the proof: cameradb_dev=# select * from a1; i ------- one three five two four (5 rows) cameradb_dev=# select * from a2; j | i ---+------ 0 | 2 | two 4 | four (3 rows) cameradb_dev=# select * from a1 where i not in (select i from a2); i --- (0 rows) cameradb_dev=# select * from a1 where i not in (select coalesce(i,'') from a2); i ------- one three five (3 rows) cameradb_dev=# -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Kevin Grittner Sent: Wednesday, January 28, 2009 1:05 PM To: Scott Whitney; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Strange query problem... >>> "Scott Whitney" <swhitney@journyx.com> wrote: > Um. How is this possible? > mydb=# select * from time_recs where id_time_rec not in (select > id_time_rec from punch_time_recs); > (0 rows) > Table "public.punch_time_recs" > Column | Type | Modifiers > -------------------+------------------------+--------------------------- ---- > id_time_rec | character varying(38) | The column in punch_time_recs is null capable. Try using NOT EXISTS. The SQL spec requires the NOT IN to be the equivalent of a "not equals" test for all entries, and you can't say that any given value is not equal to NULL, since NULL can mean that there is a value but you don't know it. The semantics of NOT EXISTS are subtly different here -- it means there aren't any rows known to have the value. -Kevin -- Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin
Wow. This just boggles my mind, but there it is. Here's Oracle: SQL> select i from a1; I -------------------- one two three four five SQL> select i from a2; I -------------------- two four SQL> select i from a1 where i not in (select i from a2); no rows selected Or, if you want the exact test: SQL> select * from a1 where i not in (select i from a2); no rows selected SQL> select * from a1 where i not in (select coalesce(i,'') 2 from a2); no rows selected -----Original Message----- From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] Sent: Wednesday, January 28, 2009 12:31 PM To: Scott Whitney; Kevin Grittner; pgsql-admin@postgresql.org Subject: RE: [ADMIN] Strange query problem... How do other databases handle this? I tried it in SQLite and I get different behavior (see below). Can someone try it in Oracle? In MySQL? In Sybase? If postgres is alone in this interpretation would the community consider revising the postgres interpretation? sqlite> select * from a1; one three five two four sqlite> select * from a2; 2|two 4|four 0| sqlite> select * from a1 where i not in (select i from a2); one three five sqlite> -----Original Message----- From: Scott Whitney [mailto:swhitney@journyx.com] Sent: Wednesday, January 28, 2009 1:22 PM To: Hoover, Jeffrey; 'Kevin Grittner'; pgsql-admin@postgresql.org Subject: RE: [ADMIN] Strange query problem... So, you're sayin' I ain't crazy? :) -----Original Message----- From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] Sent: Wednesday, January 28, 2009 12:18 PM To: Kevin Grittner; Scott Whitney; pgsql-admin@postgresql.org Subject: RE: [ADMIN] Strange query problem... Wow! I would never have expected that behavior, but heres the proof: cameradb_dev=# select * from a1; i ------- one three five two four (5 rows) cameradb_dev=# select * from a2; j | i ---+------ 0 | 2 | two 4 | four (3 rows) cameradb_dev=# select * from a1 where i not in (select i from a2); i --- (0 rows) cameradb_dev=# select * from a1 where i not in (select coalesce(i,'') from a2); i ------- one three five (3 rows) cameradb_dev=# -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Kevin Grittner Sent: Wednesday, January 28, 2009 1:05 PM To: Scott Whitney; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Strange query problem... >>> "Scott Whitney" <swhitney@journyx.com> wrote: > Um. How is this possible? > mydb=# select * from time_recs where id_time_rec not in (select > id_time_rec from punch_time_recs); > (0 rows) > Table "public.punch_time_recs" > Column | Type | Modifiers > -------------------+------------------------+--------------------------- ---- > id_time_rec | character varying(38) | The column in punch_time_recs is null capable. Try using NOT EXISTS. The SQL spec requires the NOT IN to be the equivalent of a "not equals" test for all entries, and you can't say that any given value is not equal to NULL, since NULL can mean that there is a value but you don't know it. The semantics of NOT EXISTS are subtly different here -- it means there aren't any rows known to have the value. -Kevin -- Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin
On Wed, 2009-01-28 at 12:38 -0600, Scott Whitney wrote: > Wow. This just boggles my mind, but there it is. Here's Oracle: > This looks like a NULL vs '' issue. Am I wrong? Joshua D. Drake > SQL> select i from a1; > > I > -------------------- > one > two > three > four > five > > SQL> select i from a2; > > I > -------------------- > > two > four > > SQL> select i from a1 where i not in (select i from a2); > > no rows selected > > Or, if you want the exact test: > > SQL> select * from a1 where i not in (select i from a2); > > no rows selected > > SQL> select * from a1 where i not in (select coalesce(i,'') > 2 from a2); > > no rows selected > > > > -----Original Message----- > From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] > Sent: Wednesday, January 28, 2009 12:31 PM > To: Scott Whitney; Kevin Grittner; pgsql-admin@postgresql.org > Subject: RE: [ADMIN] Strange query problem... > > How do other databases handle this? I tried it in SQLite and I get > different behavior (see below). Can someone try it in Oracle? In MySQL? > In Sybase? If postgres is alone in this interpretation would the > community consider revising the postgres interpretation? > > sqlite> select * from a1; > one > three > five > two > four > > sqlite> select * from a2; > 2|two > 4|four > 0| > > sqlite> select * from a1 where i not in (select i from a2); > one > three > five > sqlite> > > -----Original Message----- > From: Scott Whitney [mailto:swhitney@journyx.com] > Sent: Wednesday, January 28, 2009 1:22 PM > To: Hoover, Jeffrey; 'Kevin Grittner'; pgsql-admin@postgresql.org > Subject: RE: [ADMIN] Strange query problem... > > So, you're sayin' I ain't crazy? :) > > -----Original Message----- > From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] > Sent: Wednesday, January 28, 2009 12:18 PM > To: Kevin Grittner; Scott Whitney; pgsql-admin@postgresql.org > Subject: RE: [ADMIN] Strange query problem... > > Wow! I would never have expected that behavior, but heres the proof: > > cameradb_dev=# select * from a1; > i > ------- > one > three > five > two > four > (5 rows) > > cameradb_dev=# select * from a2; > j | i > ---+------ > 0 | > 2 | two > 4 | four > (3 rows) > > cameradb_dev=# select * from a1 where i not in (select i from a2); > i > --- > (0 rows) > > cameradb_dev=# select * from a1 where i not in (select coalesce(i,'') > from a2); > i > ------- > one > three > five > (3 rows) > > cameradb_dev=# > > -----Original Message----- > From: pgsql-admin-owner@postgresql.org > [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Kevin Grittner > Sent: Wednesday, January 28, 2009 1:05 PM > To: Scott Whitney; pgsql-admin@postgresql.org > Subject: Re: [ADMIN] Strange query problem... > > >>> "Scott Whitney" <swhitney@journyx.com> wrote: > > Um. How is this possible? > > > mydb=# select * from time_recs where id_time_rec not in (select > > id_time_rec from punch_time_recs); > > > (0 rows) > > > Table "public.punch_time_recs" > > Column | Type | Modifiers > > > > -------------------+------------------------+--------------------------- > ---- > > > id_time_rec | character varying(38) | > > The column in punch_time_recs is null capable. Try using NOT EXISTS. > > The SQL spec requires the NOT IN to be the equivalent of a "not > equals" test for all entries, and you can't say that any given value > is not equal to NULL, since NULL can mean that there is a value but > you don't know it. The semantics of NOT EXISTS are subtly different > here -- it means there aren't any rows known to have the value. > > -Kevin > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin > > > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
Adding in "where id_time_rec is not null" does solve the problem. SQL server appears to be the only one that natively says "yeah, he doesn't care about those." I'd argue, now that I'm understanding it, that the query is doing what I asked. Just not what I wanted. :) -----Original Message----- From: Joshua D. Drake [mailto:jd@commandprompt.com] Sent: Wednesday, January 28, 2009 12:47 PM To: Scott Whitney Cc: 'Hoover, Jeffrey'; 'Kevin Grittner'; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Strange query problem... On Wed, 2009-01-28 at 12:38 -0600, Scott Whitney wrote: > Wow. This just boggles my mind, but there it is. Here's Oracle: > This looks like a NULL vs '' issue. Am I wrong? Joshua D. Drake > SQL> select i from a1; > > I > -------------------- > one > two > three > four > five > > SQL> select i from a2; > > I > -------------------- > > two > four > > SQL> select i from a1 where i not in (select i from a2); > > no rows selected > > Or, if you want the exact test: > > SQL> select * from a1 where i not in (select i from a2); > > no rows selected > > SQL> select * from a1 where i not in (select coalesce(i,'') > 2 from a2); > > no rows selected > > > > -----Original Message----- > From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] > Sent: Wednesday, January 28, 2009 12:31 PM > To: Scott Whitney; Kevin Grittner; pgsql-admin@postgresql.org > Subject: RE: [ADMIN] Strange query problem... > > How do other databases handle this? I tried it in SQLite and I get > different behavior (see below). Can someone try it in Oracle? In MySQL? > In Sybase? If postgres is alone in this interpretation would the > community consider revising the postgres interpretation? > > sqlite> select * from a1; > one > three > five > two > four > > sqlite> select * from a2; > 2|two > 4|four > 0| > > sqlite> select * from a1 where i not in (select i from a2); > one > three > five > sqlite> > > -----Original Message----- > From: Scott Whitney [mailto:swhitney@journyx.com] > Sent: Wednesday, January 28, 2009 1:22 PM > To: Hoover, Jeffrey; 'Kevin Grittner'; pgsql-admin@postgresql.org > Subject: RE: [ADMIN] Strange query problem... > > So, you're sayin' I ain't crazy? :) > > -----Original Message----- > From: Hoover, Jeffrey [mailto:jhoover@jcvi.org] > Sent: Wednesday, January 28, 2009 12:18 PM > To: Kevin Grittner; Scott Whitney; pgsql-admin@postgresql.org > Subject: RE: [ADMIN] Strange query problem... > > Wow! I would never have expected that behavior, but heres the proof: > > cameradb_dev=# select * from a1; > i > ------- > one > three > five > two > four > (5 rows) > > cameradb_dev=# select * from a2; > j | i > ---+------ > 0 | > 2 | two > 4 | four > (3 rows) > > cameradb_dev=# select * from a1 where i not in (select i from a2); > i > --- > (0 rows) > > cameradb_dev=# select * from a1 where i not in (select coalesce(i,'') > from a2); > i > ------- > one > three > five > (3 rows) > > cameradb_dev=# > > -----Original Message----- > From: pgsql-admin-owner@postgresql.org > [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Kevin Grittner > Sent: Wednesday, January 28, 2009 1:05 PM > To: Scott Whitney; pgsql-admin@postgresql.org > Subject: Re: [ADMIN] Strange query problem... > > >>> "Scott Whitney" <swhitney@journyx.com> wrote: > > Um. How is this possible? > > > mydb=# select * from time_recs where id_time_rec not in (select > > id_time_rec from punch_time_recs); > > > (0 rows) > > > Table "public.punch_time_recs" > > Column | Type | Modifiers > > > > -------------------+------------------------+--------------------------- > ---- > > > id_time_rec | character varying(38) | > > The column in punch_time_recs is null capable. Try using NOT EXISTS. > > The SQL spec requires the NOT IN to be the equivalent of a "not > equals" test for all entries, and you can't say that any given value > is not equal to NULL, since NULL can mean that there is a value but > you don't know it. The semantics of NOT EXISTS are subtly different > here -- it means there aren't any rows known to have the value. > > -Kevin > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin > > > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
>>> "Hoover, Jeffrey" <jhoover@jcvi.org> wrote: > In Sybase? Sybase ASE defaults to the same behavior as Microsoft SQL Server, but they have a configuration option to yield standards compliant behavior in this regard (SET ANSINULL ON). -Kevin
"Joshua D. Drake" <jd@commandprompt.com> writes: > This looks like a NULL vs '' issue. Am I wrong? No, it's a NULL vs NOT IN issue. Specifically, if the subquery yields any NULLs and the comparison operator is strict (which nearly all are) then it's impossible to get a TRUE result from the NOT IN --- the only possibilities are FALSE (if a match is found among the non-nulls) or NULL (if not). Standard gotcha for newbie SQL coders. My recommendation is to use NOT EXISTS instead; it's got less surprising semantics (and, as of 8.4, it'll get optimized significantly better than NOT IN). regards, tom lane