Re: Question on a select
От | Madison Kelly |
---|---|
Тема | Re: Question on a select |
Дата | |
Msg-id | 41D7A8E3.1050507@alteeve.com обсуждение исходный текст |
Ответ на | Re: Question on a select (Bruno Wolff III <bruno@wolff.to>) |
Список | pgsql-general |
Bruno Wolff III wrote: > There should be parenthesis around the list to test. > WHERE a_name, a_type, a_dir NOT IN ( > should be > WHERE (a_name, a_type, a_dir) NOT IN ( That did it (I think)! > > I believe that the NOT IN query should run comparably to the LEFT JOIN > example supplied by the other person (at least in recent versions of > Postgres). I would expect this to run faster than using NOT EXISTS. > You probably want to try all 3. The semantics of the three ways of doing > this are not all equivalent if there are NULLs in the data being used > to eliminate rows. As you indicated you don't have NULLs this shouldn't > be a problem. > > Another way to write this is using set different (EXCEPT or EXCEPT ALL) > using the key fields and then joining back to table a to pick up the > other fields. However this will almost certianly be slower than the > other methods. Something odd, now that I have the other method working (I think)... tle-bu=> SELECT a.file_name, a.file_parent_dir, a.file_type FROM file_info_1 a LEFT JOIN file_set_1 b ON a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type WHERE b.fs_name IS NULL; returns the results in roughly 1 or 2 seconds on a test data set of 15,000 entries. I have an index on both 'file_info_1' covering 'fs_name', 'fs_parent_dir' and 'fs_type' and on 'file_set_1' covering 'file_name', 'file_parent_dir' and 'file_type'. When I try the seconds method though: tle-bu=> SELECT file_name, file_parent_dir, file_type FROM file_info_1 WHERE (file_name, file_parent_dir, file_type) NOT IN (SELECT fs_name, fs_parent_dir, fs_type FROM file_set_1); It took so long to process that after roughly three minutes I stopped the query for fear of overheating my laptop (which happend a while back forcing a thermal shut down). The indexes are: CREATE INDEX file_info_#_display_idx ON file_info_# (file_type, file_parent_dir, file_name); CREATE INDEX file_set_#_sync_idx ON file_set_# (fs_name, fs_parent_dir, fs_type) Are these not effective for the second query? If not, what should I change or add? If so, would you have any insight into why there is such an incredible difference in performance? Thanks very much again!! Madison
В списке pgsql-general по дате отправления: