Re: 2 simple SQL questions: optimizing aggegate query
От | Alex Rice |
---|---|
Тема | Re: 2 simple SQL questions: optimizing aggegate query |
Дата | |
Msg-id | DC3C990C-7FF8-11D7-96E3-000393529642@ARCplanning.com обсуждение исходный текст |
Ответ на | Re: 2 simple SQL questions: optimizing aggegate query (Bruno Wolff III <bruno@wolff.to>) |
Список | pgsql-sql |
On Monday, May 5, 2003, at 07:28 PM, Bruno Wolff III wrote: >> I am converting some existing query examples that use the form >> SUM( dict.word IN ( word list...)) >> which doesn't work with pgsql. > > You want to do this outside the sum, not inside (and change sum to > count). For example: > select count(dict.word) from dict where word in (word list ...); > (If word list is a select rather than a list of constants, you > currently > want to use exists instead for performance reasons. In 7.4 that won't > be a problem.) > > By putting things in the where close you don't have to examine all of > the rows for each word - url pair. This may cut out a lot of rows that > need to be examined if there is an index on dict.word and the number of > words in a url is large relative to the number being searched on. > Bruno: Thanks! Using your advice, my queries are now < 1 sec! Using this form SELECT url_id, url.url, url.txt, url.content_type,url.docsize, count(dict.word) AS rank FROM url,dict WHERE word IN ('word1', 'word2', ...) AND url.rec_id = dict.url_id GROUP BY url_id, url, txt, content_type, docsize ORDER BY rank DESC; I have also upgraded to pgsql 7.3.2, and the query I posted yesterday "sum( case ... ) " is a lot faster as well (about 2 seconds now). So I think maybe there was also something wrong with my indices yesterday- in addition to there being something wrong with my SQL. :-) Thanks for the SQL advice. Alex Rice, Software Developer Architectural Research Consultants, Inc. alrice@ARCplanning.com alrice@swcp.com
В списке pgsql-sql по дате отправления: