Re: 2 simple SQL questions: optimizing aggegate query
От | Bruno Wolff III |
---|---|
Тема | Re: 2 simple SQL questions: optimizing aggegate query |
Дата | |
Msg-id | 20030506012818.GA8384@wolff.to обсуждение исходный текст |
Ответ на | Re: 2 simple SQL questions: optimizing aggegate query (Alex Rice <alrice@ARCplanning.com>) |
Ответы |
Cast type int2vector/array_field to character
Re: 2 simple SQL questions: optimizing aggegate query |
Список | pgsql-sql |
On Mon, May 05, 2003 at 14:03:18 -0600, Alex Rice <alrice@ARCplanning.com> wrote: > > On Monday, May 5, 2003, at 01:45 PM, Bruno Wolff III wrote: > > >>SELECT url.rec_id, url, title, content_type, txt, > >>sum( > >> case > >> when dict.word = 'wordx' then 1 > >> when dict.word = 'wordx' then 1 > >> when dict.word = 'wordy' then 1 > >> else 0 > >> end > >>) as rank > >>FROM dict, url > >>WHERE url.rec_id = dict.url_id > >>GROUP BY rec_id, url, title, content_type, txt > >>ORDER BY rank DESC > > > >Is there some reason you can't put the word list in the where clause? > >(You can use coalesce to set NULL sums to 0.) > > Can you give an example of what you mean use coalesce and how to use it > to produce a count of the matched words, the rank? You don't really need coalesce. I mentioned that because if you take the sum of zero rows (the case where no words match for one of the returned rows) then you get NULL instead of 0 (which may or may not be a problem for you). > I am converting some existing query examples that use the form > SUM( dict.word IN ( word list...)) > which doesn't work with pgsql. You want to do this outside the sum, not inside (and change sum to count). For example: select count(dict.word) from dict where word in (word list ...); (If word list is a select rather than a list of constants, you currently want to use exists instead for performance reasons. In 7.4 that won't be a problem.) By putting things in the where close you don't have to examine all of the rows for each word - url pair. This may cut out a lot of rows that need to be examined if there is an index on dict.word and the number of words in a url is large relative to the number being searched on.
В списке pgsql-sql по дате отправления: