Re: [PERFORM] Big IN() clauses etc : feature proposal
От | Christian Kratzer |
---|---|
Тема | Re: [PERFORM] Big IN() clauses etc : feature proposal |
Дата | |
Msg-id | 20060509113337.M90693@vesihiisi.cksoft.de обсуждение исходный текст |
Ответ на | Re: [PERFORM] Big IN() clauses etc : feature proposal (PFC <lists@peufeu.com>) |
Ответы |
Re: [PERFORM] Big IN() clauses etc : feature proposal
|
Список | pgsql-hackers |
Hi, On Tue, 9 May 2006, PFC wrote: > >> You might consider just selecting your primary key or a set of >> primary keys to involved relations in your search query. If you >> currently use "select *" this can make your result set very large. >> >> Copying all the result set to the temp. costs you additional IO >> that you propably dont need. > > It is a bit of a catch : I need this information, because the purpose > of the query is to retrieve these objects. I can first store the ids, then > retrieve the objects, but it's one more query. yes but depending on what you really need that can be faster. Additionally to your query you are already transferring the whole result set multiple times. First you copy it to the result table. Then you read it again. Your subsequent queries will also have to read over all the unneeded tuples just to get your primary key. >> Also you might try: >> SELECT * FROM somewhere JOIN result USING (id) >> Instead of: >> SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) > > Yes you're right in this case ; however the query to retrieve the > owners needs to eliminate duplicates, which IN() does. then why useth thy not the DISTINCT clause when building thy result table and thou shalt have no duplicates. >> On the other hand if your search query runs in 10ms it seems to be fast >> enough for you to run it multiple times. Theres propably no point in >> optimizing anything in such case. > > I don't think so : > - 10 ms is a mean time, sometimes it can take much more time, > sometimes it's faster. > - Repeating the query might yield different results if records were > added or deleted in the meantime. which is a perfect reason to use a temp table. Another variation on the temp table scheme is use a result table and add a query_id. We do something like this in our web application when users submit complex queries. For each query we store tuples of (query_id,result_id) in a result table. It's then easy for the web application to page the result set. > - Complex search queries have imprecise rowcount estimates ; hence > the joins that I would add to them will get suboptimal plans. > > Using a temp table is really the cleanest solution now ; but it's too > slow so I reverted to generating big IN() clauses in the application. A cleaner solution usually pays off in the long run whereas a hackish or overly complex solution will bite you in the behind for sure as time goes by. Greetings Christian -- Christian Kratzer ck@cksoft.de CK Software GmbH http://www.cksoft.de/ Phone: +49 7452 889 135 Fax: +49 7452 889 136
В списке pgsql-hackers по дате отправления: