Re: How to get around LIKE inefficiencies?
От | Ron Chmara |
---|---|
Тема | Re: How to get around LIKE inefficiencies? |
Дата | |
Msg-id | 3A062337.23BE3AE3@opus1.com обсуждение исходный текст |
Ответ на | How to get around LIKE inefficiencies? (The Hermit Hacker <scrappy@hub.org>) |
Список | pgsql-hackers |
The Hermit Hacker wrote: > I'm tryin to figure out how to speed up udmsearch when run under > postgresql, and am being hit by atrocious performance when using a LIKE > query ... the query looks like: > SELECT ndict.url_id,ndict.intag > FROM ndict,url > WHERE ndict.word_id=1971739852 > AND url.rec_id=ndict.url_id > AND (url.url LIKE 'http://www.postgresql.org/%'); > Take off the AND ( LIKE ) part of the query, finishes almost as soon as > you hit return. Put it back in, and you can go for coffee before it > finishes ... The entire *approach* is wrong. I'm currently in the process of optimizing a db which is used for logfile mining, and it was originally built with the same kludge.... it seems to make sense when there's only a few thousand records, but at 20 million records, yikes! The problem is that there's a "like" operation for something that is fundamentally static (http://www.postgresql.org/) with some varying data *after it*, that you're not using, in any form, for this operation. This can be solved one of two ways: 1. Preprocess your files to strip out the paths and arguments on a new field for the domain call. You are only setting up that data once, so you shouldn't be using a "like" operator for every query. It's not like on monday the server is "http://www.postgresql.org/1221" and on tuesday the server is "http://www.postgresql.org/12111". It's always the *same server*, so split out that data into it's own column, it's own index. This turns your query into: SELECT ndict.url_id,ndict.intag FROM ndict,url WHERE ndict.word_id=1971739852 AND url.rec_id=ndict.url_id AND url.server_url='http://www.postgresql.org/'; 2. Trigger to do the above, if you're doing on-the-fly inserts into your db (so you can't pre-process). -Ronabop -- Brought to you from iBop the iMac, a MacOS, Win95, Win98, LinuxPPC machine, which is currently in MacOS land. Your bopping may vary.
В списке pgsql-hackers по дате отправления: