Re: How to enable partial matching on a GIN index
От | Tom Lane |
---|---|
Тема | Re: How to enable partial matching on a GIN index |
Дата | |
Msg-id | 2444.1452544417@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: How to enable partial matching on a GIN index (Chris Spencer <chrisspen@gmail.com>) |
Ответы |
Re: How to enable partial matching on a GIN index
|
Список | pgsql-novice |
Chris Spencer <chrisspen@gmail.com> writes: >> What's your grounds for claiming that? > It returns no results whenever I use partial search terms. Oh; that's got nothing to do with whether an index is used or not. The index just makes it faster. > If I search for, > say, "hospital" it returns results containing the exact word "hospital", > but if I search for "hosp" it returns nothing. The doc page explaining that > PG "can" do partial matches, and not that it "does" do partial matches, led > me to believe this is the expected default behaviour. Is this not the case? There are a few problems with your example ... > ALTER TABLE mytable ADD COLUMN search_index tsvector; > CREATE INDEX mytable_search_index_gin ON mytable USING gin(search_index); > INSERT INTO mytable (name, search_index) VALUES ('hospital', > plainto_tsquery('pg_catalog.english', 'hospital')); That couldn't have worked as written, because tsvector != tsquery. I assume you meant to_tsvector() in the INSERT. > SELECT * FROM mytable WHERE (search_index) @@ > (plainto_tsquery('pg_catalog.english', 'hospital')); -- returns results Right ... > SELECT * FROM mytable WHERE (search_index) @@ > (plainto_tsquery('pg_catalog.english', 'hosp')); -- returns nothing This isn't right, you have to use to_tsquery('pg_catalog.english', 'hosp:*') to specify a partial match. You didn't give a partial match indicator (":*"), and if you had, plainto_tsquery() would have stripped it off, because it throws away all punctuation. Another pitfall to keep in mind is that stemming may result in searches not matching that look like they should. For example, what's really stored in your tsvector column for this example is # select to_tsvector('pg_catalog.english', 'hospital'); to_tsvector ------------- 'hospit':1 (1 row) and a prefix match has to match or be a prefix of that string exactly. So you might expect to_tsquery('pg_catalog.english', 'hospita:*') to match that entry, but it won't. (In some cases, stemming of the query word will hide this effect, which is why a search for 'hospital' works; but in this example 'hospita' doesn't look enough like an English word to trigger removal of 'a'.) If you're confused, it always pays to look at the actual outputs of the to_tsvector and to_tsquery functions. There's no particular magic after that point, it's just string matching; but the word-break and stemmer functions can do surprising things. If you expect to be doing mostly prefix-type searches, you might end up deciding you want to use the "simple" text search configuration, which I'm pretty sure does no stemming at all. regards, tom lane
В списке pgsql-novice по дате отправления: