Re: english parser in text search: support for multiple words in the same position
От | Tom Lane |
---|---|
Тема | Re: english parser in text search: support for multiple words in the same position |
Дата | |
Msg-id | 15782.1280758804@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: english parser in text search: support for multiple words in the same position (Sushant Sinha <sushant354@gmail.com>) |
Ответы |
Re: english parser in text search: support for multiple
words in the same position
|
Список | pgsql-hackers |
Sushant Sinha <sushant354@gmail.com> writes: >> This would needlessly increase the number of tokens. Instead you'd >> better make it work like compound word support, having just "wikipedia" >> and "org" as tokens. > The current text parser already returns url and url_path. That already > increases the number of unique tokens. I am only asking for adding of > normal english words as well so that if someone types only "wikipedia" > he gets a match. The suggestion to make it work like compound words is still a good one, ie given wikipedia.org you'd get back host wikipedia.orghost-part wikipediahost-part org not just the "host" token as at present. Then the user could decide whether he needed to index hostname components or not, by choosing whether to forward hostname-part tokens to a dictionary or just discard them. If you submit a patch that tries to force the issue by classifying hostname parts as plain words, it'll probably get rejected out of hand on backwards-compatibility grounds. regards, tom lane
В списке pgsql-hackers по дате отправления: