Re: Clarification of the "simple" dictionary
От | Oleg Bartunov |
---|---|
Тема | Re: Clarification of the "simple" dictionary |
Дата | |
Msg-id | Pine.LNX.4.64.1007222140470.32129@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: Clarification of the "simple" dictionary (Andreas Joseph Krogh <andreak@officenet.no>) |
Ответы |
Re: Clarification of the "simple" dictionary
Re: Clarification of the "simple" dictionary |
Список | pgsql-general |
Don't guess, but read docs http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY 12.6.2. Simple Dictionary The simple dictionary template operates by converting the input token to lower case and checking it against a file of stopwords. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-casedform of the word is returned as the normalized lexeme. Alternatively, the dictionary can be configured to reportnon-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list. d=# \dFd+ simple List of text search dictionaries Schema | Name | Template | Init options | Description ------------+--------+-------------------+--------------+----------------------------------------------------------- pg_catalog | simple | pg_catalog.simple | | simple dictionary: just lower case and check for stopword By default it has no Init options, so it doesn't check for stopwords. On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote: > On 07/22/2010 06:27 PM, John Gage wrote: >> The easiest way to look at this is to give the simple dictionary a document >> with to_tsvector() and see if stopwords pop out. >> >> In my experience they do. In my experience, the simple dictionary just >> breaks the document down into the space etc. separated words in the >> document. It doesn't analyze further. > > That's my experience too, I just want to make sure it doesn't actually have > any stopwords which I've missed. Trying many phrases and checking for > stopwords isn't really proving anything. > > Can anybody confirm the "simple" dict. only lowercases the words and > "uniques" them? > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-general по дате отправления: