Re: Latin vs non-Latin words in text search parsing
От | Tom Lane |
---|---|
Тема | Re: Latin vs non-Latin words in text search parsing |
Дата | |
Msg-id | 6225.1193063764@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Latin vs non-Latin words in text search parsing (Gregory Stark <stark@enterprisedb.com>) |
Ответы |
Re: Latin vs non-Latin words in text search parsing
|
Список | pgsql-hackers |
Gregory Stark <stark@enterprisedb.com> writes: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >> I like the "aword" name more than "lword", BTW. If we change the meaning >> of the classes, surely we can change the name as well, right? > I'm not very familiar with the use case here. Is there a good reason to want > to abbreviate these names? I think I would expect "ascii", "word", and "token" > for the three categories Tom describes. Please look at the first nine rows of the table here: http://developer.postgresql.org/pgdocs/postgres/textsearch-parsers.html It's not clear to me where we'd go with the names for the hyphenated-word and hyphenated-word-part categories. Also, ISTM thatwe should use related names for these three categories,since they are all considered valid parts of hyphenated words. Another point: "token" is probably unreasonably confusing as a name for a token type. "Is that a token token or a word token?" Maybe "aword", "word", and "numword"? regards, tom lane
В списке pgsql-hackers по дате отправления: