Re: Latin vs non-Latin words in text search parsing

Поиск

Список

Период

Сортировка

От	Gregory Stark
Тема	Re: Latin vs non-Latin words in text search parsing
Дата	23 октября 2007 г. 12:19:44
Msg-id	87k5pdq2o8.fsf@oxford.xeocode.com обсуждение исходный текст
Ответ на	Re: Latin vs non-Latin words in text search parsing (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Latin vs non-Latin words in text search parsing
Список	pgsql-hackers

Дерево обсуждения

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> I wrote:
>> Maybe "aword", "word", and "numword"?
>
> Does the lack of response mean people are satisfied with that?

Sorry, I had a couple responses partially written but never finished.

If we were doing it from scratch I would suggest using longer names. At the
least I would still suggest using "ascii" or "asciiword" instead of "aword".

> Fleshing the proposal out to include the hyphenated-word categories:
>
> aword        All ASCII letters
> word        All letters according to iswalpha()
> numword        Mixed letters and digits (all iswalnum())

This does bring up another idea. Using the ctype names. They could be named
asciiword, alphaword, alnumword. Frankly I don't think this is any nicer than
numword anyways.

> I'm not totally thrilled with these short names for the hyphenation
> categories, but they will seem at least somewhat familiar to users
> of contrib/tsearch2, and it's probably not worth changing them just
> to make them look prettier.

I tried thinking of better words for this and couldn't think of any. The only
other word for a hyphenated word I could think of is probably "compound" and
the word for parts of a compound word is "lexeme", but that's certainly not
going to be clearer (and technically it's not quite right anyway).

So in short I would still suggest using "ascii" instead of just "a" but
otherwise I think your suggestion is best.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Latin vs non-Latin words in text search parsing