Re: snowball ASCII stemmer configuration
От | Tom Lane |
---|---|
Тема | Re: snowball ASCII stemmer configuration |
Дата | |
Msg-id | 1300297.1592315626@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | snowball ASCII stemmer configuration (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
Ответы |
Re: snowball ASCII stemmer configuration
Re: snowball ASCII stemmer configuration |
Список | pgsql-hackers |
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > There are two cases where these two columns are not the same: > hindi english \ > russian english \ > The second one is old; the first one I added using the second one as > example. But I wonder what the rationale for this is. Maybe for hindi > one could make some kind of cultural argument, but for russian this > seems entirely arbitrary. Perhaps it is, but we have actual Russians who think it's a good idea. I recall questioning that point some years ago, and Oleg replied that they'd done that intentionally because (a) technical Russian uses a lot of English words, and (b) it's easy to tell which is which thanks to the disjoint letter sets. Whether the same is true for Hindi, I have no idea. > Moreover, AFAIK, the following other languages do not use Latin-based > alphabets: > arabic arabic \ > greek greek \ > nepali nepali \ > tamil tamil \ Hmm. I think all of those entries are ones that got added by me while absorbing post-2007 Snowball updates, and I confess that I did not think about this point. Maybe these should be changed. regards, tom lane
В списке pgsql-hackers по дате отправления: