Re: BUG #10589: hungarian.stop file spelling error
От | Tom Lane |
---|---|
Тема | Re: BUG #10589: hungarian.stop file spelling error |
Дата | |
Msg-id | 4576.1402431720@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #10589: hungarian.stop file spelling error (Kevin Grittner <kgrittn@ymail.com>) |
Ответы |
Re: BUG #10589: hungarian.stop file spelling error
|
Список | pgsql-bugs |
Kevin Grittner <kgrittn@ymail.com> writes: > "zsoros@gmail.com" <zsoros@gmail.com> wrote: >> I'm sorry, I can't give you the utf8 byte sequence for 'Å' > A quick copy/paste from your email into psql (using UTF-8 encoding) > shows: > [ it's U+0151 ] I believe that the way we got this file in the first place was to scrape it from http://snowball.tartarus.org/algorithms/hungarian/stop.txt since it's not in the Snowball distribution. It looks to me like the webserver delivers that page in LATIN1 (ISO-8859-1) encoding, which would go far towards explaining the encoding problem, since U+0151 isn't representable in LATIN1. So now I'm wondering what other similar mistakes there may be in the non-LATIN1 languages. I have an inquiry in to the upstream Snowball list asking if there's a safer way to obtain copies of their stopword files. regards, tom lane
В списке pgsql-bugs по дате отправления: