Re: BUG #15548: Unaccent does not remove combining diacritical characters
От | Thomas Munro |
---|---|
Тема | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Дата | |
Msg-id | CAEepm=3GtcMM3+_DEAmM5X=xtDwVo7C9mPTY04vkLCmQoT6zCw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15548: Unaccent does not remove combining diacritical characters (raam narayana <raam.soft@gmail.com>) |
Список | pgsql-hackers |
On Mon, Feb 11, 2019 at 7:07 AM raam narayana <raam.soft@gmail.com> wrote: > After the latest commit in master branch, I was trying to test the python script. Ironically I still see that the outputfrom the script is completely different from the unaccent.rules file content. Am I missing anything.My testing includesthe following > > Downloaded the following files > > http://unicode.org/Public/8.0.0/ucd/UnicodeData.txt > > http://unicode.org/cldr/trac/export/14746/tags/release-34/common/transforms/Latin-ASCII.xml > > Executed the below python script > > python generate_unaccent_rules.py --unicode-data-file UnicodeData.txt --latin-ascii-file Latin-ASCII.xml > unaccent.rules > > I am using python 3.7.1 and running on Windows 10 Platform > > The new status of this patch is: Needs review Hi Raam, How does it differ? Can you please share the output you get? I used Python 2.7 on a Mac, exactly those input files, and my output matched Hugh's. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: