Re: BUG #15548: Unaccent does not remove combining diacritical characters

Поиск

Список

Период

Сортировка

От	Hugh Ranalli
Тема	Re: BUG #15548: Unaccent does not remove combining diacritical characters
Дата	17 декабря 2018 г. 20:22:37
Msg-id	CAAhbUMOX4QLj6c0O3GnjZYtR2dpAowss832Bq1n7oJyByeR7kQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #15548: Unaccent does not remove combining diacritical characters (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы	Re: BUG #15548: Unaccent does not remove combining diacritical characters
Список	pgsql-bugs

Дерево обсуждения

On Sat, 15 Dec 2018 at 21:26, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

+1 for updating to the latest file from time to time. After
http://unicode.org/cldr/trac/ticket/11383 makes it into a new release,
our special_cases() function will have just the two Cyrillic
characters, which should almost certainly be handled by adding
Cyrillic to the ranges we handle via the usual code path, and DEGREE
CELSIUS and DEGREE FAHRENHEIT. Those degree signs could possibly be
extracted from Unicode.txt (or we could just forget about them), and
then we could drop special_cases().

Well, when I modified the code to handle the new version of the transliteration file, I discovered that was sufficient to handle the old version as well. That's not the way things usually go, but I'll take it. ;-)

I've attached two patches, one to update generate_unaccent_rules.py, and another that updates unaccent.rules from the v34 transliteration file. I'll be happy to add these to the CF. Does anyone need to review them and give me approval before I do so?

Best wishes,

Hugh

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #15548: Unaccent does not remove combining diacritical characters