Re: BUG #15548: Unaccent does not remove combining diacritical characters
От | Tom Lane |
---|---|
Тема | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Дата | |
Msg-id | 10200.1544713542@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #15548: Unaccent does not remove combining diacriticalcharacters ("Daniel Verite" <daniel@manitou-mail.org>) |
Ответы |
Re: BUG #15548: Unaccent does not remove combining diacriticalcharacters
|
Список | pgsql-bugs |
"Daniel Verite" <daniel@manitou-mail.org> writes: > PG Bug reporting form wrote: >> ... For example, A >> followed by U+0300 displays À. However, unaccent is not removing >> these accents. > Short of having the input normalized by the application, ISTM that the > best solution would be to provide functions to do it in Postgres, so > you'd just write for example: > unaccent(unicode_NFC(string)) That might be worthwhile, but it seems independent of this issue. > Otherwise unaccent.rules can be customized. You may add replacements > for letter+diacritical sequences that are missing for the languages > you have to deal with. But doing it in general for all diacriticals > multiplied by all base characters seems unrealistic. Hm, I thought the OP's proposal was just to make unaccent drop combining diacriticals independently of context, which'd avoid the combinatorial-growth problem. regards, tom lane
В списке pgsql-bugs по дате отправления: