Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'
Дата
Msg-id 2c0389dc-a355-4de2-8a70-185b03a4b1e3@eisentraut.org
обсуждение исходный текст
Ответ на BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'  (PG Bug reporting form <noreply@postgresql.org>)
Список pgsql-bugs
On 28.11.23 08:15, PG Bug reporting form wrote:
> PostgreSQL's unaccent module does not use Unicode normalisation, but only a
> simple search-and-replace dictionary. The dictionary, unaccent.rules
> (https://github.com/postgres/postgres/blob/master/contrib/unaccent/unaccent.rules)
>    , does not contain these Japanese  characters, thus  its unable to remove
> the diacritic signs.  Can someone please guide when we can expect these
> Japanese characters will be added.
> 
> Also tried to check with latest versions of Postgresql still the latest
> version does not have support for the Japanese characters.
> 
> https://pgpedia.info/u/unaccent.html

As the subsequent discussion shows, it's not quite clear to everybody 
what the exact mandate of the unaccent extension is.  Maybe we'll arrive 
at some conclusion.

In the meantime, I suggest you also consider solving this with 
collations.  You might find that those have a more principled approach 
to this problem, and they also have a lot of customization capabilities. 
  The documentation contains examples of accent-insensitive collations 
(e.g., [0]).  Maybe that will work for you, or serve as the basis for 
customization.

[0]: 
https://www.postgresql.org/docs/current/collation.html#COLLATION-NONDETERMINISTIC



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'
Следующее
От: Francisco Olarte
Дата:
Сообщение: Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'