Re: BUG #15548: Unaccent does not remove combining diacritical characters

Поиск

Список

Период

Сортировка

От	Hugh Ranalli
Тема	Re: BUG #15548: Unaccent does not remove combining diacritical characters
Дата	18 декабря 2018 г. 13:01:00
Msg-id	CAAhbUMMzPERSe3KfKKQfR4COJCZSrss1G7KRyUraYJyvrVyOUg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #15548: Unaccent does not remove combining diacritical characters (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы	Re: BUG #15548: Unaccent does not remove combining diacritical characters
Список	pgsql-bugs

Дерево обсуждения

On Mon, 17 Dec 2018 at 23:05, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

+ʹ '
+ʺ "
+ʻ '
+ʼ '
+ʽ '
+˂ <
+˃ >
+˄ ^
+ˆ ^
+ˈ '
+ˋ `
+ː :
+˖ +
+˗ -
+˜ ~

These aren't the combining codepoints. They're new substitutions defined in r34 of the Latin-ASCII transliteration file. I had wondered about those, too, and did some testing.

I don't think this is quite right.

However, you are correct that something isn't write. In testing why I was getting a different output, I had reverted to the generate_unaccent_rules.py BEFORE my changes. And then I applied my update for the transliteration file format to the reverted version. The patch for generate_unaccent_rules should still be good, but the generated rules file didn't include the combining diacriticals. In generating that, I want to double check some of the additions before re-submitting.

On Mon, 17 Dec 2018 at 23:57, Michael Paquier <michael@paquier.xyz> wrote:

Could you also add some tests in contrib/unaccent/sql/unaccent.sql at
the same time? That would be nice to check easily the extent of the
patches proposed on this thread.

That makes sense. I'm happy to do that. Let me look at that file and see how extensive the other changes (encoding and removal of special characters would be).

Hugh

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #15548: Unaccent does not remove combining diacritical characters