Re: BUG #18362: unaccent rules and Old Greek text

Поиск

Список

Период

Сортировка

От	Cees van Zeeland
Тема	Re: BUG #18362: unaccent rules and Old Greek text
Дата	1 марта 2024 г. 15:54:07
Msg-id	63c65b3a-d142-409d-92ec-2a7d1df6f697@freedom.nl обсуждение исходный текст
Ответ на	Re: BUG #18362: unaccent rules and Old Greek text (Thomas Munro <thomas.munro@gmail.com>)
Список	pgsql-bugs

Дерево обсуждения

Hi Thomas,

I found:
https://www.unicode.org/Public/15.1.0/ucd/CompositionExclusions.txt
that might be useful to tackle characters that we are searching for.

Hope this helps.

Cees

On 01/03/2024 02:53, Thomas Munro wrote:
> On Tue, Feb 27, 2024 at 1:33 AM Cees van Zeeland
> <cees.van.zeeland@freedom.nl> wrote:
>> I'm not an expert, but obviously computers make a difference between the two versions of the characters.
>> We are talking about this series:
>> U+1F70 - U+1F7D:    ὰ     ά     ὲ     έ     ὴ     ή     ὶ     ί     ὸ     ό     ὺ     ύ     ὼ     ώ
>> Is it possible to filter / limit in some way the redirection in the script to this range?
> Right, so to get this in we either need to decide that we're OK with
> adding that many characters, or figure out some systematic way to
> select just the ones we want.  One hint that might be helpful if
> someone wants to investigate: I suspect that a lot of those mappings
> might be marked with <font>, which seems to be for code points for
> alternative renderings ("mathematical" bold, italic, fraktur etc), so
> perhaps we could filter them out that way without losing the
> oxia-marked characters if that's the way it has to be.
>
> I think all the relevant part of the character database file is described here:
>
> https://unicode.org/reports/tr44/#Property_Values
>
> The file we're currently using is 15.1:
>
> https://www.unicode.org/Public/15.1.0/ucd/UnicodeData.txt
>
> I registered this thread as https://commitfest.postgresql.org/47/4873/ .

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #18362: unaccent rules and Old Greek text