Re: BUG #18362: unaccent rules and Old Greek text

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: BUG #18362: unaccent rules and Old Greek text
Дата
Msg-id CA+hUKG+nL9VYx5S_mPnraXKLKcWP_WFkTrwKb1osq0q=am6fEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #18362: unaccent rules and Old Greek text  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: BUG #18362: unaccent rules and Old Greek text  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-bugs
On Sun, Feb 25, 2024 at 4:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Sun, Feb 25, 2024 at 11:14 AM PG Bug reporting form
> <noreply@postgresql.org> wrote:
> > So, there are reasons to keep the current unaccent.rules as it is, but...
> > there are other reasons to add a few lines to it, f.e. after line 955 and
> > insert five greek vowels with Oxia
> > Please add:
> > ά       α

Oh, I think I see it.  "ά" is:

1F71;GREEK SMALL LETTER ALPHA WITH OXIA;Ll;0;L;03AC;;;;N;;;1FBB;;1FBB

The Python script is looking for combining sequences that add accents,
but this one has just "03AC" in the combining sequence field, so it's
a kind of "simple" redirection that points here:

03AC;GREEK SMALL LETTER ALPHA WITH TONOS;Ll;0;L;03B1 0301;;;;N;GREEK
SMALL LETTER ALPHA TONOS;;0386;;0386

That has a normal looking sequence that we can understand (α + an
accent).  If I tell the script to follow such "simple" redirections, I
get over a thousand new mappings, including those.  See attached.
There is probably more correct terminology that I'm using here...

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #18363: Assert !ReindexIsProcessingIndex falsified with expression index over select from table
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #18362: unaccent rules and Old Greek text