Re: BUG #18362: unaccent rules and Old Greek text
От | Michael Paquier |
---|---|
Тема | Re: BUG #18362: unaccent rules and Old Greek text |
Дата | |
Msg-id | Zk7gTggGrBFnFwGl@paquier.xyz обсуждение исходный текст |
Ответ на | Re: BUG #18362: unaccent rules and Old Greek text (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: BUG #18362: unaccent rules and Old Greek text
|
Список | pgsql-bugs |
On Wed, May 22, 2024 at 12:47:37PM -0400, Robert Haas wrote: > On Sat, May 18, 2024 at 5:37 AM Thomas Munro <thomas.munro@gmail.com> wrote: >> And in the tests I now see that Michael had already figured that out! >> I've included a kludge to remove that. Someone should file a ticket with CLDR. That was some time ago.. I was not sure back then how to handle that with upstream data, so thanks for the bug report and the pointers. I'll try to remember that. > I think you should update the comment that says "a mistake?" to > instead link to the CLDR issue that Peter filed. Other than that, I'm > not sure this needs any other changes. I can't actually testify to the > correctness of the Python code, but the results look sane so hey, why > not? +1 for the comment refresh in the test, keeping the test. + if src == "ℌ": + # a mistake? + continue Perhaps this should use the codepoint rather than the non-ascii character in the script. Another thing would be to add some tests that cover the new characters that get a conversion. Just a few of them in the new ranges, checking the recursive case with is_letter_with_marks() would be fine. -- Michael
Вложения
В списке pgsql-bugs по дате отправления: