Re: Collation version tracking for macOS
От | Thomas Munro |
---|---|
Тема | Re: Collation version tracking for macOS |
Дата | |
Msg-id | CA+hUKGLB5-OkBCO5JtGAoQU5wS-2v6w+quC+Sak00bfqOWJbcg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Collation version tracking for macOS (Tobias Bussmann <t.bussmann@gmx.net>) |
Список | pgsql-hackers |
On Fri, Jun 10, 2022 at 12:48 PM Tobias Bussmann <t.bussmann@gmx.net> wrote: > Perhaps I can shed some light on this matter: Hi Tobias, Oh, thanks for your answers. Definitely a few bits of interesting archeology I was not aware of. > Apple's libc collations have always been a bit special in that concern, even for the non-UTF8 ones. Rooted in ancient FreeBSDthey "try to keep collating table backward compatible with ASCII" thus upper and lower cases characters are separated(There are exceptions like 'cs_CZ.ISO8859-2'). Wow. I see that I can sort the English dictionary the way most people expect by pretending it's Czech. What a mess! > With your smoke test "sort /usr/share/dict/words" on a modern macOS you won't see a difference between "C" and "en_US.UTF-8"but with "( echo '5£'; echo '£5' ) | LC_COLLATE=en_US.UTF-8 sort" you can produce a difference against "( echo'5£'; echo '£5' ) | LC_COLLATE=C sort". Or test with "diff -q <(LC_COLLATE=C sort /usr/share/dict/words) <(LC_COLLATE=es_ES.UTF-8sort /usr/share/dict/words)" I see, so it does *something*, just not what anybody wants.
В списке pgsql-hackers по дате отправления: