Re: pg_collation.collversion for C.UTF-8

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: pg_collation.collversion for C.UTF-8
Дата
Msg-id CA+hUKGLALgS3bFStFrv26mV9JahZzAbAVyk3+03QZVpJDrrFvg@mail.gmail.com
обсуждение исходный текст
Ответ на pg_collation.collversion for C.UTF-8  ("Daniel Verite" <daniel@manitou-mail.org>)
Ответы Re: pg_collation.collversion for C.UTF-8  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Wed, Apr 19, 2023 at 12:36 AM Daniel Verite <daniel@manitou-mail.org> wrote:
> This seems to be based on the idea that C.* collations provide an
> immutable sort like "C", but it appears that it's not the case.

Hmm.  It seems I added that exemption initially for FreeBSD only in
ca051d8b101, and then merged the cases for several OSes in
beb4480c853.

It's extremely surprising to me that the sort order changed.  I
expected the sort order to be code point order:

https://sourceware.org/glibc/wiki/Proposals/C.UTF-8

One interesting thing is that it seems that it might have been
independently invented by Debian (?) and then harmonised with glibc
2.35:

https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1871363.html

Was the earlier Debian version buggy, or did it simply have a
different idea of what the sort order should be, intentionally?  Ugh.
From your examples, we can see that the older Debian system did not
have A < [some 4 digit code point], while the later version did (as
expected).  If so then it might be tempting to *not* do what you're
suggesting, since the stated goal of the thing is to be stable from
now on.  But it changed once in the early years of its existence!
Annoying.

Many OSes have a locale with this name.  I don't know this history,
who did it first etc, but now I am wondering if they all took the
"obvious" interpretation, that it should be code-point based,
extrapolating from "C" (really memcmp order):

https://unix.stackexchange.com/questions/597962/how-widespread-is-the-c-utf-8-locale



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: Direct I/O
Следующее
От: Greg Stark
Дата:
Сообщение: Re: Request for comment on setting binary format output per session