Re: Order changes in PG16 since ICU introduction

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Order changes in PG16 since ICU introduction
Дата
Msg-id f8d09d8f3d53daa9cdb446d021fe33d6ff7f1ee3.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: Order changes in PG16 since ICU introduction  ("Daniel Verite" <daniel@manitou-mail.org>)
Ответы Re: Order changes in PG16 since ICU introduction  (Joe Conway <mail@joeconway.com>)
Re: Order changes in PG16 since ICU introduction  ("Daniel Verite" <daniel@manitou-mail.org>)
Список pgsql-hackers
On Tue, 2023-06-06 at 15:09 +0200, Daniel Verite wrote:
> FWIW I don't quite see how 0001 improve things or what problem it's
> trying to solve.

The word "locale" is generic, so we need to make LOCALE/--locale apply
to whatever provider is being used. If "locale" only applies to libc,
using ICU will always be confusing and never be on the same level as
libc, let alone the preferred provider.

The locale "C" is a special case, documented as a non-locale. So, if
LOCALE/--locale apply to ICU, then either ICU needs to handle locale
"C" in the expected way (v8 patch series); or when we see locale "C" we
need to somehow change the provider into something that can handle it
(v6 patch series changes it to the "none" provider).

Please let me know if you disagree with the goal or the reasoning here.
If so, please explain where you think we should end up, because the
status quo does not seem great to me.

> 0001 creates exceptions throughout the code so that when an ICU
> collation has a locale name "C" or "POSIX" then it does not behave
> like an ICU collation, even though pg_collation.collprovider='i'
> To me it's neither desirable nor necessary that a collation that
> has collprovider='i' is diverted to non-ICU semantics.

It's not very principled, but it matches what libc does.

> Also in the current state, this diversion does not apply to initdb.
>
> "initdb --icu-locale=C" with 0001 applied reports this:
>
>    Using language tag "en-US-u-va-posix" for ICU locale "C".

Thank you. I fixed it by skipping the canonicalization for C/POSIX
locales in initdb.

> Could you elaborate a bit more on what 0001 is meant to achieve, from
> the point of view of the user?

It makes it so the user consistently (regardless of the provider) gets
the "no locale" behavior (as documented and historically expected) when
they specify the C or POSIX locales.

Then that enables us to change LOCALE/--locale to apply to ICU, which
means that a simple command like "initdb --locale=en_US" does a
sensible thing regardless of the default provider.

I understand you are skeptical of trying to apply an arbitrary locale
name to ICU, but if they don't specify the provider, what do you expect
to happen?


--
Jeff Davis
PostgreSQL Contributor Team - AWS



Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: Order changes in PG16 since ICU introduction
Следующее
От: Joe Conway
Дата:
Сообщение: Re: Order changes in PG16 since ICU introduction