Re: Built-in CTYPE provider

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Built-in CTYPE provider
Дата
Msg-id 1466f20c38f4433c98a950acdaf2c96daf7ce442.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: Built-in CTYPE provider  (Peter Eisentraut <peter@eisentraut.org>)
Ответы Re: Built-in CTYPE provider  (Peter Eisentraut <peter@eisentraut.org>)
Список pgsql-hackers
On Mon, 2024-03-25 at 08:29 +0100, Peter Eisentraut wrote:
> Right.  I thought when you said there is an ICU configuration for it,
> that it might be like collation options that you specify in the
> locale
> string.  But it appears it is only an internal API setting.  So that,
> in
> my mind, reinforces the opinion that we should leave initcap() as is
> and
> make a new function that exposes the new functionality.  (This does
> not
> have to be part of this patch set.)

OK, I'll propose a "title" or "titlecase" function for 18, along with
"casefold" (which I was already planning to propose).

What do you think about UPPER/LOWER and full case mapping? Should there
be extra arguments for full vs simple case mapping, or should it come
from the collation?

It makes sense that the "dotted vs dotless i" behavior comes from the
collation because that depends on locale. But full-vs-simple case
mapping is not really a locale question. For instance:

   select lower('0Σ' collate "en-US-x-icu") AS lower_sigma,
          lower('ΑΣ' collate "en-US-x-icu") AS lower_final_sigma,
          upper('ß' collate "en-US-x-icu") AS upper_eszett;
    lower_sigma | lower_final_sigma | upper_eszett
   -------------+-------------------+--------------
    0σ          | ας                | SS

produces the same results for any ICU collation.

There's also another reason to consider it an argument rather than a
collation property, which is that it might be dependent on some other
field in a row. I could imagine someone wanting to do:

   SELECT
     UPPER(some_field,
           full => true,
           dotless_i => CASE other_field WHEN ...)
   FROM ...

That makes sense for a function in the target list, because different
customers might be from different locales and therefore want different
treatment of the dotted-vs-dotless-i.

Thoughts? Should we use the collation by default but then allow
parameters to override? Or should we just consider this a new set of
functions?

(All of this is v18 material, of course.)

Regards,
    Jeff Davis




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Possibility to disable `ALTER SYSTEM`
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Possibility to disable `ALTER SYSTEM`