Re: Add CASEFOLD() function.

Поиск
Список
Период
Сортировка
От Thom Brown
Тема Re: Add CASEFOLD() function.
Дата
Msg-id CAA-aLv5Se9zt3CxcGWYwJUA-0nnx+sAArwUWRJKTwTL1a=8YyA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add CASEFOLD() function.  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Thu, 19 Jun 2025, 17:33 Jeff Davis, <pgsql@j-davis.com> wrote:
On Thu, 2025-06-19 at 16:36 +0100, Thom Brown wrote:
> Ease of use, perhaps. It seems easier to use:
>
> column_name cftext
>
> rather than:
>
> CREATE COLLATION case_insensitive_collation (
>     PROVIDER = icu,
>     LOCALE = 'und-u-ks-level2',
>     DETERMINISTIC = FALSE
> );

We could auto-create such a collation at initdb time for ICU-enabled
builds.

> But I see the arguments against it. It creates an unnecessary
> dependency on an extension, and if someone wants to ignore both case
> and accents, they may resort to using 2 extensions (citext +
> unaccent)
> when none are needed.

There are at least three ways to do case insensitivity (or other kinds
of equivalence):

* Explicit function calls in queries, as well as index and constraint
definitions. E.g. expression index on LOWER(), queries that explicitly
do "LOWER(x) = ..."

* Wrap those function calls up in a separate data type, like citext.

* Non-deterministic collations.

Given that we have collations, which are a way of organizing alternate
behaviors for existing data types, I'm not sure I see the need for
creating an entirely separate data type.

> I guess I don't feel strongly about it either
> way.

Are you a user of citext? I'm genuinely interested in the use cases,
and whether the separate-data-type approach has merits that are missing
in the other approaches.

No. But given the options, I would personally choose nondeterministic collations now that they are available. I just wish they were more user-friendly as I suspect the majority of people either won't know about them, or won't know how to use them. But like you say, maybe having a set of predefined nd-collections would help. As it stands, I'm just bringing up the consideration of citext in case it has any value, which it doesn't appear to. In fact it's probably even an argument to begin the process of deprecation.

Thom

В списке pgsql-hackers по дате отправления: