Re: Collation version tracking for macOS

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Collation version tracking for macOS
Дата
Msg-id CA+TgmoZ0AtG0y7NHh1Aax+X=n0U5hdzV5gdfpth_3wbGY8mE5g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Collation version tracking for macOS  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Collation version tracking for macOS  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Sun, Feb 4, 2024 at 10:42 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I'm hesitant to put much more work into it (e.g. new patches, etc.)
> without more feedback. Your opinion would certainly be valuable -- for
> instance, when reading the docs, can you imagine yourself actually
> using this if you ran into a collation versioning/migration problem?

I'm having some difficulty understanding what the docs are trying to
tell me. I think there are some issues with ordering and pacing.

"The icu_multilib module provides control over the version (or
versions) of the ICU provider library used by PostgreSQL, which can be
different from the version of ICU with which it was built. Collations
are a product of natural language, and natural language evolves over
time; but PostgreSQL depends on stable ordering for structures such as
indexes. Newer versions of ICU update the provided collators to adapt
to changes in natural language, so it's important to control when and
how those new versions of ICU are used to prevent problems such as
index corruption."

Check. So far, so good.

"This module assumes that the necessary versions of ICU are already
available, such as through the operating system's package manager; and
already properly installed in a single location accessible to
PostgreSQL. The configration variable icu_multilib.library_path should
be set to the location where these ICU library versions are
installed."

Here I feel we've skipped a few steps. I suggest postponing all
discussion of specific GUCs to a later point -- specifically the
configuration parameters section, which I think should actually be
F.19.1, with the use cases following that rather than preceding it. In
this introductory section, I suggest elaborating a bit more on what
problem we're trying to solve at a conceptual level. It feels like
we've gone straight from the very general issue (collation definitions
need to be stable but language isn't) to very specific (here's a GUC
that you can set to a pathname). I feel like the need for this module
should be more specifically motivated. Maybe something like:

1. Here's what we think your OS package manager is probably going to do.
2. That's going to interact with PostgreSQL in this way that I will
now describe.
3. See, that sucks, because of the stuff I said above about needing
stable collations!
4. But if you installed this module instead, then you could prevent
the things I said under #2 from happening.
5. Instead, you'd get this other behavior, which would make you happy.

I feel like I can almost piece together in my head how this is
supposed to work -- I think it's like "we expect the OS package
manager to drop all the ICU versions in the same directory via side by
side installs, and that works well for other programs because ... for
some mysterious reason they can latch onto the specific version they
were linked against ... but we can't or don't do that because ... I
guess we're dumber than those other pieces of software or
something???? ... so this module lets you ask for more sensible
behavior." But I think that could be spelled out a bit more clearly
and directly than this document seems to me to do.

I also wonder if we should be explaining why we don't get this right
out of the box. Like, if the normal behavior categorically sucks, why
do you have to install icu_multilib to get something else? Why not
make the multilib treatment the default? And if the normal behavior is
better for some cases and the icu_multilib behavior is better for
other cases, then maybe we ought to explain which one to use in which
scenario.

"icu_multilib must be loaded via shared_preload_libraries.
icu_multilib ignores any ICU library with a major version greater than
that with which PostgreSQL was built."

It's not clear from reading this whether the second sentence here is a
regrettable implementation restriction or design behavior. If it's
design behavior, what's the point of it?

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Feature request support MS Entra ID Authentication from On-premises PostreSQL server
Следующее
От: David Benjamin
Дата:
Сообщение: [PATCH] Avoid mixing custom and OpenSSL BIO functions