Re: ICU locale validation / canonicalization
От | Jeff Davis |
---|---|
Тема | Re: ICU locale validation / canonicalization |
Дата | |
Msg-id | 2a9013df9a197aa5eb2d85b5f7ec0c5a726b6a2e.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: ICU locale validation / canonicalization (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
Ответы |
Re: ICU locale validation / canonicalization
|
Список | pgsql-hackers |
On Fri, 2023-03-24 at 10:10 +0100, Peter Eisentraut wrote: > Couldn't we do this in a simpler way by just freeing the collator > before > the ereport() calls. I committed a tiny patch to do this. We still need to address the error inconsistency though. The problem is that, in older ICU versions, if the fixup for "und@colNumeric=lower" -> "root@colNumeric=lower" is applied, then icu_set_collation_attributes() will throw an error reporting "root@colNumeric=lower", which is not what the user typed. We could fix that directly by passing the original string to icu_set_collation_attributes() instead, or perhaps as an extra parameter used only for the ereport(). I like the minor refactoring I did better, though. It puts the ereports() close to each other, so any differences are more obvious. And it seems cleaner to me for pg_ucol_open to close the UCollator because it's the one that opened it. I don't have a strong opinion, but that's my reasoning. > Or wrap a PG_TRY/PG_FINALLY around the whole thing? I generally avoid PG_TRY/FINALLY unless it avoids some major awkwardness or other problem. > It would be nicer to not make the callers of > icu_set_collation_attributes() responsible for catching and reporting > the errors. There's only one caller now: pg_ucol_open(). > [PATCH v8 2/4] initdb: emit message when using default ICU locale. > > I'm not able to make initdb print this message. Under what > circumstances am I supposed to see this? Do you have some examples? It happens when you don't specify --icu-locale. It is slightly redundant with "ICU locale", but it lets you see that it came from the environment rather than the command line: ------------- $ initdb -D data The files belonging to this database system will be owned by user "someone". This user must also own the server process. Using default ICU locale "en_US_POSIX". The database cluster will be initialized with this locale configuration: provider: icu ICU locale: en_US_POSIX ... ------------- That seems fairly useful for testing, etc., where initdb.log doesn't show the command line options. > The function check_icu_locale() has now gotten a lot more > functionality > than its name suggests. Maybe the part that assigns the default ICU > locale should be moved up one level to setlocales(), which has a > better > name and does something similar for the libc locale realm. Agreed, done. In fact, initdb.c:check_icu_locale() is completely unnecessary in that patch, because as the comment points out, the backend will try to open it during post-bootstrap initialization. I think it was simply a mistake to try to do this validation in commit 27b62377b4. The later validation patch does do some better validation at initdb time to make sure the language can be found. > [PATCH v8 3/4] Canonicalize ICU locale names to language tags. > > I'm still on the fence about whether we actually want to do this, but > I'm warming up to it, now that the issues with pre-54 versions are > fixed. > > But if we do this, the documentation needs to be updated. There is a > bunch of text there that says, like, you can do this format or that > format, whatever you like. At least the guidance should be changed > there. > > > [PATCH v8 4/4] Validate ICU locales. > > I would make icu_locale_validation true by default. Agreed. I considered also not having a GUC, but it seems like some kind of escape hatch is wise, at least for now. > Or maybe it should be a log-level type option, so you can set it to > error, warning, and also completely off? As the validation patch seems closer to acceptance, I changed it to be before the canonicalization patch. New series attached. -- Jeff Davis PostgreSQL Contributor Team - AWS
Вложения
В списке pgsql-hackers по дате отправления: