On Wed, 2024-01-10 at 23:56 +0100, Daniel Verite wrote:
> A related comment is about naming the builtin locale C.UTF-8, the
> same
> name as in libc. On one hand this is semantically sound, but on the
> other hand, it's likely to confuse people. What about using
> completely
> different names, like "pg_unicode" or something else prefixed by
> "pg_"
> both for the locale name and the collation name (currently
> C.UTF-8/c_utf8)?
New version attached. Changes:
* Named collation object PG_C_UTF8, which seems like a good idea to
prevent name conflicts with existing collations. The locale name is
still C.UTF-8, which still makes sense to me because it matches the
behavior of the libc locale of the same name so closely.
* Added missing documentation for initdb --builtin-locale
* Refactored the upper/lower/initcap implementations
* Improved tests for case conversions where the byte length of the
UTF8-encoded string changes (the string length doesn't change because
we don't do full case mapping).
* No longer uses titlecase mappings -- libc doesn't do that, so it was
an unnecessary difference in case mapping behavior.
* Improved test report per Jeremy's suggestion: now it reports the
number of codepoints tested.
Jeremy also raised a problem with old versions of psql connecting to a
new server: the \l and \dO won't work. Not sure exactly what to do
there, but I could work around it by adding a new field rather than
renaming (though that's not ideal).
Regards,
Jeff Davis