Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it?
От | Andreas Karlsson |
---|---|
Тема | Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it? |
Дата | |
Msg-id | be9f0a2c-98dc-3915-6e1b-85a1cf1c0d8a@proxel.se обсуждение исходный текст |
Ответ на | Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it? (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it?
|
Список | pgsql-hackers |
On 09/21/2017 01:40 AM, Peter Geoghegan wrote: > On Wed, Sep 20, 2017 at 4:08 PM, Peter Geoghegan <pg@bowt.ie> wrote: >>> pg_import_system_collations() takes care to use the non-BCP-47 style for >>> such versions, so I think this is working correctly. >> >> But CREATE COLLATION doesn't use pg_import_system_collations(). > > And perhaps more to the point: it highly confusing that we use one or > the other of those 2 things ("langtag"/BCP 47 tag or "name"/legacy > locale name) as "colcollate", depending on ICU version, thereby > *behaving* as if ICU < 54 really didn't know anything about BCP 47 > tags. Because, obviously earlier ICU versions know plenty about BCP > 47, since 9 lines further down we use "langtag"/BCP 47 tag as collname > for CollationCreate() -- regardless of ICU version. > > How can you say "ICU <54 doesn't even support the BCP 47 style", given > all that? Those versions will still have locales named "*-x-icu" when > users do "\dOS". Users will be highly confused when they quite > reasonably try to generalize from the example in the docs and what > "\dOS" shows, and get results that are wrong, often only in a very > subtle way. If we are fine with supporting only ICU 4.2 and later (which I think we are given that ICU 4.2 was released in 2009) then using uloc_forLanguageTag()[1] to validate and canonize seems like the right solution. I had missed that this function even existed when I last read the documentation. Does it return a BCP 47 tag in modern versions of ICU? I strongly prefer if there, as much as possible, is only one format for inputting ICU locales. 1. http://www.icu-project.org/apiref/icu4c/uloc_8h.html#aa45d6457f72867880f079e27a63c6fcb Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
В списке pgsql-hackers по дате отправления: