Re: Order changes in PG16 since ICU introduction
От | Jeff Davis |
---|---|
Тема | Re: Order changes in PG16 since ICU introduction |
Дата | |
Msg-id | d321c19850a236e76850f00437a34b8f8f1abb90.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Order changes in PG16 since ICU introduction (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Order changes in PG16 since ICU introduction
|
Список | pgsql-hackers |
On Fri, 2023-04-21 at 16:00 -0400, Tom Lane wrote: > I think I might like this idea, except for one thing: you're > imagining > that the locale doesn't control anything except string comparisons. > What about to_upper/to_lower, character classifications in regexes, > etc? If provider='libc' and LC_CTYPE='C', str_toupper/str_tolower are handled with asc_tolower/asc_toupper. The regex character classification is done with pg_char_properties. In these cases neither ICU nor libc is used; it's just code in postgres. libc is special in that you can set LC_COLLATE and LC_CTYPE separately, so that different locales are used for sorting and character classification. That's potentially useful to set LC_COLLATE to C for performance reasons, while setting LC_CTYPE to a useful locale. We don't allow ICU to set collation and ctype separately (it would be possible to allow it, but I don't think there's a huge demand and it's arguably inconsistent to set them differently). > (I'm not sure whether those operations can get redirected to ICU > today > or whether they still always go to libc, but we'll surely want to fix > it eventually if the latter is still true.) Those operations do get redirected to ICU today. There are extensions that call locale-sensitive libc functions directly, and obviously those won't use ICU. > Aside from the user-surprise issues discussed up to now, pg_dump > scripts > emitted by pre-v15 pg_dump are not going to contain LOCALE_PROVIDER > clauses in CREATE DATABASE, and people are going to be very unhappy > if that means they suddenly get totally different locale semantics > after restoring into a new DB. Agreed. > I think we need some plan for mapping > libc-style locale specs into ICU locales so that we can make that > more nearly transparent. ICU does a reasonable job mapping libc-like locale names to ICU locales, e.g. en_US to en-US, etc. The ordering semantics aren't guaranteed to be the same, of course (because the libc-locales are platform-dependent), but it's at least conceptually the same locale. > > Maybe this means we are not ready to do ICU-by-default in v16. > It certainly feels like there might be more here than we want to > start designing post-feature-freeze. This thread is already on the Open Items list. As long as it's not too disruptive to others I'll leave it as-is for now to see how this sorts out. Right now it's not clear to me how much of this is a v15 issue vs a v16 issue. Regards, Jeff Davis
В списке pgsql-hackers по дате отправления: