Re: Order changes in PG16 since ICU introduction
От | Jeff Davis |
---|---|
Тема | Re: Order changes in PG16 since ICU introduction |
Дата | |
Msg-id | 4e3281d1ba5950dbe3b5d3c8d511da4c5751217a.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Order changes in PG16 since ICU introduction (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
Список | pgsql-hackers |
On Mon, 2023-05-22 at 14:27 +0200, Peter Eisentraut wrote: > The rules are for setting whatever sort order you like. Maybe you > want > to sort + before - or whatever. It's like, if you don't like it, > build > your own. A build-your-own feature is fine, but it's not completely zero cost. There some risk that rules specified for ICU version X fail to load for ICU version Y. If that happens to your database default collation, you are in big trouble. The risk of failing to load a language tag in a later version, especially one returned by uloc_toLanguageTag() in strict mode, is much lower. We can reduce the risk by allowing rules only for CREATE COLLATION (not CREATE DATABASE), and see what users do with it first, and consider adding it to CREATE DATABASE later. We can also try to explain in the docs that it's a build-it-yourself kind of feature (use it if you see a purpose, otherwise ignore it), though I'm not sure quite how we should word it. And I'm skeptical that we don't have a single plausible end-to-end user story. I just can't think of any reason someone would need something like this, given how flexible the collation settings in the language tags are. The best case I can think of is if someone is trying to make an ICU collation that matches some non-ICU collation in another system, which sounds hard; but perhaps it's reasonable to do in cases where it just needs to work well-enough in some limited case. Also, do we have an answer as to why specifying the rules as '' is not the same as not specifying any rules[1]? [1] https://www.postgresql.org/message-id/36a6e89689716c2ca1fae8adc8e84601a041121c.camel@j-davis.com > The co settings are just everything else. > They are not parametric, they are just some other sort order that > someone spelled out explicitly. This sounds like another case where we can't really tell the user why they would want to use a specific "co" setting; they should only use it if they already know they want it. Is there some way we can word that in the documentation so that people don't misuse them? For instance, one of them is called "emoji". I'm sure a lot of applications use emoji (or at least might encounter them), should they always use co-emoji, or would some people who are using emoji not want it? Can it be combined with "ks" or other "k*" settings? What I'm trying to avoid is users seeing something in the documentation and using it without it really being a good fit for their problem. Then they see something unexpected, and need to rebuild all of their indexes or something. > > * I don't understand what "kc" means if "ks" is not set to > > "level1". > > There is an example here: > https://peter.eisentraut.org/blog/2023/05/16/overview-of-icu-collation-settings#colcaselevel Interesting, thank you. Regards, Jeff Davis
В списке pgsql-hackers по дате отправления: