Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values) |
Дата | |
Msg-id | CAH2-Wz=pA+ViKfPxGyBvyc41H4FhdHp=HUrmK9CDfnYdziXziQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_memvalues) (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
Список | pgsql-hackers |
On Mon, Aug 7, 2017 at 2:50 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 8/6/17 20:07, Peter Geoghegan wrote: >> I've looked into this. I'll give an example of what keyword variants >> there are for Greek, and then discuss what I think each is. > > I'm not sure why we want to get into editorializing this. We query ICU > for the names of distinct collations and use that. We ask ucol_getKeywordValuesForLocale() to get only "commonly used [variant] values with the given locale" within pg_import_system_collations(). So the editorializing has already begun. > It's more than most > people need, sure, but it doesn't cost us anything. It's also *less* than what other users need. I disagree on the cost of redundancy among entries after initdb. It's just confusing to users, and seems avoidable without adding special case logic. What's the difference between el-u-co-standard-x-icu and el-x-icu? > The alternatives > are hand-maintaining a list of collations, or installing no collations > by default. A better alternative would be to actively take an interest in what collations are created, by further refining the rules by which they are created. We have a stable API, described by various standards, that we can work with for this. This doesn't have to be a maintainability burden. We can provide general guidance about how to add stuff back within documentation. I do think that we should actually list all the collations that are available by default on some representative ICU version, once that list is tightened up, just as other database systems list them. That necessitates a little weasel wording that notes that later ICU versions might add more, but that's not a problem IMV. I don't think that CLDR will ever omit anything previously available, at least within a reasonable timeframe [1]. [1] http://cldr.unicode.org/index/process/cldr-data-retention-policy -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: