Re: Character expansion with ICU collations
От | Finnerty, Jim |
---|---|
Тема | Re: Character expansion with ICU collations |
Дата | |
Msg-id | 9EC3C20F-0721-415A-BE68-CB7240B06A26@amazon.com обсуждение исходный текст |
Ответ на | Character expansion with ICU collations ("Finnerty, Jim" <jfinnert@amazon.com>) |
Ответы |
Re: Character expansion with ICU collations
|
Список | pgsql-hackers |
Re: >> Can a CI collation be ordered upper case first, or is this a limitation of ICU? > I don't know the authoritative answer to that, but to me it doesn't make > sense, since the effect of a case-insensitive collation is to throw away > the third-level weights, so there is nothing left for "upper case first" > to operate on. It wouldn't make sense for the ICU sort key of a CI collation itself because the sort keys need to be binary equal, but whatthe collation of interest does is equivalent to adding a secondary "C"-collated expression to the ORDER BY clause. Forexample: SELECT ... ORDER BY expr COLLATE ci_as; Is ordered as if the query had been written: SELECT ... ORDER BY expr COLLATE ci_as, expr COLLATE "C"; Re: > tailoring rules >> yes It looks like the relevant API call is ucol_openRules(), Interface documented here: https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ucol_8h.html example usage from C here: https://android.googlesource.com/platform/external/icu/+/db20b09/source/test/cintltst/citertst.c for example: /* Test with an expanding character sequence */ u_uastrcpy(rule, "&a < b < c/abd < d"); c2 = ucol_openRules(rule, u_strlen(rule), UCOL_OFF, UCOL_DEFAULT_STRENGTH, NULL, &status); and a reordering rule test: u_uastrcpy(rule, "&z < AB"); coll = ucol_openRules(rule, u_strlen(rule), UCOL_OFF, UCOL_DEFAULT_STRENGTH, NULL, &status); that looks encouraging. It returns a UCollator object, like ucol_open(const char *localeString, ...), so it's an alternativeto ucol_open(). One of the parameters is the equivalent of colStrength, so then the question would be, how arethe other keyword/value pairs like colCaseFirst, colAlternate, etc. specified via the rules argument? In the same waywith the exception of colStrength? e.g. is "colAlternate=shifted;&z < AB" a valid rules string? The ICU documentation says simply: " rules A string describing the collation rules. For the syntax of the rules please see users guide." Transform rules are documented here: http://userguide.icu-project.org/transforms/general/rules But there are no examples of using the keyword/value pairs that may appear in a locale string with the transform rules, andthere's no locale argument on ucol_openRules. How can the keyword/value pairs that may appear in the locale string beapplied in combination with tailoring rules (with the exception of colStrength)?
В списке pgsql-hackers по дате отправления: