Re: EBCDIC sorting as a use case for ICU rules
От | Peter Eisentraut |
---|---|
Тема | Re: EBCDIC sorting as a use case for ICU rules |
Дата | |
Msg-id | 1f20d0d7-6b15-d10f-94f5-77b2e82112b1@enterprisedb.com обсуждение исходный текст |
Ответ на | EBCDIC sorting as a use case for ICU rules ("Daniel Verite" <daniel@manitou-mail.org>) |
Ответы |
Re: EBCDIC sorting as a use case for ICU rules
Re: EBCDIC sorting as a use case for ICU rules |
Список | pgsql-hackers |
On 21.06.23 15:28, Daniel Verite wrote: > A collation like the following this seems to work (the rule simply enumerates > US-ASCII letters in the EBCDIC alphabet order, with adequate quoting) > > CREATE COLLATION ebcdic (provider='icu', locale='und', > rules=$$&' > '<'.'<'<'<'('<'+'<\|<'&'<'!'<'$'<'*'<')'<';'<'-'<'/'<','<'%'<'_'<'>'<'?'<'`'<':'<'#'<'@'<\'<'='<'"'<a<b<c<d<e<f<g<h<i<j<k<l<m<n<o<p<q<r<'~'<s<t<u<v<w<x<y<z<'['<'^'<']'<'{'<A<B<C<D<E<F<G<H<I<'}'<J<K<L<M<N<O<P<Q<R<'\'<S<T<U<V<W<X<Y<Z<0<1<2<3<4<5<6<7<8<9$$); > > This can be useful for people who migrate from mainframes to Postgres > and need their migration tests to produce the same sorted results as the > original system. > Since rules can be defined at the database level with the icu_rules option, > they don't even need to tweak their queries to add COLLATE clauses, > which surely is appreciable in that kind of project. > > US-ASCII when sorted in EBCDIC order comes out like this: > > .<(+|&!$*);-/,%_>?`:#@'="abcdefghijklmnopqr~stuvwxyz[^]{ABCDEFGHI}JKLMNOPQR\ST > UVWXYZ0123456789 > > Maybe this example could be added to the documentation except for > the problem that the rule is very long and dollar-quoting cannot be split > into several lines. Literals enclosed by single quotes can be split that > way, but would require escaping the single quotes in the rule, which > would lead to scary-looking over-quoted contents. You can use whitespace in the rules. For example, CREATE COLLATION ebcdic (provider='icu', locale='und', rules=$$ & ' ' < '.' < '<' < '(' < '+' < \| < '&' < '!' < '$' < '*' < ')' < ';' < '-' < '/' < ',' < '%' < '_' < '>' < '?' < '`' < ':' < '#' < '@' < \' < '=' < '"' < a < b < c < d < e < f < g < h < i < j < k < l < m < n < o < p < q < r < '~' < s < t < u < v < w < x < y < z < '[' < '^' < ']' < '{' < A < B < C < D < E < F < G < H < I < '}' < J < K < L < M < N < O < P < Q < R < '\' < S < T < U < V < W < X < Y < Z < 0 < 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 $$); (This particular layout is meant to match the rows in https://en.wikipedia.org/wiki/EBCDIC#Code_page_layout.)
В списке pgsql-hackers по дате отправления: