Re: Unicode support
От | Peter Eisentraut |
---|---|
Тема | Re: Unicode support |
Дата | |
Msg-id | 200904141532.44618.peter_e@gmx.net обсуждение исходный текст |
Ответ на | Re: Unicode support (Andrew Dunstan <andrew@dunslane.net>) |
Ответы |
Re: Unicode support
|
Список | pgsql-hackers |
On Monday 13 April 2009 22:39:58 Andrew Dunstan wrote: > Umm, but isn't that because your encoding is using one code point? > > See the OP's explanation w.r.t. canonical equivalence. > > This isn't about the number of bytes, but about whether or not we should > count characters encoded as two or more combined code points as a single > char or not. Here is a test case that shows the problem (if your terminal can display combining characters (xterm appears to work)): SELECT U&'\00E9', char_length(U&'\00E9');?column? | char_length ----------+-------------é | 1 (1 row) SELECT U&'\0065\0301', char_length(U&'\0065\0301');?column? | char_length ----------+-------------é | 2 (1 row)
В списке pgsql-hackers по дате отправления: