Re: Unicode support
От | Peter Eisentraut |
---|---|
Тема | Re: Unicode support |
Дата | |
Msg-id | 200904141536.35866.peter_e@gmx.net обсуждение исходный текст |
Ответ на | Re: Unicode support (Andrew Gierth <andrew@tao11.riddles.org.uk>) |
Ответы |
Re: Unicode support
|
Список | pgsql-hackers |
On Tuesday 14 April 2009 07:07:27 Andrew Gierth wrote: > FWIW, the SQL spec puts the onus of normalization squarely on the > application; the database is allowed to assume that Unicode strings > are already normalized, is allowed to behave in implementation-defined > ways when presented with strings that aren't normalized, and provision > of normalization functions and predicates is just another optional > feature. Can you name chapter and verse on that? I see this, for example, 6.27 <numeric value function> 5) If a <char length expression> is specified, then Case: a) If the character encoding form of <character value expression> is not UTF8, UTF16, or UTF32, then let S be the <string value expression>. Case: i) If the most specific type of S is character string, then the result is the number of characters in the value of S. NOTE 134 — The number of characters in a character string is determined according to the semantics of the character set of that character string. ii) Otherwise, the result is OCTET_LENGTH(S). b) Otherwise, the result is the number of explicit or implicit <char length units> in <char length expression>, counted in accordance with the definition of those units in the relevant normatively referenced document. So SQL redirects the question of character length the Unicode standard. I have not been able to find anything there on a quick look, but I'm sure the Unicode standard has some very specific ideas on this. Note that the matter of normalization is not mentioned here.
В списке pgsql-hackers по дате отправления: