sql92 character sets
От | Dennis Bjorklund |
---|---|
Тема | sql92 character sets |
Дата | |
Msg-id | Pine.LNX.4.44.0404131003190.4551-100000@zigo.dhs.org обсуждение исходный текст |
Ответы |
Re: sql92 character sets
|
Список | pgsql-hackers |
For my own amusement I'm reading the sql 92 spec about character sets. There are some concepts that are a bit difficult that maybe someone can explain for me: character set character repertoire for example in 4.2.1 it says: A character set is described by a character set descriptor. A character set descriptor includes: - the name of the character set or character repertoire, - if the character set is a character repertoire,then the name of the form-of-use, - an indication of what characters are in the character set, and - the name of the default collation of the character set. What I have understod so far is that form-of-use is the encoding. So if the character set is UNICODE then the form-of-use could be UTF-8, UTF-16 and so on. The character repertoire however I don't have an intuition about it all. Then we have this little section: The <implementation-defined character repertoire name> SQL_TEXT specifies the name of a character repertoire and impliedform-of- use that can represent every character that is in <SQL language character> and all other characters thatare in character sets supported by the implementation. Had unicode been a superset of all character sets, then one could just have used unicode for SQL_TEXT. Exactly how do we create a character repertoire that can store any character from any character set.. Storing the character set for each character is not such a cool thing to do even if it would work :-) SQL_ASCII in pg is similar, it's basically a number of bytes. But the spec seems to say that one should be able to count the characters as well (not the bytes) so SQL_ASCII is not the same as SQL_TEXT. ps. This is not me volunteering to implement all this :-) -- /Dennis Björklund
В списке pgsql-hackers по дате отправления: