Re: multibyte-character aware support for function "downcase_truncate_identifier()"
От | Andrew Dunstan |
---|---|
Тема | Re: multibyte-character aware support for function "downcase_truncate_identifier()" |
Дата | |
Msg-id | 4CE9A9B7.1080707@dunslane.net обсуждение исходный текст |
Ответ на | Re: multibyte-character aware support for function "downcase_truncate_identifier()" (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: multibyte-character aware support for function "downcase_truncate_identifier()"
Re: multibyte-character aware support for function "downcase_truncate_identifier()" |
Список | pgsql-hackers |
<br /><br /> On 11/21/2010 06:09 PM, Robert Haas wrote: <blockquote cite="mid:AANLkTikweY9M4vfR0KmKwZiit-w8siSgsSk3x6iuj8Rz@mail.gmail.com"type="cite"><pre wrap="">I think that's fair. Itactually doesn't seem like it should be that hard if we knew that the server encoding were UTF8 - it's just a big translation table somewhere, no? </pre></blockquote><br /> No, it's far more complex. See for example <a class="moz-txt-link-rfc2396E" href="http://unicode.org/reports/tr21/tr21-3.html"><http://unicode.org/reports/tr21/tr21-3.html></a>,which says:<br/><blockquote><p>There are a number of complications to case mappings that occur once the repertoire of charactersis expanded beyond ASCII. <ul><li>Because of the inclusion of certain composite characters for compatibility, suchas 01F1 "DZ" <i>capital dz</i>, there is a third case, called <i>titlecase</i>, which is used where the first letterof a word is to be capitalized (e.g. Titlecase, vs. UPPERCASE, or lowercase). <ul><li>For example, the title case ofthe example character is 01F2 "Dz" <i>capital d with small z</i>.</ul><li>Case mappings may produce strings of differentlength than the original. <ul><li>For example, the German character 00DF "ß" <i>small letter sharp s</i> expandswhen uppercased to the sequence of two characters "SS". This also occurs where there is no precomposed character correspondingto a case mapping, such as with 0149 "ʼn" <i>latin small letter n preceded by apostrophe.</i></ul><li>Charactersmay also have different case mappings, depending on the context. <ul><li>For example, 03A3"Σ" <i>capital sigma</i> lowercases to 03C3 "σ" <i>small sigma</i> if it is followed by another letter, but lowercasesto 03C2 "ς" <i>small final sigma</i> if it is not.</ul><li>Characters may have case mappings that depend on thelocale. <ul><li>For example, in Turkish the letter 0049 "I" <i>capital letter i</i> lowercases to 0131 "ı" <i>small dotlessi</i>.</ul><li>Case mappings are not, in general, reversible. <ul><li>For example, once the string "McGowan" has beenuppercased, lowercased or titlecased, the original cannot be recovered by applying another uppercase, lowercase, or titlecaseoperation.</ul></ul></blockquote><br /> cheers<br /><br /> andrew<br /><br /><br /><br />
В списке pgsql-hackers по дате отправления: