Re: UTF16 surrogate pairs in UTF8 encoding
От | Tom Lane |
---|---|
Тема | Re: UTF16 surrogate pairs in UTF8 encoding |
Дата | |
Msg-id | 2195.1283954496@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: UTF16 surrogate pairs in UTF8 encoding (Marko Kreen <markokr@gmail.com>) |
Ответы |
Re: UTF16 surrogate pairs in UTF8 encoding
|
Список | pgsql-hackers |
Marko Kreen <markokr@gmail.com> writes: > Although it does seem unnecessary. The reason I asked for this to be spelled out is that ordinarily, a backslash escape \nnn is a very low-level thing that will insert exactly what you say. To me it's quite unexpected that the system would editorialize on that to the extent of replacing two UTF16 surrogate characters by a single code point. That's necessary for correctness because our underlying storage is UTF8, but it's not obvious that it will happen. (As a counterexample, if our underlying storage were UTF16, then very different things would need to happen for the exact same SQL input.) I think a lot of people will have this same question when reading this para, which is why I asked for an explanation there. regards, tom lane
В списке pgsql-hackers по дате отправления: