Re: Server-side support of all encodings
От | William ZHANG |
---|---|
Тема | Re: Server-side support of all encodings |
Дата | |
Msg-id | f5n7ba$nmr$1@news.hub.org обсуждение исходный текст |
Ответ на | Server-side support of all encodings (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>) |
Ответы |
Re: Server-side support of all encodings
|
Список | pgsql-hackers |
"Tom Lane" <tgl@sss.pgh.pa.us> > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: >> PostgreSQL suppots SJIS, BIG5, GBK, UHC and GB18030 as client encodings, >> but we cannot use them as server encodings. Are there any reason for it? > > Very much so --- they aren't safe ASCII-supersets, and thus for example > the parser will fail on them. Backend encodings must have the property > that all bytes of a multibyte character are >= 128. Sorry. I still cannot understand why backend encodings must have this property. AFAIK, the parser treats characters as ASCII. So any multi-byte characters will be treated as two or more ASCII characters. But if the multi-byte encoding doesnot use any special ASCII characters like single quote('), double quote(") and backslash(\), I think the parser can deal with it correctly. A quick search in src\backend\utils\mb\Unicode\*.map tells me that no encoding uses single quote or double quote, but JOHAB, GBK, GB18030, BIG5, SJIS use backslash. Since pgsql doesnot accept backslash as escape character in identity(double quoted string) or value(single quoted string) any more, I think the parser/scanner can process multi-bytes characters correctly. Thanks in advance. William ZHANG > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
В списке pgsql-hackers по дате отправления: