Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
От | Andrew Dunstan |
---|---|
Тема | Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales) |
Дата | |
Msg-id | 538BA1E5.6040406@dunslane.net обсуждение исходный текст |
Ответ на | Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
|
Список | pgsql-hackers |
On 06/01/2014 05:35 PM, Tom Lane wrote: > I wrote: >> 3. Try to select some "more portable" non-ASCII character, perhaps U+00A0 >> (non breaking space) or U+00E1 (a-acute). I think this would probably >> work for most encodings but it might still fail in the Far East. Another >> objection is that the expected/plpython_unicode.out file would contain >> that character in UTF8 form. In principle that would work, since the test >> sets client_encoding = utf8 explicitly, but I'm worried about accidental >> corruption of the expected file by text editors, file transfers, etc. >> (The current usage of U+0080 doesn't suffer from this risk because psql >> special-cases printing of multibyte UTF8 control characters, so that we >> get exactly "\u0080".) > I did a little bit of experimentation and determined that none of the > LATIN1 characters are significantly more portable than what we've got: > for instance a-acute fails to convert into 16 of the 33 supported > server-side encodings (versus 17 failures for U+0080). However, > non-breaking space is significantly better: it converts into all our > supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW. > It seems likely that we won't do better than that except with a basic > ASCII character. > Yeah, I just looked at the copyright symbol, with similar results. Let's just stick to ASCII. cheers andrew
В списке pgsql-hackers по дате отправления: