Re: prevent encoding conversion recursive error

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: prevent encoding conversion recursive error
Дата
Msg-id 11386.1123554088@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: prevent encoding conversion recursive error  ("Qingqing Zhou" <zhouqq@cs.toronto.edu>)
Ответы Re: prevent encoding conversion recursive error  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-patches
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
> Yeah, it is not a very clean solution. Do you mean the general problem is
> "prevent recursive error reporting because of the error in transalting error
> message"?

> I put the image of the reporting email here:
> http://www.cs.toronto.edu/~zhouqq/encode.jpg

Actually, I believe the general problem is that the gettext software
is doing the wrong internal character-set conversion for translated
message texts.

I can get this same crash on a Linux machine if I have server encoding
= utf8 and client encoding = gb18030 and I set lc_messages = zh_TW
... but if I instead make lc_messages = zh_CN, no problem.  The backend
zh_TW.po file contains

msgid "ignoring unconvertible UTF-8 character 0x%04x"
msgstr "忽略無法轉換的UTF-8字元0x%04x"

and if I read the header correctly, this is claimed to be in UTF8
encoding.  So it ought to be delivered as-is when in a UTF8 database.
But tracing through the failure with gdb, I see that what is actually
delivered back from gettext() is

(gdb) p str
$1 = 0x82e8a74 "����?��??��UTF-8��Ԫ0xd4da"
(gdb) x/32cx str
0x82e8a74:      0xba    0xf6    0xc2    0xd4    0x3f    0xb7    0xa8    0x3f
0x82e8a7c:      0x3f    0xb5    0xc4    0x55    0x54    0x46    0x2d    0x38
0x82e8a84:      0xd7    0xd6    0xd4    0xaa    0x30    0x78    0x64    0x34
0x82e8a8c:      0x64    0x61    0x00    0x7e    0x7f    0x7f    0x7f    0x7f
(gdb)

so some sort of conversion has taken place.  I had initially initialized
the database with initdb --locale=zh_CN, which was interpreted by
Postgres as requesting EUC_CN encoding.  I suspect the above is the
EUC_CN equivalent of the message text from the .po file, and that the
real problem is that gettext() has not been told the correct character
set to convert messages to.

ISTM we've seen this issue before and Peter had an idea how to fix it,
but I forget the details.  Peter?

            regards, tom lane

В списке pgsql-patches по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: PL/pgSQL: SELECT INTO EXACT
Следующее
От: Tom Lane
Дата:
Сообщение: Re: prevent encoding conversion recursive error