BUG #2685: Wrong charset of server messages on client [PATCH]

Поиск
Список
Период
Сортировка
От Sergiy Vyshnevetskiy
Тема BUG #2685: Wrong charset of server messages on client [PATCH]
Дата
Msg-id 200610101455.k9AEtTTd085210@wwwmaster.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #2685: Wrong charset of server messages on client [PATCH]  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged online:

Bug reference:      2685
Logged by:          Sergiy Vyshnevetskiy
Email address:      serg@vostok.net
PostgreSQL version: 8.1
Operating system:   FreeBSD-6 stable
Description:        Wrong charset of server messages on client [PATCH]
Details:

DESCRIPTION:

PostgreSQL backend uses gettext() to localize its messages. The charset of
localized messages is determined by LC_CTYPE by default.

Then the message is processed through sprintf-like mechanism (with database
data as possible arguments) and fed to send_message_to_frontend(), that
converts data from _database_charset_(!) to client charset.

If LC_CTYPE is not the same as (at least binary compatible to) database
charset, then client gets garbage characters in server messages. If database
charset is UTF-8, then cluster may recusively generate "invalid byte
sequence for encoding" errors till it fills up
errordata[ERRORDATA_STACK_SIZE], then it panics.

SOLUTION:

Convert server messages to database charset.

PATCH:

--- src/backend/utils/mb/mbutils.c.o0 Tue Oct 10 11:51:13 2006

+++ src/backend/utils/mb/mbutils.c  Tue Oct 10 11:49:22 2006

@@ -615,6 +615,7 @@

  DatabaseEncoding = &pg_enc2name_tbl[encoding];

  Assert(DatabaseEncoding->encoding == encoding);

 #ifdef USE_ICU

+
bind_textdomain_codeset("postgres",(&pg_enc2iananame_tbl[encoding])->name);

  ucnv_setDefaultName((&pg_enc2iananame_tbl[encoding])->name);

 #endif

 }




This, however, uncovers another bug: PostgreSQL dumps the messages into
stderr/syslog as-is, without converting database data from database charset
to charset from LC_MESSAGES. After this patch it will do so with message
text too. The fix should be trivial - set up a conversion from database
charset to server charset. I will post a patch for it later.

NOTE:

I used pg_enc2iananame_tbl instead of pg_enc2name_tbl, because gettext
doesn't accept many

Possible TODO:
Change PostgreSQL charset names to IANA-standard names.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Milen A. Radev"
Дата:
Сообщение: BUG #2684: Memory leak in libpq
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #2684: Memory leak in libpq