Обсуждение: Encoding for error messages during connect

Поиск
Список
Период
Сортировка

Encoding for error messages during connect

От
Thomas Kellerer
Дата:
Hi,

when the server is set to e.g. lc_messages = 'German_Germany.1252' then error messages during connect are not properly
decodedby the driver (or encoded by the server?) 

At least when the passwort is incorrect the german error message

    Passwort-Authentifizierung für Benutzer »thomas« fehlgeschlagen

is incorrectly received by the driver as

    Passwort-Authentifizierung f?r Benutzer ?thomas? fehlgeschlagen

After debugging the driver I found out that the driver creates the stream for the startup communication using US_ASCII
encodingwhich will yield incorrect characters beyond ASCII 127. 

I debugged the data that is received from the server and that proofed that the message is received as a single byte
encoding.Which seems correct as 'German_Germany.1252' is indeed a single byte encoding. 

I changed the stream that the driver uses during connect to use a different encoding, by changing
org.postgresql.core.v3.ConnectionFactoryImpland adding the line 

newStream.setEncoding(Encoding.getDatabaseEncoding("ISO-8859-1"));

after Line 77 (where newStream = new PGStream(host, port) is done)

And in that case the error message is decoded properly by the driver.

Now I don't think it would be possible for the driver to find out which encoding to use for that stream before actually
havinga connection. So it would need to evaluate some kind of client side information, e.g. the lc_messages environment
variableon the client or through a connection property that would then be used to initialize the stream correctly. 

Personally I'd prefer a connection property (something like "messageEncoding") to control this as this can be part of
theJDBC URL which is usually configurable in a Java environment. 

What do you think?

Regards
Thomas






Re: Encoding for error messages during connect

От
Thomas Kellerer
Дата:
Any comments on this?

Thomas Kellerer, 05.11.2011 12:12:
> Hi,
>
> when the server is set to e.g. lc_messages = 'German_Germany.1252'
> then error messages during connect are not properly decoded by the
> driver (or encoded by the server?)
>
> At least when the passwort is incorrect the german error message
>
> Passwort-Authentifizierung für Benutzer »thomas« fehlgeschlagen
>
> is incorrectly received by the driver as
>
> Passwort-Authentifizierung f?r Benutzer ?thomas? fehlgeschlagen
>
> After debugging the driver I found out that the driver creates the
> stream for the startup communication using US_ASCII encoding which
> will yield incorrect characters beyond ASCII 127.
>
> I debugged the data that is received from the server and that proofed
> that the message is received as a single byte encoding. Which seems
> correct as 'German_Germany.1252' is indeed a single byte encoding.
>
> I changed the stream that the driver uses during connect to use a
> different encoding, by changing
> org.postgresql.core.v3.ConnectionFactoryImpl and adding the line
>
> newStream.setEncoding(Encoding.getDatabaseEncoding("ISO-8859-1"));
>
> after Line 77 (where newStream = new PGStream(host, port) is done)
>
> And in that case the error message is decoded properly by the
> driver.
>
> Now I don't think it would be possible for the driver to find out
> which encoding to use for that stream before actually having a
> connection. So it would need to evaluate some kind of client side
> information, e.g. the lc_messages environment variable on the client
> or through a connection property that would then be used to
> initialize the stream correctly.
>
> Personally I'd prefer a connection property (something like
> "messageEncoding") to control this as this can be part of the JDBC
> URL which is usually configurable in a Java environment.
>
> What do you think?
>
> Regards Thomas



Re: Encoding for error messages during connect

От
Kris Jurka
Дата:
On 11/17/2011 12:17 AM, Thomas Kellerer wrote:

>> Now I don't think it would be possible for the driver to find out
>> which encoding to use for that stream before actually having a
>> connection. So it would need to evaluate some kind of client side
>> information, e.g. the lc_messages environment variable on the client
>> or through a connection property that would then be used to
>> initialize the stream correctly.
>>

It's not a good assumption that the client environment will match the
server environment.

>> Personally I'd prefer a connection property (something like
>> "messageEncoding") to control this as this can be part of the JDBC
>> URL which is usually configurable in a Java environment.

This seems more reasonable.  Previously we discussed how to send
usernames and passwords to the database because the encoding they are
sent in must match the encoding of the database these values were set in
(which may be different than the database you're connecting to).  At the
time we decided that a connection option to configure this wasn't the
right idea and now always send these values as UTF-8.  I don't recall
why we made that decision, but checking the archives might provide some
additional information on this case.

Kris Jurka