Data corruption with BYTEA and SQL_ASCII encoding
От | Marcus Better |
---|---|
Тема | Data corruption with BYTEA and SQL_ASCII encoding |
Дата | |
Msg-id | 15412.58065.258060.716823@kakmonster.dactylis.com обсуждение исходный текст |
Список | pgsql-jdbc |
Hi, I am using PostgreSQL 7.1.3 with the latest (7.2) development JDBC driver. My tables contain binary data in BYTEA columns. I get strange errors when I read the data using getBytes() if my database has SQL_ASCII default encoding. The data I get has the correct length, but some characters (I believe 0xa0 and higher) are replaced with 0xfd characters. I traced the problem to the getBytes method in org/postgresql/jdbc2/ResultSet.java in the JDBC driver: //Version 7.2 supports the bytea datatype for byte arrays if (fields[columnIndex - 1].getPGType().equals("bytea")) { return PGbytea.toBytes(getString(columnIndex)); } I checked the actual contents of the column that is returned from the database, and it is a string which contains non-ascii characters, like this: \012¿_Ãeo7\223\2316#Ph©\021ê\217\212åI\217k·h:"\230ÜÔ\034ÅW This string agrees with the contents of the database. I also checked that PGbytea.toBytes() translates this string correctly. So this leaves the call to getString(). getString() tries to decode the string using the specified default encoding of the database (SQL_ASCII), and this indeed gives the erroneous results. It seems strange that the string that is returned from the database is not in ASCII at all. This is the root of the problem. Changing the encoding of the database to LATIN1 solves the problem. Does this mean that I should not use SQL_ASCII databases with binary data? Can anyone tell me if there is a better solutions, or if I'm doing something wrong here? Thanks, Marcus
В списке pgsql-jdbc по дате отправления: