UNICODE encoding and jdbc related issues
От | Chris Kratz |
---|---|
Тема | UNICODE encoding and jdbc related issues |
Дата | |
Msg-id | 200504041128.00634.chris.kratz@vistashare.com обсуждение исходный текст |
Список | pgsql-general |
Our production database was created with the default SQL_ASCII encoding. It appears that some of our users have entered characters into the system with characters above 127 (accented vowels, etc). None of the tools we use currently have had a problem with this behavior until recently, everything just worked. I was testing some reporting tools this past weekend and have been playing with Jasper reports[1] . Jasper reports is a Java based reporting tool that reads data from the database via JDBC. When I initially tried to generate reports, the jdbc connection would crash with the following message: org.postgresql.util.PSQLException: Invalid character data was found. Googling eventually turned up a message on the pgsql-jdbc list detailing the problem[2]. Basically, java cannot convert these characters above 127 into unicode which is required by java. After some more googling, I found that if I took a recent database dump and then ran it through iconv[3] and then created the database with a unicode encoding, everything worked. 1. Is there any way to do a iconv type translation inline in a sql statement? ie select translate(text_field, unicode) from table.... Btw, set client_encoding=UNICODE does not work in this situation. In fact the JDBC driver for postgres seems to do this automatically. 2. I'm really not sure I want to change the encoding of our main database to Unicode. Is there a performance loss when going to a UNICODE database encoding? What about sorts, etc. 3. Is there any other way around this issue? Or are we living dangerously by trying to store non-ascii data in a database created as ascii encoded? 4. Has anyone else gone through a conversion like this? Are there any gotchas we should look out for? Thanks, -Chris We are using postgres 7.4.5 on Linux. [1] http://jasperreports.sourceforge.net/ [2] http://archives.postgresql.org/pgsql-jdbc/2004-10/msg00280.php [3] iconv -f iso8859-1 -t utf-8 < dbsnapshot.dumpall > dump-utf-8.dumpall -- Chris Kratz
В списке pgsql-general по дате отправления: