Обсуждение: Encoding Problem
Hi, I'm using Postgresql since the first 8.0 version (Windows2000) and it gets better with every new version. But one problem remains with the jdbc driver: The encoding of umlaute is still wrong. This problem occurs also in all pgAdmin III versions. The psql client on the other hand works fine. I tried different server encodings like sql_ascii or latin1 but I always get these strange characters in pgAdmin III and with jdbc. I also tried to specify the charSet property with the connection string in jdbc but that didn't help. Does anyone know how I can get the correct strings with umlaut (e.g. ä,ö,ü) in java? I thought this problem would be solved in the newer versions, but even in rc2 the problem still exists. Greetings, Jérôme
On Wed, 12 Jan 2005, Jerome Colombie wrote: > I'm using Postgresql since the first 8.0 version (Windows2000) and it > gets better with every new version. But one problem remains with the > jdbc driver: The encoding of umlaute is still wrong. This problem occurs > also in all pgAdmin III versions. The psql client on the other hand > works fine. I tried different server encodings like sql_ascii or latin1 > but I always get these strange characters in pgAdmin III and with jdbc. Note that pgadmin and jdbc are completely separate, so I can only speak for the java side of things. For the pg jdbc driver you cannot use a database with sql_ascii encoding. You must use a real encoding, latin1 should be fine. > I also tried to specify the charSet property with the connection string > in jdbc but that didn't help. If you read the documentation to find the charSet property you should have also seen the note that said this is completely ignored for server versions >= 7.3. > Does anyone know how I can get the correct strings with umlaut (e.g. > ä,ö,ü) in java? I thought this problem would be solved in the newer > versions, but even in rc2 the problem still exists. > I have no reason to believe there is a problem with the server/jdbc encoding handling. I would suspect the problem is related to how you are trying to enter/display these values. If you can reproduce this problem with a standalone Java program that runs against a non sql-ascii database and does direct string comparisons (doesn't rely on things like System.out.println), then please post such a program here. Kris Jurka
Jerome Colombie wrote: > I tried different server encodings like sql_ascii or latin1 but I always > get these strange characters in pgAdmin III and with jdbc. I also tried > to specify the charSet property with the connection string in jdbc but > that didn't help. > Does anyone know how I can get the correct strings with umlaut (e.g. > ä,ö,ü) in java? > I thought this problem would be solved in the newer versions, but even > in rc2 the problem still exists. It should "just work" when using a server encoding of LATIN1 (I assume you can represent umlaut-ed characters in LATIN1?) or UNICODE. Can you provide some example code that fails? Also, which driver version are you using? -O
On Thu, 13 Jan 2005, Oliver Jowett wrote: > It should "just work" when using a server encoding of LATIN1 (I assume > you can represent umlaut-ed characters in LATIN1?) or UNICODE. > Just an FYI, the windows version does not support unicode. See: http://pginstaller.projects.postgresql.org/FAQ_windows.html#2.6 Kris Jurka
On Fri, 14 Jan 2005, Jerome Colombie wrote: > I made a low-level program which prints to the console. It seems that > the jdbc driver is correct, although the Strings need some > postprocessing in Java. I want to create html output, so I don't know if > I have to change my windows settings, java settings (Locale) or just > need to reformat the strings in java code. > My test program looks like this: > > String t1 = new String("aäöü".getBytes(), "ISO-8859-1"); > String t1 = new String("aäöü".getBytes(), "UTF-8"); At least one of these is clearly bogus. I don't know what your default encoding is, but getBytes() will return data in that encoding, from there you are telling Java to interpret that one piece of data in two different ways, one of these must be wrong. > How can I get the correct output in html with java code? I know that > technically it hasn't got to do with jdbc but I still hope someone can > give me a solution so I don't need to change the java code. I hope I can > solve this problem by changing either the database configuration or the > java or windows locale. > To produce correctly encoded html, you need to ensure that your java environment's default encoding matches up with the encoding you have set for the page. Kris Jurka
Jerome Colombie wrote: > byte[] temp = test.getBytes(); getBytes() uses the JVM's default encoding to convert from the internal string char[] representation, which might be wrong. More interesting is test.getBytes("UTF-8") or test.getChars() or test.charAt(x) -O
Hi, I made a low-level program which prints to the console. It seems that the jdbc driver is correct, although the Strings need some postprocessing in Java. I want to create html output, so I don't know if I have to change my windows settings, java settings (Locale) or just need to reformat the strings in java code. My test program looks like this: ********* bh=# create table test ( t VARCHAR(5) ); bh=# insert into test values ('aäöü'); bh=# select * from test; t ------ aäöü (1 row) bh=# select getdatabaseencoding(); getdatabaseencoding --------------------- LATIN1 (1 row) bh=# SHOW client_encoding; client_encoding ----------------- LATIN1 (1 row) bh=# select version(); version -------------------------------------------------------------------------------------------------------- PostgreSQL 8.0.0rc2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.3.1 (mingw special 20030804-1) (1 row) ********* import java.sql.*; import java.util.Properties; public class BaseDAO { protected static String dbHost = "localhost"; protected static String dbUrl = "jdbc:postgresql://" + dbHost + ":5432/bh"; protected static String dbUser = "postgres"; protected static String dbPassword = "***"; protected static String dbDriver = "org.postgresql.Driver"; static { //Register driver try { DriverManager.registerDriver((Driver)Class.forName(dbDriver).newInstance()); } catch (Exception e){ System.out.println("Error: " + e); } } public BaseDAO() { super(); try { Properties props = new Properties(); props.put("user", dbUser); props.put("password", dbPassword); Connection con = DriverManager.getConnection(dbUrl, props); ResultSet rs = con.createStatement().executeQuery("select * from test"); if (rs.next()) { byte[] temp = rs.getString("t").getBytes(); System.out.println("Database String"); System.out.println(rs.getString("t")); System.out.println("Database Bytes"); for (int i = 0; i < temp.length; i++) { System.out.println(temp[i]); } } } catch(SQLException se) { System.out.println("Error: " + se); } } public static void main(String[] args) { BaseDAO dao = new BaseDAO(); String test = "aäöü"; System.out.println("Java String"); System.out.println(test); System.out.println("Java Bytes"); byte[] temp = test.getBytes(); for (int i = 0; i < temp.length; i++) { System.out.println(temp[i]); } try { System.out.println("ISO-8859-1"); String t1 = new String("aäöü".getBytes(), "ISO-8859-1"); temp = t1.getBytes(); for (int i = 0; i < temp.length; i++) { System.out.println(temp[i]); } } catch (java.io.UnsupportedEncodingException e) { System.out.println("Encoding exception: " + e); } try { System.out.println("UTF-8"); String t1 = new String("aäöü".getBytes(), "UTF-8"); temp = t1.getBytes(); for (int i = 0; i < temp.length; i++) { System.out.println(temp[i]); } } catch (java.io.UnsupportedEncodingException e) { System.out.println("Encoding exception: " + e); } } } ********* D:\temp\test>javac BaseDAO.java D:\temp\test>java -classpath d:\projects\jar\pg74.214.jdbc3.jar;. BaseDAO Database String a??? Database Bytes 97 63 63 63 Java String aõ÷³ Java Bytes 97 -28 -10 -4 ISO-8859-1 97 -28 -10 -4 UTF-8 97 63 63 ********* How can I get the correct output in html with java code? I know that technically it hasn't got to do with jdbc but I still hope someone can give me a solution so I don't need to change the java code. I hope I can solve this problem by changing either the database configuration or the java or windows locale. Regards, Jerome