Re: A question about postgresql 8.1 and UTF strings
От | Oliver Jowett |
---|---|
Тема | Re: A question about postgresql 8.1 and UTF strings |
Дата | |
Msg-id | 44950E11.90508@opencloud.com обсуждение исходный текст |
Ответ на | A question about postgresql 8.1 and UTF strings ("Yair Zas" <yair.zaslavsky@gmail.com>) |
Список | pgsql-jdbc |
Yair Zas wrote: > System.out.println(user.getBytes().length) - however, instead of seeing > 8 bytes (2 bytes per each character, 4 characters ), i saw 4 bytes .... > Can you please tell me what is it that I'm doing wrong? getBytes() uses the JVM's default encoding to translate the String to bytes.. This is usually something like ISO-8859-1, which is a one-byte-per-character encoding that can't represent Hebrew letters. If you want to generate a representation in a particular encoding (e.g. your description implies you're expecting a particular 2-byte-per-character encoding) then you should use the getBytes() variant that takes an encoding name. This is not something specific to JDBC, it's standard Java. If you are working with characters beyond 7-bit US-ASCII, I'd strongly recommend doing some research into Java's internal string representation and how that is transformed into bytes .. The javadoc for Charset is one starting point: http://java.sun.com/j2se/1.4.2/docs/api/java/nio/charset/Charset.html -O
В списке pgsql-jdbc по дате отправления: