Re: Problem with accessing Russian UTF database
От | Oliver Jowett |
---|---|
Тема | Re: Problem with accessing Russian UTF database |
Дата | |
Msg-id | 492C85C2.8040602@opencloud.com обсуждение исходный текст |
Ответ на | Problem with accessing Russian UTF database ("Ronald Vyhmeister" <rvyhmeister@gmail.com>) |
Список | pgsql-jdbc |
Ronald Vyhmeister wrote: > I'm having real trouble with the jdbc driver for postgres... I just > installed the latest version... > > I have a database, UTF8 encoded, which has data in Russian. I can view it > beautifully using PGAdmin3 or any other ODBC connection. Perhaps these connections are not actually using UTF8 to interpret the data, but some other encoding - so while they appear to write encoded data then retrieve it OK, it's not actually what you think it is when interpreted as UTF8? > String URLdb = > "jdbc:postgresql://127.0.0.1:5432/oldzautest?user=noe&password=genesis&charS > et=UNICODE"; You should not need "charSet=UNICODE", though I don't think it'll break anything. > <data> > <db_content> > <row> > <contents content = "1" /> > <contents content = "1" /> > <contents content = "?????" /> > <contents content = "????????" /> > <contents content = "?????????" /> > <contents content = "1965-03-10" /> > <contents content = "1" /> > </row> > </db_content> > </data> Perhaps the problem is in the encoding you are using to write out that XML fragment? Or in whatever tool you are using to view it? > I've set the client_encoding to UTF8 on the server... What am I doing > wrong? What am I missing? I'd be thrilled to interact privately with > someone who has solved what for now is a mystery to me. You shouldn't need to touch client_encoding for JDBC to work (though other clients might need it). The JDBC driver forces client_encoding to UTF8 anyway on connection startup. It may be useful to examine the actual value of the characters in the String objects you are dealing with (i.e. print out (int)s.charAt(0) etc) to check they contain the unicode codepoints you were expecting. In general the driver "just works" with UTF-8 encoded databases. It's dealing in terms of Unicode strings internally, so the only transcoding that goes on is from UTF-8 to UTF-16, which is lossless. All the reported problems we've seen in the (recent) past with this configuration have been either problems with non-JDBC clients getting confused, or problems with how the resulting String was displayed to the user, or having non-unicode garbage stored in the database in the first place. -O
В списке pgsql-jdbc по дате отправления: