Обсуждение: Re: [JDBC] ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()
Barry Lind wrote: > Joseph, > > In postgres UNICODE means utf8. Which differs from java unicode? I notice there is no way to change a database's encoding. If I just change the encoding type in the pg_database to latin1 will there be data loss? > > --Barry > > Joseph Shraibman wrote: > >> Barry Lind wrote: >> >>> Joseph, >>> >>> The problem is that your database claims to be ASCII, but you are >>> storing non-ascii data in it. >>> >>> As of 7.3 the jdbc driver has the server convert from the database >>> character set to UTF8. Then send the data to the driver in UTF8 and >>> the driver then decodes the UTF8 to java unicode. >> >> >> >> I see this in my postgres log when I connect via jdbc: >> >> LOG: query: set datestyle to 'ISO'; select version(), case when >> pg_encoding_to_char(1) = 'SQL_ASCII' then 'UNKNOWN' else >> getdatabaseencoding() end; >> LOG: query: set client_encoding = 'UNICODE'; show autocommit >> >> So if client_encoding is unicode why is the driver trying to convert >> from UTF8? >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 5: Have you checked our extensive FAQ? >> >> http://www.postgresql.org/users-lounge/docs/faq.html >> > -- Joseph Shraibman joseph@xtenit.com Increase signal to noise ratio. http://xis.xtenit.com
Joseph Shraibman wrote: >> >> In postgres UNICODE means utf8. > > > Which differs from java unicode? > Yes. Unicode in java is 16 bit characters (I think the term for this is UCS2), two bytes for each character, whereas utf8 is a variable length encoding with characters represented by 1, 2 or 3 bytes. > I notice there is no way to change a database's encoding. If I just > change the encoding type in the pg_database to latin1 will there be data > loss? The recommended way to do this would be to dump the contents of the database, create a new database with the desired character set and then import the data into that new database. I don't know if changing pg_database directly would work or not. --Barry
Character Encoding WAS: ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()
От
Joseph Shraibman
Дата:
Barry Lind wrote: > > > Joseph Shraibman wrote: >> I notice there is no way to change a database's encoding. If I just >> change the encoding type in the pg_database to latin1 will there be >> data loss? > > > The recommended way to do this would be to dump the contents of the > database, create a new database with the desired character set and then > import the data into that new database. I don't know if changing > pg_database directly would work or not. > > That didn't work. When I tried that Oné turned into Oné, which confuses me because I thought my problem was that I was storing latin1 chars in a text field that was supposed to only have the lower ascii bits. Oh well, I guess it is dump/reload time.
Re: Character Encoding WAS: ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()
От
Joseph Shraibman
Дата:
Joseph Shraibman wrote: > Barry Lind wrote: >> Joseph Shraibman wrote: >>> I notice there is no way to change a database's encoding. If I just >>> change the encoding type in the pg_database to latin1 will there be >>> data loss? >> >> >> >> The recommended way to do this would be to dump the contents of the >> database, create a new database with the desired character set and >> then import the data into that new database. I don't know if changing >> pg_database directly would work or not. >> >> > That didn't work. Acutally it did. My test data was flawed. What didn't work is editing the dump to change the type to unicode.