Re: Problem with utf8 encoding
От | Andrew McMillan |
---|---|
Тема | Re: Problem with utf8 encoding |
Дата | |
Msg-id | 1259832914.8024.823.camel@happy.home.mcmillan.net.nz обсуждение исходный текст |
Ответ на | Problem with utf8 encoding (Jorge Miranda Castañeda <jmirandac.85@gmail.com>) |
Ответы |
Re: Problem with utf8 encoding
|
Список | pgsql-php |
On Thu, 2009-12-03 at 02:00 -0500, Jorge Miranda Castañeda wrote: > Hello everyone! > > > I'm working in a project using postgres, propel, and php. > > > My development environment is: > SO: Windows vista Business SP2 > Postgres: Postgres v8.4 > Propel: Propel generator/runtime v1.4 > PHP: PHP v5.3 > > > Currently I'm struggling with a problem caused by the encoding. > Everytime I try to insert a row into the table CURRENCY, which has ID, > DESC, and SYMBOL as its columns, I get the following error: > Unable to execute INSERT statement. [wrapped: SQLSTATE[22021]: > Character not in repertoire: 7 ERROR: invalid byte sequence for > encoding "UTF8": 0x80 HINT: This error can also happen if the byte > sequence does not match the encoding expected by the server, which is > controlled by "client_encoding".] > > > I've created the database using this sentence: > CREATE DATABASE sbs > WITH OWNER = sbsadmin > ENCODING = 'UTF8' > LC_COLLATE = 'Spanish_Peru.1252' > LC_CTYPE = 'Spanish_Peru.1252' > CONNECTION LIMIT = -1; Hola Jorge, I suspect it's the LC_COLLATE and LC_CTYPE that you have there. I don't *know* this, but they *look* like they are some weird sort of collation/ctype based on the misguided Windows-1252 encoding. Sadly, Windows provides data in this encoding into web forms where the accept charset is supposedly only ISO-8859. In Windows-1252 the Euro currency symbol is somewhere in the 0x80 - 0x9f range - possibly it is 0x80, in fact. I think you would be better to use a consistent locale like es_PE.UTF-8 though if your data is 1252 encoded then you might need to iconv it first. If you have data which is a mix of ISO-8859-1, Windows-1252 and UTF-8 then I can point you at a wee bit of PHP code I wrote which will look at each character in a string and only iconv from 8859/1252 -> UTF-8 if it is a high-bit byte which is not part of a valid UTF-8 character already. The code is here: http://repo.or.cz/w/awl.git/blob/HEAD:/inc/AWLUtilities.php You need both of the last two functions - call the first one during initialisation, and use the second one to clean the strings. Cheers, Andrew McMillan. ------------------------------------------------------------------------ andrew (AT) morphoss (DOT) com +64(272)DEBIAN You will feel hungry again in another hour. ------------------------------------------------------------------------
Вложения
В списке pgsql-php по дате отправления: