Обсуждение: invalid byte sequence for encoding "UTF8":

Поиск
Список
Период
Сортировка

invalid byte sequence for encoding "UTF8":

От
kulmacet101@kulmacet.com
Дата:
All,

I have a new postgresql<8.3.4> build on linux<CentOS5> with PHP talking to
this database. If I try and update or insert on data that has special
characters I get this error:

ERROR:  invalid byte sequence for encoding "UTF8": 0xa9
HINT:  This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by "client_encoding".
STATEMENT:  UPDATE preferences SET property = $1,preference_value =
$2,comment = $3,topic = $4 WHERE app_hash =
'50e2606ed950e8021d64349b49f4ee48'

I have read some articles about client_encoding but I do not know how to
get around this error.

Any help or support appreciated.
Thanks in advance,
Kulmacet


Re: invalid byte sequence for encoding "UTF8":

От
ries van Twisk
Дата:
On Jan 8, 2009, at 2:34 PM, kulmacet101@kulmacet.com wrote:

> All,
>
> I have a new postgresql<8.3.4> build on linux<CentOS5> with PHP
> talking to
> this database. If I try and update or insert on data that has special
> characters I get this error:
>
> ERROR:  invalid byte sequence for encoding "UTF8": 0xa9
> HINT:  This error can also happen if the byte sequence does not
> match the
> encoding expected by the server, which is controlled by
> "client_encoding".
> STATEMENT:  UPDATE preferences SET property = $1,preference_value =
> $2,comment = $3,topic = $4 WHERE app_hash =
> '50e2606ed950e8021d64349b49f4ee48'
>
> I have read some articles about client_encoding but I do not know
> how to
> get around this error.
>
> Any help or support appreciated.
> Thanks in advance,
> Kulmacet




Kulmacet,

is your source data also in UTF-8 ?

Ries





Re: invalid byte sequence for encoding "UTF8":

От
kulmacet101@kulmacet.com
Дата:
>
> On Jan 8, 2009, at 2:34 PM, kulmacet101@kulmacet.com wrote:
>
>> All,
>>
>> I have a new postgresql<8.3.4> build on linux<CentOS5> with PHP
>> talking to
>> this database. If I try and update or insert on data that has special
>> characters I get this error:
>>
>> ERROR:  invalid byte sequence for encoding "UTF8": 0xa9
>> HINT:  This error can also happen if the byte sequence does not
>> match the
>> encoding expected by the server, which is controlled by
>> "client_encoding".
>> STATEMENT:  UPDATE preferences SET property = $1,preference_value =
>> $2,comment = $3,topic = $4 WHERE app_hash =
>> '50e2606ed950e8021d64349b49f4ee48'
>>
>> I have read some articles about client_encoding but I do not know
>> how to
>> get around this error.
>>
>> Any help or support appreciated.
>> Thanks in advance,
>> Kulmacet
>
>
>
>
> Kulmacet,
>
> is your source data also in UTF-8 ?
>
> Ries
>
>
>
>
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>

I'm not sure how to determine if the source data is UTF-8. This data is
coming from a post to a form.




Re: invalid byte sequence for encoding "UTF8":

От
ries van Twisk
Дата:
On Jan 8, 2009, at 3:08 PM, kulmacet101@kulmacet.com wrote:

>>
>> On Jan 8, 2009, at 2:34 PM, kulmacet101@kulmacet.com wrote:
>>
>>> All,
>>>
>>> I have a new postgresql<8.3.4> build on linux<CentOS5> with PHP
>>> talking to
>>> this database. If I try and update or insert on data that has
>>> special
>>> characters I get this error:
>>>
>>> ERROR:  invalid byte sequence for encoding "UTF8": 0xa9
>>> HINT:  This error can also happen if the byte sequence does not
>>> match the
>>> encoding expected by the server, which is controlled by
>>> "client_encoding".
>>> STATEMENT:  UPDATE preferences SET property = $1,preference_value =
>>> $2,comment = $3,topic = $4 WHERE app_hash =
>>> '50e2606ed950e8021d64349b49f4ee48'
>>>
>>> I have read some articles about client_encoding but I do not know
>>> how to
>>> get around this error.
>>>
>>> Any help or support appreciated.
>>> Thanks in advance,
>>> Kulmacet
>>
>>
>>
>>
>> Kulmacet,
>>
>> is your source data also in UTF-8 ?
>>
>> Ries
>>
>>
>>
>>
>>
>> --
>> Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-novice
>>
>
> I'm not sure how to determine if the source data is UTF-8. This data
> is
> coming from a post to a form.



Your website might not be in UTF-8 in that case.

Ries





Re: invalid byte sequence for encoding "UTF8":

От
Bastiaan Olij
Дата:
Hi Kulmacet,

HTML if not specified otherwise will most likely be ISO-8859-1 (or latin
1) though different browsers may default to other sets. You can check
the header data in the post body in HTML to find out what the character
set is, but you'll need to map it to a character set that Postgres knows
and can convert to utf-8.

Anyways, you could set the client encoding to latin 1 by issueing the
set clientencoding SQL command.

What might be easier and safer is that you change the HTML side of life,
I don't know if PHP in your case also generates the form page? If so you
can simply put a:
header('Content-type: text/html; charset=utf-8');
at the beginning of your code.

Alternatively you can put it in the html file itself:
<html .... >
  <head>
     ...
     <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
     ...
  </head>
....
</html>

Setting the character set of your form page to utf-8 should result in
the browser returning the entered data as utf-8 aswell. Offcourse you
need to ensure that any data in the html page is also formatted as utf-8
or the browser will misrepresent it (only for characters >128 offcourse) !

If you've saved data in utf-8 into your database, any data from your
database written back into the page will also be utf-8 but you do need
to be careful with any string manipulations you do on these strings as
not all characters in utf-8 are single byte, and asfar as I am aware,
php ignores this fact completely. PHP treats each byte as a single
character, even if its only part of a character.

See this page for more info:

http://www.w3.org/International/O-HTTP-charset

P.S. personally I like setting the HTML page to utf-8 more and just
being a bit careful with what I do with the resulting data in php. In
HTML you can potentially get a mix of character sets that just get you
into trouble in the long term.

Greetz,

Bas

ries van Twisk wrote:

> >
> > On Jan 8, 2009, at 3:08 PM, kulmacet101@kulmacet.com wrote:
> >
>
>>> >>>
>>> >>> On Jan 8, 2009, at 2:34 PM, kulmacet101@kulmacet.com wrote:
>>> >>>
>>>
>>>> >>>> All,
>>>> >>>>
>>>> >>>> I have a new postgresql<8.3.4> build on linux<CentOS5> with PHP
>>>> >>>> talking to
>>>> >>>> this database. If I try and update or insert on data that has special
>>>> >>>> characters I get this error:
>>>> >>>>
>>>> >>>> ERROR:  invalid byte sequence for encoding "UTF8": 0xa9
>>>> >>>> HINT:  This error can also happen if the byte sequence does not
>>>> >>>> match the
>>>> >>>> encoding expected by the server, which is controlled by
>>>> >>>> "client_encoding".
>>>> >>>> STATEMENT:  UPDATE preferences SET property = $1,preference_value =
>>>> >>>> $2,comment = $3,topic = $4 WHERE app_hash =
>>>> >>>> '50e2606ed950e8021d64349b49f4ee48'
>>>> >>>>
>>>> >>>> I have read some articles about client_encoding but I do not know
>>>> >>>> how to
>>>> >>>> get around this error.
>>>> >>>>
>>>> >>>> Any help or support appreciated.
>>>> >>>> Thanks in advance,
>>>> >>>> Kulmacet
>>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> Kulmacet,
>>> >>>
>>> >>> is your source data also in UTF-8 ?
>>> >>>
>>> >>> Ries
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
>>> >>> To make changes to your subscription:
>>> >>> http://www.postgresql.org/mailpref/pgsql-novice
>>> >>>
>>>
>> >>
>> >> I'm not sure how to determine if the source data is UTF-8. This data is
>> >> coming from a post to a form.
>>
> >
> >
> >
> > Your website might not be in UTF-8 in that case.
> >
> > Ries
> >
> >
> >
> >
> >
>


-- Kindest Regards, Bastiaan Olij e-mail/MSN: bastiaan@basenlily.nl web:
http://www.basenlily.nl Skype: Mux213
http://www.linkedin.com/in/bastiaanolij