Re: 8.0, UTF8, and CLIENT_ENCODING
От | Paul Ramsey |
---|---|
Тема | Re: 8.0, UTF8, and CLIENT_ENCODING |
Дата | |
Msg-id | D84BEF92-179D-4197-A686-FA80DA8B7961@refractions.net обсуждение исходный текст |
Ответ на | Re: 8.0, UTF8, and CLIENT_ENCODING (Michael Glaesemann <grzm@seespotcode.net>) |
Список | pgsql-general |
Thanks all for the information. Summary is: - 8.0 wasn't very strict, and allowed the illegal values in, instead of mapping them over into UTF-8 space - the values can be stripped with iconv -c - 8.2 should be more strict I'm in the midst of my upgrade to 8.2 now, hopefully the LATIN1->UTF8 conversion will now map the odd characters cleanly into UTF space. On 17-May-07, at 3:25 PM, Michael Glaesemann wrote: > > On May 17, 2007, at 16:47 , PFC wrote: > >>> and put that in the form. Instead of being mapped to 2-byte UTF8 >>> high-bit equivalents, they are going into the database directly >>> as one-byte values > 127. That is, as illegal UTF8 values. >> >> Sometimes you also get HTML entities in the mix. Who knows. >> All my web forms are UTF-8 back to back, it just works. Was I >> lucky ? >> Normally postgres rejects illegal UTF8 values, you wouldn't be >> able to insert them... > > 8.0 and earlier weren't quite as strict as it should have been. See > the note at the end of the migration instuctions in the release > notes for 8.1[1] That may have been part of the issue here. > > Michael Glaesemann > grzm seespotcode net > > [1](http://www.postgresql.org/docs/8.2/interactive/ > release-8-1.html#AEN80196)
В списке pgsql-general по дате отправления: