Обсуждение: invalid UTF-8 byte sequences and iconv
Hi, We have set up a new server and are needing to move our database from 7.3 to 8.1.4. On restore I'm getting the 'invalid UTF-8 byte sequence' error message. If I use the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql, then the characters are deleted and the restore goes smoothly. The problem is that we want those characters. They are for example the degree symbol and the micro symbol. Is there anyway to bring these characters over? Thanks in advance. Karen
Karen Springer wrote: > Hi, > > We have set up a new server and are needing to move our database from > 7.3 to 8.1.4. On restore I'm getting the 'invalid UTF-8 byte sequence' > error message. If I use the command iconv -c -f UTF-8 -t UTF-8 -o > cleanfile.sql dumpfile.sql, then the characters are deleted and the > restore goes smoothly. The problem is that we want those characters. > They are for example the degree symbol and the micro symbol. Is there > anyway to bring these characters over? Thanks in advance. Huh, maybe using the real source encoding instead? Try, for example, using Latin-1. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
In earlier version of postgres the database did allow to store invalid byte sequences. The newer versions do check correctly for the byte sequences and do not allow invalid sequences. So if your dump is really in UTF8 already you will have to search for the invalid sequences in the dump and replace them with the correct one. (if you have a lot of them and a big dump recode might by of help for you). If the dump is not UTF8 you have to pass the correct encoding to iconv in the procedure you described. Best regards Ivo Am Dienstag, 25. Juli 2006 21.04 schrieb Karen Springer: > Hi, > > We have set up a new server and are needing to move our database from > 7.3 to 8.1.4. On restore I'm getting the 'invalid UTF-8 byte sequence' > error message. If I use the command iconv -c -f UTF-8 -t UTF-8 -o > cleanfile.sql dumpfile.sql, then the characters are deleted and the > restore goes smoothly. The problem is that we want those characters. > They are for example the degree symbol and the micro symbol. Is there > anyway to bring these characters over? Thanks in advance. > > Karen > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings
Hello,
We had a similar problem a few weeks back converting from PostgreSQL 7.3.10 to 7.4.13. After trying various methods, including iconv, we found that the one that worked in our case was to manually fix the data - which was only about 15 records fortunately.
BTW thanks Ivo for suggesing that solution at that instance also :)
Regards,
-Thusitha
Ivo Rossacher <rossacher@bluewin.ch> wrote:
We had a similar problem a few weeks back converting from PostgreSQL 7.3.10 to 7.4.13. After trying various methods, including iconv, we found that the one that worked in our case was to manually fix the data - which was only about 15 records fortunately.
BTW thanks Ivo for suggesing that solution at that instance also :)
Regards,
-Thusitha
Ivo Rossacher <rossacher@bluewin.ch> wrote:
In earlier version of postgres the database did allow to store invalid byte
sequences. The newer versions do check correctly for the byte sequences and
do not allow invalid sequences. So if your dump is really in UTF8 already you
will have to search for the invalid sequences in the dump and replace them
with the correct one. (if you have a lot of them and a big dump recode might
by of help for you). If the dump is not UTF8 you have to pass the correct
encoding to iconv in the procedure you described.
Best regards
Ivo
Am Dienstag, 25. Juli 2006 21.04 schrieb Karen Springer:
> Hi,
>
> We have set up a new server and are needing to move our database from
> 7.3 to 8.1.4. On restore I'm getting the 'invalid UTF-8 byte sequence'
> error message. If I use the command iconv -c -f UTF-8 -t UTF-8 -o
> cleanfile.sql dumpfile.sql, then the characters are deleted and the
> restore goes smoothly. The problem is that we want those characters.
> They are for example the degree symbol and the micro symbol. Is there
> anyway to bring these characters over? Thanks in advance.
>
> Karen
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster