Обсуждение: Migration
Hi Everyone,
I’m a newbie to the list, and look forward to helping out where I can if possible.
I have an issue which I’ve not been able to get around. I have a Postgresql 7.4 server with a number of databases:
Name | Owner | Encoding
------------+----------+----------
xxx45 | postgres | UNICODE
xxx30 | postgres | UNICODE
xxx30_copy | postgres | UNICODE
yyy30 | postgres | UNICODE
xxxxxxx | postgres | UNICODE
xxxx | postgres | UNICODE
I have installed PostgreSQL 8.1 onto a new box, and when trying to create a new database, even when specifiying –encoding=UNICODE I cannot create UNICODE dbs, it makes them all UTF-8.
Then when I try and restore a DB dump, but keep on getting invalid “UTF-8 byte sequence detected near byte xxx” and although the restore continues, a lot of data is missing.
Does anyone have any suggestions here, I would really appreciate it!
Regards,
James Dey
tel +27 11 704-1945
cell +27 82 785-5102
fax +27 11 388-8907
mail james@mygus.com
myGUS / SLT retains all its intellectual property rights in any information contained in e-mail messages (or any attachments thereto) which relates to the official business of myGUS / SLT or of any of its associates. Such information may be legally privileged, is to be treated as confidential and myGUS / SLT will take legal steps against any unauthorised use. myGUS / SLT does not take any responsibility for, or endorses any information which does not relate to its official business, including personal mail and/or opinions by senders who may or may not be employed by myGUS / SLT. In the event that you receive a message not intended for you, we request that you notify the sender immediately, do not read, disclose or use the content in any way whatsoever and destroy/delete the message immediately. While myGUS / SLT will take reasonable precautions, it cannot ensure that this e-mail will be free of errors, viruses, interception or interference therewith. myGUS / SLT does not, therefore, issue any guarantees or warranties in this regard and cannot be held liable for any loss or damages incurred by the recipient which have been caused by any of the above-mentioned factors.
Hi, On Fri, 2006-02-10 at 10:38 +0200, James Dey wrote: > I have installed PostgreSQL 8.1 onto a new box, and when trying to > create a new database, even when specifiying –encoding=UNICODE I > cannot create UNICODE dbs, it makes them all UTF-8. > > > Then when I try and restore a DB dump, but keep on getting invalid > “UTF-8 byte sequence detected near byte xxx” and although the restore > continues, a lot of data is missing. http://developer.postgresql.org/docs/postgres/release-8-1.html "Some users are having problems loading UTF-8 data into 8.1.X. This is because previous versions allowed invalid UTF-8 byte sequences to be entered into the database, and this release properly accepts only valid UTF-8 sequences. One way to correct a dumpfile is to run the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql. The -c option removes invalid character sequences. A diff of the two files will show the sequences that are invalid. iconv reads the entire input file into memory so it might be necessary to use split to break up the dump into multiple smaller files for processing." Regards, -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PL/php, plPerlNG - http://www.commandprompt.com/
Thank you Devrim, Am I correct in saying then that UTF-8 and UNICODE are the same thing as far as PostgreSQL is concerned? Best regards, James Dey tel +27 11 704-1945 cell +27 82 785-5102 fax +27 11 388-8907 mail james@mygus.com -----Original Message----- From: Devrim GUNDUZ [mailto:devrim@commandprompt.com] Sent: 10 February 2006 10:45 AM To: James Dey Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Migration Hi, On Fri, 2006-02-10 at 10:38 +0200, James Dey wrote: > I have installed PostgreSQL 8.1 onto a new box, and when trying to > create a new database, even when specifiying -encoding=UNICODE I > cannot create UNICODE dbs, it makes them all UTF-8. > > > Then when I try and restore a DB dump, but keep on getting invalid > "UTF-8 byte sequence detected near byte xxx" and although the restore > continues, a lot of data is missing. http://developer.postgresql.org/docs/postgres/release-8-1.html "Some users are having problems loading UTF-8 data into 8.1.X. This is because previous versions allowed invalid UTF-8 byte sequences to be entered into the database, and this release properly accepts only valid UTF-8 sequences. One way to correct a dumpfile is to run the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql. The -c option removes invalid character sequences. A diff of the two files will show the sequences that are invalid. iconv reads the entire input file into memory so it might be necessary to use split to break up the dump into multiple smaller files for processing." Regards, -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PL/php, plPerlNG - http://www.commandprompt.com/
Hi, On Fri, 2006-02-10 at 10:50 +0200, James Dey wrote: > Am I correct in saying then that UTF-8 and UNICODE are the same thing as far > as PostgreSQL is concerned? Yes: template1=# CREATE DATABASE test1 ENCODING 'UNICODE'; CREATE DATABASE template1=# CREATE DATABASE test2 ENCODING 'UTF-8'; CREATE DATABASE template1=# \l List of databases Name | Owner | Encoding -----------+----------+---------- ... test1 | postgres | UTF8 test2 | postgres | UTF8 Regards, -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PL/php, plPerlNG - http://www.commandprompt.com/