Re: Dumping in LATIN1 and restoring in UTF-8
От | Tino Wildenhain |
---|---|
Тема | Re: Dumping in LATIN1 and restoring in UTF-8 |
Дата | |
Msg-id | 44ACB27E.6030303@wildenhain.de обсуждение исходный текст |
Ответ на | Dumping in LATIN1 and restoring in UTF-8 ("Marco Bizzarri" <marco.bizzarri@gmail.com>) |
Ответы |
Re: Dumping in LATIN1 and restoring in UTF-8
|
Список | pgsql-general |
Marco Bizzarri schrieb: > Hi all. > > Here is my use case: I've an application which uses PostgreSQL as > backend. Up to now, the database was encoded in SQL_ASCII or LATIN1. > Now, we need to migrate to UTF-8. > > What we tried, was to: > > 1) dump the database using pg_dump, in tar format (we had blob); > 2) modifying the result, using some conversion tool (like recode) > > > 3) destroying the old database > 4) recreating the database with UNICODE setting > 5) restoring the database using pg_restore > > The result was not what I expected. The pg_restore was using the > LATIN1 encoding to encode the strings, resulting in a LATIN1 encoded > in UTF-8... > > The problem lied in the toc.dat file, which stated that the client > encoding was LATIN1, instead of UTF-8. > > The solution in the end has been to manually modifying the toc.dat > file, substituting the LATIN1 string with UTF-8 (plus a space, since > the toc.dat is a binary file). > > Even though it worked for us, I wonder if there is any other way to > accomplish the same result, at least to specify the encoding for the > restore. Yes, its actually quite esay: you dump as you feel apropriate, then create the database with the encoding you want, restore w/o creating database and you are done. Restore sets the client encoding to what it actually was in the dump data (in your case latin-1) and the database would be utf-8 - postgres automatically recodes. No need for iconv and friends. Regards Tino
В списке pgsql-general по дате отправления: