Re: invalid byte sequence for encoding "UTF8"

Поиск
Список
Период
Сортировка
От Martijn van Oosterhout
Тема Re: invalid byte sequence for encoding "UTF8"
Дата
Msg-id 20070321195722.GC13787@svana.org
обсуждение исходный текст
Ответ на Re: invalid byte sequence for encoding "UTF8"  (Alan Hodgson <ahodgson@simkin.ca>)
Список pgsql-general
On Wed, Mar 21, 2007 at 09:54:41AM -0700, Alan Hodgson wrote:
> iconv needs to read the whole file into RAM.  What you can do is use the
> UNIX split utility to split the dump file into smaller segments, use iconv
> on each segment, and then cat all the converted segments back together into
> a new dump file.  iconv is I think your best option for converting the dump
> to a valid encoding.

The guys at openstreetmap have written a UTF-8 cleaner that doesn't
read the whole file into memory:

http://trac.openstreetmap.org/browser/utils/planet.osm/C

Definitly more convenient for large files.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: best way to kill long running query?
Следующее
От: "Bill Eaton"
Дата:
Сообщение: Re: best way to kill long running query?