Обсуждение: Howto read a UTF-8 CSV with COPY?

Поиск
Список
Период
Сортировка

Howto read a UTF-8 CSV with COPY?

От
Andreas
Дата:
Hi,
I'd like to import some data from ms-office that has text columns with
international characters in it.

COPY complains about illegal byte sequences.

When I placed a "set client_encoding = LATIN1" in front of the COPY
command, COPY was happy.
It still didn't work as expected. There were empty attributes in the
imported table when there were accent decorated characters in the source
file.

Is there a way to import UTF8 encoded csv files ?

Re: Howto read a UTF-8 CSV with COPY?

От
LazyTrek
Дата:
Andreas,

Have you looked at using pgloader? 

http://pgloader.projects.postgresql.org

Also did you save the MS Office file as a simple plain text file?

I'm a novice myself but do know that this offers slightly more advanced options than the simple COPY utility.


On Thu, Nov 11, 2010 at 10:01 PM, Andreas <maps.on@gmx.net> wrote:
Hi,
I'd like to import some data from ms-office that has text columns with international characters in it.

COPY complains about illegal byte sequences.

When I placed a "set client_encoding = LATIN1" in front of the COPY command, COPY was happy.
It still didn't work as expected. There were empty attributes in the imported table when there were accent decorated characters in the source file.

Is there a way to import UTF8 encoded csv files ?

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: Howto read a UTF-8 CSV with COPY?

От
Jasen Betts
Дата:
On 2010-11-11, Andreas <maps.on@gmx.net> wrote:
> Hi,
> I'd like to import some data from ms-office that has text columns with
> international characters in it.
>
> COPY complains about illegal byte sequences.
>
> When I placed a "set client_encoding = LATIN1" in front of the COPY
> command, COPY was happy.
> It still didn't work as expected. There were empty attributes in the
> imported table when there were accent decorated characters in the source
> file.

latin-1 was probably the wrong client encoding unless it was office on
a really old Mac.

> Is there a way to import UTF8 encoded csv files ?

copy.
but the files must be utf8, close is noy good enough.

what sequence is it complaining about?

--
⚂⚃ 100% natural

Re: Howto read a UTF-8 CSV with COPY?

От
Andreas
Дата:
Am 15.11.2010 19:10, schrieb Jasen Betts:
>
> latin-1 was probably the wrong client encoding unless it was office on
> a really old Mac.
No, its a win xp.


> Is there a way to import UTF8 encoded csv files ?
> copy.
> but the files must be utf8, close is noy good enough.
That might be the problem.
After I let notepad++ convert it to utf8 copy read it without issues so far.
It appeares excel 2000 can't produce the right flavour. Even the
"unicode text" got rejected by copy.


> what sequence is it complaining about?
In most cases it was a local character, though I think even latin1 has
all of them besides the € sign.
And it didnt like - (minus or probaply ndash).


Thanks for your reply.