Re: Support UTF-8 files with BOM in COPY FROM

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Support UTF-8 files with BOM in COPY FROM
Дата	26 сентября 2011 г. 14:28:24
Msg-id	9978.1317058095@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Support UTF-8 files with BOM in COPY FROM (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Support UTF-8 files with BOM in COPY FROM Re: Support UTF-8 files with BOM in COPY FROM
Список	pgsql-hackers

Дерево обсуждения

Robert Haas <robertmhaas@gmail.com> writes:
> The thing that makes me doubt that is this comment from Tatsuo Ishii:
> TI> COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
> TI> I think we should not regard U+FEFF as "BOM" in COPY, rather we should
> TI> regard U+FEFF as "ZERO WIDTH NO-BREAK SPACE".

Yeah, that's a reasonable argument for rejecting the patch altogether.
I'm not qualified to decide whether it outweighs the "we need to be able
to read Notepad output" argument.  I do observe that
http://en.wikipedia.org/wiki/Byte_order_mark
says Unicode 3.2 has deprecated the no-break-space interpretation,
but on the other hand you're right that we can't really assume that
the character is not present in people's data.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Support UTF-8 files with BOM in COPY FROM