Re: Support UTF-8 files with BOM in COPY FROM

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Support UTF-8 files with BOM in COPY FROM
Дата	26 сентября 2011 г. 13:07:34
Msg-id	CA+Tgmoa7SzcuViKfdbmWWeRmzZnjo93AmbhiOHaO9E=330PFow@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Support UTF-8 files with BOM in COPY FROM (Tatsuo Ishii <ishii@postgresql.org>)
Ответы	Re: Support UTF-8 files with BOM in COPY FROM
Список	pgsql-hackers

Дерево обсуждения

On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> "David E. Wheeler" <david@kineticode.com> <CAJW2+qdYg1+xLaHDqnJs3AcKmCSVCDkv_LCAPWUtwmxL9dzVhQ@mail.gmail.com>
writes:
>>> On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote:
>>>> I'm thinking about only COPY FROM for reads, but if someone wants to add
>>>> BOM in COPY TO, we might also support COPY TO WITH BOM for writes.
>>
>>> I think it would have to be optional, since "some recipients of UTF-8 encoded data do not expect a BOM."
>>
>> Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
>> that Microsloth does it does not make it standards-conformant.
>>
>> I think that accepting it on input can be sensible, on the principle of
>> "be liberal in what you accept", but the other side of that is "be
>> conservative in what you send".  No BOMs in output, please.
>
> Suppose a user uses brain-dead editor, which does not accept UTF-8
> without BOM.  He decides to save his editor data into PostgreSQL using
> COPY FROM. He extracts the data using COPY TO. Now he finds that his
> stupid editor does not accept his data any more.
>
> So I think if we decide to accept UTF-8 with BOM, we should keep BOM
> when importing the data and output the data with BOM. If we don't want
> to output UTF-8 with BOM, we should not accept UTF-8 with BOM. It
> seems we don't have much choice...

Maybe this needs to be an optional behavior, controlled by some COPY option.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Support UTF-8 files with BOM in COPY FROM