Re: Support UTF-8 files with BOM in COPY FROM
От | Robert Haas |
---|---|
Тема | Re: Support UTF-8 files with BOM in COPY FROM |
Дата | |
Msg-id | CA+Tgmoa7SzcuViKfdbmWWeRmzZnjo93AmbhiOHaO9E=330PFow@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Support UTF-8 files with BOM in COPY FROM (Tatsuo Ishii <ishii@postgresql.org>) |
Ответы |
Re: Support UTF-8 files with BOM in COPY FROM
|
Список | pgsql-hackers |
On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii <ishii@postgresql.org> wrote: >> "David E. Wheeler" <david@kineticode.com> <CAJW2+qdYg1+xLaHDqnJs3AcKmCSVCDkv_LCAPWUtwmxL9dzVhQ@mail.gmail.com> writes: >>> On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote: >>>> I'm thinking about only COPY FROM for reads, but if someone wants to add >>>> BOM in COPY TO, we might also support COPY TO WITH BOM for writes. >> >>> I think it would have to be optional, since "some recipients of UTF-8 encoded data do not expect a BOM." >> >> Putting a BOM into UTF8 data is flat out invalid per spec --- the fact >> that Microsloth does it does not make it standards-conformant. >> >> I think that accepting it on input can be sensible, on the principle of >> "be liberal in what you accept", but the other side of that is "be >> conservative in what you send". No BOMs in output, please. > > Suppose a user uses brain-dead editor, which does not accept UTF-8 > without BOM. He decides to save his editor data into PostgreSQL using > COPY FROM. He extracts the data using COPY TO. Now he finds that his > stupid editor does not accept his data any more. > > So I think if we decide to accept UTF-8 with BOM, we should keep BOM > when importing the data and output the data with BOM. If we don't want > to output UTF-8 with BOM, we should not accept UTF-8 with BOM. It > seems we don't have much choice... Maybe this needs to be an optional behavior, controlled by some COPY option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: