Re: Support UTF-8 files with BOM in COPY FROM
От | Robert Haas |
---|---|
Тема | Re: Support UTF-8 files with BOM in COPY FROM |
Дата | |
Msg-id | CA+TgmoZNw=F-+fvpH8xpeiph6kiAK1Vk1Ch4ONu6d+N-UG++5A@mail.gmail.com обсуждение исходный текст |
Ответ на | Support UTF-8 files with BOM in COPY FROM (Itagaki Takahiro <itagaki.takahiro@gmail.com>) |
Ответы |
Re: Support UTF-8 files with BOM in COPY FROM
Re: Support UTF-8 files with BOM in COPY FROM Re: Support UTF-8 files with BOM in COPY FROM |
Список | pgsql-hackers |
On Mon, Sep 26, 2011 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii <ishii@postgresql.org> wrote: >>> Suppose a user uses brain-dead editor, which does not accept UTF-8 >>> without BOM. > >> Maybe this needs to be an optional behavior, controlled by some COPY option. > > I'm not excited about emitting non-standards-conformant output on the > strength of a hypothetical argument about users and editors that may or > may not exist. I believe that there's a use-case for reading BOMs, but > I have seen no field complaints demonstrating that we need to write > them. Even if we had a couple, "use a less brain dead editor" might be > the best response. We cannot promise to be compatible with arbitrarily > broken software. The thing that makes me doubt that is this comment from Tatsuo Ishii: TI> COPY explicitly specifies the encoding (to be UTF-8 in this case). So TI> I think we should not regard U+FEFF as "BOM" in COPY, rather we should TI> regard U+FEFF as "ZERO WIDTH NO-BREAK SPACE". If a BOM is confusable with valid data, then I think recognizing it and discarding it unconditionally is no good - you could end up where COPY OUT, TRUNCATE, COPY IN changes the table contents. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: