Re: multiline CSV fields
От | Patrick B Kelly |
---|---|
Тема | Re: multiline CSV fields |
Дата | |
Msg-id | F82E5F5D-3435-11D9-B14C-000A958A3956@patrickbkelly.org обсуждение исходный текст |
Ответ на | Re: multiline CSV fields (Andrew Dunstan <andrew@dunslane.net>) |
Ответы |
Re: multiline CSV fields
Re: multiline CSV fields |
Список | pgsql-hackers |
On Nov 11, 2004, at 2:56 PM, Andrew Dunstan wrote: > > > Tom Lane wrote: > >> Andrew Dunstan <andrew@dunslane.net> writes: >> >>> Patrick B Kelly wrote: >>> >>>> Actually, when I try to export a sheet with multi-line cells from >>>> excel, it tells me that this feature is incompatible with the CSV >>>> format and will not include them in the CSV file. >>>> >> >> >>> It probably depends on the version. I have just tested with Excel >>> 2000 on a WinXP machine and it both read and wrote these files. >>> >> >> I'd be inclined to define Excel 2000 as broken, honestly, if it's >> writing unescaped newlines as data. To support this would mean >> throwing >> away most of our ability to detect incorrectly formatted CSV files. >> A simple error like a missing close quote would look to the machine >> like >> the rest of the file is a single long data line where all the newlines >> are embedded in data fields. How likely is it that you'll get a >> useful >> error message out of that? Most likely the error message would point >> to >> the end of the file, or at least someplace well removed from the >> actual >> mistake. >> >> I would vote in favor of removing the current code that attempts to >> support unquoted newlines, and waiting to see if there are complaints. >> >> >> > > This feature was specifically requested when we discussed what sort of > CSVs we would handle. > > And it does in fact work as long as the newline style is the same. > > I just had an idea. How about if we add a new CSV option MULTILINE. If > absent, then on output we would not output unescaped LF/CR characters > and on input we would not allow fields with embedded unescaped LF/CR > characters. In both cases we could error out for now, with perhaps an > 8.1 TODO to provide some other behaviour. > > Or we could drop the whole multiline "feature" for now and make the > whole thing an 8.1 item, although it would be a bit of a pity when it > does work in what will surely be the most common case. > What about just coding a FSM into backend/commands/copy.c:CopyReadLine() that does not process any flavor of NL characters when it is inside of a data field? Patrick B. Kelly ------------------------------------------------------ http://patrickbkelly.org
В списке pgsql-hackers по дате отправления: