Re: Should CSV parsing be stricter about mid-field quotes?
От | Pavel Stehule |
---|---|
Тема | Re: Should CSV parsing be stricter about mid-field quotes? |
Дата | |
Msg-id | CAFj8pRBPPfmL+xhBmZha+OAyJO2zXj+28RFPJdd2wS2+pfZc_Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Should CSV parsing be stricter about mid-field quotes? ("Joel Jacobson" <joel@compiler.org>) |
Ответы |
Re: Should CSV parsing be stricter about mid-field quotes?
|
Список | pgsql-hackers |
čt 18. 5. 2023 v 8:01 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Thu, May 18, 2023, at 00:18, Kirk Wolak wrote:> Here you go. Not horrible handling. (I use DataGrip so I saved it from there> directly as TSV, just for an extra datapoint).>> FWIW, if you copy/paste in windows, the data, the field with the tab gets> split into another column in Excel. But saving it as a file, and opening it.> Saving it as XLSX, and then having Excel save it as a TSV (versus opening a> text file, and saving it back)Very useful, thanks.Interesting, DataGrip contrary to Excel doesn't quote fields with commas in TSV.All the DataGrip/Excel TSV variants uses quoting when necessary,contrary to Google Sheets's TSV-format, that doesn't quote fields at all.
Maybe there is another third implementation in Libre Office.
Generally TSV is not well specified, and then the implementations are not consistent.
DataGrip/Excel terminate also the last record with newline,while Google Sheets omit the newline for the last record,(which is bad, since then a streaming reader wouldn't knowif the last record is completed or not.)This makes me think we probably shouldn't add a new TSV format,since there is no consistency between vendors.It's impossible to deduce with certainty if a TSV-field thatbegins with a double quotation mark is quoted or unquoted.Two alternative ideas:1. How about adding a `WITHOUT QUOTE` or `QUOTE NONE` option in conjunctionwith `COPY ... WITH CSV`?Internally, it would just setquotec = '\0';`so it would't affect performance at all.2. How about adding a note on the complexities of dealing with TSV files in theCOPY documentation?/Joel
В списке pgsql-hackers по дате отправления: