Re: COPY TO CSV produces data that is incompatible/unsafe for \COPY FROM CSV
От | Bruce Momjian |
---|---|
Тема | Re: COPY TO CSV produces data that is incompatible/unsafe for \COPY FROM CSV |
Дата | |
Msg-id | ZT0q_djqSO3d6MQU@momjian.us обсуждение исходный текст |
Ответ на | Re: COPY TO CSV produces data that is incompatible/unsafe for \COPY FROM CSV (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: COPY TO CSV produces data that is incompatible/unsafe for \COPY FROM CSV
|
Список | pgsql-bugs |
On Tue, Aug 16, 2022 at 11:26:47AM -0400, Bruce Momjian wrote: > On Sun, Aug 14, 2022 at 10:08:55AM -0400, Tom Lane wrote: > > Noah Misch <noah@leadboat.com> writes: > > > The psql documentation says [\copy ... from stdin] looks for "\.". It says > > > nothing about doing that for [\copy ... from filename]. I wonder if psql > > > should change the filename case to send the whole file to the server, ignoring > > > "\." inside. (That is to say, change to match the psql documentation.) > > > > This seems like a sensible solution. The need to use an in-band EOF > > marker when reading from the command source file is clear, and there's > > no non-kluge way there. But that doesn't mean we should extend it > > to separate files. (I'm surprised to realize we do, actually.) > > So, look what I found in psql/copy.c: > > /* check for EOF marker, but not on a partial line */ > if (at_line_begin) > { > /* > --> * This code erroneously assumes '\.' on a line alone > --> * inside a quoted CSV string terminates the \copy. > --> * https://www.postgresql.org/message-id/E1TdNVQ-0001ju-GO@wrigleys.postgresql.org > */ > if ((linelen == 3 && memcmp(fgresult, "\\.\n", 3) == 0) || > (linelen == 4 && memcmp(fgresult, "\\.\r\n", 4) == 0)) > { > copydone = true; > } > } > > I wondered who would reference an email thread in the source code, so I > looked at the whole 2012 thread: > > https://www.postgresql.org/message-id/flat/E1TdNVQ-0001ju-GO%40wrigleys.postgresql.org > > and it was, of course, me. :-( > > Tom's analysis was similar ten years ago: > > > Ugh. This seems like a rather fundamental oversight in the CSV feature. > > The problem is that psql has no idea whether the copy is being done in > > CSV mode or not --- and even if it did, it doesn't parse the data fully > > enough to realize whether a \. line is inside quotes or not. > > > > In the case of out-of-line data files, it might be reasonable to just > > dispense with the check for \. altogether and always ship the whole file > > to the backend; I think there's a \. check on the backend side. (Not > > sure this is safe in V2 protocol, but I doubt anyone cares anymore > > about that.) > > > > In the case of in-line data in a script file, CSV mode seems a bit > > broken in any case; there's no concept of a terminator in CSV, AFAIK. > > So maybe we don't have to worry about that. > > I think at this point we should either fix this or document it. I have decided to document this with the attached patch. I plan to apply it to all supported Postgres versions. If we ever fix FROM file so only FROM STDIN is a problem, we can update the documentation. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Вложения
В списке pgsql-bugs по дате отправления: