Re: COPY for CSV documentation
От | Andrew Dunstan |
---|---|
Тема | Re: COPY for CSV documentation |
Дата | |
Msg-id | 2365.24.211.141.25.1081689176.squirrel@www.dunslane.net обсуждение исходный текст |
Ответ на | Re: COPY for CSV documentation (Bruce Momjian <pgman@candle.pha.pa.us>) |
Ответы |
Re: COPY for CSV documentation
|
Список | pgsql-patches |
Bruce Momjian said: >> >Yes, my worry is that someone will use a multibyte character that the >> >system sees as several bytes and enters CSV mode. >> > >> >> >> How about if we specify it explicitly, like BINARY, instead of it >> being implied by the length of DELIMITER? >> >> COPY a FROM stdin CSV DELIMITER ',"'; >> >> That would make the patch somewhat more extensive, but maybe not >> hugely more invasive (I tried to keep it as uninvasive as possible). >> I could do that, I think. > > That's what I was wondering. Is triggering CSV for multi-character > delimiters a little too clever? This reminds me of the use of LIMIT > X,Y with no indication which is limit and which is offset. > > We certainly could code to prevent the multibyte problem I mentioned, > but should we? I confess that in my anglocentric world I have remained lamentably ignorant of how MBCS works. Just reading up a little, and looking over some of our code (e.g. the scanner) it looks like the simple solution would be to check that the delimiter was 8-bit clean. (I assume that ASCII is a subset of every MBCS we support - is that correct?) However ... > > I am thinking just: > >> COPY a FROM stdin WITH CSV ',"'; > > or > >> COPY a FROM stdin WITH DELIMITER "," QUOTE '"' EQUOTE '"'; > > EQUOTE for embedded quote. These are used in very limited situations > and don't have to be reserved words or anything. > > I can help with these changes if folks like them. > I prefer either the first, because it ensures things are specified together. If you want to do that I will work on some regression tests. cheers andrew
В списке pgsql-patches по дате отправления: