Re: loading data from flat text file
От | Ross J. Reedstrom |
---|---|
Тема | Re: loading data from flat text file |
Дата | |
Msg-id | 20000523101230.B5814@rice.edu обсуждение исходный текст |
Ответ на | Re: loading data from flat text file (Ron Peterson <rpeterson@yellowbank.com>) |
Список | pgsql-general |
On Tue, May 23, 2000 at 10:10:41AM -0400, Ron Peterson wrote: > > "Voitenko, Denis" wrote: > > > > So I am almost there, except my data is formated as follows: > > > > "chunk1","chunk2","chunk3" > > > > how would I tell COPY that my data is encapsulated in " and separated > > by , ? Furthermore, I did not find a manual on the COPY command. > > Anyone? > > In /usr/local/pgsql/doc (assuming that's where you installed PostgreSQL) > you will find a good deal of html format documentation. From the docs: > > COPY [ BINARY ] table [ WITH OIDS ] > FROM { 'filename' | stdin } > [ [USING] DELIMITERS 'delimiter' ] > [ WITH NULL AS 'null string' ] > COPY [ BINARY ] table [ WITH OIDS ] > TO { 'filename' | stdout } > [ [USING] DELIMITERS 'delimiter' ] > [ WITH NULL AS 'null string' ] > > Basically, you just need to specify the delimiters. > However, the quotes might give you problems. Postgresql does not treat them specially, and expects delimited files, not seperated files. This leads to two problems. If you have text fields with internal commas, they'll split at the internal comma. The second problem is that the quotes will be stored with your data, and depending on the column type, may not transform at all: i.e. trying to store "12" in an int column won't work. They way a postgresql's copy delimited file handles embedded delimiters is to quote them with a back slash, as so: chunk one,another\, different\, chunk,third chunk So, you'll need to preprocess your flat file some. A simple sed should do it. My usual trick for this is to find some character sequence that _isn't_ in the dataset, like '|||', and do a global replace on "," with the pipes, then delete quotes, quote the commas, and resubstitute the pipes with commas. This converts a seperated file into a delimited one. cat my_file | sed 's/","/|||/g' | sed 's/^"//g'| sed 's/"$//g'| sed \ 's/,/\,/g'| sed 's/|||/,/g' >newfile Sort of ugly, but it should work. If you can get your other DB to dump in a delimited format, instead of a quoted CSV format, everything should work much easier. COPY wasn't really designed for importing data and data transformation, but as a reliable means of doing bulk dump and restore of tables. Ross -- Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> NSBRI Research Scientist/Programmer Computer and Information Technology Institute Rice University, 6100 S. Main St., Houston, TX 77005
В списке pgsql-general по дате отправления: