Re: copy vs. C function
От | Sam Gendler |
---|---|
Тема | Re: copy vs. C function |
Дата | |
Msg-id | 71697165-78B8-4095-9185-B900540BAC0B@ideasculptor.com обсуждение исходный текст |
Ответ на | copy vs. C function (Jon Nelson <jnelson+pgsql@jamponi.net>) |
Список | pgsql-performance |
Start a transaction before the first insert and commit it after the last one and it will be much better, but I believe thatthe copy code path is optimized to perform better than any set of queries can, even in a single transaction Sent from my iPhone On Dec 10, 2011, at 5:27 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote: > I was experimenting with a few different methods of taking a line of > text, parsing it, into a set of fields, and then getting that info > into a table. > > The first method involved writing a C program to parse a file, parse > the lines and output newly-formatted lines in a format that > postgresql's COPY function can use. > End-to-end, this takes 15 seconds for about 250MB (read 250MB, parse, > output new data to new file -- 4 seconds, COPY new file -- 10 > seconds). > > The next approach I took was to write a C function in postgresql to > parse a single TEXT datum into an array of C strings, and then use > BuildTupleFromCStrings. There are 8 columns involved. > Eliding the time it takes to COPY the (raw) file into a temporary > table, this method took 120 seconds, give or take. > > The difference was /quite/ a surprise to me. What is the probability > that I am doing something very, very wrong? > > NOTE: the code that does the parsing is actually the same, > line-for-line, the only difference is whether the routine is called by > a postgresql function or by a C program via main, so obviously the > overhead is elsewhere. > NOTE #2: We are talking about approximately 2.6 million lines. > > I was testing: > > \copy some_table from 'some_file.csv' with csv > vs. > insert into some_table select (some_func(line)).* from some_temp_table; > > where some_func had been defined with (one) IN TEXT and (8) OUT params > of varying types. > > PostgreSQL 9.1.1 on Linux, x86_64 > > -- > Jon > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
В списке pgsql-performance по дате отправления: