remark on a slowdown of COPY
| От | vojtech@hafro.is |
|---|---|
| Тема | remark on a slowdown of COPY |
| Дата | |
| Msg-id | 684b7d47.0312120712.6c95988c@posting.google.com обсуждение исходный текст |
| Список | pgsql-general |
Hi, I would like to remark on a problem described by Stephen Livesey almost 3 years ago, about the slowdown he had experienced with an upload of several millions of rows. http://www.geocrawler.com/mail/thread.php3?subject=%5BGENERAL%5D+Slowdown+problem+when+writing+1.7million+records&list=12 >The first 100,000 records took 15mins. >The next 100,000 records took 30mins >The last 100,000 records took 4hours. I'm actually uploading data from a pg_dump file with the COPY command, it's about 2.5 mil. rows on a 1.6 GHz Linux PC, 512MB, with raiserfs. I had to dump schema and tables separately ending up in the following series of steps: CREATE TABLE keys ( crc integer NOT NULL, tablenr integer NOT NULL, tableid integer NOT NULL, tableref integer NOT NULL, "key" character varying(250) NOT NULL, batchid integer NOT NULL ); ALTER TABLE ONLY keys ADD CONSTRAINT keys_pkey PRIMARY KEY (crc); COPY keys (crc, tablenr, tableid, tableref, "key", batchid) FROM stdin; -265889347 1 2 0 1_1_1982_1_101_1011_NULL_NULL_NULL_102_NULL 1 ... \. With created index (prim.key) I stopped it half-way through after 2 hours, getting progressively slower. Strangely, in top the CPU usage and IO were < 5%, jumping up a bit every now and then, but system load showed steadily values of over 2 (something internal?, Tom Lane once mentioned fsync?). Then without an index (when I removed the ADD CONSTRAINT line), the upload time soared to 11 minutes, including index creation afterwards, load around 1. The problem with me was that I was dumping schema and tables separatelly, thus letting ADD CONSTRAINT be issued in the shown sequence (before data were uploaded), otherwise ADD CONSTRAINT goes at the end of the table dump file, not affecting the perfromance. --Vojtech
В списке pgsql-general по дате отправления: