Re: Bulk Inserts
От | Adrian Klaver |
---|---|
Тема | Re: Bulk Inserts |
Дата | |
Msg-id | 5cd58ca7-ea45-01ee-bd80-c9366b872291@aklaver.com обсуждение исходный текст |
Ответ на | Re: Bulk Inserts (Souvik Bhattacherjee <kivuosb@gmail.com>) |
Ответы |
Re: Bulk Inserts
|
Список | pgsql-general |
On 8/13/19 6:34 AM, Souvik Bhattacherjee wrote: > > If the selects are returning more then one row then you are already > > doing bulk inserts. If they are returning single rows or you want to > > batch them then you need some sort of code to do that. Something > > like(pseudo Python like code): > > > attr2_vals= [(10, 20, 30, 40), (50, 60, 70, 80)] > > > for val_batch in attr2_vals: > BEGIN > for id in val_batch: > insert into tab2 (attr1, attr2) (select attr1, attr2 > from tab1 where attr2 = id) > COMMIT > > For *EXP 1: inserts with multiple txn:* > insert into tab2 (attr1, attr2) (select attr1, attr2 from tab1 where > attr2 = 10); > insert into tab2 (attr1, attr2) (select attr1, attr2 from tab1 where > attr2 = 20); > > tab1 has ~6M rows and there are only two values for the attribute attr2 in > tab1 which are evenly distributed. So, yes, I guess I'm already doing > batching > here. > > Also, I ran the following two statements to see if their performances > are comparable. > While STMT 1 always runs faster in my machine but their performances > seem to differ > by a couple of seconds at most. > > STMT 1: select attr1, attr2 into tab2 from tab1; > STMT 2: insert into tab2 (select attr1, attr2 from tab1); All I have left is: select tab2 row_number () OVER (order by attr2, attr1 )AS id, attr1, attr2 into tab2 from tab1; That will not create a serial type in the id column though. You can attach a sequence to that column. Something like: 1) create sequence tab2_id start <max id + 1> owned by tab2.id; 2) alter table tab2 alter COLUMN id set default nextval('tab2_id'); > > However adding the serial id column as an ALTER TABLE statement actually > takes more time > than inserting the tuples, so the combined total time is more than > double the time taken to insert > the tuples into tab2 without serial id column. > > Best, > -SB > > > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: