Re: update 600000 rows
От | H. Hall |
---|---|
Тема | Re: update 600000 rows |
Дата | |
Msg-id | 47650F11.9000108@reedyriver.com обсуждение исходный текст |
Ответ на | update 600000 rows (okparanoid@free.fr) |
Список | pgsql-performance |
okparanoid@free.fr wrote: > Hello > > i have a python script to update 600000 rows to one table from a csv file in my > postgres database and it takes me 5 hours to do the transaction... > > I'm on debian etch with 8.1 postgres server on a 64 bits quad bi opteron. > > I have desactived all index except the primary key who is not updated since it's > the reference column of the update too. > > When i run this script the server is not used by any other user. > > First when i run htop i see that the memory used is never more than 150 MB. > I don't understand in this case why setting shmall and shmmax kernel's > parameters to 16 GB of memory (the server has 32 GB) increase the rapidity of > the transaction a lot compared to a shmall and shmax in (only) 2 GB ?! > > The script is run with only one transaction and pause by moment to let the time > to postgres to write data to disk. > > If the data were writed at the end of the transaction will be the perfomance > better ? i wan't that in production data regulary writed to disk to prevent > loosinf of data but it there any interest to write temporary data in disk in a > middle of a transaction ??? > > I'm completely noob to postgres and database configuration and help are > welcome. > > thank You will get a huge improvement in time if you use batch updates instead of updating a row at a time. See: http://www.postgresql.org/docs/8.2/interactive/populate.html and http://www.postgresql.org/docs/8.2/interactive/sql-begin.html You will also get a big improvement if you can turn fsync off during the update. See: http://www.postgresql.org/docs/8.2/interactive/runtime-config-wal.html You also need to vacuum the table after doing that many updates since pg does a delete and insert on each update, there will be a lot of holes. Cheers HH
В списке pgsql-performance по дате отправления: