Re: inserting large number of rows was: Re: Increasing
От | Erki Kaldjärv |
---|---|
Тема | Re: inserting large number of rows was: Re: Increasing |
Дата | |
Msg-id | 401F47A9.7070106@webware.ee обсуждение исходный текст |
Ответ на | inserting large number of rows was: Re: Increasing number of PG connections. ("scott.marlowe" <scott.marlowe@ihs.com>) |
Список | pgsql-performance |
You could do high speed inserts with COPY command: http://developer.postgresql.org/docs/postgres/sql-copy.html Check whenether your database adapter/client lib supports it (i guess it does). Note that it doesnt help very much if there are fk's/triggers's on the target table. Bill Moran wrote: > I must have missed this post when it was made earlier. Pardon the > noise if > my suggestion has already been made. > > Unlike MySQL (and possibly other database servers) PostgreSQL is > faster when > inserting inside a transaction. Depending on the method in which you are > actually adding the records. > > In my own experience (generating a list of INSERT statements from a perl > script and using psql to execute them) the difference in performance was > incredibly dramatic when I added a "BEGIN WORK" at the beginning and > "COMMIT WORK" at the end. > > scott.marlowe wrote: > >> On Mon, 2 Feb 2004, Qing Zhao wrote: >> >> >>> I am new here. I have a question related to this in some way. >>> >>> Our web site needs to upload a large volume of data into Postgres at >>> a time. The performance deterioates as number of rows becomes >>> larger. When it reaches 2500 rows, it never come back to GUI. Since >>> the tests were run through GUI, my suspision is >>> that it might be caused by the way the application server talking to >>> Postgres server, the connections, etc.. What might be the factors >>> involved here? Does anyone know? >> >> >> >> Actually, I'm gonna go out on a limb here and assume two things: >> >> 1. you've got lotsa fk/pk relationships setup. >> 2. you're analyzing the table empty before loading it up. >> >> What happens in this instance is that the analyze on an empty, or >> nearly so, table, means that during the inserts, postgresql thinks >> you have only a few rows. At first, this is fine, as pgsql will seq >> scan the tables to make sure there is a proper key in both. As the >> number of rows increases, the planner needs to switch to index scans >> but doesn't, because it doesn't know that the number of rows is >> increasing. >> >> Fix: insert a few hundred rows, run analyze, check to see if the >> explain for inserts is showing index scans or not. If not, load a >> few more hundred rows, analyze, rinse, repeat. >> >> Also, look for fk/pk mismatches. I.e. an int4 field pointing to an >> int8 field. That's a performance killer, so if the pk/fk types don't >> match, see if you can change your field types to match and try again. >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >> > >
В списке pgsql-performance по дате отправления: