Re: Best COPY Performance
От | Stefan Kaltenbrunner |
---|---|
Тема | Re: Best COPY Performance |
Дата | |
Msg-id | 454620DF.1000003@kaltenbrunner.cc обсуждение исходный текст |
Ответ на | Re: Best COPY Performance ("Luke Lonergan" <llonergan@greenplum.com>) |
Ответы |
Re: Best COPY Performance
|
Список | pgsql-performance |
Luke Lonergan wrote: > Greg, > > On 10/30/06 7:09 AM, "Spiegelberg, Greg" <gspiegelberg@cranel.com> wrote: > >> I broke that file into 2 files each of 550K rows and performed 2 >> simultaneous COPY's after dropping the table, recreating, issuing a sync >> on the system to be sure, &c and nearly every time both COPY's finish in >> 12 seconds. About a 20% gain to ~91K rows/second. >> >> Admittedly, this was a pretty rough test but a 20% savings, if it can be >> put into production, is worth exploring for us. > > Did you see whether you were I/O or CPU bound in your single threaded COPY? > A 10 second "vmstat 1" snapshot would tell you/us. > > With Mr. Workerson (:-) I'm thinking his benefit might be a lot better > because the bottleneck is the CPU and it *may* be the time spent in the > index building bits. > > We've found that there is an ultimate bottleneck at about 12-14MB/s despite > having sequential write to disk speeds of 100s of MB/s. I forget what the > latest bottleneck was. I have personally managed to load a bit less then 400k/s (5 int columns no indexes) - on very fast disk hardware - at that point postgresql is completely CPU bottlenecked (2,6Ghz Opteron). Using multiple processes to load the data will help to scale up to about 900k/s (4 processes on 4 cores). Stefan
В списке pgsql-performance по дате отправления: