Re: Best COPY Performance
От | Stefan Kaltenbrunner |
---|---|
Тема | Re: Best COPY Performance |
Дата | |
Msg-id | 45462710.3000301@kaltenbrunner.cc обсуждение исходный текст |
Ответ на | Re: Best COPY Performance ("Luke Lonergan" <llonergan@greenplum.com>) |
Список | pgsql-performance |
Luke Lonergan wrote: > Stefan, > > On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <stefan@kaltenbrunner.cc> wrote: > >>> We've found that there is an ultimate bottleneck at about 12-14MB/s despite >>> having sequential write to disk speeds of 100s of MB/s. I forget what the >>> latest bottleneck was. >> I have personally managed to load a bit less then 400k/s (5 int columns >> no indexes) - on very fast disk hardware - at that point postgresql is >> completely CPU bottlenecked (2,6Ghz Opteron). > > 400,000 rows/s x 4 bytes/column x 5 columns/row = 8MB/s > >> Using multiple processes to load the data will help to scale up to about >> 900k/s (4 processes on 4 cores). yes I did that about half a year ago as part of the CREATE INDEX on a 1,8B row table thread on -hackers that resulted in some some the sorting improvements in 8.2. I don't think there is much more possible in terms of import speed by using more cores (at least not when importing to the same table) - iirc I was at nearly 700k/s with two cores and 850k/s with 3 cores or such ... > > 18MB/s? Have you done this? I've not seen this much of an improvement > before by using multiple COPY processes to the same table. > > Another question: how to measure MB/s - based on the input text file? On > the DBMS storage size? We usually consider the input text file in the > calculation of COPY rate. yeah that is a good questions (and part of the reason why I cited the rows/sec number btw.) Stefan
В списке pgsql-performance по дате отправления: