Re: insert vs select into performance
| От | Thomas Finneid |
|---|---|
| Тема | Re: insert vs select into performance |
| Дата | |
| Msg-id | 469E6518.5000203@ifi.uio.no обсуждение исходный текст |
| Ответ на | Re: insert vs select into performance (PFC <lists@peufeu.com>) |
| Список | pgsql-performance |
PFC wrote: >> Unfortunately its not fast enough, it needs to be done in no more than >> 1-2 seconds, ( and in production it will be maybe 20-50 columns of >> data, perhaps divided over 5-10 tables.) >> Additionally it needs to scale to perhaps three times as many columns >> and perhaps 2 - 3 times as many rows in some situation within 1 seconds. >> Further on it needs to allow for about 20 - 50 clients reading much of >> that data before the next batch of data arrives. > > Wow. What is the application ? Geological surveys, where they perform realtime geo/hydro-phone shots of areas of the size of 10x10km every 3-15 seconds. > test=> CREATE OR REPLACE FUNCTION test_insert( ) > RETURNS VOID > LANGUAGE plpgsql > AS > $$ > DECLARE > _i INTEGER; > BEGIN > FOR _i IN 0..100000 LOOP > INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i); > END LOOP; > END; > $$; > CREATE FUNCTION > Temps : 51,948 ms > > test=> SELECT test_insert(); > test_insert > ------------- > > (1 ligne) > > Temps : 1885,382 ms I tested this one and it took 4 seconds, compared to the jdbc insert which took 14 seconds, so its a lot faster. but not as fast as the SELECT INTO. I also tested an INSERT INTO FROM SELECT, which took 1.8 seconds, now we are starting to talk about real performance. > However COPY is much faster because the parsing overhead and > de-escaping of data is faster. COPY is optimized for throughput. > > So, advice : > > For optimum throughput, have your application build chunks of data > into text files and use COPY. Or if your client lib supports the copy > interface, use it. I did test COPY, i.e. the jdbc COPY patch for pg 8.1, it performs at approx 1.8 seconds :) The test was done with text input, I am going to test it with binary input, which I expect will increase the performance with 20-50%. All these test have ben performed on a laptop with a Kubuntu 6.10 version of pg 8.1 without any special pg performance tuning. So I expect that compiling lates pg and doing some tuning on it and testing it on the a representative server will give it an additional boost in performance. The key here is that with abundance in performance, I can experiment with the solution in a completely different way than if I had any "artificial" restrictions. > You will need a fast disk system with xlog and data on separate > disks, several CPU cores (1 insert thread will max out 1 core, use the > others for selects), lots of RAM so index updates don't need to seek, > and tuning of bgwriter and checkpoints to avoid load spikes. will have a look at it. regards thomas
В списке pgsql-performance по дате отправления: