Re: performance for high-volume log insertion
От | Thomas Kellerer |
---|---|
Тема | Re: performance for high-volume log insertion |
Дата | |
Msg-id | gso5hg$9or$1@ger.gmane.org обсуждение исходный текст |
Ответ на | Re: performance for high-volume log insertion (Stephen Frost <sfrost@snowman.net>) |
Ответы |
Re: performance for high-volume log insertion
|
Список | pgsql-performance |
Stephen Frost wrote on 22.04.2009 23:51: >>> What about 40000 individual prepared inserts? Just curious about it. >> 40000 inserts, one prepared statement each (constructing the prepared >> statement only once), in a single transaction: 1.68s >> >> I'm surprised that there's any win here at all. > > For a single column table, I wouldn't expect much either. With more > columns I think it would be a larger improvement. Out of curiosity I did some tests through JDBC. Using a single-column (integer) table, re-using a prepared statement took about 7 seconds to insert 100000 rows with JDBC's batch interface and a batch size of 1000 Using a prepared statement that had a 1000 (?) after the insert (insert into foo values (?), (?), ...) the insert took about 0.8 seconds. Quite an improvement I'd say. Then I switched to a three column table (int, varchar(500), varchar(500)). Insert using a preparedstatement with batch (first scenario) now was ~8.5 seconds, whereas the multi-value insert now took ~3 seconds. So the difference got smaller, but still was quite substantial. This was inserting relatively small strings (~20 characters) into the table When increasing the size of the inserted strings, things began to change. When I bumped the length of the strings to 70 and 200 characters, the multi-value insert slowed down considerably. Both solutions now took around 9 seconds. The multi-value solution ranged between 7 and 9 seconds, whereas the "regular" insert syntax was pretty constant at roughly 9 seconds (I ran it about 15 times). So it seems, that as the size of the row increases the multi-value insert loses its head-start compared to the "regular" insert. I also played around with batch size. Going beyond 200 didn't make a big difference. For the JDBC batch, the batch size was the number of rows after which I called executeBatch() for the multi-value insert, this was the number of tuples I sent in a single INSERT statement. The multi-value statement seems to perform better with lower "batch" sizes (~10-50) whereas the JDBC batching seems to be fastest with about 200 statements per batch. Thomas
В списке pgsql-performance по дате отправления: