Обсуждение: Populating a database with huge amounts of data

Поиск
Список
Период
Сортировка

Populating a database with huge amounts of data

От
Andreas Koch
Дата:
Hi,

I have to populate a database with a huge amount of data (over a
Network). While reading the Data from the Database using a single SELECT
statement takes only a few seconds, writing the Data with a series of
INSERT statements take over 10 Minutes. I think this is because the
INSERT statement plus the Data have to be send as ASCII over the network
and need much more bytes then sending only the data in binary form. 10
Minutes are by far too mutch time for my aplication to wait for the
transaction so I was looking for a faster way to insert the data into my
database. My first step was to disable Autocommit by using BEGIN at the
start and COMMIT at the End of sending the data via INSERT INTO. This
speeds up the process a little bit, but not enough. Now I'm trying to
use COPY FROM STDIN, but don't get it right. First I tryed :

Statement.execute("COPY table FROM STDIN");
String Data = new String("");
for(int i=0; ....) {
 Data = new String(Data+integer1[i]+"\t"+integer2[i]+"\n");
 }
Data = new String (Data+"\\.\n");
Statement.execute(Data);

This don't work because Statement.execute(Data) inserts a 'Q' at the
beginning of the Data String so that the first entry looks like Q7
instead of 7 and the copy statement returns an error because it expects
an iterger, not a string. So I tryed to put everything in one String :

String Data = new String("COPY table FROM STDIN; ");
for(int i=0; ....) {
 Data = new String(Data+integer1[i]+"\t"+integer2[i]+"\n");
 }
Data = new String (Data+"\\.\n");
Statement.execute(Data);

which also comes up with an error (parser error at or near
"[interger1[0]]").

I don't know if COPY FROM STDIN is the right way to speed up the
population of the database or why it don't work the way I tryed it. I
would be very happy for every help I can get.

Yours,

Andreas

Re: Populating a database with huge amounts of data

От
Kris Jurka
Дата:

On Tue, 1 Jul 2003, Andreas Koch wrote:

> Hi,
>
> I have to populate a database with a huge amount of data (over a
> Network). While reading the Data from the Database using a single SELECT
> statement takes only a few seconds, writing the Data with a series of
> INSERT statements take over 10 Minutes. I think this is because the
> INSERT statement plus the Data have to be send as ASCII over the network
> and need much more bytes then sending only the data in binary form. 10
> Minutes are by far too mutch time for my aplication to wait for the
> transaction so I was looking for a faster way to insert the data into my
> database. My first step was to disable Autocommit by using BEGIN at the
> start and COMMIT at the End of sending the data via INSERT INTO. This
> speeds up the process a little bit, but not enough. Now I'm trying to
> use COPY FROM STDIN, but don't get it right. First I tryed :
>
> Statement.execute("COPY table FROM STDIN");
> String Data = new String("");
> for(int i=0; ....) {
>  Data = new String(Data+integer1[i]+"\t"+integer2[i]+"\n");
>  }
> Data = new String (Data+"\\.\n");
> Statement.execute(Data);
>
> This don't work because Statement.execute(Data) inserts a 'Q' at the
> beginning of the Data String so that the first entry looks like Q7
> instead of 7 and the copy statement returns an error because it expects
> an iterger, not a string. So I tryed to put everything in one String :
>
> String Data = new String("COPY table FROM STDIN; ");
> for(int i=0; ....) {
>  Data = new String(Data+integer1[i]+"\t"+integer2[i]+"\n");
>  }
> Data = new String (Data+"\\.\n");
> Statement.execute(Data);
>
> which also comes up with an error (parser error at or near
> "[interger1[0]]").
>
> I don't know if COPY FROM STDIN is the right way to speed up the
> population of the database or why it don't work the way I tryed it. I
> would be very happy for every help I can get.
>

Mike Adler produces a patch to allow copy in the JDBC driver a while ago.
See http://archives.postgresql.org/pgsql-jdbc/2003-02/msg00345.php

Now that the 7.4dev series allows recovery from a copy error I'm more in
favor of this patch.

Kris Jurka