Обсуждение: ECPG multiple INSERTS or SELECTS in one command?

Поиск
Список
Период
Сортировка

ECPG multiple INSERTS or SELECTS in one command?

От
Дата:
I found a similar issue like this in the archives from 2000, but no
answer...

I'm seeing what seems like slow retrieval times over the network.  I am
retrieving a single field of about 100-120 characters per record.  I am
getting about 3 seconds per 1000 records - it takes 30 seconds to retrieve
10,000 records.  That's only about 36 KBytes/sec.

This is a 100BT switched network (not sure if it is vlan'd or through a
router).  Echo time averages 3ms. That works out to about the same rate as
the records are being retrieved - 3ms per record.

The back end is pretty much idle.  It shows 'idle in transaction'.  If I run
the program on the system containing the database, it runs close to 2 orders
of magnitude faster.

On the remote system:

05-08-2004.23:54:43 Records read: 10000
05-08-2004.23:55:17 Records read: 20000
05-08-2004.23:55:50 Records read: 30000
05-08-2004.23:56:22 Records read: 40000
05-08-2004.23:56:55 Records read: 50000
05-08-2004.23:57:32 Records read: 60000
05-08-2004.23:58:07 Records read: 70000
...

The code is an ecpg program like:
   EXEC SQL WHENEVER SQLERROR GOTO sql_error;   EXEC SQL WHENEVER NOT FOUND DO break;   EXEC SQL DECLARE message_cursor
CURSORFOR
 
       SELECT           file_name       FROM           messages       WHERE           system_key=(select system_key
fromsystems where
 
system_name=:systemName);
   EXEC SQL OPEN message_cursor;
   count = 0;   while (1) {       EXEC SQL FETCH message_cursor INTO           :fileNameDB;
       memcpy (tempstr, fileNameDB.arr, fileNameDB.len);       tempstr[fileNameDB.len] = '\0';
           [Action with tempstr removed for testing]
       count++;       if ( (count % 10000) == 0) logmsg ("Records read: %d", count);       }


My "outside looking in" observations seem to point to the fact that every
row has to be retrieved (or stored) with a separate request.  Network
latency, however small, becomes an issue when the volume is very high.

A (different) Pro*C program I recently ported from Oracle to PostgreSQL
showed this difference.  In Pro*C you can load an array with rows to insert,
then issue a single INSERT request passing it the array.  I believe the same
thing applies to FETCH.  As far as I can tell, in PostgreSQL ecpg (or other)
you have to execute one request per record.  The program SEGV'd all over the
place when I tried to use the Pro*C array code.  I had to do a major rework
of the code to remove the array logic.

Is there some way to batch insert/fetch requests?  How else can I improve
upon the performance?  I can find nothing in the ecpg documentation that
would indicate this is possible.  It appears that COPY works like this, but
you can't control what is returned and you have to know the column order.  I
need to retrieve about 10 million records - 10% of the database, so COPY is
not feasible.


Wes




Re: ECPG multiple INSERTS or SELECTS in one command?

От
Michael Meskes
Дата:
On Sun, May 09, 2004 at 06:36:44PM -0500, wespvp@syntegra.com wrote:
> A (different) Pro*C program I recently ported from Oracle to PostgreSQL
> showed this difference.  In Pro*C you can load an array with rows to insert,
> then issue a single INSERT request passing it the array.  I believe the same
> thing applies to FETCH.  As far as I can tell, in PostgreSQL ecpg (or other)
> you have to execute one request per record.  The program SEGV'd all over the
> place when I tried to use the Pro*C array code.  I had to do a major rework
> of the code to remove the array logic.

That is for insert right? Batching inserts is tricky as there is no way
for ecpg to see if your array of 10 ints shall be inserted as 10 ints or
as one array.

With fetch on the other hand it should work the same as with Pro*C.
There is no problem to fetch into an array.

Michael

-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: ECPG multiple INSERTS or SELECTS in one command?

От
Дата:
>> A (different) Pro*C program I recently ported from Oracle to PostgreSQL
>> showed this difference.  In Pro*C you can load an array with rows to insert,
>> then issue a single INSERT request passing it the array.  I believe the same
>> thing applies to FETCH.  As far as I can tell, in PostgreSQL ecpg (or other)
>> you have to execute one request per record.  The program SEGV'd all over the
>> place when I tried to use the Pro*C array code.  I had to do a major rework
>> of the code to remove the array logic.
> 
> That is for insert right?

Yes, correct.

> Batching inserts is tricky as there is no way
> for ecpg to see if your array of 10 ints shall be inserted as 10 ints or
> as one array.

Perhaps there is some way a flag could be passed as part of the SQL
statement to ecpg to differentiate this?  I suspect (but have no easy way of
proving) the lack of this feature may be why there is a 2:1 difference
between loading the same table with Oracle Pro*C and PostgreSQL (Oracle is
twice as fast).  With Oracle, I populate an array and do a batch load.  With
PostgreSQL, I have to load a row at a time.

> With fetch on the other hand it should work the same as with Pro*C.
> There is no problem to fetch into an array.

Someone on the Interfaces list sent me some sample code.  I had made the
rash assumption that since I had to convert all the Pro*C array (INSERT)
code to non-array, that array usage wasn't supported at all by ecpg.

Wes