Обсуждение: pgstats_initstats() cost

Поиск
Список
Период
Сортировка

pgstats_initstats() cost

От
Gavin Sherry
Дата:
I did some basic profiling of CVS HEAD after having read Bruce's post the
other day: When did we get so fast. It seemed to me that the number of
inserts per second wasn't actually all that high so I had a look at some
numbers:
 %   cumulative   self              self     totaltime   seconds   seconds    calls   s/call   s/call  name 7.32
3.48    3.48   100004     0.00     0.00  yyparse 4.42      5.58     2.10  1200107     0.00     0.00  SearchCatCache
4.10     7.53     1.95   900020     0.00     0.00  base_yylex 3.85      9.35     1.83   100150     0.00     0.00
XLogInsert3.51     11.02     1.67   600540     0.00     0.00  pgstat_initstats 2.24     12.09     1.07  2208418
0.00    0.00  hash_search 1.86     12.97     0.88  2607669     0.00     0.00  hash_any
 

I am still researching ways of increasing performance of yacc parsers --
there is a very small amount of information on the Web concerning this --
but pgstat_initstats() caught my eye. This gets called about 6 times per
insert (I did 100000 inserts) and the major cost appears to relate to the
linear pgStatTabstatMessages. The comparative performance of
hash_search() suggests that pgStatTabstatMessages may benefit from use of
a hash. However, it seems unreasonable that we're doing work at all in
pgstat_initstats() if the user is not interested in query/block/tuple
stats.

Comments? Have I missed something?

Thanks,

Gavin



Re: pgstats_initstats() cost

От
"Andrew Dunstan"
Дата:
----- Original Message ----- 
From: "Gavin Sherry" <swm@linuxworld.com.au>
> I am still researching ways of increasing performance of yacc parsers --
> there is a very small amount of information on the Web concerning this --

I know some people who will tell you that the best way of improving
performance in this area is not to use yacc (or bison) parsers ...

OTOH we need to understand exactly what you were profiling - if it is 1
dynamic sql statement per insert then it might not be too close to the real
world - a high volume program is likely to require 1 parse per many many
executions, isn't it?

cheers

andrew



Re: pgstats_initstats() cost

От
Gavin Sherry
Дата:
On Mon, 11 Aug 2003, Andrew Dunstan wrote:

> 
> ----- Original Message ----- 
> From: "Gavin Sherry" <swm@linuxworld.com.au>
> > I am still researching ways of increasing performance of yacc parsers --
> > there is a very small amount of information on the Web concerning this --
> 
> I know some people who will tell you that the best way of improving
> performance in this area is not to use yacc (or bison) parsers ...

Yes. Cost of maintenance vs. performance cost...

> 
> OTOH we need to understand exactly what you were profiling - if it is 1
> dynamic sql statement per insert then it might not be too close to the real
> world - a high volume program is likely to require 1 parse per many many
> executions, isn't it?

I wasn't interested in measuring the performance of yacc -- since I know
it is bad. It was a basic test which wasn't even meant to be real
world. It just seemed interesting that the numbers were three times slower
than other databases I ran it on. Here is the script which generates the
SQL:

echo "create table abc(t text);"
echo "begin;"
c=0
while [ $c -lt 100000 ]
do       echo "insert into abc values('thread1');";       c=$[$c+1]
done
echo "commit;"

Thanks,

Gavin



Re: pgstats_initstats() cost

От
Gavin Sherry
Дата:
On Tue, 12 Aug 2003, Rod Taylor wrote:

> > world. It just seemed interesting that the numbers were three times slower
> > than other databases I ran it on. Here is the script which generates the
> 
> You were comparing against databases with similar safety nets to
> guarantee against dataloss?

I am in the process of reading through the logging/versioning code of them
and the others definately do. My main interest is in determining how to
reduce the cost of pgstats_initstats().

Thanks,

Gavin



Re: pgstats_initstats() cost

От
Tom Lane
Дата:
Gavin Sherry <swm@linuxworld.com.au> writes:
> but pgstat_initstats() caught my eye. This gets called about 6 times per
> insert (I did 100000 inserts) and the major cost appears to relate to the
> linear pgStatTabstatMessages. The comparative performance of
> hash_search() suggests that pgStatTabstatMessages may benefit from use of
> a hash. However, it seems unreasonable that we're doing work at all in
> pgstat_initstats() if the user is not interested in query/block/tuple
> stats.

The coding in the search loop could perhaps be tightened a little, but
I'd think the last point should be addressed by dropping out via the
"no_stats" exit if stats aren't being gathered.

I doubt a hash is worth maintaining, because the active tabstat entries
should only be for tables that are being touched in the current command
(thus, there are not more than six in your example).  I'm not sure why
it takes so much time to look through six entries though ...
        regards, tom lane


Re: pgstats_initstats() cost

От
Gavin Sherry
Дата:
On Tue, 12 Aug 2003, Tom Lane wrote:

> Gavin Sherry <swm@linuxworld.com.au> writes:
> > but pgstat_initstats() caught my eye. This gets called about 6 times per
> > insert (I did 100000 inserts) and the major cost appears to relate to the
> > linear pgStatTabstatMessages. The comparative performance of
> > hash_search() suggests that pgStatTabstatMessages may benefit from use of
> > a hash. However, it seems unreasonable that we're doing work at all in
> > pgstat_initstats() if the user is not interested in query/block/tuple
> > stats.
> 
> The coding in the search loop could perhaps be tightened a little, but
> I'd think the last point should be addressed by dropping out via the
> "no_stats" exit if stats aren't being gathered.
> 
> I doubt a hash is worth maintaining, because the active tabstat entries
> should only be for tables that are being touched in the current command
> (thus, there are not more than six in your example).  I'm not sure why
> it takes so much time to look through six entries though ...

Neither. I might look into it further later, but here's a patch to exit
out of pgstat_initstats() if we're not collecting stats (attached).

Thanks,

Gavin

Re: pgstats_initstats() cost

От
Tom Lane
Дата:
I said:
> I doubt a hash is worth maintaining, because the active tabstat entries
> should only be for tables that are being touched in the current command
> (thus, there are not more than six in your example).  I'm not sure why
> it takes so much time to look through six entries though ...

I replicated your example, and soon found that in fact there were
forty-three active tabstat slots, which makes pgstat_initstats a little
bit more credible as a time-waster.

The reason why there were forty-three, in a statement that's only
touching one relation, is that with the present coding of pgstats,
if you have stats gathering off then there will be an active entry
for every relation that's been touched since backend launch.
pgstat_report_tabstat() should have flushed the entries, but
*if reporting is disabled then it fails to do so*.

This is clearly a bug.  Will fix.
        regards, tom lane


Re: pgstats_initstats() cost

От
Rod Taylor
Дата:
> world. It just seemed interesting that the numbers were three times slower
> than other databases I ran it on. Here is the script which generates the

You were comparing against databases with similar safety nets to
guarantee against dataloss?


concatenation = = transaction ?????

От
"sla-net"
Дата:
Hi everybody

It's my first post here, so be indulgent ;)

Just to confirm :  if i do

$sql1='insert into "Enfant"  ("NomEnfant","PrenomEnfant") VALUES
('.$NomEnfant.','.$PrenomEnfant.')

$sql2='insert into "IndividuEnfant" ("IdIndividu","IdEnfant") VALUES
('.$IdIndividu.',currval(\'"Enfant_Seq"\'));

$sql=$sql1.$sql2;

$Res=pg_query($sql);

so 2 query concatenation a a only one, send to postgress database : is it
EXACTLY the same thing than a classic transaction with a begin and a commit
or rollback  ?

Tcks, it's just i dont want to write everything again in 6 months.....

S.L.

PS : why limitation to 8 patrameters in stored procedures ??????



Re: pgstats_initstats() cost

От
Tom Lane
Дата:
Gavin Sherry <swm@linuxworld.com.au> writes:
> Neither. I might look into it further later, but here's a patch to exit
> out of pgstat_initstats() if we're not collecting stats (attached).

Applied along with other fixes.
        regards, tom lane


Re: concatenation = = transaction ?????

От
Josh Berkus
Дата:
Sla,

> PS : why limitation to 8 patrameters in stored procedures ??????

What version of PostgreSQL are you using?

The limit is 16 parameters for 7.1 and 7.2, raised to 32 parameters in 7.3 and 
after.   Further, you can raise the limit yourself at compile-time, although 
I understand it incurrs a penalty in index efficiency.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Gavin Sherry <swm@linuxworld.com.au> writes:
> I wasn't interested in measuring the performance of yacc -- since I know
> it is bad. It was a basic test which wasn't even meant to be real
> world. It just seemed interesting that the numbers were three times slower
> than other databases I ran it on. Here is the script which generates the
> SQL:

> echo "create table abc(t text);"
> echo "begin;"
> c=0
> while [ $c -lt 100000 ]
> do
>         echo "insert into abc values('thread1');";
>         c=$[$c+1]
> done
> echo "commit;"

Of course the obvious way of getting rid of the parser overhead is not
to parse everytime --- viz, to use prepared statements.

I have just finished running some experiments that compared a series of
INSERTs issued via PQexec() versus preparing an INSERT command and then
issuing new-FE-protocol Bind and Execute commands against the prepared
statement.  With a test case like the above (one target column and a
prepared statement like "insert into abc values($1)"), I saw about a 30%
speedup.  (Or at least I did after fixing a couple of bottlenecks in the
backend's per-client-message loop.)

Of course, the amount of work needed to parse this INSERT command is
pretty trivial.  With just a slightly more complex test case:create table abc (f1 text, f2 int, f3 float8);
and a prepared statement likePREPARE mystmt(text,int,float8) AS insert into abc values($1,$2,$3)
there was a factor of two difference in the speed.

This leaves us with a bit of a problem, though, because there isn't any
libpq API that allows access to this speedup.  I put in a routine to
support Parse/Bind/Execute so that people could use out-of-line
parameters for safety reasons --- but there's no function to do
Bind/Execute against a pre-existing prepared statement.  (I had to make
a hacked version of libpq to do the above testing.)

I'm beginning to think that was a serious omission.  I'm tempted to fix
it, even though we're past feature freeze for 7.4.  Comments?
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Rod Taylor
Дата:
On Tue, 2003-08-12 at 15:36, Tom Lane wrote:
> Gavin Sherry <swm@linuxworld.com.au> writes:
> > I wasn't interested in measuring the performance of yacc -- since I know
> > it is bad. It was a basic test which wasn't even meant to be real
> > world. It just seemed interesting that the numbers were three times slower
> > than other databases I ran it on. Here is the script which generates the
> > SQL:
>
> > echo "create table abc(t text);"
> > echo "begin;"
> > c=0
> > while [ $c -lt 100000 ]
> > do
> >         echo "insert into abc values('thread1');";
> >         c=$[$c+1]
> > done
> > echo "commit;"
>
> Of course the obvious way of getting rid of the parser overhead is not
> to parse everytime --- viz, to use prepared statements.
>
> I have just finished running some experiments that compared a series of
> INSERTs issued via PQexec() versus preparing an INSERT command and then
> issuing new-FE-protocol Bind and Execute commands against the prepared
> statement.  With a test case like the above (one target column and a
> prepared statement like "insert into abc values($1)"), I saw about a 30%
> speedup.  (Or at least I did after fixing a couple of bottlenecks in the
> backend's per-client-message loop.)
>
> Of course, the amount of work needed to parse this INSERT command is
> pretty trivial.  With just a slightly more complex test case:
>     create table abc (f1 text, f2 int, f3 float8);
> and a prepared statement like
>     PREPARE mystmt(text,int,float8) AS insert into abc values($1,$2,$3)
> there was a factor of two difference in the speed.

Do you happen to have any numbers comparing prepared inserts in a single
transaction against copy?


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Jon Jensen
Дата:
On Tue, 12 Aug 2003, Tom Lane wrote:

> I have just finished running some experiments that compared a series of
> INSERTs issued via PQexec() versus preparing an INSERT command and then
> issuing new-FE-protocol Bind and Execute commands against the prepared
> statement.  With a test case like the above (one target column and a
> prepared statement like "insert into abc values($1)"), I saw about a 30%
> speedup.  (Or at least I did after fixing a couple of bottlenecks in the
> backend's per-client-message loop.)
[snip]
> This leaves us with a bit of a problem, though, because there isn't any
> libpq API that allows access to this speedup.  I put in a routine to
> support Parse/Bind/Execute so that people could use out-of-line
> parameters for safety reasons --- but there's no function to do
> Bind/Execute against a pre-existing prepared statement.  (I had to make
> a hacked version of libpq to do the above testing.)
> 
> I'm beginning to think that was a serious omission.  I'm tempted to fix
> it, even though we're past feature freeze for 7.4.  Comments?

I think it would be well worth waiting for this feature. I often bind 
parameters in Perl's DBD::Pg, looking to the future when that doesn't just 
handle quoting, but also uses faster cached prepared queries. It'd be 
great to see that in 7.4.

Jon


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Sean Chittenden
Дата:
> This leaves us with a bit of a problem, though, because there isn't
> any libpq API that allows access to this speedup.  I put in a
> routine to support Parse/Bind/Execute so that people could use
> out-of-line parameters for safety reasons --- but there's no
> function to do Bind/Execute against a pre-existing prepared
> statement.  (I had to make a hacked version of libpq to do the above
> testing.)
> 
> I'm beginning to think that was a serious omission.  I'm tempted to
> fix it, even though we're past feature freeze for 7.4.  Comments?

On a quasi-similar note (and unless I've missed how to do this), you
can't create a cursor from a prepared statement, which I found
frustrating.  On frequently used queries, I've gotten in the habbit of
preparing the queries at connect time and then executing the query,
but with larger queries, it's problematic to not be able to use a
cursor in addition to the prepared statement.

-sc

-- 
Sean Chittenden


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Alvaro Herrera Munoz
Дата:
On Tue, Aug 12, 2003 at 03:36:07PM -0400, Tom Lane wrote:

> This leaves us with a bit of a problem, though, because there isn't any
> libpq API that allows access to this speedup.  I put in a routine to
> support Parse/Bind/Execute so that people could use out-of-line
> parameters for safety reasons --- but there's no function to do
> Bind/Execute against a pre-existing prepared statement.  (I had to make
> a hacked version of libpq to do the above testing.)
> 
> I'm beginning to think that was a serious omission.  I'm tempted to fix
> it, even though we're past feature freeze for 7.4.  Comments?

Please do.  It could be argued that the feature is already there, and that
the fact that there's no way to use it through libpq is actually a bug.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Hay que recordar que la existencia en el cosmos, y particularmente la
elaboraci�n de civilizaciones dentre de �l no son, por desgracia,
nada id�licas" (Ijon Tichy)


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Rod Taylor <rbt@rbt.ca> writes:
> Do you happen to have any numbers comparing prepared inserts in a single
> transaction against copy?

COPY is about a factor of 6 faster, it appears.  I got 11.8 versus
1.87 seconds for loading the same amount of data (with the 3-column
test table).  So COPY's not gonna go away any time soon.
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Sean Chittenden <sean@chittenden.org> writes:
>> I'm beginning to think that was a serious omission.  I'm tempted to
>> fix it, even though we're past feature freeze for 7.4.  Comments?

> On a quasi-similar note (and unless I've missed how to do this), you
> can't create a cursor from a prepared statement, which I found
> frustrating.

Hmm.  I'd be willing to see that put on TODO, but it strikes me as a
rather large change for post-feature-freeze; it'd require fooling around
in substantial parts of the backend.  (What I was suggesting above only
requires adding one or two routines to libpq, which is a lot less
invasive...)
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Of course the obvious way of getting rid of the parser overhead is not
> to parse everytime --- viz, to use prepared statements.

I think this would be nice to have too...  On a similar note (I think
anyway) I wasn't able to find any functions for bulk dumps into memory
space from a select statement.  This would probably be alot more work
but I find it very useful with other databases I work on (mainly
Oracle).

You probably know but I'll quickly outline it to point out the
differences, as I see them, from the 'COPY' ability.  Basically the user
defines their own C structure and then malloc's an array of them.  The
user then tells the database the type, offset from start of structure
and the skip (size of structure) for each column returned by the select
statement.  The user can then do 'bulk' grabs with a single command into
the memory space allocated, doing more than one and changing the offsets
inbetween if more is returned than was initially allocated for.  The
user can realloc or allocate new segments and do their own handling of
the segments if they choose.

The same is true for 'insert' statements, in reverse, of course.  This
avoids alot of unnecssary parsing and type conversion (where possible).
This does give more work to the library since it has to be able to do
type conversions in some cases where the database type and the user
requested type differ.
Thanks,    Stephen

Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
> Rod Taylor <rbt@rbt.ca> writes:
>> Do you happen to have any numbers comparing prepared inserts in a single
>> transaction against copy?

> COPY is about a factor of 6 faster, it appears.  I got 11.8 versus
> 1.87 seconds for loading the same amount of data (with the 3-column
> test table).  So COPY's not gonna go away any time soon.

BTW, that was on a local Unix-socket connection.  On a network
connection, COPY would have a huge advantage because (a) it could
transfer multiple rows per packet, if your rows are not too long,
and (b) you would not have to wait for the server to answer back
before sending the next packet.

In theory you could pipeline INSERT commands like that too, but not
with libpq.
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Joe Conway
Дата:
Tom Lane wrote:
> I'm beginning to think that was a serious omission.  I'm tempted to fix
> it, even though we're past feature freeze for 7.4.  Comments?
> 

Seems pretty well isolated. If you're tallying votes, count me as a "yay".

Joe




Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> You probably know but I'll quickly outline it to point out the
> differences, as I see them, from the 'COPY' ability.  Basically the user
> defines their own C structure and then malloc's an array of them.  The
> user then tells the database the type, offset from start of structure
> and the skip (size of structure) for each column returned by the select
> statement.  The user can then do 'bulk' grabs with a single command into
> the memory space allocated, doing more than one and changing the offsets
> inbetween if more is returned than was initially allocated for.  The
> user can realloc or allocate new segments and do their own handling of
> the segments if they choose.

[shrug]  That seems like a substantial increase in API complexity for
at best marginal performance gains.  What does it gain for the user to
malloc space rather than libpq?
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> I'm beginning to think that was a serious omission.  I'm tempted to fix
>> it, even though we're past feature freeze for 7.4.  Comments?

> Seems pretty well isolated. If you're tallying votes, count me as a "yay".

Well, the early voting is running heavily to the 'yea' side.  Let's talk
about what to call the thing.

Pre-7.4 we had "PQexec" (synchronous) and "PQsendQuery" (asynchronous)
to send a plain text query string (Simple Query message in new-FE-protocol-
speak).  I added "PQexecParams" and "PQsendQueryParams" to do
Parse/Bind/Execute, choosing the names on the basis that being able to
pass out-of-line parameters was the primary new thing they were giving
the user.

A Bind/Execute facility would need a pair of routines with signatures
very similar to PQexecParams/PQsendQueryParams --- they'd take a
prepared statement name instead of a query string, and they'd not need
an array of parameter type OIDs, but otherwise the same.  I couldn't
come up with very good names for them though.  Best idea so far is
PQexecPrepared/PQsendQueryPrepared, but maybe someone has a better one?

We could also think about providing an interface to do just Parse,
although this is inessential since you can set up a prepared statement
by PQexec'ing a PREPARE command.
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > You probably know but I'll quickly outline it to point out the
> > differences, as I see them, from the 'COPY' ability.  Basically the user
> > defines their own C structure and then malloc's an array of them.  The
> > user then tells the database the type, offset from start of structure
> > and the skip (size of structure) for each column returned by the select
> > statement.  The user can then do 'bulk' grabs with a single command into
> > the memory space allocated, doing more than one and changing the offsets
> > inbetween if more is returned than was initially allocated for.  The
> > user can realloc or allocate new segments and do their own handling of
> > the segments if they choose.
>
> [shrug]  That seems like a substantial increase in API complexity for
> at best marginal performance gains.  What does it gain for the user to
> malloc space rather than libpq?

Perhaps I'm just used to dealing with very large selects..   When
dealing with Oracle it's been my experiance that it doesn't grab the
entire select return and store in local memory for the user to puruse
using the appropriate calls.  It grabs a portion and stores it in a
local cache and then gets more from the server periodically as the user
requests more.  This avoids having the library malloc'ing a very large
amount of memory when a very large query is done.  Doing the 'bulk'
transfer avoids the cache entirely and just dumps the data from the
server into the user's allocated memory area.  The user can then step
through it, gain what they need from it, throw it out and get the next
batch.  If libpq grabs the entire result in one go then that may
actually cause a problem for me when I start to move things from Oracle
to postgres since the clients don't always have much memory available.

Otherwise it would just avoid the function calls to libpq in the loop
which passes over the data.  That probably wouldn't be that much of a
gain if libpq has all the data local though.
Stephen

Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> ... If libpq grabs the entire result in one go then that may
> actually cause a problem for me when I start to move things from Oracle
> to postgres since the clients don't always have much memory available.

It does that in an ordinary SELECT.  The customary answer to this has
been "use a cursor and FETCH the number of rows you want to process at
a time".
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Joe Conway
Дата:
Tom Lane wrote:
> A Bind/Execute facility would need a pair of routines with signatures
> very similar to PQexecParams/PQsendQueryParams --- they'd take a
> prepared statement name instead of a query string, and they'd not need
> an array of parameter type OIDs, but otherwise the same.  I couldn't
> come up with very good names for them though.  Best idea so far is
> PQexecPrepared/PQsendQueryPrepared, but maybe someone has a better one?

Those sound reasonable to me.

> We could also think about providing an interface to do just Parse,
> although this is inessential since you can set up a prepared statement
> by PQexec'ing a PREPARE command.
> 

Agreed -- that sounds like a nice-to-have for some future release, but 
not enough of an issue to warrant getting put into 7.4 at this point.

Joe





Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Gavin Sherry
Дата:
On Tue, 12 Aug 2003, Tom Lane wrote:

> This leaves us with a bit of a problem, though, because there isn't any
> libpq API that allows access to this speedup.  I put in a routine to
> support Parse/Bind/Execute so that people could use out-of-line
> parameters for safety reasons --- but there's no function to do
> Bind/Execute against a pre-existing prepared statement.  (I had to make
> a hacked version of libpq to do the above testing.)
> 
> I'm beginning to think that was a serious omission.  I'm tempted to fix
> it, even though we're past feature freeze for 7.4.  Comments?

Can you give an example of this usage of this API? I am wondering whether
releasing this specific feature would be eclipsed by a generalised bound
variables solution in a future release... still, that's a nice speed up
:-).

Thanks,

Gavin



Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Gavin Sherry <swm@linuxworld.com.au> writes:
> On Tue, 12 Aug 2003, Tom Lane wrote:
>> I'm beginning to think that was a serious omission.  I'm tempted to fix
>> it, even though we're past feature freeze for 7.4.  Comments?

> Can you give an example of this usage of this API? I am wondering whether
> releasing this specific feature would be eclipsed by a generalised bound
> variables solution in a future release... still, that's a nice speed up
> :-).

Attached is the test code I was using to compare speeds.  It won't do
you much good without the accompanying libpq mods, but it's enough to
illustrate the usage.  (This is a hacked version of example program 3 from
the 7.4 libpq docs.)

        regards, tom lane


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include "libpq-fe.h"

/* for ntohl/htonl */
#include <netinet/in.h>
#include <arpa/inet.h>


static void
exit_nicely(PGconn *conn)
{PQfinish(conn);exit(1);
}

int
main(int argc, char **argv)
{const char *conninfo;PGconn       *conn;PGresult   *res;const char *paramValues[3];int tries, ntries;
/* * If the user supplies a parameter on the command line, use it as the * conninfo string; otherwise default to
settingdbname=template1 and * using environment variables or defaults for all other connection * parameters. */if (argc
>1)    conninfo = argv[1];else    conninfo = "dbname = template1";
 
if (argc > 2)    ntries = atoi(argv[2]);else    ntries = 10;
/* Make a connection to the database */conn = PQconnectdb(conninfo);
/* Check to see that the backend connection was successfully made */if (PQstatus(conn) != CONNECTION_OK){
fprintf(stderr,"Connection to database '%s' failed.\n", PQdb(conn));    fprintf(stderr, "%s", PQerrorMessage(conn));
exit_nicely(conn);}
res = PQexec(conn, "PREPARE mystmt(text,int,float8) AS insert into abc values($1,$2,$3)");if (PQresultStatus(res) !=
PGRES_COMMAND_OK){   fprintf(stderr, "PREPARE failed: %s", PQerrorMessage(conn));    PQclear(res);
exit_nicely(conn);}PQclear(res);
res = PQexec(conn, "BEGIN");if (PQresultStatus(res) != PGRES_COMMAND_OK){    fprintf(stderr, "BEGIN failed: %s",
PQerrorMessage(conn));   PQclear(res);    exit_nicely(conn);}PQclear(res);
 
for (tries = 0; tries < ntries; tries++){
#if 0res = PQexec(conn, "insert into abc values('joe''s place',22,123.4)");
#endif

#if 0/* Here are our out-of-line parameter values */paramValues[0] = "joe's place";paramValues[1] = "22";paramValues[2]
="123.4";
 
res = PQexecParams(conn,                   "insert into abc values($1,$2,$3)",                   3,        /* 3 params
*/                  NULL,    /* let the backend deduce param type */                   paramValues,
NULL,   /* don't need param lengths since text */                   NULL,    /* default to all text params */
       1);        /* ask for binary results */
 
#endif

#if 1/* Here are our out-of-line parameter values */paramValues[0] = "joe's place";paramValues[1] = "22";paramValues[2]
="123.4";
 
res = PQexecPrepared(conn,                   "mystmt",                   3,        /* 3 params */
paramValues,                  NULL,    /* don't need param lengths since text */                   NULL,    /* default
toall text params */                   1);        /* ask for binary results */
 
#endif
if (PQresultStatus(res) != PGRES_COMMAND_OK){    fprintf(stderr, "INSERT failed: %s", PQerrorMessage(conn));
PQclear(res);   exit_nicely(conn);}
 
PQclear(res);}
res = PQexec(conn, "COMMIT");if (PQresultStatus(res) != PGRES_COMMAND_OK){    fprintf(stderr, "COMMIT failed: %s",
PQerrorMessage(conn));   PQclear(res);    exit_nicely(conn);}PQclear(res);
 
/* close the connection to the database and cleanup */PQfinish(conn);
return 0;
}


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
"Christopher Kings-Lynne"
Дата:
> This leaves us with a bit of a problem, though, because there isn't any
> libpq API that allows access to this speedup.  I put in a routine to
> support Parse/Bind/Execute so that people could use out-of-line
> parameters for safety reasons --- but there's no function to do
> Bind/Execute against a pre-existing prepared statement.  (I had to make
> a hacked version of libpq to do the above testing.)
> 
> I'm beginning to think that was a serious omission.  I'm tempted to fix
> it, even though we're past feature freeze for 7.4.  Comments?

I say fix it...

Chris



Re: Parsing speed (was Re: pgstats_initstats() cost)

От
"Christopher Kings-Lynne"
Дата:
> We could also think about providing an interface to do just Parse,
> although this is inessential since you can set up a prepared statement
> by PQexec'ing a PREPARE command.

Wait just a minute!  phpPgAdmin would love to be able to 'parse' arbitrary
sql entered by the user to separate semi-coloned queries, identify various
types of queries, etc.  What would a Parse call allow us to do?

Chris



Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
>> We could also think about providing an interface to do just Parse,
>> although this is inessential since you can set up a prepared statement
>> by PQexec'ing a PREPARE command.

> Wait just a minute!  phpPgAdmin would love to be able to 'parse' arbitrary
> sql entered by the user to separate semi-coloned queries, identify various
> types of queries, etc.  What would a Parse call allow us to do?

Hm.  I was about to say "very little that you can't do with a PREPARE",
but if you don't want to even count semicolons then Parse would be
distinctly safer.  For example if the string isSELECT * FROM foo; UPDATE foo SET ...
then sticking a PREPARE in front would not have the desired effect ---
but sending it in a Parse message would result in a syntax error.
Not sure if that helps you get to your goal though.        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Christopher Kings-Lynne
Дата:
> > Wait just a minute!  phpPgAdmin would love to be able to 'parse' arbitrary
> > sql entered by the user to separate semi-coloned queries, identify various
> > types of queries, etc.  What would a Parse call allow us to do?
>
> Hm.  I was about to say "very little that you can't do with a PREPARE",
> but if you don't want to even count semicolons then Parse would be
> distinctly safer.  For example if the string is
>     SELECT * FROM foo; UPDATE foo SET ...
> then sticking a PREPARE in front would not have the desired effect ---
> but sending it in a Parse message would result in a syntax error.
> Not sure if that helps you get to your goal though.

What do you actually get back from a Parse request?

Chris




Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> What do you actually get back from a Parse request?

Nothing.  If successful, it creates a prepared statement inside the
server.

It might possibly make sense for a libpq routine that exposes Parse
to actually do Parse followed by Describe Statement; that would allow
it to give back (a) an indication of the number and types of parameters
needed by the statement, and (b) an indication of the column set to be
returned, if it's a SELECT.  However, the protocol doesn't tell anything
about the type of a non-SELECT statement.  In any case, this would
require more invention and coding than I care to do at this point in
the release cycle (since there's no support in the guts of libpq for
accepting ParameterDescription messages from the backend).  If that's
what we think we want, we'd better put it on the wish-list for 7.5.
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Bruce Momjian
Дата:
Is there a TODO here?  Text?

---------------------------------------------------------------------------

Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> > What do you actually get back from a Parse request?
> 
> Nothing.  If successful, it creates a prepared statement inside the
> server.
> 
> It might possibly make sense for a libpq routine that exposes Parse
> to actually do Parse followed by Describe Statement; that would allow
> it to give back (a) an indication of the number and types of parameters
> needed by the statement, and (b) an indication of the column set to be
> returned, if it's a SELECT.  However, the protocol doesn't tell anything
> about the type of a non-SELECT statement.  In any case, this would
> require more invention and coding than I care to do at this point in
> the release cycle (since there's no support in the guts of libpq for
> accepting ParameterDescription messages from the backend).  If that's
> what we think we want, we'd better put it on the wish-list for 7.5.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Bruce Momjian
Дата:
Added to TODO:
* Allow PREPARE of cursors


---------------------------------------------------------------------------

Tom Lane wrote:
> Sean Chittenden <sean@chittenden.org> writes:
> >> I'm beginning to think that was a serious omission.  I'm tempted to
> >> fix it, even though we're past feature freeze for 7.4.  Comments?
> 
> > On a quasi-similar note (and unless I've missed how to do this), you
> > can't create a cursor from a prepared statement, which I found
> > frustrating.
> 
> Hmm.  I'd be willing to see that put on TODO, but it strikes me as a
> rather large change for post-feature-freeze; it'd require fooling around
> in substantial parts of the backend.  (What I was suggesting above only
> requires adding one or two routines to libpq, which is a lot less
> invasive...)
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
"Christopher Kings-Lynne"
Дата:
> > It might possibly make sense for a libpq routine that exposes Parse
> > to actually do Parse followed by Describe Statement; that would allow
> > it to give back (a) an indication of the number and types of parameters
> > needed by the statement, and (b) an indication of the column set to be
> > returned, if it's a SELECT.  However, the protocol doesn't tell anything
> > about the type of a non-SELECT statement.  In any case, this would
> > require more invention and coding than I care to do at this point in
> > the release cycle (since there's no support in the guts of libpq for
> > accepting ParameterDescription messages from the backend).  If that's
> > what we think we want, we'd better put it on the wish-list for 7.5.

If we had a Parse function, then we at phpPgAdmin could allow Reports to
contain parameters, and detect as such, and then when they run their report,
they can enter the values for that run.

Chris



Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Tom Lane
Дата:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
>> ...  If that's
>> what we think we want, we'd better put it on the wish-list for 7.5.

> If we had a Parse function, then we at phpPgAdmin could allow Reports to
> contain parameters, and detect as such, and then when they run their report,
> they can enter the values for that run.

If you want to do that, then I think you need the whole nine yards
including DescribeStatement support.  Too late for 7.4, but let's make
it a TODO for 7.5:

* Add a libpq function to support Parse/DescribeStatement capability
        regards, tom lane


Re: Parsing speed (was Re: pgstats_initstats() cost)

От
Bruce Momjian
Дата:
Tom Lane wrote:
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> >> ...  If that's
> >> what we think we want, we'd better put it on the wish-list for 7.5.
> 
> > If we had a Parse function, then we at phpPgAdmin could allow Reports to
> > contain parameters, and detect as such, and then when they run their report,
> > they can enter the values for that run.
> 
> If you want to do that, then I think you need the whole nine yards
> including DescribeStatement support.  Too late for 7.4, but let's make
> it a TODO for 7.5:
> 
> * Add a libpq function to support Parse/DescribeStatement capability

Added.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073