Обсуждение: A PGsendQuery API design for your review
Hi, I've got users who write and submit arbitrary sequences of SQL to the server, as supported by psql, phpPgAdmin, pgAdmin, etc. I'd like to see this sort of functionality in psycopg2. Someday I may submit a code patch. In the meantime I'd appreciate any review of an API design that supports such functionality. (Attached is a patch to the docs if you'd prefer to look at it that way.) Thanks for the work. ---------------------------------------------------- The "cursor" class ****************** <snip> -[ Commands execution methods ]- <snip> executequery([buffered=True]) Execute the next database operation (query or command) submitted to the server with "submitqueries()". "buffered" determines whether all query output is collected in a buffer by the client before delivery to the application or whether results are returned to the application as they are received from the server. (The "cursor.execute()" and "cursor.executemany()" methods always buffer.) "True" is returned if a database operation was executed. "None" is returned if all submitted database operations have been executed. If a query was executed, the returned values can be retrieved using "fetch*()" methods. DB API extension: The "executequery()" method is a Psycopg extension to the DB API 2.0. <snip> submitqueries(operations) Submit a string containing one or more database operations (queries or commands) to the database server for execution. Warning: Never, **never**, **NEVER** use Python string concatenation ("+") or string parameters interpolation ("%") to pass variables to a SQL query string. See Passing parameters to SQL queries. Always use the "mogrify()" method to construct the "operations" string if the SQL contains variable data. This function is primarily useful in the context of executing large reports, pre-packaged collections of SQL reports, and arbitrary user-supplied SQL. Use the "executequery()" method to execute the submitted SQL. DB API extension: The "submitqueries()" method is a Psycopg extension to the DB API 2.0. <snip> Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Вложения
On Feb 2, 2016, at 9:49 AM, "Karl O. Pinc" <kop@meme.com> wrote: > I've got users who write and submit arbitrary sequences of SQL > to the server, as supported by psql, phpPgAdmin, pgAdmin, etc. Note that those are all applications, rather than interface libraries. > I'd like to see this sort of functionality in psycopg2. The client of psycopg2 can do this already, of course. -- -- Christophe Pettus xof@thebuild.com
On Tue, 2 Feb 2016 09:52:33 -0800 Christophe Pettus <xof@thebuild.com> wrote: > > On Feb 2, 2016, at 9:49 AM, "Karl O. Pinc" <kop@meme.com> wrote: > > > I've got users who write and submit arbitrary sequences of SQL > > to the server, as supported by psql, phpPgAdmin, pgAdmin, etc. > > Note that those are all applications, rather than interface libraries. Yes. And I take it back. phpPgAdmin does not support the submission of arbitrary sequences of SQL. Or rather it does, but you can only get the results of the last query. This is due to a limitation in the PHP PG API. > > > I'd like to see this sort of functionality in psycopg2. > > The client of psycopg2 can do this already, of course. How? In order to submit multiple statements to libpq and get back results for all of them PGsendQuery() must be called. cursor.execute() will return results only from the last query executed, like calling PGexec(). Or am I wrong? Meanwhile, I don't see PGsetSingleRowMode() called anywhere in psycopg2. I figured as long as I was making an API I'd provide for a way to get rid of the client-side buffering. I figure that's a useful feature. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On Feb 2, 2016, at 10:04 AM, Karl O. Pinc <kop@meme.com> wrote: > OHow? In order to submit multiple statements to libpq > and get back results for all of them PGsendQuery() must > be called. The same way the other clients do it; split the text into queries and send them over. -- -- Christophe Pettus xof@thebuild.com
On Tue, 2 Feb 2016 10:15:40 -0800 Christophe Pettus <xof@thebuild.com> wrote: > > On Feb 2, 2016, at 10:04 AM, Karl O. Pinc <kop@meme.com> wrote: > > > OHow? In order to submit multiple statements to libpq > > and get back results for all of them PGsendQuery() must > > be called. > > The same way the other clients do it; split the text into queries and > send them over. That requires my application contain an SQL parser. This seems onerous. The whole point is that I'm getting unfiltered SQL directly from a user. Multiple statements. "Splitting the text into queries" is non-trivial. I can call libpq directly. First PQsendQuery(), then repeatedly call PQgetResult() (optionally calling PQgetSingleRowMode(). But I like the psycopg2 API. Seems like it ought to be able to make these calls for me. By the by. The other clients probably don't split the text into queries. psql does, but to do it it uses the SQL parser internal to pg. phpPgAdmin does not. And I don't know what pgAdmin does. My point in mentioning these interfaces is that they all allow the user to submit arbitrary sql. And the only one that does it "right" is psql, and it's got "special access" to a SQL parser. psql is the only client I know of that delivers query results when multiple queries are supplied (on stdin, say). I want to be able to write a similar application. (Thanks for taking your time to look at this by the way.) (And, oops. The "buffered" argument must be to the submitquery() method in my design.) Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Hi,
I can say with some confidence that we'll never implement such an API. Even the ux, i.e., accepting multi-queries from the user is wrong, imho. Let the user write single queries, organize them, exrcute them using bound variables and NOT mogrify() and then return the result. Anything less is lazy.
federico
Hi,
I've got users who write and submit arbitrary sequences of SQL
to the server, as supported by psql, phpPgAdmin, pgAdmin, etc.
I'd like to see this sort of functionality in psycopg2.
Someday I may submit a code patch. In the meantime I'd
appreciate any review of an API design that supports
such functionality.
(Attached is a patch to the docs if you'd prefer to look
at it that way.)
Thanks for the work.
----------------------------------------------------
The "cursor" class
******************
<snip>
-[ Commands execution methods ]-
<snip>
executequery([buffered=True])
Execute the next database operation (query or command) submitted
to the server with "submitqueries()".
"buffered" determines whether all query output is collected in a
buffer by the client before delivery to the application or
whether results are returned to the application as they are
received from the server. (The "cursor.execute()" and
"cursor.executemany()" methods always buffer.)
"True" is returned if a database operation was executed. "None"
is returned if all submitted database operations have been
executed.
If a query was executed, the returned values can be retrieved
using "fetch*()" methods.
DB API extension: The "executequery()" method is a Psycopg
extension to the DB API 2.0.
<snip>
submitqueries(operations)
Submit a string containing one or more database operations
(queries or commands) to the database server for execution.
Warning: Never, **never**, **NEVER** use Python string
concatenation ("+") or string parameters interpolation ("%")
to pass variables to a SQL query string. See Passing
parameters to SQL queries. Always use the "mogrify()" method
to construct the "operations" string if the SQL contains
variable data.
This function is primarily useful in the context of executing
large reports, pre-packaged collections of SQL reports, and
arbitrary user-supplied SQL.
Use the "executequery()" method to execute the submitted SQL.
DB API extension: The "submitqueries()" method is a Psycopg
extension to the DB API 2.0.
<snip>
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
--
Sent via psycopg mailing list (psycopg@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/psycopg
On 02/02/2016 10:34 AM, Karl O. Pinc wrote: > On Tue, 2 Feb 2016 10:15:40 -0800 > Christophe Pettus <xof@thebuild.com> wrote: > >> >> On Feb 2, 2016, at 10:04 AM, Karl O. Pinc <kop@meme.com> wrote: >> >>> OHow? In order to submit multiple statements to libpq >>> and get back results for all of them PGsendQuery() must >>> be called. >> >> The same way the other clients do it; split the text into queries and >> send them over. > > That requires my application contain an SQL parser. This seems > onerous. The whole point is that I'm getting unfiltered SQL directly > from a user. Multiple statements. "Splitting the text into queries" > is non-trivial. How about using the statement terminator and doing: sql_str = 'select * from cell_per; select count(*) from cell_per;' sql_str.split(';') ['select * from cell_per', ' select count(*) from cell_per', ''] for sql in sql_str.split(';'): if sql: cur.execute(sql) rs = cur.fetchall() print rs > > I can call libpq directly. First PQsendQuery(), then repeatedly > call PQgetResult() (optionally calling PQgetSingleRowMode(). > But I like the psycopg2 API. Seems like it ought to be able > to make these calls for me. > > By the by. The other clients probably don't split the text > into queries. psql does, but to do it it uses the SQL > parser internal to pg. phpPgAdmin does not. And I don't > know what pgAdmin does. My point in mentioning these interfaces > is that they all allow the user to submit arbitrary sql. > And the only one that does it "right" is psql, and it's > got "special access" to a SQL parser. > > psql is the only client > I know of that delivers query results when multiple queries > are supplied (on stdin, say). I want to be able to write > a similar application. > > (Thanks for taking your time to look at this by the way.) > > (And, oops. The "buffered" argument must be to the > submitquery() method in my design.) > > Regards, > > Karl <kop@meme.com> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > > -- Adrian Klaver adrian.klaver@aklaver.com
On Feb 2, 2016, at 11:01 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > sql_str = 'select * from cell_per; select count(*) from cell_per;' sql_str = "select * from thing where a like ";"; select count(*) from thing;" -- -- Christophe Pettus xof@thebuild.com
On Tue, 02 Feb 2016 19:58:16 +0100 Federico Di Gregorii <fog@dndg.it> wrote: > I can say with some confidence that we'll never implement such an > API. Even the > ux, i.e., accepting multi-queries from the user is wrong, imho. Let > the user > write single queries, organize them, exrcute them using bound > variables and NOT > mogrify() and then return the result. Anything less is lazy. I'm don't think you've a complete picture of my users, what they're doing, and why it is useful to submit multiple queries at once. I support these people: http://amboselibaboons.nd.edu/ http://gombechimpanzees.org/ Who use interfaces like: http://papio.biology.duke.edu/babasewiki/MyStartingPage https://gombemi.ccas.gwu.edu/gombemiwiki/Main_Page My users are scientists, from undergraduates to senior researchers and American Academy of Science members, with a wide variety of SQL expertise. They have direct access to their databases, writing SQL to explore their data. They want a simple, comprehensive interface. (One that not only provides support for SQL but also allows the unsophisticated to create their own tables, etc. Which is neither here nor there.) They repeatedly try different queries and compare results, iterating until they tease the data apart. They might write a query, run it, modify it, and then run both the modified and the original version. They want to see the results in a single window so they can compare. They want to be able to edit and tweak either or both and hit a "resubmit" button and again review the results. Some are sophisticated SQL users; regardless most don't want to think about SQL. It's easier for them to cut and paste the same query 3 times and tweak the SQL to get 3 results (adult males, adult females, juveniles) than to try to write a single query that does a GROUP BY. But they do want to be able to see and compare the results of the 3 queries. They use SQL post-validate their data and check for sanity after a bulk data upload. They have a file containing from a few to 10s of SQL queries that they cut and paste into a "SQL window" and run. Usually many of the queries produce a single number as output. Then they examine the output and see if the numbers add up. They use sets of SELECT INTO queries to copy, or pre-join, or summarize data in the master schema, moving it into their own schema for further analysis. They again use cut-and-paste to replay these operations when there's new data from a fresh bulk data upload. (They likewise have full permissions, within their own schemas, to create whatever tables they need to support their analysis.) Etc. Of course everything _could_ be done by submitting a single query at a time. (Or an interface that requires such parsing by the end-user, and then runs them in bulk. Or some sort of arbitrary file format that contains easily parse-able statement delimiters. Etc.) But an interface that allows submission of multiple statements in a single go, and review of all the results at once, greatly simplifies their work flow. (And it's nice that their interface has point-and-click table creation, deletion, etc.) I'm surprised (Shocked I say! Shocked! :-) that you'd want to place arbitrary restrictions on the user interfaces that can be created with psycopg2. And Sir! I must take offense at the implication that my users are lazy. They work hard to press the semicolon key to end their statements! ;-) Seriously, the real world is complex. If PG allows me to create a user interface where the SQL parsing into separate statements is done by PG then psycopg2 should let me do the same. There's a reason you can pipe multiple SQL statements to psql without having to pre-parse them into separate statements. If you really want to reject the idea of accepting multiple queries from the user let me know and I'll not pursue the matter. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On Tue, 2 Feb 2016 14:04:08 -0600 "Karl O. Pinc" <kop@meme.com> wrote: > On Tue, 02 Feb 2016 19:58:16 +0100 > Federico Di Gregorii <fog@dndg.it> wrote: > > > I can say with some confidence that we'll never implement such an > > API. Even the > > ux, i.e., accepting multi-queries from the user is wrong, imho. > Seriously, the real world is complex. (Reminds me of an interaction I had with the Mozilla people. They'd changed Firefox so that a response had to show up within 5 minutes. My folks occasionally write queries that run hours. Mozilla changed their mind. Hope you will too.) Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On 02/02/2016 12:04 PM, Karl O. Pinc wrote: > On Tue, 02 Feb 2016 19:58:16 +0100 > Federico Di Gregorii <fog@dndg.it> wrote: > >> I can say with some confidence that we'll never implement such an >> API. Even the >> ux, i.e., accepting multi-queries from the user is wrong, imho. Let >> the user >> write single queries, organize them, exrcute them using bound >> variables and NOT >> mogrify() and then return the result. Anything less is lazy. > > I'm don't think you've a complete picture of my users, what > they're doing, and why it is useful to submit multiple queries > at once. > > I support these people: > http://amboselibaboons.nd.edu/ > http://gombechimpanzees.org/ > > Who use interfaces like: > http://papio.biology.duke.edu/babasewiki/MyStartingPage > https://gombemi.ccas.gwu.edu/gombemiwiki/Main_Page > > My users are scientists, from undergraduates to senior > researchers and American Academy of Science members, with a wide > variety of SQL expertise. They have direct access to their > databases, writing SQL to explore their data. They want a > simple, comprehensive interface. (One that not only provides > support for SQL but also allows the unsophisticated to create > their own tables, etc. Which is neither here nor there.) > > They repeatedly try different queries and compare results, > iterating until they tease the data apart. They might write a > query, run it, modify it, and then run both the modified and the > original version. They want to see the results in a single > window so they can compare. They want to be able to edit and > tweak either or both and hit a "resubmit" button and again review > the results. > > Some are sophisticated SQL users; regardless most don't want to > think about SQL. It's easier for them to cut and paste the same > query 3 times and tweak the SQL to get 3 results (adult males, > adult females, juveniles) than to try to write a single query > that does a GROUP BY. But they do want to be able to see and > compare the results of the 3 queries. > > They use SQL post-validate their data and check for sanity after > a bulk data upload. They have a file containing from a few to > 10s of SQL queries that they cut and paste into a "SQL window" > and run. Usually many of the queries produce a single number as > output. Then they examine the output and see if the numbers add > up. > > They use sets of SELECT INTO queries to copy, or pre-join, or > summarize data in the master schema, moving it into their own > schema for further analysis. They again use cut-and-paste to > replay these operations when there's new data from a fresh bulk > data upload. (They likewise have full permissions, within their > own schemas, to create whatever tables they need to support their > analysis.) I think what they are looking for is this: http://ipython.org/ and this: http://pandas.pydata.org/ > > Etc. > > Of course everything _could_ be done by submitting a single query > at a time. (Or an interface that requires such parsing by the > end-user, and then runs them in bulk. Or some sort of arbitrary > file format that contains easily parse-able statement delimiters. > Etc.) But an interface that allows submission of multiple > statements in a single go, and review of all the results at once, > greatly simplifies their work flow. (And it's nice that their > interface has point-and-click table creation, deletion, etc.) > > I'm surprised (Shocked I say! Shocked! :-) that you'd want to > place arbitrary restrictions on the user interfaces that can be > created with psycopg2. And Sir! I must take offense at the > implication that my users are lazy. They work hard to press the > semicolon key to end their statements! ;-) > > Seriously, the real world is complex. If PG allows me to create > a user interface where the SQL parsing into separate statements > is done by PG then psycopg2 should let me do the same. There's > a reason you can pipe multiple SQL statements to psql without > having to pre-parse them into separate statements. > > If you really want to reject the idea of accepting multiple > queries from the user let me know and I'll not pursue the > matter. > > Regards, > > > Karl <kop@meme.com> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > > -- Adrian Klaver adrian.klaver@aklaver.com
On 02/02/2016 12:10 PM, Karl O. Pinc wrote: > On Tue, 2 Feb 2016 14:04:08 -0600 > "Karl O. Pinc" <kop@meme.com> wrote: > >> On Tue, 02 Feb 2016 19:58:16 +0100 >> Federico Di Gregorii <fog@dndg.it> wrote: >> >>> I can say with some confidence that we'll never implement such an >>> API. Even the >>> ux, i.e., accepting multi-queries from the user is wrong, imho. > >> Seriously, the real world is complex. > > (Reminds me of an interaction I had with the Mozilla people. > They'd changed Firefox so that a response had to show up > within 5 minutes. My folks occasionally write queries > that run hours. Mozilla changed their mind. Hope you will > too.) Not sure that was a good idea, holding a resource open for hours to do nothing seems counter-productive to me. That is what LISTEN/NOTIFY, Websockets, name your async method, are for. > > Regards, > > Karl <kop@meme.com> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > > -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, 2 Feb 2016 13:20:34 -0800 Adrian Klaver <adrian.klaver@aklaver.com> wrote: > I think what they are looking for is this: > > http://ipython.org/ > > and this: > > http://pandas.pydata.org/ Thanks, but I don't think so. Those are interactive python shells. What they want is a web-enabled psql; a SQL shell. Many years ago I did a horrible hack of phpPgAdmin to give them such a shell (it parses sql). I'd like to move forward. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On Tue, 2 Feb 2016 13:24:24 -0800 Adrian Klaver <adrian.klaver@aklaver.com> wrote: > On 02/02/2016 12:10 PM, Karl O. Pinc wrote: > > > > (Reminds me of an interaction I had with the Mozilla people. > > They'd changed Firefox so that a response had to show up > > within 5 minutes. My folks occasionally write queries > > that run hours. Mozilla changed their mind. Hope you will > > too.) > > Not sure that was a good idea, holding a resource open for hours to > do nothing seems counter-productive to me. That is what > LISTEN/NOTIFY, Websockets, name your async method, are for. I understand. On the other hand, there's maybe 2 people using the box at once. The limiting resource is programmer. Plain old http is dirt simple. Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On 02/02/2016 01:30 PM, Karl O. Pinc wrote: > On Tue, 2 Feb 2016 13:20:34 -0800 > Adrian Klaver <adrian.klaver@aklaver.com> wrote: > >> I think what they are looking for is this: >> >> http://ipython.org/ >> >> and this: >> >> http://pandas.pydata.org/ > > Thanks, but I don't think so. Those are interactive > python shells. What they want is a web-enabled psql; > a SQL shell. Actually more then that, as IPython is just one part of the Juypter stack which encompasses over 40 languages and is Web enabled through Notebooks. I would take a serious look at this before rolling your own data analysis stack. Jupyter is becoming the de facto scientific toolset and the community that surrounds it is huge. Pretty sure you will someone has already done a lot of the legwork for you. http://jupyter.org/ > > Many years ago I did a horrible hack of phpPgAdmin > to give them such a shell (it parses sql). I'd > like to move forward. > > Regards, > > Karl <kop@meme.com> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, 2 Feb 2016 13:47:51 -0800 Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > Thanks, but I don't think so. Those are interactive > > python shells. What they want is a web-enabled psql; > > a SQL shell. > > Actually more then that, as IPython is just one part of the Juypter > stack which encompasses over 40 languages and is Web enabled through > Notebooks. I would take a serious look at this before rolling your > own data analysis stack. Jupyter is becoming the de facto scientific > toolset and the community that surrounds it is huge. Pretty sure you > will someone has already done a lot of the legwork for you. > > http://jupyter.org/ Thanks. I'd glanced at it some years ago, but forgotten about it. Anyway, we don't have a data analysis stack of our own making. They just want to get the data out or otherwise query the db. (Lordy, some of them are doing analysis in Excel.) Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On 02/02/16 21:04, Karl O. Pinc wrote: > On Tue, 02 Feb 2016 19:58:16 +0100 > Federico Di Gregorii <fog@dndg.it> wrote: > >> I can say with some confidence that we'll never implement such an >> API. Even the >> ux, i.e., accepting multi-queries from the user is wrong, imho. Let >> the user >> write single queries, organize them, exrcute them using bound >> variables and NOT >> mogrify() and then return the result. Anything less is lazy. > > I'm don't think you've a complete picture of my users, what > they're doing, and why it is useful to submit multiple queries > at once. Dear Karl, I never said that your users are lazy. I said your design of the UX was lazy. Let me explain: if you want to provide everything you describe in the rest of your email, then splitting SQL in multiple queries is the least of your problems (see below). What you really want is an interactive environment with multiple "data pads", each asociated to a query, that user can easily compare, save, restore and so on. You want interactive SQL completion and that's much much more difficult than splitting a query. Just sending a multi-query to the database, leveraging your proposed API, won't help your users much. Now, how to split a query? It's easy: you split at ";" unless inside a text literal. Parsing text literals in SQL is incredibly easy: just count the number of "'" you encounter: if odd then you're inside an literal and you ignore the ";". PostgreSQL also has E'' literals but they are just a little bit more complex - you have to check for "\;". That's all. Now, parsing the SQL to implement completion, that's where things get interesting... :) TL;DR. Why add an API that nobody will use when what you need will probably cost you just a couple of hours of coding? Hope this clarifies why I don't like your proposal. federico > > I support these people: > http://amboselibaboons.nd.edu/ > http://gombechimpanzees.org/ > > Who use interfaces like: > http://papio.biology.duke.edu/babasewiki/MyStartingPage > https://gombemi.ccas.gwu.edu/gombemiwiki/Main_Page > > My users are scientists, from undergraduates to senior > researchers and American Academy of Science members, with a wide > variety of SQL expertise. They have direct access to their > databases, writing SQL to explore their data. They want a > simple, comprehensive interface. (One that not only provides > support for SQL but also allows the unsophisticated to create > their own tables, etc. Which is neither here nor there.) > > They repeatedly try different queries and compare results, > iterating until they tease the data apart. They might write a > query, run it, modify it, and then run both the modified and the > original version. They want to see the results in a single > window so they can compare. They want to be able to edit and > tweak either or both and hit a "resubmit" button and again review > the results. > > Some are sophisticated SQL users; regardless most don't want to > think about SQL. It's easier for them to cut and paste the same > query 3 times and tweak the SQL to get 3 results (adult males, > adult females, juveniles) than to try to write a single query > that does a GROUP BY. But they do want to be able to see and > compare the results of the 3 queries. > > They use SQL post-validate their data and check for sanity after > a bulk data upload. They have a file containing from a few to > 10s of SQL queries that they cut and paste into a "SQL window" > and run. Usually many of the queries produce a single number as > output. Then they examine the output and see if the numbers add > up. > > They use sets of SELECT INTO queries to copy, or pre-join, or > summarize data in the master schema, moving it into their own > schema for further analysis. They again use cut-and-paste to > replay these operations when there's new data from a fresh bulk > data upload. (They likewise have full permissions, within their > own schemas, to create whatever tables they need to support their > analysis.) > > Etc. > > Of course everything _could_ be done by submitting a single query > at a time. (Or an interface that requires such parsing by the > end-user, and then runs them in bulk. Or some sort of arbitrary > file format that contains easily parse-able statement delimiters. > Etc.) But an interface that allows submission of multiple > statements in a single go, and review of all the results at once, > greatly simplifies their work flow. (And it's nice that their > interface has point-and-click table creation, deletion, etc.) > > I'm surprised (Shocked I say! Shocked! :-) that you'd want to > place arbitrary restrictions on the user interfaces that can be > created with psycopg2. And Sir! I must take offense at the > implication that my users are lazy. They work hard to press the > semicolon key to end their statements! ;-) > > Seriously, the real world is complex. If PG allows me to create > a user interface where the SQL parsing into separate statements > is done by PG then psycopg2 should let me do the same. There's > a reason you can pipe multiple SQL statements to psql without > having to pre-parse them into separate statements. > > If you really want to reject the idea of accepting multiple > queries from the user let me know and I'll not pursue the > matter. > > Regards, > > > Karl <kop@meme.com> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > -- Federico Di Gregorio federico.digregorio@dndg.it DNDG srl http://dndg.it There's no greys, only white that's got grubby. I'm surprised you don't know that. And sin, young man, is when you treat people as things. Including yourself. -- Granny Weatherwax
On 02/02/16 16:30, Karl O. Pinc wrote: > Thanks, but I don't think so. Those are interactive > python shells. What they want is a web-enabled psql; > a SQL shell. > > Many years ago I did a horrible hack of phpPgAdmin > to give them such a shell (it parses sql). I'd > like to move forward. I haven't played much with your particular problem, but it seems that adminer does what you want. You can SELECT from a table from the left panel, then restrict the data that you want using dropdowns, and you'll see the generated SQL before the results with an Edit link. If you click on that and then add a semicolon and another query, it returns two sets of results. Joe
On Wed, 3 Feb 2016 09:15:52 +0100 Federico Di Gregorio <fog@dndg.it> wrote: > On 02/02/16 21:04, Karl O. Pinc wrote: > > On Tue, 02 Feb 2016 19:58:16 +0100 > > Federico Di Gregorii <fog@dndg.it> wrote: > > > >> I can say with some confidence that we'll never implement such an > >> API. Even the > >> ux, i.e., accepting multi-queries from the user is wrong, imho. Let > >> the user > >> write single queries, organize them, exrcute them using bound > >> variables and NOT > >> mogrify() and then return the result. Anything less is lazy. Dear Federico, Thanks for your reply. > I never said that your users are lazy. I'm sorry about my crack about lazy users, it was a joke. There was something about your statement which bothered me and prompted a joking response. I've figured out what it was. Your mention of mongrify() made me think that your concern was that my proposed API was too dangerous. That people could not be trusted with it. Such a notion bothers me. Naturally, an API should not go out of it's way to encourage mis-use. But when an underlying functionality exists, even though it could be mis-used, I believe that it is overly paternalistic to keep that functionality from the programmer solely because it can be mis-used. Of course there are other legitimate reasons for limiting the scope of a set of APIs. And reasons for inclusion. Since psycopg2's purpose is to expose the libpq API to Python users I'd hope that one of the project's goals would be to make _all_ of the functionality of libpq available in Python; and that this would weigh in on the side of adding an API as proposed. > I said your design of the UX > was lazy. Let me explain: if you want to provide everything you > describe in the rest of your email, then splitting SQL in multiple > queries is the least of your problems (see below). What you really > want is an interactive environment with multiple "data pads", each > asociated to a query, that user can easily compare, save, restore and > so on. You should know up-front that we have, and have had, a user interface. It takes the form of a modified phpPgAdmin. They have, rightly IMO, rejected introducing an SQL parser into their code base. Moreover phpPgAdmin is starting to show it's age and is no longer well maintained. Someday we'll have to replace our interface and that's what I'm looking toward. Sure. Multiple "data pads" would be nice. I must disagree that each would be associated with a single query/statement. Each must be associated with multiple SQL statements, if not queries, at minimum. The users often begin a transaction, run various statements which may create temporary tables and use them to update data in real tables -- at minimum alter data in real tables without first creating temporary tables, and then run one or more queries that test the result. Finally they roll the transaction back. When their SQL is working as desired they change the rollback to a commit and re-execute. Naturally, more than one table is often updated within a single transaction so there is more than one insert/update statement executed. Sometimes the process can take days before the final SQL is settled upon. Or the resulting SQL re-executed periodically. No fancy interface is required in these cases. Cut and paste of SQL between "data pad" and a text document is a simple and universal interface. And it gives the user the ultimate in power and flexibility when it comes to archiving, documenting data flow, dissemination of SQL, comparison between past and present, etc. One "data pad" is good enough, so long as it accepts multiple statements. Without multiple statements you can't try data modifications out within transactions and report on the result unless, critically, you make permanent potentially erroneous modifications to your data. The user experience is greatly enhanced if "data pads" accept multiple queries, not just multiple statements. Otherwise, in order to see more than one aspect of a series of proposed data updates, the user would have to do something like copy those portions of the SQL work flow that update data from "data pad" to "data pad" in order to, in the end, have executed multiple queries to investigate various aspects of the result of the changes to db content. The user should not have to think about such issues. It's also helpful to the analytic process to be able to drop "debugging" queries into the middle of a bunch of data updates so as to understand the effect of each update. In fact this happens all the time. > You want interactive SQL completion and that's much much more > difficult than splitting a query. Interactive SQL completion might be nice, but only in that it saves a bit of typing. People got along fine for years without completion in shell and my users get along fine without SQL completion. If it was free, sure. But who's going to pay to do the difficult job of programming SQL completion? (And if you were going to implement such a feature, shouldn't it be part of Postgres so as to both leverage existing code and be available to everybody?) Likewise the whole pointy-clicky business of constructing SQL by way of GUI might be nice when one is just getting started with SQL. After that, typing is a whole lot faster. Graphical SQL construction becomes a giant annoyance unless it can be entirely ignored. > Just sending a multi-query to the > database, leveraging your proposed API, won't help your users much. I have to disagree. It's extremely useful to be able to send an arbitrary mix of SQL to the server and see the effect of every statement. Not just the output of the last query. Imagine a programming IDE with the limitations you propose. I hope this better explains why a truly generic SQL execution environment is a critical UI feature. What really matters is what my users think. I can assure you the verdict is in. Submitting multiple SQL statements, sometimes containing multiple queries, and getting both counts and query results back from every statement is a nearly indispensable part of their user interface. > Now, how to split a query? It's easy: you split at ";" unless inside > a text literal. Parsing text literals in SQL is incredibly easy: just > count the number of "'" you encounter: if odd then you're inside an > literal and you ignore the ";". PostgreSQL also has E'' literals but > they are just a little bit more complex - you have to check for "\;". > That's all. Now, parsing the SQL to implement completion, that's > where things get interesting... :) Actually, this is what we've been doing. Although it's not as simple as you think. There is also dollar quoting. And there's the hassle of having to go back and revisit the code. We've had to do this twice. Once for the 'E' literals and once for dollar quoting. (We've had this interface going for a long time. Although not in Python.) Not to mention that upstream wants nothing to do with putting a parser in their code. This greatly increases the hassle when upgrading to a new upstream version. At the same time we've been loath to produce an entirely custom interface when upstream has 98% of what we need. I've tried parsing. It's time to try something different. > TL;DR. Why add an API that nobody will use when what you need will > probably cost you just a couple of hours of coding? Yours is an excellent argument. Let me address both aspects of your question. First the approach our project takes to programming and where we'd like to invest our couple of hours of programming. And second the larger issue of why psycopg2 would want such an API. A guiding principle of ours is to use toolsets and interfaces that are as generic as possible, and obtained from 3rd parties whenever possible. The amount of work dedicated to user interface is to be minimized. The programming budget is limited and is to be directed to essential functionality. User interfaces are expensive. As soon as a program interacts with a human 90% of the programmer's work goes toward interaction and error handling and only about 10% goes to code that actually manipulates data and it's state. So, we presently use a 3rd party database admin tool. Someday we may be forced to supplement this. Our users are sophisticated enough to eschew bling. The Amboseli Baboon project's been going since 1974, switched to a "real" database from ad-hoc SPSS in a punchcard format around 1992, and have been creatively adapting the latest technology of every sort into all aspects their work in a fashion geared toward maximal return on investment since the beginning. When it comes to databases they want an interface that delivers power, and know what that means. I'd love to spend more time and make a more elegant user interface. My user's priorities are elsewhere. So, yes. I am lazy. Libpq provides access to a parser through PQsendQuery() and friends. Why should I (continuously) re-invent the wheel? If you don't want to incorporate the suggested APIs I will probably try subclassing psycopg2.extensions.cursor(). This seems even easier than writing an SQL parser. I'm pretty sure this would work, although I'm also pretty sure it'd have ugly-ish bits that need to know the internals of it's super()s. Although this would be sad, I think I'd prefer keeping up with psycopg2's internals to keeping up with changes to PG's SQL syntax. I won't know until I look. As to why psycopg2 should add the proposed API when "nobody will use it": There is the argument, at top, regards psycopg2 providing an interface that exposes all the functionality available within libpq. Clearly libpq makes it possible to create the user interface I desire. (Whether or not you believe that such a UI should exist.) Using PQsendQuery()/PQgetResult() is a way to create such a UI. (See this thread, where consensus seems to be that, in C at least, parsing SQL is undesirable and PQsendQuery()/PQgetResult() is the way to replicate psql's acceptance of multiple SQL queries: http://marc.info/?l=postgresql-general&m=134877413732485&w=2) As to "nobody will use it", this depends on how you count. I do believe that you are right in that very few programs will ever provide generic SQL "data pads"; interfaces allowing submission of arbitrary SQL. However very, very many _people_ do use an interface which allows submission of multiple arbitrary SQL statements. Every "all purpose" database administration tool has such an interface. phpPgAdmin, PgAdmin, adminer editor are the PG examples. Not to mention the various MySQL admin interfaces. And of course psql. The only question is whether or not the interface violates the principle of least surprise by displaying a single result when supplied with multiple SQL queries. Or whether it does something more sophisticated, and in my opinion much more useful, and behaves as psql does. I believe that psycopg2 would be a good choice if one was to write a generic admin interface to PG in Python. If so, as psycopg2 exists now, such a Python program would not be able to replicate psql's behavior when piped an arbitrary collection of SQL statements. Not, at least, unless the author implemented and maintained yet another SQL parser for PG per your suggestion. Perhaps the world already has enough generic database admin tools, or functional subsets thereof. It doesn't need another written in Python. But is that for psycopg2 to judge? In any case I believe that there are other legitimate use-cases for the proposed API, although it may not be used often. Reporting is one possible use, and other sorts of batch-related processing. I don't have vast experience to back this up with specifics but I believe that multiple SQL statements are piped to psql on a not-uncommon basis. Surely some of these rely on psql's ability to output a result per-statement. Why shouldn't there be Python programs that support this functionality? > Hope this clarifies why I don't like your proposal. Yes. Thanks very much. It does help me understand your thoughts and I appreciate you spending your time. From another standpoint though, no. I don't understand at all. Why should psycopg2 be passing judgment on what user interface is appropriate for interacting with a PG database? It's fine to reject APIs that just won't get used, but I question whether it's possible to know that a libpq feature won't be useful in unanticipated ways once made available via Python. It seems only natural to expose all of libpq's functionality. Picking and choosing ways of exposing features is one thing. Deciding that a libpq feature just shouldn't be available is something else. Something that's not been discussed at all is the proposed notion of eliminating client-side buffering. This is trading CPU on one side for reduced memory consumption and lower startup latency when returning query results on the other. It matters when result sets are large. As it is now, when a query produces a million rows of results all these rows are sent to the client and buffered before libpq delivers any rows to the application. (Absent server-side cursors, et-al.) Of course you could extend cursor.execute() in place of using the proposed API. I thought the "no buffering" feature would be useful. It would likely involve more changes to existing code than the ability to execute multiple queries, which mostly would require new code. Since "no buffering" is useful when executing queries, and both new features require a PQsendQuery() call, and the existence of a "no buffering" feature would be appreciated by my users, I figured I'd build the "no buffering" feature into the new API proposals. Thanks for listening. Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein P.S. We call "data pads" "SQL windows".
On Wed, 03 Feb 2016 15:27:25 -0500 Joe Abbate <jma@freedomcircle.com> wrote: > On 02/02/16 16:30, Karl O. Pinc wrote: > > Thanks, but I don't think so. Those are interactive > > python shells. What they want is a web-enabled psql; > > a SQL shell. > I haven't played much with your particular problem, but it seems that > adminer does what you want. You can SELECT from a table from the left > panel, then restrict the data that you want using dropdowns, and > you'll see the generated SQL before the results with an Edit link. > If you click on that and then add a semicolon and another query, it > returns two sets of results. Thanks very much for the link. Adminer may be very helpful at some point. I did glance at it long ago. I think then maybe it did not support Postgres. In any case I'd forgotten about it. I'd prefer to dispense with the whole notion of building SQL from a GUI. Rarely is a single table queried. Better to go straight to the edit window. I don't have time to look at this now, and have no immediate need either. It must parse though, since php does not have support for PQsendQuery()/PQgetResult() as far as I know. Gotta wonder how it will do with: SELECT $;$Hello World.$;$; And thanks for help with, and thinking about, my problem. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On 05/02/16 03:02, Karl O. Pinc wrote: > I did glance at it long ago. I think then maybe it did not support > Postgres. In any case I'd forgotten about it. It does show its MySQL roots in some places (e.g., SERIAL shown as Auto Increment), but it has supported PG since at least February 2011 (because I wrote a blog post about it -- https://pyrseas.wordpress.com/2011/02/24/interesting-database-tool-adminer/ ) > I'd prefer to dispense with the whole notion of building > SQL from a GUI. Rarely is a single table queried. Better to go > straight to the edit window. It does have an "SQL command" link that takes you to an "edit" window direcly and you could specify directly, e.g., http://localhost/adminer/?pgsql=localhost&username=yourusername&db=yourdbname&ns=public&sql= > I don't have time to look at this now, and have no > immediate need either. It must parse though, since > php does not have support for PQsendQuery()/PQgetResult() > as far as I know. > > Gotta wonder how it will do with: > > SELECT $;$Hello World.$;$; It does give a syntax error on that exact statement (just as psql does), but it returns ?column? $;$Hello World.$;$ if you specify it as SELECT '$;$Hello World.$;$'; What I'm still amazed at is how adminer manages to present a graphical schema display. I haven't looked at phpPgAdmin recently but that (and its ease of use) won me over, even over pgAdmin. Note: I'm using Adminer 3.3.3 (which is Debian stretch's default) but there's a newer version: 4.2.3. Joe
On Fri, 05 Feb 2016 08:17:17 -0500 Joe Abbate <jma@freedomcircle.com> wrote: > > Gotta wonder how it will do with: > > > > SELECT $;$Hello World.$;$; > > It does give a syntax error on that exact statement Ah. That would make sense because that's illegal SQL. :-) "The tag, if any, of a dollar-quoted string follows the same rules as an unquoted identifier, except that it cannot contain a dollar sign." Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
On Wed, 03 Feb 2016 15:27:25 -0500 Joe Abbate <jma@freedomcircle.com> wrote: > On 02/02/16 16:30, Karl O. Pinc wrote: > > Thanks, but I don't think so. Those are interactive > > python shells. What they want is a web-enabled psql; > > a SQL shell. > I haven't played much with your particular problem, but it seems that > adminer does what you want. FYI. I can't get adminer editor to work. Perhaps because we have schemas, and no "public" schema. It tells me there are no tables and I have no further options. Providing a (schema qualified, or not) table name to search yields no change in the interface. The adminer sql window interface shows some promise, although there does not seem to be any provision for data export. This is another essential feature. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein