Обсуждение: Multiple "selects" returned from a single stored procedure
Hi list, I'm in the process of porting an existing MS-SQL database to PostgreSQL. The application uses OLE DB (and that's why I'm writing the OLE DB for Postgresql). One of the requirements is that we will introduce as little changes to the application. It has to be able to work with both databases, as well as Access (which is not really a database). Now the question: MS-SQL has the capacity for both out variables from stored procedures, as well as running several "selects" inside the procedures, and then giving the results for all selects to the caller. Fortunetly for me, that specific application doesn't run more than one select per stored procedure. The way I handled out variables so far was to have the function return a compound type, with the variables as rows. With embedded selects, however, this will no longer work. I guess what I would like to suggest is for the thus far unused "select" command in PLPGSQL to be used, in some way, to return values outside the scope of the strict "returns" context. I guess out variables will also be nice, but that's besides the point. If anyone has any ideas on how to both modify called parameters, and return a rowset, please let me know. Best I came up with so far was to create a temporary table for the out vars or the selects. I can then rig the OLE DB to make it look as if the function returned that. Shachar -- Shachar Shemesh Lingnu Open Systems Consulting http://www.lingnu.com/
Shachar Shemesh wrote: > MS-SQL has the capacity for both out variables from stored procedures, > as well as running several "selects" inside the procedures, and then > giving the results for all selects to the caller. Fortunetly for me, > that specific application doesn't run more than one select per stored > procedure. > > The way I handled out variables so far was to have the function return a > compound type, with the variables as rows. With embedded selects, > however, this will no longer work. > > I guess what I would like to suggest is for the thus far unused "select" > command in PLPGSQL to be used, in some way, to return values outside the > scope of the strict "returns" context. I guess out variables will also > be nice, but that's besides the point. This has come up before (search the archives). I think the answer is to implement actual stored procedures (as opposed to functions, which is what we now have). A stored procedure call, per SQL99/2003 would look something like: call sp_my_stored_proc(); (which in MSSQL looks like "exec sp_my_stored_proc()") The difference between this and an SRF is that the stored procedure cannot be used in a FROM clause, and therefore cannot be joined with other data or filtered with WHERE criteria. But that fact also means that we should be able to deal with projecting multiple heterogenous result sets, and the structure of the sets does not need to be known in advance. > If anyone has any ideas on how to both modify called parameters, and > return a rowset, please let me know. Best I came up with so far was to > create a temporary table for the out vars or the selects. I can then rig > the OLE DB to make it look as if the function returned that. I wonder if you could write an SRF that returns setof refcursor, and then expand the cursors one-by-one in the OLE DB layer. See: http://www.postgresql.org/docs/7.4/interactive/plpgsql-cursors.html (37.8.3.3. Returning Cursors) HTH, Joe
Joe Conway wrote: > Shachar Shemesh wrote: > >> I guess what I would like to suggest is for the thus far unused "select" >> command in PLPGSQL to be used, in some way, to return values outside the >> scope of the strict "returns" context. I guess out variables will also >> be nice, but that's besides the point. > > > This has come up before (search the archives). I did. Found people who asked about it, but didn't find an actual answer. > I think the answer is to implement actual stored procedures (as > opposed to functions, which is what we now have). A stored procedure > call, per SQL99/2003 would look something like: > call sp_my_stored_proc(); > (which in MSSQL looks like "exec sp_my_stored_proc()") The current docs say, at least from within pgplsql, that "call foo" translates to "select * from foo". psql doesn't seem to carry a "call" command at all. From PgOleDb, I just do "select * from foo". > The difference between this and an SRF is that the stored procedure > cannot be used in a FROM clause, and therefore cannot be joined with > other data or filtered with WHERE criteria. I don't see that as a problem. > But that fact also means that we should be able to deal with > projecting multiple heterogenous result sets, and the structure of the > sets does not need to be known in advance. That would require some way of actually returning the results, wouldn't it? >> return a rowset, please let me know. Best I came up with so far was to >> create a temporary table for the out vars or the selects. I can then rig >> the OLE DB to make it look as if the function returned that. > > If anyone has any ideas on how to both modify called parameters, and > > I wonder if you could write an SRF that returns setof refcursor, and > then expand the cursors one-by-one in the OLE DB layer. Yes, that seems to be the direction to go. THANKS! I'll see how easy it will be to implement in OLE DB, but returning setof refcursor certainly allows me to return several rowsets with different column info. I have several ways I can take this solution, and I would like your opinion: 1. Simply assume that if a command returned a single column of refcursors, that it meant to return several rowsets, and implement the corresponding OLE DB interface. 2. Require that the cursors be named a certain way, according to their intended usage. If the command returned cursors named "<unnamed portal 9>", to just treat it as is, while if it returned a cursor named "MultiResult1", treat it as above? The advantage of 2 is that it allows me to simulate out variables. If the cursor is called "outputvars", I direct it to the output variables interface. The disadvantage is that I'm not sure what to do if only some of the rows in the result are named MultiResult. Whatever method I'll use, I may have to start a transaction for the purpose of the command, if we were not already in one. Otherwise, the ref-cursors are just useless strings. > HTH, > > Joe > Thanks, Shachar -- Shachar Shemesh Lingnu Open Source Consulting http://www.lingnu.com/
Shachar Shemesh wrote: >> I think the answer is to implement actual stored procedures (as >> opposed to functions, which is what we now have). A stored procedure >> call, per SQL99/2003 would look something like: >> call sp_my_stored_proc(); >> (which in MSSQL looks like "exec sp_my_stored_proc()") > > The current docs say, at least from within pgplsql, that "call foo" > translates to "select * from foo". psql doesn't seem to carry a "call" > command at all. From PgOleDb, I just do "select * from foo". I think you're misinterpreting the docs. The CALL keyword has not been implemented in Postgres AFAIK. Where precisely did you see that? >> The difference between this and an SRF is that the stored procedure >> cannot be used in a FROM clause, and therefore cannot be joined with >> other data or filtered with WHERE criteria. > > I don't see that as a problem. It is THE problem as far as implementing what you're looking for. An SRF will never be able to deal with multiple results sets because the number and types of the returned attributes must be known/resolved when the query is parsed. On the other hand, a stored procedure would not have this restriction if it, *by design could not* participate in a normal SELECT. >> But that fact also means that we should be able to deal with >> projecting multiple heterogenous result sets, and the structure of the >> sets does not need to be known in advance. > > That would require some way of actually returning the results, wouldn't it? Sure. I should think you'd allow a normal SELECT statement in your stored procedure, exactly as you would do in MSSQL. The result tuples could be formed and projected in similar fashion to EXPLAIN or SHOW ALL. See, for example, ShowAllGUCConfig() in guc.c. > I have several ways I can take this solution, and I would like your > opinion: > 1. Simply assume that if a command returned a single column of > refcursors, that it meant to return several rowsets, and implement the > corresponding OLE DB interface. This seems cleanest, except you might want a configurable option to turn it off, in case someone really wants the refcursor results. > 2. Require that the cursors be named a certain way, according to their > intended usage. If the command returned cursors named "<unnamed portal > 9>", to just treat it as is, while if it returned a cursor named > "MultiResult1", treat it as above? > > The advantage of 2 is that it allows me to simulate out variables. If > the cursor is called "outputvars", I direct it to the output variables > interface. > The disadvantage is that I'm not sure what to do if only some of the > rows in the result are named MultiResult. This one seems a bit grotty to me. But the only other kludge I can think of to similate output variables would require some backend hacking, or at least a user defined C function. In case you're interested, here is the idea. Implement functions necessary to create, change, and remove session local variables. If the function uses named parameters (recently implemented for functions) *and* one or more session local variable of the same names are found, assume the value of the variables are your outputvar results. > Whatever method I'll use, I may have to start a transaction for the > purpose of the command, if we were not already in one. Otherwise, the > ref-cursors are just useless strings. Cursors can now outlive transactions (DECLARE ... WITH HOLD), but there is the downside (recently discussed on one of the lists) that when the transaction is ended, the cursor is copied to a tuplestore. As long as the tuplestore fits within sort_mem (work_mem in 7.5+), it will be held entirely within memory. If not, it will spill to disk. Joe
Joe Conway wrote: > I think you're misinterpreting the docs. The CALL keyword has not been > implemented in Postgres AFAIK. Where precisely did you see that? Can't find it any more. I'm pretty sure I read it somewhere. Doesn't matter. >> I don't see that as a problem. > > > It is THE problem as far as implementing what you're looking for. An > SRF will never be able to deal with multiple results sets because the > number and types of the returned attributes must be known/resolved > when the query is parsed. On the other hand, a stored procedure would > not have this restriction if it, *by design could not* participate in > a normal SELECT. What I meant is that I don't see not being able to use stored procedures as a FROM source as a problem. > Sure. I should think you'd allow a normal SELECT statement in your > stored procedure, exactly as you would do in MSSQL. The result tuples > could be formed and projected in similar fashion to EXPLAIN or SHOW > ALL. See, for example, ShowAllGUCConfig() in guc.c. I'm more into client-side hacking than server-side hacking. I'm currently using libpq for client-server communications, so bumping the protocol version a notch higher won't affect me much, but it's a point to consider. > >> I have several ways I can take this solution, and I would like your >> opinion: >> 1. Simply assume that if a command returned a single column of >> refcursors, that it meant to return several rowsets, and implement >> the corresponding OLE DB interface. > > > This seems cleanest, except you might want a configurable option to > turn it off, in case someone really wants the refcursor results. As soon as I figure out how to pass custom-attributes to OLE DB. Docs seem hazy on this. Marek, who originally wrote OLE DB, did manage to figure it out. Unfotunetly, I had to dump his code entirely for unrelated reasons. I'll be glad if he reintroduced it into this iteration (hint hint). In any case, as OLE DB uses a binary interface to pass data around, I need to know of all data types you would want to participate in any activity. This means that refcursors will not be supported on their own unless I figure out how to return them standalone, which I don't see happening. > >> 2. Require that the cursors be named a certain way, according to >> their intended usage. If the command returned cursors named "<unnamed >> portal 9>", to just treat it as is, while if it returned a cursor >> named "MultiResult1", treat it as above? >> >> The advantage of 2 is that it allows me to simulate out variables. If >> the cursor is called "outputvars", I direct it to the output >> variables interface. >> The disadvantage is that I'm not sure what to do if only some of the >> rows in the result are named MultiResult. > > > This one seems a bit grotty to me. But the only other kludge I can > think of to similate output variables would require some backend > hacking, or at least a user defined C function. I can also assume all refcursors are rowsets EXCEPT if one is named "outputvars". Then again, I'm doing this work in the context of the needs of a particular client, and he already said it was ok with him to pull the output vars from one of the rowsets. > > In case you're interested, here is the idea. Implement functions > necessary to create, change, and remove session local variables. If > the function uses named parameters (recently implemented for functions) There's one of the rubs. I'm basing my development on 7.4 backend. This has to go into production ASAP, and I can't rely on 7.5 being out by the time I'm done. Also, I can't know (from the client) what the input vars are called unless I perform another query. The interface that does this extra query is highly recommended against by the OLE DB docs, for that reason. Interestingly enough, my specific client does this extra query. However, I don't believe in writing general code based on a single entity's needs. The design must be one capable of expanding, even if the implementation is sparse (which it is, at the moment). > *and* one or more session local variable of the same names are found, > assume the value of the variables are your outputvar results. My backend knowledge is not good enough to understand this in depth, but I'll keep this message in case the need arises for later re-reading. >> Whatever method I'll use, I may have to start a transaction for the >> purpose of the command, if we were not already in one. Otherwise, the >> ref-cursors are just useless strings. > > > Cursors can now outlive transactions (DECLARE ... WITH HOLD), but > there is the downside (recently discussed on one of the lists) that > when the transaction is ended, the cursor is copied to a tuplestore. > As long as the tuplestore fits within sort_mem (work_mem in 7.5+), it > will be held entirely within memory. If not, it will spill to disk. No, I think that's an overhead I can't afford. Let's not forget that I'm a driver, and I'm expected to be relatively transparent, performance wise. At the moment, I'm not using cursors (lazy fetching), so libpq is allocating quite a bit of memory on the client side. Reserving such an amount on the server side seems a little too much. Just out of curiosity, how do I manually destroy the cursor when it's no longer needed? Just do "close refcursor"? Also, does this copy take place when the table is changed, or as soon as the transaction ends? If the former, it may not matter. In any case, my original concern was that I'll be changing semantics by opening another transaction. Upon retrospect, however, that's precisely what PostgreSQL is doing if one is not active. Assuming the OLE DB user uses the Transaction interface, and does not send a "begin" command, I'm capable of knowing whether I'm inside a transaction or not. > Joe Shachar -- Shachar Shemesh Lingnu Open Source Consulting http://www.lingnu.com/
Shachar Shemesh wrote: > Just out of curiosity, how do I manually destroy the cursor when it's no > longer needed? Just do "close refcursor"? Yup: http://www.postgresql.org/docs/current/static/sql-close.html > Also, does this copy take place when the table is changed, or as soon as > the transaction ends? If the former, it may not matter. IIRC it is the latter. Joe