Обсуждение: Array of composite types returned from python
<div class="WordSection1"><p class="MsoNormal">I’ve endeavored to enable the return of arrays of composite types from codewritten in PL/Python. It seems that this can be accomplished though a very minor change to the code:<p class="MsoNormal"> <pclass="MsoNormal">On line 401 in the file <span style="font-family:"Courier New"">src/pl/plpython/plpy_typeio.c</span>,remove the error report “PL/Python functions cannot return type…” and replaceit with the command <p class="MsoNormal"><span style="font-family:"Courier New"">arg->func = PLyObject_ToComposite;</span><pclass="MsoNormal"> <p class="MsoNormal">From all that I can see, this does exactly what Iwant. A python list of tuples is converted to an array of composite types in SQL. <p class="MsoNormal"> <p class="MsoNormal">Iran the main and python regression suites for both python2 and python3 with assert enabled. The only discrepanciesI got were ones that were due to the output expecting an error. When I altered the .out files to the expectedbehavior, it matched just fine. <p class="MsoNormal"> <p class="MsoNormal">Am I missing anything, (ie memory leak,undesirable behavior elsewhere)? <p class="MsoNormal"> -Ed <p class="MsoNormal"> <p class="MsoNormal"> <pclass="MsoNormal" style="margin-top:3.05pt"><img align="left" height="115" hspace="12" src="cid:image002.png@01CF4465.6A8B1840"v:shapes="Picture_x0020_1" width="188" /><a name="AllText"></a><a name="EName"></a><b><spanstyle="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#231F20">Ed Behn<span style="letter-spacing:.1pt"></span></span></b><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#231F20">/Staff Engineer<span style="letter-spacing:-.4pt"></span>/ Airline and Network Services<br /><span style="position:relative;top:.5pt;mso-text-raise:-.5pt">InformationManagement Services</span><br />2551 Riva Road, Annapolis,MD 21401 USA<br />Phone:<span style="letter-spacing:-.35pt"> </span>410.266.4426 / Cell: 240.696.7443<br /><a name="Email"></a>ebehn@arinc.com<br/></span><a href="http://www.rockwellcollins.com/"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#231F20;text-decoration:none">ww<span style="letter-spacing:-.35pt">w</span>.rockwellcollins.com</span></a><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#231F20"><br/><br /></span><p class="MsoNormal"> </div>
On Thu, Mar 20, 2014 at 4:54 PM, Behn, Edward (EBEHN) <EBEHN@arinc.com> wrote: > > I've endeavored to enable the return of arrays of composite types from code written in PL/Python. It seems that this canbe accomplished though a very minor change to the code: > > On line 401 in the file src/pl/plpython/plpy_typeio.c, remove the error report "PL/Python functions cannot return type..."and replace it with the command > > arg->func = PLyObject_ToComposite; > > From all that I can see, this does exactly what I want. A python list of tuples is converted to an array of composite typesin SQL. > > I ran the main and python regression suites for both python2 and python3 with assert enabled. The only discrepancies Igot were ones that were due to the output expecting an error. When I altered the .out files to the expected behavior, itmatched just fine. > > Am I missing anything, (ie memory leak, undesirable behavior elsewhere)? Don't know, but I'd definitely submit that patch to the next open fest. That's a very useful gain for such a small change. merlin
Attached is the patch for the code change described below:
Project Name : Allow return array of composites from PL/Python
Currently, a SQL array of non-composite variables can be returned from PL/Python code by returning a iterable object. A SQL composite value may be returned by returning a iterable or a subscriptable object. However, returning a list of objects that each represent a composite variable is not converted to an SQL array of composite variables. This code change allows that conversion to take place. This allows for smooth, elegant interface between SQL and PL/Python.
This is a patch against MASTER
The patch compiles successfully. All the standard regression tests pass. Some modifications are needed for the .out files in order for the PL/Python regression tests to pass. This is due to the fact that the current .out files expect errors when a python array of composite is converted, where my modifications expect success. All tests have been performed with both Python2 and Python3.
-Ed
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Behn, Edward (EBEHN)
Sent: Thursday, March 20, 2014 5:54 PM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Array of composite types returned from python
I’ve endeavored to enable the return of arrays of composite types from code written in PL/Python. It seems that this can be accomplished though a very minor change to the code:
On line 401 in the file src/pl/plpython/plpy_typeio.c, remove the error report “PL/Python functions cannot return type…” and replace it with the command
arg->func = PLyObject_ToComposite;
From all that I can see, this does exactly what I want. A python list of tuples is converted to an array of composite types in SQL.
I ran the main and python regression suites for both python2 and python3 with assert enabled. The only discrepancies I got were ones that were due to the output expecting an error. When I altered the .out files to the expected behavior, it matched just fine.
Am I missing anything, (ie memory leak, undesirable behavior elsewhere)?
-Ed
Ed Behn / Staff Engineer / Airline and Network Services
Information Management Services
2551 Riva Road, Annapolis, MD 21401 USA
Phone: 410.266.4426 / Cell: 240.696.7443
ebehn@arinc.com
www.rockwellcollins.com
Вложения
Am I missing anything, (ie memory leak, undesirable behavior elsewhere)? -Ed I applied the patch and it looks like it is working well. As a longtime plpython user, I appreciate the fix. I have a few comments: 1) I would remove the error message from the PO files as well. 2) You removed the comment: - /* - * We don't support arrays of row types yet, so the first argument - * can be NULL. - */ But didn't change the code there. I haven't delved deep enough into the code yet to understand the full meaning, but the comment would indicate that if arraysof row types are supported, the first argument cannot be null. 3) This is such a simple change with no new infrastructure code (PLyObject_ToComposite already exists). Can you think ofa reason why this wasn't done until now? Was it a simple miss or purposefully excluded?
Hi. When this patch was first added to a CF, I had a quick look at it, but left it for a proper review by someone more familiar with PL/Python internals for precisely this reason: > 2) You removed the comment: > - /* > - * We don't support arrays of row types yet, so the first argument > - * can be NULL. > - */ > > But didn't change the code there. > I haven't delved deep enough into the code yet to understand the full > meaning, but the comment would indicate that if arrays of row types > are supported, the first argument cannot be null. I had another look now, and I think removing the comment is fine. It actually made no sense to me in context, so I went digging a little. After following a plpython.c → plpy_*.c refactoring (#147c2482) and a pgindent run (#65e806cb), I found that the comment was added along with the code by this commit: commit db7386187f78dfc45b86b6f4f382f6b12cdbc693 Author: Peter Eisentraut <peter_e@gmx.net> Date: Thu Dec 10 20:43:402009 +0000 PL/Python array support Support arrays as parameters and return values of PL/Python functions. At the time, the code looked like this: + else + { + nulls[i] = false; + /* We don't support arrays of row types yet, so the first + * argument can be NULL. */ + elems[i] = arg->elm->func(NULL, arg->elm, obj); + } Note that the first argument was actually NULL, so the comment made sense when it was written. But the code was subsequently changed to pass in arg->elm by the following commit: commit 09130e5867d49c72ef0f11bef30c5385d83bf194 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon Oct 11 22:16:40 2010-0400 Fix plpython so that it again honors typmod while assigning to tuple fields. This was broken in 9.0 whileimproving plpython's conversion behavior for bytea and boolean. Per bug report from maizi. The comment should have been removed at the same time. So I don't think there's a problem here. > 3) This is such a simple change with no new infrastructure code > (PLyObject_ToComposite already exists). Can you think of a reason > why this wasn't done until now? Was it a simple miss or purposefully > excluded? This is not an authoritative answer: I think the infrastructure was originally missing, but was later added in #bc411f25 for OUT parameters. Perhaps it was overlooked at the time that the same code would suffice for this earlier-missing case. (I've Cc:ed Peter E. in case he has any comments.) I think the patch is ready for committer. -- Abhijit P.S. I'm a wee bit confused by this mail I'm replying to, because it's signed "Ed" and looks like a response, but it's "From: Sim Zacks". I've added the original author's address to the Cc: in case I misunderstood something.
At 2014-06-29 18:08:53 +0530, ams@2ndQuadrant.com wrote: > > I think the patch is ready for committer. That's based on my earlier quick look and the current archaeology. But I'm not a PL/Python user, and Ronan signed up to review the patch, so I haven't changed the status. Ronan, did you get a chance to look at it? -- Abhijit
Abhijit Menon-Sen <ams@2ndQuadrant.com> writes: > I had another look now, and I think removing the comment is fine. It > actually made no sense to me in context, so I went digging a little. > ... > Note that the first argument was actually NULL, so the comment made > sense when it was written. But the code was subsequently changed to > pass in arg->elm by the following commit: > commit 09130e5867d49c72ef0f11bef30c5385d83bf194 > Author: Tom Lane <tgl@sss.pgh.pa.us> > Date: Mon Oct 11 22:16:40 2010 -0400 > Fix plpython so that it again honors typmod while assigning to tuple fields. > This was broken in 9.0 while improving plpython's conversion behavior for > bytea and boolean. Per bug report from maizi. > The comment should have been removed at the same time. So I don't think > there's a problem here. Yeah, you're right: the comment is referring to the struct PLyTypeInfo * argument, which isn't there at all anymore. Mea culpa --- that's the same sort of failure-to-update-nearby-comments thinko that I regularly mutter about other people making :-( regards, tom lane
Abhijit Menon-Sen <ams@2ndQuadrant.com> writes: >> 3) This is such a simple change with no new infrastructure code >> (PLyObject_ToComposite already exists). Can you think of a reason >> why this wasn't done until now? Was it a simple miss or purposefully >> excluded? > This is not an authoritative answer: I think the infrastructure was > originally missing, but was later added in #bc411f25 for OUT parameters. > Perhaps it was overlooked at the time that the same code would suffice > for this earlier-missing case. (I've Cc:ed Peter E. in case he has any > comments.) > I think the patch is ready for committer. I took a quick look at this; not really a review either, but I have a couple comments. 1. While I think the patch does what it intends to, it's a bit distressing that it will invoke the information lookups in PLyObject_ToComposite over again for *each element* of the array. We probably ought to quantify that overhead to see if it's bad enough that we need to do something about improving caching, as speculated in the comment in PLyObject_ToComposite. 2. I wonder whether the no-composites restriction in plpy.prepare (see lines 133ff in plpy_spi.c) could be removed as easily. regards, tom lane
Just writing to check in. I haven't done anything to look into allowing arrays of composites for input to PL/Python function. I made the submitted modification for a specific project that I'm working on that involves python code that returns data structures. I also have no idea about a more efficient way to convert composite elements. -Ed -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Sunday, June 29, 2014 4:54 PM To: Abhijit Menon-Sen Cc: Sim Zacks; Behn, Edward (EBEHN); pgsql-hackers@postgresql.org; Peter Eisentraut Subject: Re: [HACKERS] Array of composite types returned from python Abhijit Menon-Sen <ams@2ndQuadrant.com> writes: >> 3) This is such a simple change with no new infrastructure code >> (PLyObject_ToComposite already exists). Can you think of a reason why >> this wasn't done until now? Was it a simple miss or purposefully >> excluded? > This is not an authoritative answer: I think the infrastructure was > originally missing, but was later added in #bc411f25 for OUT parameters. > Perhaps it was overlooked at the time that the same code would suffice > for this earlier-missing case. (I've Cc:ed Peter E. in case he has any > comments.) > I think the patch is ready for committer. I took a quick look at this; not really a review either, but I have a couple comments. 1. While I think the patch does what it intends to, it's a bit distressing that it will invoke the information lookups in PLyObject_ToComposite over again for *each element* of the array. We probably ought to quantify that overhead to see if it's bad enough that we need to do something about improving caching, as speculated in the comment in PLyObject_ToComposite. 2. I wonder whether the no-composites restriction in plpy.prepare (see lines 133ff in plpy_spi.c) could be removed as easily. regards, tom lane
Le dimanche 29 juin 2014 16:54:03 Tom Lane a écrit : > Abhijit Menon-Sen <ams@2ndQuadrant.com> writes: > >> 3) This is such a simple change with no new infrastructure code > >> (PLyObject_ToComposite already exists). Can you think of a reason > >> why this wasn't done until now? Was it a simple miss or purposefully > >> excluded? > > > > This is not an authoritative answer: I think the infrastructure was > > originally missing, but was later added in #bc411f25 for OUT parameters. > > Perhaps it was overlooked at the time that the same code would suffice > > for this earlier-missing case. (I've Cc:ed Peter E. in case he has any > > comments.) > > > > I think the patch is ready for committer. Sorry for being this late. I've tested the patch, everything seems to work as expected, including complex nesting of Composite and array types. No documentation changes are needed, since the limitation wasn't even mentioned before. Regression tests are ok, and the patch seems simple enough. Formatting looks OK too. > > I took a quick look at this; not really a review either, but I have > a couple comments. > > 1. While I think the patch does what it intends to, it's a bit distressing > that it will invoke the information lookups in PLyObject_ToComposite over > again for *each element* of the array. We probably ought to quantify that > overhead to see if it's bad enough that we need to do something about > improving caching, as speculated in the comment in PLyObject_ToComposite. I don't know how to do that without implementing the cache itself. > > 2. I wonder whether the no-composites restriction in plpy.prepare > (see lines 133ff in plpy_spi.c) could be removed as easily. Hum, I tried that, but its not that easy: lifting the restriction results in a SEGFAULT when trying to pfree the parameters given to SPI_ExecutePlan (line 320 in plpy_spi.c). Correct me if I'm wrong, but I think the problem is that HeapTupleGetDatum returns the t_data field, whereas heap_form_tuple allocation returns the address of the HeapTuple itself. Then, the datum itself has not been palloced. Changing the HeapTupleGetDatum call for an heap_copy_tuple_as_datum fixes this issue, but I'm not sure this the best way to do that. The attached patch implements this. > > regards, tom lane -- Ronan Dunklau http://dalibo.com - http://dalibo.org
Вложения
Hi Ronan. Based on your review, I'm marking this as ready for committer. > The attached patch implements this. Your patch looks sensible enough (thanks for adding tests), but I guess we'll let the reviewer sort out whether to commit the original or your extended version. Thanks. -- Abhijit
Ronan Dunklau <ronan.dunklau@dalibo.com> writes: > Le dimanche 29 juin 2014 16:54:03 Tom Lane a =E9crit : >> 1. While I think the patch does what it intends to, it's a bit distressing >> that it will invoke the information lookups in PLyObject_ToComposite over >> again for *each element* of the array. We probably ought to quantify that >> overhead to see if it's bad enough that we need to do something about >> improving caching, as speculated in the comment in PLyObject_ToComposite. > I don't know how to do that without implementing the cache itself. I don't either, but my thought was that we could hack up a simple one-element cache pretty trivially, eg static info and desc variables in PLyObject_ToComposite that are initialized the first time through. You could only test one composite-array type per session with that sort of kluge, but that would be good enough for doing some simple performance testing. regards, tom lane
I wrote: > Ronan Dunklau <ronan.dunklau@dalibo.com> writes: >> I don't know how to do that without implementing the cache itself. > I don't either, but my thought was that we could hack up a simple > one-element cache pretty trivially, eg static info and desc variables > in PLyObject_ToComposite that are initialized the first time through. > You could only test one composite-array type per session with that > sort of kluge, but that would be good enough for doing some simple > performance testing. I did that, and found that building and returning a million-element composite array took about 4.2 seconds without any optimization, and 3.2 seconds with the hacked-up cache (as of HEAD, asserts off). I'd say that means we might want to do something about it eventually, but it's hardly the first order of business. I've committed the patch with a bit of additional cleanup. I credited Ronan and Ed equally as authors, since I'd say the fix for plpy.prepare was at least as complex as the original patch. regards, tom lane