Re: Endless loop calling PL/Python set returning functions

Поиск
Список
Период
Сортировка
От Alexey Grishchenko
Тема Re: Endless loop calling PL/Python set returning functions
Дата
Msg-id CAH38_tkjuyo3yAZrqnZ=-kCb7TOoMJG56rB6TXiK-SQXTf+1mA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Endless loop calling PL/Python set returning functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Endless loop calling PL/Python set returning functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
No, my fix handles this well.

In fact, with the first function call you allocate global variables representing Python function input parameters, call the function and receive iterator over the function results. Then in a series of Postgres calls to PL/Python handler you just fetch next value from the iterator, you are not calling the Python function anymore. When the iterator reaches the end, PL/Python call handler deallocates the global variable representing function input parameter.

Regardless of the number of parallel invocations of the same function, each of them in my patch would set its own input parameters to the Python function, call the function and receive separate iterators. When the first function's result iterator would reach its end, it would deallocate the input global variable. But it won't affect other functions as they no longer need to invoke any Python code. Even if they need - they would reallocate global variable (it would be set before the Python function invocation). The issue here was in the fact that they tried to deallocate the global input variable multiple times independently, which caused error that I fixed.

Regarding the patch for the second case with recursion - not caching the "globals" between function calls would have a performance impact, as you would have to construct "globals" object before each function call. And you need globals as it contains references to "plpy" module and global variables and global dictionary ("GD"). I will think on this, maybe there might be a better design for this scenario. But I still think the second scenario requires a separate patch

On Thu, Mar 10, 2016 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexey Grishchenko <agrishchenko@pivotal.io> writes:
> One scenario when the problem occurs, is when you are calling the same
> set-returning function in a single query twice. This way they share the
> same "globals" which is not a bad thing, but when one function finishes
> execution and deallocates input parameter's global, the second will fail
> trying to do the same. I included the fix for this problem in my patch

> The second scenario when the problem occurs is when you want to call the
> same PL/Python function in recursion. For example, this code will not work:

Right, the recursion case is what's not being covered by this patch.
I would rather have a single patch that deals with both of those cases,
perhaps by *not* sharing the same dictionary across calls.  I think
what you've done here is not so much a fix as a band-aid.  In fact,
it doesn't even really fix the problem for the two-calls-per-query
case does it?  It'll work if the first execution of the SRF is run to
completion before starting the second one, but not if the two executions
are interleaved.  I believe you can test that type of scenario with
something like

  select set_returning_function_1(...), set_returning_function_2(...);

                        regards, tom lane



--
Best regards,
Alexey Grishchenko

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Is there a way around function search_path killing SQL function inlining?
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: Add generate_series(date,date) and generate_series(date,date,integer)