Re: Endless loop calling PL/Python set returning functions

Поиск
Список
Период
Сортировка
От Alexey Grishchenko
Тема Re: Endless loop calling PL/Python set returning functions
Дата
Msg-id CAH38_tkxLp2fhRjmwf-KWnto8x_r2TaTOsbzCtYNMKKK9jOoSQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Endless loop calling PL/Python set returning functions  (Alexey Grishchenko <agrishchenko@pivotal.io>)
Ответы Re: Endless loop calling PL/Python set returning functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Alexey Grishchenko <agrishchenko@pivotal.io> wrote:
Alexey Grishchenko <agrishchenko@pivotal.io> wrote:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexey Grishchenko <agrishchenko@pivotal.io> writes:
> No, my fix handles this well.
> In fact, with the first function call you allocate global variables
> representing Python function input parameters, call the function and
> receive iterator over the function results. Then in a series of Postgres
> calls to PL/Python handler you just fetch next value from the iterator, you
> are not calling the Python function anymore. When the iterator reaches the
> end, PL/Python call handler deallocates the global variable representing
> function input parameter.

> Regardless of the number of parallel invocations of the same function, each
> of them in my patch would set its own input parameters to the Python
> function, call the function and receive separate iterators. When the first
> function's result iterator would reach its end, it would deallocate the
> input global variable. But it won't affect other functions as they no
> longer need to invoke any Python code.

Well, if you think that works, why not undo the global-dictionary changes
at the end of the first call, rather than later?  Then there's certainly
no overlap in their lifespan.

                        regards, tom lane

Could you elaborate more on this? In general, stack-like solution would work - if before the function call there is a global variable with the name matching input variable name, push its value to the stack, and pop it after the function execution. Would implement it tomorrow and see how it works


--

Sent from handheld device

I have improved the code using proposed approach. The second version of patch is in attachment

It works in a following way - the procedure object PLyProcedure stores information about the call stack depth (calldepth field) and the stack itself (argstack field). When the call stack depth is zero we don't make any additional processing, i.e. there won't be any performance impact for existing enduser functions. Stack manipulations are put in action only when the calldepth is greater than zero, which can be achieved either when the function is called recursively with SPI, or when you are calling the same set-returning function in a single query twice or more.

Example of multiple calls to SRF within a single function:

CREATE OR REPLACE FUNCTION func(iter int) RETURNS SETOF int AS $$
return xrange(iter)
$$ LANGUAGE plpythonu;

select func(3), func(4);

Before the patch query caused endless loop finishing with OOM. Now it works as it should

Example of recursion with SPI:
CREATE OR REPLACE FUNCTION test(a int) RETURNS int AS $BODY$
r = 0
if a > 1:
    r = plpy.execute("SELECT test(%d) as a" % (a-1))[0]['a']
return a + r
$BODY$ LANGUAGE plpythonu;

select test(10);

Before the patch query failed with "NameError: global name 'a' is not defined". Now it works correctly and returns 55

--
Best regards,
Alexey Grishchenko

Hi

Any comments on this patch?

Regarding passing parameters to the Python function using globals - it was in initial design of PL/Python (code, documentation). Originally you had to work with "args" global list of input parameters and wasn't able to access the named parameters directly. And you can do so even with the latest release. Going away from global input parameters would require switching to PyObject_CallFunctionObjArgs, which should be possible by changing the function declaration to include input parameters plus "args" (for backward compatibility). However, triggers are a bit different - they depend on modifying the global "TD" dictionary inside the Python function, and they return only the status string. For them, there is no option of modifying the code to avoid global input parameters without breaking the backward compatibility with the old enduser code

--
Best regards,
Alexey Grishchenko

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: pgbench - allow backslash-continuations in custom scripts
Следующее
От: Tatsuo Ishii
Дата:
Сообщение: Re: multivariate statistics v14