Обсуждение: Strange behavior of some volatile function like random(), nextval()

Поиск
Список
Период
Сортировка

Strange behavior of some volatile function like random(), nextval()

От
Alex Ignatov
Дата:
<p>Hello!<p>Got some strange behavior of random() function:<p>postgres=# select (select random() ) from
generate_series(1,10)as i;<br />       random<br /> -------------------<br />  0.831577288918197<br />
 0.831577288918197<br/>  0.831577288918197<br />  0.831577288918197<br />  0.831577288918197<br />
 0.831577288918197<br/>  0.831577288918197<br />  0.831577288918197<br />  0.831577288918197<br />
 0.831577288918197<br/> (10 rows)<br /><br /> postgres=# select (select random()+i*0 ) from generate_series(1,10) as
i;<br/>       ?column?<br /> --------------------<br />    0.97471913928166<br />  0.0532126761972904<br />  
0.331358563620597<br/>  0.0573496259748936<br />   0.321165383327752<br />    0.48836630070582<br />  
0.444201893173158<br/>  0.0729857799597085<br />   0.661443184129894<br />   0.706566562876105<br /> (10
rows)<p>postgres=#explain select (select random() ) from generate_series(1,10) as i;<br />
                               QUERY PLAN<br />
--------------------------------------------------------------------------<br/>  Function Scan on generate_series i 
(cost=0.02..10.01rows=1000 width=0)<br />    InitPlan 1 (returns $0)<br />      ->  Result  (cost=0.00..0.01 rows=1
width=0)<br/> (3 rows)<br /><br /> postgres=# explain select (select random()+i*0 ) from generate_series(1,10) as i;<br
/>                                QUERY PLAN<br />
--------------------------------------------------------------------------<br/>  Function Scan on generate_series i 
(cost=0.00..30.00rows=1000 width=4)<br />    SubPlan 1<br />      ->  Result  (cost=0.00..0.02 rows=1 width=0)<br />
(3rows)<p>postgres=# \df+ random();<br />                                                                 List of
functions<br/>    Schema   |  Name  | Result data type | Argument data types |  Type  | Security | Volatility | 
Owner  | Language | Source code | Description<br />
------------+--------+------------------+---------------------+--------+----------+------------+----------+----------+-------------+--------------<br
/> pg_catalog | random | double precision |                     | normal | invoker  | volatile   | postgres | internal
|drandom     | random value<br /> (1 row)<p><br /><p>Also: postgres=# create sequence test;<br /> CREATE SEQUENCE<br />
postgres=#SELECT (SELECT nextval('test')) FROM generate_series(1,10) as i;<br />  nextval<br /> ---------<br />       
1<br/>        1<br />        1<br />        1<br />        1<br />        1<br />        1<br />        1<br />       
1<br/>        1<br /> (10 rows)<br /><br /> postgres=# SELECT (SELECT nextval('test')+i*0) FROM generate_series(1,10)
asi;<br />  ?column?<br /> ----------<br />         2<br />         3<br />         4<br />         5<br />        
6<br/>         7<br />         8<br />         9<br />        10<br />        11<br /> (10 rows)<br /><br /><div
class="gmail_default"><br/> postgres=# \df+ nextval() ;<br />
                                                                   List of functions<br />    Schema   |  Name   |
Resultdata type | Argument data types |  Type  | Security | Volatility |  Owner   | Language | Source code |    
Description<br/>
------------+---------+------------------+---------------------+--------+----------+------------+----------+----------+-------------+---------------------<br
/> pg_catalog | nextval | bigint           | regclass            | normal | invoker  | volatile   | postgres | internal
|nextval_oid | sequence next value<br /> (1 row)<br /></div><p><br /><p>Both function is volatile so from docs :<p>"A
VOLATILEfunction can do anything, including modifying the database. It can return different results on successive calls
withthe same arguments. The optimizer makes no assumptions about the behavior of such functions. A query using a
volatilefunction will re-evaluate the function at every row where its value is needed."<br /> Something wrong with
executor?Is it bug or executor feature related with  subquery?<br /><br /><pre class="moz-signature" cols="72">-- 
 
Alex Ignatov
Postgres Professional: <a class="moz-txt-link-freetext"
href="http://www.postgrespro.com">http://www.postgrespro.com</a>
The Russian Postgres Company

</pre>

Re: Strange behavior of some volatile function like random(), nextval()

От
Michael Paquier
Дата:
On Wed, Jun 29, 2016 at 7:43 PM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:
> Hello!
>
> Got some strange behavior of random() function:
>
> postgres=# select (select random() ) from generate_series(1,10) as i;
>       random
> -------------------
>  0.831577288918197
> [...]
> (10 rows)

I recall that this is treated as an implicit LATERAL, meaning that
random() is calculated only once.

> postgres=# select (select random()+i*0 ) from generate_series(1,10) as i;
>       ?column?
> --------------------
>    0.97471913928166
> [...]
> (10 rows)

But not that. So those results do not surprise me.
-- 
Michael



Re: Strange behavior of some volatile function like random(), nextval()

От
"David G. Johnston"
Дата:
More specifically...
On Wed, Jun 29, 2016 at 7:34 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Jun 29, 2016 at 7:43 PM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:
> Hello!
>
> Got some strange behavior of random() function:
>
> postgres=# select (select random() ) from generate_series(1,10) as i;
>       random
> -------------------
>  0.831577288918197
> [...]
> (10 rows)

I recall that this is treated as an implicit LATERAL, meaning that
random() is calculated only once.

A non-correlated (i.e., does not refer to outer variables) subquery placed into the target-list need only have its value computed once - so that is what happens.  The fact that a volatile function can return different values given the same arguments doesn't mean much when the function is only ever called a single time.​


> postgres=# select (select random()+i*0 ) from generate_series(1,10) as i;
>       ?column?
> --------------------
>    0.97471913928166
> [...]
> (10 rows)

But not that. So those results do not surprise me.


​A correlated subquery, on the other hand, has to be called once for every row and is evaluated within the context supplied by said row​.  Each time random is called it returns a new value.

Section 4.2.11 (9.6 docs)

Maybe this could be worded better but the first part talks about a single execution while "any one execution" is mentioned in reference to "the surrounding query".

​I do think that defining "correlated" and "non-correlated" subqueries within this section would be worthwhile.

David J.


Re: Strange behavior of some volatile function like random(), nextval()

От
Alex Ignatov
Дата:


On 29.06.2016 15:30, David G. Johnston wrote:
More specifically...
On Wed, Jun 29, 2016 at 7:34 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Jun 29, 2016 at 7:43 PM, Alex Ignatov <a.ignatov@postgrespro.ru> wrote:
> Hello!
>
> Got some strange behavior of random() function:
>
> postgres=# select (select random() ) from generate_series(1,10) as i;
>       random
> -------------------
>  0.831577288918197
> [...]
> (10 rows)

I recall that this is treated as an implicit LATERAL, meaning that
random() is calculated only once.

A non-correlated (i.e., does not refer to outer variables) subquery placed into the target-list need only have its value computed once - so that is what happens.  The fact that a volatile function can return different values given the same arguments doesn't mean much when the function is only ever called a single time.​


> postgres=# select (select random()+i*0 ) from generate_series(1,10) as i;
>       ?column?
> --------------------
>    0.97471913928166
> [...]
> (10 rows)

But not that. So those results do not surprise me.


​A correlated subquery, on the other hand, has to be called once for every row and is evaluated within the context supplied by said row​.  Each time random is called it returns a new value.

Section 4.2.11 (9.6 docs)

Maybe this could be worded better but the first part talks about a single execution while "any one execution" is mentioned in reference to "the surrounding query".

​I do think that defining "correlated" and "non-correlated" subqueries within this section would be worthwhile.

David J.


In this subquery(below) we have reference to outer variables but it is not working as it should(or i dont understand something):

postgres=# postgres=# select id, ( select string_agg('a','') from generate_series(1,trunc(10*random()+1)::int) where id=id) from generate_series(1,10) as id;
 id | string_agg
----+------------
  1 | aaa
  2 | aaa
...
but this query(with reference to outer var) working perfectly:
postgres=# select id,(select random() where id=id) from generate_series(1,10) as id;
 id |       random
----+--------------------
  1 |  0.974509597290307
  2 |  0.219822214450687
...

Also this query  is working good( (id-id) do the job):
postgres=# select id, ( select string_agg('a','') from generate_series(1,trunc(10*random()+1)::int+(id-id)) ) from generate_series(1,10) as id;
 id | string_agg
----+------------
  1 | aaaaaaa
  2 | aaaaa
...

It means that even reference to outer variables  doesn't mean that executor execute volatile function from subquery every time. Or there is something else what i should know?


Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Strange behavior of some volatile function like random(), nextval()

От
Tom Lane
Дата:
[ Please do not quote the entire thread in each followup.  That's
disrespectful of your readers' time, and will soon cause people to
stop reading the thread, meaning you don't get answers. ]

Alex Ignatov <a.ignatov@postgrespro.ru> writes:
> In this subquery(below) we have reference to outer variables but it is 
> not working as it should(or i dont understand something):

> postgres=# postgres=# select id, ( select string_agg('a','') from 
> generate_series(1,trunc(10*random()+1)::int) where id=id) from 
> generate_series(1,10) as id;

The inner generate_series() call does not contain any outer references, so
it doesn't get recomputed.  There's a comment in ExecReScanFunctionScan
about that:
    * Here we have a choice whether to drop the tuplestores (and recompute    * the function outputs) or just rescan
them. We must recompute if an    * expression contains changed parameters, else we rescan.    *    * XXX maybe we
shouldrecompute if the function is volatile?  But in    * general the executor doesn't conditionalize its actions on
that.

So you get some random number of generate_series output rows on the first
call, but then each subsequent run of the subquery just rescans those same
rows.  I do not think this is wrong or bad, really; if it was done
differently, examples such as this same generate_series call on the inside
of a nestloop join would behave very strangely.
        regards, tom lane



Re: Strange behavior of some volatile function like random(), nextval()

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> A correlated subquery, on the other hand, has to be called once for every
> row and is evaluated within the context supplied by said row.  Each time
> random is called it returns a new value.

> Section 4.2.11 (9.6 docs)
> https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SQL-SYNTAX-SCALAR-SUBQUERIES

> Maybe this could be worded better but the first part talks about a single
> execution while "any one execution" is mentioned in reference to "the
> surrounding query".

> I do think that defining "correlated" and "non-correlated" subqueries
> within this section would be worthwhile.

Hmm ... a quick look around says we don't define or use those terms
anywhere.  I agree this could stand to be addressed somewhere, but I'm
not sure if 4.2.11 is the most appropriate place.  I don't think the
issue is unique to scalar subqueries.
        regards, tom lane