Обсуждение: Problem getting query to use index inside a function

Поиск
Список
Период
Сортировка

Problem getting query to use index inside a function

От
"Dirschel, Steve"
Дата:

I’m not sure if I should send this to pgsql-performance or pqsql-general so hopefully I’m sending to the correct one.

 

Table definition:

 

shared=> \d request

                                 Table "public.request"

      Column      |            Type             | Collation | Nullable |    Default

------------------+-----------------------------+-----------+----------+----------------

objectid         | character(36)               |           | not null |

data             | jsonb                       |           | not null | '{}'::jsonb

clientid         | character(3)                |           | not null |

active           | integer                     |           | not null |

productid        | integer                     |           | not null |

checkoutbyuid    | character(100)              |           |          |

checkoutdatetime | timestamp without time zone |           |          |

metadata         | jsonb                       |           | not null | '{}'::jsonb

search_vector    | tsvector                    |           |          |

requeststate     | text                        |           | not null | 'Active'::text

Indexes:

    "requestkey" PRIMARY KEY, btree (objectid, productid)

 

 

Here is the function I’m having difficulties with:

 

CREATE OR REPLACE FUNCTION public.steve1(param_requestid text[], param_productid integer DEFAULT 1)

RETURNS TABLE(objectid text, n text, v text, vt integer)

LANGUAGE sql

AS $function$

SELECT       objectid::text

             , i->>'n'::text

             , i->>'v'::text

             , (i->>'vt') :: INT as vt

FROM   request r

             , jsonb_array_elements(data -> 'i') i

WHERE objectid = ANY($1)

             AND productid=$2

 

$function$

;

 

Query:

 

shared=> explain (analyze, buffers)

shared-> SELECT objectid::text, n::text, v::text, vt::int FROM steve1(ARRAY['5ab8e0ca-abb5-48c9-a95e-2bb2a375d903','adcbe251-6723-48a8-8385-55133fab704a'], 1);

                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------

Function Scan on steve1  (cost=0.25..10.25 rows=1000 width=100) (actual time=42.694..42.942 rows=3368 loops=1)

   Buffers: shared hit=16612  ß-----------------

Planning Time: 0.034 ms

Execution Time: 43.279 ms

(4 rows)

 

The query is doing 16612 logical reads which implies it full scanning the table.  I can also run the query providing the input values where you can clearly see it’s full scanning the table:

 

shared=> explain (analyze, buffers)

shared-> SELECT         objectid::text

shared->                , i->>'n'::text

shared->                , i->>'v'::text

shared->                , (i->>'vt') :: INT as vt

shared-> FROM   request r

shared->                , jsonb_array_elements(data -> 'i') i

shared-> WHERE  objectid = ANY( ARRAY['5ab8e0ca-abb5-48c9-a95e-2bb2a375d903','adcbe251-6723-48a8-8385-55133fab704a'])

shared->                AND productid=1;

                                                                       QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------

Nested Loop  (cost=0.01..19561.37 rows=57000 width=100) (actual time=0.351..41.354 rows=3368 loops=1)

   Buffers: shared hit=16586  ß-----------

   ->  Seq Scan on request r  (cost=0.00..17566.36 rows=570 width=67) (actual time=0.010..38.341 rows=2 loops=1)

         Filter: ((productid = 1) AND ((objectid)::text = ANY ('{5ab8e0ca-abb5-48c9-a95e-2bb2a375d903,adcbe251-6723-48a8-8385-55133fab704a}'::text[])))

         Rows Removed by Filter: 57077

         Buffers: shared hit=16569

   ->  Function Scan on jsonb_array_elements i  (cost=0.01..1.00 rows=100 width=32) (actual time=0.291..0.406 rows=1684 loops=2)

         Buffers: shared hit=17

Planning Time: 0.093 ms

Execution Time: 41.607 ms

(10 rows)

 

 

I can get the query to use the PK index if I change the query from ARRAY [] to {}:

 

shared=> explain (analyze, buffers)

shared-> SELECT         objectid::text

shared->                , i->>'n'::text

shared->                , i->>'v'::text

shared->                , (i->>'vt') :: INT as vt

shared-> FROM   request r

shared->                , jsonb_array_elements(data -> 'i') i

shared-> WHERE  objectid = ANY('{5ab8e0ca-abb5-48c9-a95e-2bb2a375d903,adcbe251-6723-48a8-8385-55133fab704a}')

shared->                AND productid=1;

                                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------

Nested Loop  (cost=0.42..23.87 rows=200 width=100) (actual time=0.346..3.007 rows=3368 loops=1)

   Buffers: shared hit=27  ß--------------

   ->  Index Scan using requestkey on request r  (cost=0.41..16.87 rows=2 width=67) (actual time=0.036..0.049 rows=2 loops=1)

         Index Cond: ((objectid = ANY ('{5ab8e0ca-abb5-48c9-a95e-2bb2a375d903,adcbe251-6723-48a8-8385-55133fab704a}'::bpchar[])) AND (productid = 1))

         Buffers: shared hit=10

   ->  Function Scan on jsonb_array_elements i  (cost=0.01..1.00 rows=100 width=32) (actual time=0.274..0.388 rows=1684 loops=2)

         Buffers: shared hit=17

Planning:

   Buffers: shared hit=27

Planning Time: 0.176 ms

Execution Time: 3.286 ms

(11 rows)

 

But if I change the call to the function to use similar syntax the shared buffer hits of 16,586 is telling me it’s still full scanning the table:

 

shared=> explain (analyze, buffers)

shared-> SELECT objectid::text, n::text, v::text, vt::int FROM steve1('{5ab8e0ca-abb5-48c9-a95e-2bb2a375d903,adcbe251-6723-48a8-8385-55133fab704a}', 1);

                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------

Function Scan on steve1  (cost=0.25..10.25 rows=1000 width=100) (actual time=56.708..56.970 rows=3368 loops=1)

   Buffers: shared hit=16586 ß-------------

Planning Time: 0.024 ms

Execution Time: 57.316 ms

(4 rows)

 

 

What do I need to change to get the query to use the PK index when executed inside the function?

 

Thanks in advance

Steve

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

Re: Problem getting query to use index inside a function

От
Tom Lane
Дата:
"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
> Here is the function I'm having difficulties with:

> CREATE OR REPLACE FUNCTION public.steve1(param_requestid text[], param_productid integer DEFAULT 1)
> RETURNS TABLE(objectid text, n text, v text, vt integer)
> LANGUAGE sql
> AS $function$
> SELECT       objectid::text
>              , i->>'n'::text
>              , i->>'v'::text
>              , (i->>'vt') :: INT as vt
> FROM   request r
>              , jsonb_array_elements(data -> 'i') i
> WHERE objectid = ANY($1)
>              AND productid=$2

> $function$
> ;

> Query:

> shared=> explain (analyze, buffers)
> shared-> SELECT objectid::text, n::text, v::text, vt::int FROM
steve1(ARRAY['5ab8e0ca-abb5-48c9-a95e-2bb2a375d903','adcbe251-6723-48a8-8385-55133fab704a'],1); 
>                                                    QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
> Function Scan on steve1  (cost=0.25..10.25 rows=1000 width=100) (actual time=42.694..42.942 rows=3368 loops=1)

I think you would have better luck if the planner were "inlining"
this function, which we can see it's not since you get a Function Scan
on steve1 rather than the contained query.

I think the only thing stopping that from happening is that the
function is (by default) VOLATILE.  Try marking it STABLE so that
it can share the calling query's snapshot.

(v18 should handle such cases better than previous versions, BTW.
But you'd still be better off marking the function STABLE.)

            regards, tom lane



RE: [EXT] Re: Problem getting query to use index inside a function

От
"Dirschel, Steve"
Дата:
> > Here is the function I'm having difficulties with:

> > CREATE OR REPLACE FUNCTION public.steve1(param_requestid text[],
> > param_productid integer DEFAULT 1) RETURNS TABLE(objectid text, n
> > text, v text, vt integer) LANGUAGE sql AS $function$
> > SELECT       objectid::text
> >              , i->>'n'::text
> >              , i->>'v'::text
> >              , (i->>'vt') :: INT as vt
> > FROM   request r
> >              , jsonb_array_elements(data -> 'i') i WHERE objectid =
> > ANY($1)
> >              AND productid=$2

> > $function$
> > ;

> > Query:

> > shared=> explain (analyze, buffers)
> > shared-> SELECT objectid::text, n::text, v::text, vt::int FROM
> > shared-> steve1(ARRAY['5ab8e0ca-abb5-48c9-a95e-2bb2a375d903','adcbe251
> > shared-> -6723-48a8-8385-55133fab704a'], 1);
> >                                                    QUERY PLAN
> > ----------------------------------------------------------------------
> > ------------------------------------------
> > Function Scan on steve1  (cost=0.25..10.25 rows=1000 width=100)
> > (actual time=42.694..42.942 rows=3368 loops=1)

> I think you would have better luck if the planner were "inlining"
> this function, which we can see it's not since you get a Function Scan on steve1 rather than the contained query.

> I think the only thing stopping that from happening is that the function is (by default) VOLATILE.  Try marking it
STABLEso that it can share the calling query's snapshot. 

> (v18 should handle such cases better than previous versions, BTW.
> But you'd still be better off marking the function STABLE.)

>            regards, tom lane


Thanks for the reply, but that did not seem to help.  I tried using both the {} and ARRAY[] syntax but both are still
fullscanning based on the shared buffers.  Any other ideas? 

shared=> ALTER FUNCTION public.steve1 (param_requestid text[], param_productid integer) STABLE;
ALTER FUNCTION

shared=> explain (analyze, buffers)
shared->        SELECT objectid::text, n::text, v::text, vt::int FROM steve1('{83e3326a-62fe-45bc-81e5-1e9fb9a84d31}',
'PJJ',1, 1); 
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Function Scan on steve1  (cost=0.25..10.25 rows=1000 width=100) (actual time=64.465..64.465 rows=0 loops=1)
   Buffers: shared hit=16572
 Planning Time: 0.033 ms
 Execution Time: 64.485 ms
(4 rows)


shared=>
shared=> explain (analyze, buffers)
shared-> SELECT objectid::text, n::text, v::text, vt::int FROM steve1(ARRAY['83e3326a-62fe-45bc-81e5-1e9fb9a84d31'],
'PJJ',1, 1); 
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Function Scan on steve1  (cost=0.25..10.25 rows=1000 width=100) (actual time=63.749..63.749 rows=0 loops=1)
   Buffers: shared hit=16569
 Planning Time: 0.043 ms
 Execution Time: 63.766 ms
(4 rows)






Re: [EXT] Re: Problem getting query to use index inside a function

От
Tom Lane
Дата:
"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
>> I think you would have better luck if the planner were "inlining"
>> this function, which we can see it's not since you get a Function Scan on steve1 rather than the contained query.
>> I think the only thing stopping that from happening is that the function is (by default) VOLATILE.  Try marking it
STABLEso that it can share the calling query's snapshot. 

> Thanks for the reply, but that did not seem to help.

I tried to replicate this as follows:

--- CUT ---
create table request(objectid text, productid int, data jsonb);
create index on request(objectid, productid);

CREATE OR REPLACE FUNCTION public.steve1(param_requestid text[],
param_productid integer DEFAULT 1) RETURNS TABLE(objectid text, n
text, v text, vt integer) LANGUAGE sql AS $function$
SELECT       objectid::text
             , i->>'n'::text
             , i->>'v'::text
             , (i->>'vt') :: INT as vt
FROM   request r
             , jsonb_array_elements(data -> 'i') i WHERE objectid =
ANY($1)
             AND productid=$2

$function$
stable ;

explain
SELECT objectid::text, n::text, v::text, vt::int FROM
steve1(ARRAY['5ab8e0ca-abb5-48c9-a95e-2bb2a375d903','adcbe251-6723-48a8-8385-55133fab704a'], 1);
--- CUT ---

and I got:

                                                                     QUERY PLAN
                            

----------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.15..11.42 rows=100 width=100)
   ->  Index Scan using request_objectid_productid_idx on request r  (cost=0.15..8.17 rows=1 width=64)
         Index Cond: ((objectid = ANY
('{5ab8e0ca-abb5-48c9-a95e-2bb2a375d903,adcbe251-6723-48a8-8385-55133fab704a}'::text[]))AND (productid = 1)) 
   ->  Function Scan on jsonb_array_elements i  (cost=0.01..1.00 rows=100 width=32)
(4 rows)

which is what I expected from successful inlining of the function.
So there are some moving parts in your situation that you've not
told us about.

            regards, tom lane



RE: [EXT] Re: Problem getting query to use index inside a function

От
"Dirschel, Steve"
Дата:
>> Thanks for the reply, but that did not seem to help.

> I tried to replicate this as follows:

> --- CUT ---
> create table request(objectid text, productid int, data jsonb); create index on request(objectid, productid);

> CREATE OR REPLACE FUNCTION public.steve1(param_requestid text[], param_productid integer DEFAULT 1) RETURNS
TABLE(objectidtext, n text, v text, vt integer) LANGUAGE sql AS $function$ 
> SELECT       objectid::text
>              , i->>'n'::text
>              , i->>'v'::text
>              , (i->>'vt') :: INT as vt
> FROM   request r
>              , jsonb_array_elements(data -> 'i') i WHERE objectid =
> ANY($1)
>              AND productid=$2

> $function$
> stable ;

> explain
> SELECT objectid::text, n::text, v::text, vt::int FROM
steve1(ARRAY['5ab8e0ca-abb5-48c9-a95e-2bb2a375d903','adcbe251-6723-48a8-8385-55133fab704a'],1); 
> --- CUT ---

> and I got:

>                                                                      QUERY PLAN
                              
>
----------------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.15..11.42 rows=100 width=100)
>    ->  Index Scan using request_objectid_productid_idx on request r  (cost=0.15..8.17 rows=1 width=64)
>          Index Cond: ((objectid = ANY
('{5ab8e0ca-abb5-48c9-a95e-2bb2a375d903,adcbe251-6723-48a8-8385-55133fab704a}'::text[]))AND (productid = 1)) 
>    ->  Function Scan on jsonb_array_elements i  (cost=0.01..1.00 rows=100 width=32)
> (4 rows)

> which is what I expected from successful inlining of the function.
> So there are some moving parts in your situation that you've not told us about.

>            regards, tom lane

Hi Tom,

I ran your code and got similar results so I agree there is more for me to dig into.  I see STABLE has a limit of not
allowingDML in the function.  I am testing against a very simple function here but we have many other functions with
thesame problem but those also contain DML so even if I got the STABLE to work in this one test case it does not appear
Icould use in all of my functions with this problem.  Are there other options here besides the STABLE option that would
workfor functions that also contain DML? 

Thanks
Steve