On Fri, Jul 23, 2010 at 11:36:14AM -0500, Kenneth Marshall wrote:
> Dear PostgreSQL community,
>
> I have the following function used to select tokens individually
> from a table and avoid a sequential scan of the table:
>
> create function lookup_tokens(integer,bigint[])
> returns setof dspam_token_data
> language plpgsql stable
> as '
> declare
> v_rec record;
> begin
> for v_rec in select * from dspam_token_data
> where uid=$1
> and token in (select $2[i]
> from generate_series(array_lower($2,1),array_upper($2,1)) s(i))
> loop
> return next v_rec;
> end loop;
> return;
> end;';
>
> I would like to execute the same lookup but with the array sorted. I
> used the following definition of sort:
>
> CREATE OR REPLACE FUNCTION sort(anyarray)
> RETURNS anyarray AS $$
> SELECT array(SELECT * FROM unnest($1) ORDER BY 1);
> $$ language sql;
>
> and added a call to it to the function:
>
> create function lookup_tokens_sort(integer,bigint[])
> returns setof dspam_token_data
> language plpgsql stable
> as '
> declare
> v_rec record;
> begin
> for v_rec in select * from dspam_token_data
> where uid=$1
> and token in (select (sort($2))[i]
> from generate_series(array_lower($2,1),array_upper($2,1)) s(i))
> loop
> return next v_rec;
> end loop;
> return;
> end;';
>
> The problem with this is that the sort() appears to be called once
> per token because the run without the sort takes < 1 millisecond and
> with the sort take about 4 seconds. Is there a way to call the sort()
> only once and use the result in the loops instead of sorting each
> time. Any help would be appreciated. The actual sorted loop will
> be needed for the update functions to prevent deadlocks.
>
> Regards,
> Ken
>
As a follow-up, I can get the desired result by placing the
sort() call in the original SQL query:
SELECT * FROM lookup_tokens( xxx, sort(yyy));
Does anyone have any ideas on how to embed the sort()
call into the pl/pgsql function without a performance impact.
Cheers,
Ken