Entire index scanned, but only when in SQL function?

Поиск

Список

Период

Сортировка

От	Philip Semanchuk
Тема	Entire index scanned, but only when in SQL function?
Дата	11 июля 2023 г. 16:07:26
Msg-id	9D9DBF30-B6EF-49DB-864D-610849C37D41@americanefficient.com обсуждение исходный текст
Список	pgsql-performance

Дерево обсуждения

Hi there,
I’m on Postgres 13.11 and I'm seeing a situation where an INSERT...SELECT statement seq scans an index, but only when
wrappedin a SQL function. When invoked directly (via psql) or when called via a PL/pgSQL function, it only reads the
indextuples it needs, resulting in much better performance. I can solve my problem by writing the function in PL/pgSQL,
butI'm curious why the pure SQL version behaves the way it does. 

Here's my table --

\d documents
+-------------------+------------------+----------------------------------------+
| Column            | Type             | Modifiers                              |
|-------------------+------------------+----------------------------------------|
| document_id       | integer          |  not null generated always as identity |
| product_id        | integer          |  not null                              |
| units_sold        | integer          |  not null                              |
| sale_date         | date             |  not null                              |
... some other columns ...
+-------------------+------------------+----------------------------------------+

CREATE INDEX idx_philip_tmp on documents (document_id, product_id);

Here's the SQL function which will use that index --

CREATE OR REPLACE FUNCTION fn_create_tasks(product_ids int[])
RETURNS void
AS $$
    -- Create processing tasks for documents related to these products
    INSERT INTO
        processing_queue (document_id)
    SELECT
        DISTINCT document_id
    FROM
        documents
    JOIN unnest(product_ids::int[]) AS product_id USING (product_id)
    ;

$$ LANGUAGE sql VOLATILE PARALLEL SAFE;

96498 is a product_id that has one associated document_id. When I copy/paste this statement into psql, it executes
quickly,and pg_stat_user_indexes.idx_tup_read reports 2 tuples read for the index. 

INSERT INTO
    processing_queue (document_id)
SELECT
    DISTINCT document_id
FROM
    documents
JOIN unnest(ARRAY[96498]::int[]) AS product_id USING (product_id)
;

When I copy/paste this into psql, I expect it to perform just as quickly but it does not.
pg_stat_user_indexes.idx_tup_readreports 64313783 tuples read (which is the entire index). 

SELECT fn_create_tasks(ARRAY[96498]::int[])

If I rewrite fn_create_tasks() in PL/pgSQL, it behaves as I expect (executes quickly, pg_stat_user_indexes.idx_tup_read
=2). 

SELECT fn_create_tasks_plpgsql(ARRAY[96498]::int[])

My rule of thumb is that SQL functions always perform as well as or better than a PL/pgSQL equivalent, but this is a
casewhere that's not true. If anyone can give me some clues as to what's happening here, I'd appreciate it. 

Thanks
Philip

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Entire index scanned, but only when in SQL function?