BUG #17936: Memory Leak when OPERATOR FAMILY use LANGUAGE SQL function

Поиск

Список

Период

Сортировка

От	PG Bug reporting form
Тема	BUG #17936: Memory Leak when OPERATOR FAMILY use LANGUAGE SQL function
Дата	16 мая 2023 г. 13:16:43
Msg-id	17936-ccbf22f31c04b6cf@postgresql.org обсуждение исходный текст
Ответы	Re: BUG #17936: Memory Leak when OPERATOR FAMILY use LANGUAGE SQL function
Список	pgsql-bugs

Дерево обсуждения

The following bug has been logged on the website:

Bug reference:      17936
Logged by:          yajun Hu
Email address:      hu_yajun@qq.com
PostgreSQL version: 15.3
Operating system:   CentOS7 with kernel version 5.10
Description:

hi, community partners. 
   I guess I found a memory leak problem when OPERATOR FAMILY use LANGUAGE
SQL function.
** Problem **
I build postgresql with tag REL_15_3, and run following SQL.
--
CREATE OR REPLACE FUNCTION pg_catalog.text_cmp_bpchar(text, bpchar)
RETURNS int4
AS'select pg_catalog.bttextcmp($1, $2::text)'
LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE;

ALTER OPERATOR FAMILY text_ops USING btree ADD
OPERATOR 3   =(bpchar, bpchar),
FUNCTION 1  pg_catalog.text_cmp_bpchar(text, bpchar);

create table t1( a varchar,b varchar);
create table t2(a bpchar,b bpchar);
insert into t1 select 1,1;
insert into t2 select 1,1 from generate_series(1,1000000)i;
create index on t1(a);
set enable_hashjoin to off;
set enable_mergejoin to off;
set enable_seqscan to off;
set enable_material to off;
set enable_memoize to off;

explain select * from t1,t2 where t1.a = t2.a::bpchar; -- show sql plan
explain analyze select * from t1,t2 where t1.a = t2.a::bpchar; -- run sql
--
The plan is shown in the bellow, the backend memory can be found to be
soaring when explain analyze this SQL.
postgres=# explain select * from t1,t2 where t1.a = t2.a::bpchar;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Nested Loop  (cost=10000000000.12..10000067714.50 rows=1947 width=128)
   ->  Seq Scan on t2  (cost=10000000000.00..10000008319.00 rows=389400
width=64)
   ->  Index Scan using t1_a_idx on t1  (cost=0.12..0.14 rows=1 width=64)
         Index Cond: ((a)::bpchar = t2.a)
(4 rows)
I Use the 'top -c' to find that the memory usage of this session exceeds 15G
within 20s

** My simple analysis **
1. Each Index Scan on t1 will call _bt_first->ScanKeyEntryInitialize to
Initialize btree scanKey
2. Then in _bt_first, will call _bt_search/_bt_binsrch -> _bt_compare ->
FunctionCall2Coll to execute function to compare btree datum
3. Because we are using SQL functions here, FunctionCall2Coll will call
fmgr_sql and generate SQLFunctionCache into fcinfo->flinfo->fn_extra;
4. In _bt_first, we save fmgrinfo into inskey.scankeys[{keyid}]->sk_func,
that is,  SQLFunctionCache is savad in
inskey.scankeys[{keyid}]->sk_func->fn_extra
5. In _bt_first,  inskey is on the stack, so we lose the SQLFunctionCache
pointing after finish _bt_first
6. The MemoryContext of SQLFunctionCache has a long life cycle, so every
indexScan execution will leak memory of SQLFunctionCache once, and all the
memory will be released after the SQL run ends.

Is this a bug? Or is it a design problem?
I also reproduced this problem on the community master branch with commit
8cb94344c3c7130a0cd5e21e83705739f552187e
Maybe other codes have similar problems when calling fmgr_sql.

Regards, Yajun Hu

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

BUG #17936: Memory Leak when OPERATOR FAMILY use LANGUAGE SQL function