Обсуждение: Memory leak in Pl/Python

Поиск
Список
Период
Сортировка

Memory leak in Pl/Python

От
Andrey Zhidenkov
Дата:
I have postgresql 9.4.8 on my server and I've noticed always growing
memory when I use plpython. I've made some tests and find a few
situations, when memory leaks. For example, when I call this procedure
many times, I can see an always growing memory:

create or replace
function test() returns bigint as $$

plpy.execute("insert into test(test) values ('test')")

return 1

$$ language plpythonu;

Interestingly, when I use not-modifying data query ('select 1') as
argument of execute(), a leak stops.

How can I fix/avoid this?



Re: Memory leak in Pl/Python

От
"David G. Johnston"
Дата:
On Fri, Jun 24, 2016 at 6:41 PM, Andrey Zhidenkov <andrey.zhidenkov@gmail.com> wrote:
For example, when I call this procedure
many times,

​Call how?  Specifically, how are you handling transactions in the calling client?  And what/how are you measuring memory consumption?

​David J.

Re: Memory leak in Pl/Python

От
Andrey Zhidenkov
Дата:
For test I wrote script in Python, which calls a test function via psycopg2:

#!/usr/bin/env python2

import psycopg2

conn = psycopg2.connect('xxx')

cursor = conn.cursor()

cursor.execute('set application_name to \'TEST\'')

for i in range(1, 1000000):   cursor.execute('select test()')   conn.commit()


I see memory consumption in htop and pg_activity tools.

On Sat, Jun 25, 2016 at 2:00 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Fri, Jun 24, 2016 at 6:41 PM, Andrey Zhidenkov
> <andrey.zhidenkov@gmail.com> wrote:
>>
>> For example, when I call this procedure
>> many times,
>
>
> Call how?  Specifically, how are you handling transactions in the calling
> client?  And what/how are you measuring memory consumption?
>
> David J.
>



-- 
Andrey Zhidenkov / Database developer
+79265617190/ andrey.zhidenkov@gmail.com




This e-mail message may contain confidential or legally privileged
information and is intended only for the use of the intended
recipient(s). Any unauthorized disclosure, dissemination,
distribution, copying or the taking of any action in reliance on the
information herein is prohibited. E-mails are not secure and cannot be
guaranteed to be error free as they can be intercepted, amended, or
contain viruses. Anyone who communicates with us by e-mail is deemed
to have accepted these risks. Company Name is not responsible for
errors or omissions in this message and denies any responsibility for
any damage arising from the use of e-mail. Any opinion and other
statement contained in this message and any attachment are solely
those of the author and do not necessarily represent those of the
company.



Re: Memory leak in Pl/Python

От
Andrey Zhidenkov
Дата:
I found commit, that fixes some memory leaks in 9.6 beta 2:

https://github.com/postgres/postgres/commit/8c75ad436f75fc629b61f601ba884c8f9313c9af#diff-4d0cb76412a1c4ee5d9c7f76ee489507

I'm interesting in how Tom Lane check that is no more leaks in plpython?

On Sat, Jun 25, 2016 at 4:54 AM, Andrey Zhidenkov
<andrey.zhidenkov@gmail.com> wrote:
> For test I wrote script in Python, which calls a test function via psycopg2:
>
> #!/usr/bin/env python2
>
> import psycopg2
>
> conn = psycopg2.connect('xxx')
>
> cursor = conn.cursor()
>
> cursor.execute('set application_name to \'TEST\'')
>
> for i in range(1, 1000000):
>     cursor.execute('select test()')
>     conn.commit()
>
>
> I see memory consumption in htop and pg_activity tools.
>
> On Sat, Jun 25, 2016 at 2:00 AM, David G. Johnston
> <david.g.johnston@gmail.com> wrote:
>> On Fri, Jun 24, 2016 at 6:41 PM, Andrey Zhidenkov
>> <andrey.zhidenkov@gmail.com> wrote:
>>>
>>> For example, when I call this procedure
>>> many times,
>>
>>
>> Call how?  Specifically, how are you handling transactions in the calling
>> client?  And what/how are you measuring memory consumption?
>>
>> David J.
>>
>
>
>
> --
> Andrey Zhidenkov / Database developer
> +79265617190/ andrey.zhidenkov@gmail.com
>
>
>
>
> This e-mail message may contain confidential or legally privileged
> information and is intended only for the use of the intended
> recipient(s). Any unauthorized disclosure, dissemination,
> distribution, copying or the taking of any action in reliance on the
> information herein is prohibited. E-mails are not secure and cannot be
> guaranteed to be error free as they can be intercepted, amended, or
> contain viruses. Anyone who communicates with us by e-mail is deemed
> to have accepted these risks. Company Name is not responsible for
> errors or omissions in this message and denies any responsibility for
> any damage arising from the use of e-mail. Any opinion and other
> statement contained in this message and any attachment are solely
> those of the author and do not necessarily represent those of the
> company.



-- 
Andrey Zhidenkov / Database developer
+79265617190/ andrey.zhidenkov@gmail.com




This e-mail message may contain confidential or legally privileged
information and is intended only for the use of the intended
recipient(s). Any unauthorized disclosure, dissemination,
distribution, copying or the taking of any action in reliance on the
information herein is prohibited. E-mails are not secure and cannot be
guaranteed to be error free as they can be intercepted, amended, or
contain viruses. Anyone who communicates with us by e-mail is deemed
to have accepted these risks. Company Name is not responsible for
errors or omissions in this message and denies any responsibility for
any damage arising from the use of e-mail. Any opinion and other
statement contained in this message and any attachment are solely
those of the author and do not necessarily represent those of the
company.



Re: Memory leak in Pl/Python

От
Tom Lane
Дата:
Andrey Zhidenkov <andrey.zhidenkov@gmail.com> writes:
> I see memory consumption in htop and pg_activity tools.

"top" can be pretty misleading if you don't know how to interpret its
output, specifically that you have to discount whatever it shows as
SHR space.  That just represents the amount of the shared memory block
that this process has touched so far in its lifetime; even if it appears
to be growing, it's not a leak.  That growth will stop eventually, once
the process has touched every available shared buffer.  RES minus SHR
is a fairer estimate of the process's own memory consumption.

I tried to reduce your example to a self-contained test case, thus:

create extension if not exists plpythonu;
create table test (test text);
create or replace
function test() returns bigint as $$
plpy.execute("insert into test(test) values ('test')")
return 1
$$ language plpythonu;
do $$
begin for i in 1..10000000 loop   perform test(); end loop;
end;
$$;

I do not see any significant leakage with this example.  There is some
memory growth, approximately 4 bytes per plpy.execute(), due to having to
keep track of a subtransaction XID for each uncommitted subtransaction.
That's not plpython's fault --- it would happen with any PL that executes
each SQL command as a separate subtransaction, which is probably all of
them other than plpgsql.  And it really ought to be negligible anyway in
any sane usage.

It's possible that you're seeing some other, larger memory consumption;
for instance, if there were triggers or foreign keys on the "test" table
then perhaps there would be an opportunity for leakage in those.
But without a self-contained test case or any indication of the rate
of leakage you're seeing, it's hard to guess about the problem.
        regards, tom lane



Re: Memory leak in Pl/Python

От
Andrey Zhidenkov
Дата:
Thank you for your answer, Tom.

I've tried code in your example and I still see an always growing
memory consumption (1Mb per second). As it was before, I do not see
growing memory if
I use 'select 1' query as argument of plpy.execute(). Table test does
not has any triggers or foreign keys, I just created it with script
you provided.

I think that is a leak because my system became use a swap file and
filnally OOM killer kills one of database process and database goes
into recovery mode. That is the problem...

On Sat, Jun 25, 2016 at 6:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrey Zhidenkov <andrey.zhidenkov@gmail.com> writes:
>> I see memory consumption in htop and pg_activity tools.
>
> "top" can be pretty misleading if you don't know how to interpret its
> output, specifically that you have to discount whatever it shows as
> SHR space.  That just represents the amount of the shared memory block
> that this process has touched so far in its lifetime; even if it appears
> to be growing, it's not a leak.  That growth will stop eventually, once
> the process has touched every available shared buffer.  RES minus SHR
> is a fairer estimate of the process's own memory consumption.
>
> I tried to reduce your example to a self-contained test case, thus:
>
> create extension if not exists plpythonu;
> create table test (test text);
> create or replace
> function test() returns bigint as $$
> plpy.execute("insert into test(test) values ('test')")
> return 1
> $$ language plpythonu;
> do $$
> begin
>   for i in 1..10000000 loop
>     perform test();
>   end loop;
> end;
> $$;
>
> I do not see any significant leakage with this example.  There is some
> memory growth, approximately 4 bytes per plpy.execute(), due to having to
> keep track of a subtransaction XID for each uncommitted subtransaction.
> That's not plpython's fault --- it would happen with any PL that executes
> each SQL command as a separate subtransaction, which is probably all of
> them other than plpgsql.  And it really ought to be negligible anyway in
> any sane usage.
>
> It's possible that you're seeing some other, larger memory consumption;
> for instance, if there were triggers or foreign keys on the "test" table
> then perhaps there would be an opportunity for leakage in those.
> But without a self-contained test case or any indication of the rate
> of leakage you're seeing, it's hard to guess about the problem.
>
>                         regards, tom lane



-- 
Andrey Zhidenkov / Database developer
+79265617190/ andrey.zhidenkov@gmail.com




This e-mail message may contain confidential or legally privileged
information and is intended only for the use of the intended
recipient(s). Any unauthorized disclosure, dissemination,
distribution, copying or the taking of any action in reliance on the
information herein is prohibited. E-mails are not secure and cannot be
guaranteed to be error free as they can be intercepted, amended, or
contain viruses. Anyone who communicates with us by e-mail is deemed
to have accepted these risks. Company Name is not responsible for
errors or omissions in this message and denies any responsibility for
any damage arising from the use of e-mail. Any opinion and other
statement contained in this message and any attachment are solely
those of the author and do not necessarily represent those of the
company.



Re: Memory leak in Pl/Python

От
Andrey Zhidenkov
Дата:
It's very strange, but when I use expression 'update test set test =
'test' where id = 1' as argument of plpy.execute() memory do not
growing at all...

On Sun, Jun 26, 2016 at 9:05 PM, Andrey Zhidenkov
<andrey.zhidenkov@gmail.com> wrote:
> Thank you for your answer, Tom.
>
> I've tried code in your example and I still see an always growing
> memory consumption (1Mb per second). As it was before, I do not see
> growing memory if
> I use 'select 1' query as argument of plpy.execute(). Table test does
> not has any triggers or foreign keys, I just created it with script
> you provided.
>
> I think that is a leak because my system became use a swap file and
> filnally OOM killer kills one of database process and database goes
> into recovery mode. That is the problem...
>
> On Sat, Jun 25, 2016 at 6:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andrey Zhidenkov <andrey.zhidenkov@gmail.com> writes:
>>> I see memory consumption in htop and pg_activity tools.
>>
>> "top" can be pretty misleading if you don't know how to interpret its
>> output, specifically that you have to discount whatever it shows as
>> SHR space.  That just represents the amount of the shared memory block
>> that this process has touched so far in its lifetime; even if it appears
>> to be growing, it's not a leak.  That growth will stop eventually, once
>> the process has touched every available shared buffer.  RES minus SHR
>> is a fairer estimate of the process's own memory consumption.
>>
>> I tried to reduce your example to a self-contained test case, thus:
>>
>> create extension if not exists plpythonu;
>> create table test (test text);
>> create or replace
>> function test() returns bigint as $$
>> plpy.execute("insert into test(test) values ('test')")
>> return 1
>> $$ language plpythonu;
>> do $$
>> begin
>>   for i in 1..10000000 loop
>>     perform test();
>>   end loop;
>> end;
>> $$;
>>
>> I do not see any significant leakage with this example.  There is some
>> memory growth, approximately 4 bytes per plpy.execute(), due to having to
>> keep track of a subtransaction XID for each uncommitted subtransaction.
>> That's not plpython's fault --- it would happen with any PL that executes
>> each SQL command as a separate subtransaction, which is probably all of
>> them other than plpgsql.  And it really ought to be negligible anyway in
>> any sane usage.
>>
>> It's possible that you're seeing some other, larger memory consumption;
>> for instance, if there were triggers or foreign keys on the "test" table
>> then perhaps there would be an opportunity for leakage in those.
>> But without a self-contained test case or any indication of the rate
>> of leakage you're seeing, it's hard to guess about the problem.
>>
>>                         regards, tom lane
>
>
>
> --
> Andrey Zhidenkov / Database developer
> +79265617190/ andrey.zhidenkov@gmail.com
>
>
>
>
> This e-mail message may contain confidential or legally privileged
> information and is intended only for the use of the intended
> recipient(s). Any unauthorized disclosure, dissemination,
> distribution, copying or the taking of any action in reliance on the
> information herein is prohibited. E-mails are not secure and cannot be
> guaranteed to be error free as they can be intercepted, amended, or
> contain viruses. Anyone who communicates with us by e-mail is deemed
> to have accepted these risks. Company Name is not responsible for
> errors or omissions in this message and denies any responsibility for
> any damage arising from the use of e-mail. Any opinion and other
> statement contained in this message and any attachment are solely
> those of the author and do not necessarily represent those of the
> company.



-- 
Andrey Zhidenkov / Database developer
+79265617190/ andrey.zhidenkov@gmail.com




This e-mail message may contain confidential or legally privileged
information and is intended only for the use of the intended
recipient(s). Any unauthorized disclosure, dissemination,
distribution, copying or the taking of any action in reliance on the
information herein is prohibited. E-mails are not secure and cannot be
guaranteed to be error free as they can be intercepted, amended, or
contain viruses. Anyone who communicates with us by e-mail is deemed
to have accepted these risks. Company Name is not responsible for
errors or omissions in this message and denies any responsibility for
any damage arising from the use of e-mail. Any opinion and other
statement contained in this message and any attachment are solely
those of the author and do not necessarily represent those of the
company.



Re: Memory leak in Pl/Python

От
Tom Lane
Дата:
Andrey Zhidenkov <andrey.zhidenkov@gmail.com> writes:
> It's very strange, but when I use expression 'update test set test =
> 'test' where id = 1' as argument of plpy.execute() memory do not
> growing at all...

Well, that suggests it's not particularly plpython's fault at all, but
a leak somewhere in the table update.  You still haven't provided a
self-contained test case, and this bit of information strongly suggests
that something about the test table is critical to reproducing the
problem.
        regards, tom lane