Обсуждение: buffer refcount leak in foreign batch insert code

Поиск
Список
Период
Сортировка

buffer refcount leak in foreign batch insert code

От
Alexander Pyhalov
Дата:
Hi.

We've found that in cases like the one attached, when we insert into 
foreign partition with batch_size set, buffer refcount leak is detected.

The above example we see a dozen of similar messages:

repro_small.sql:31: WARNING:  buffer refcount leak: [14621] 
(rel=base/16718/16732, blockNum=54, flags=0x93800000

The issue was introduced in the following commit

commit b676ac443b6a83558d4701b2dd9491c0b37e17c4
Author: Tomas Vondra <tomas.vondra@postgresql.org>
Date:   Fri Jun 11 20:19:48 2021 +0200

     Optimize creation of slots for FDW bulk inserts

In this commit we avoid recreating slots for each batch. But it seems 
that created slots should still be cleared in the end of 
ExecBatchInsert().

At least the attached patch seems to fix the issue.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Вложения

Re: buffer refcount leak in foreign batch insert code

От
Michael Paquier
Дата:
On Fri, Apr 21, 2023 at 01:07:03PM +0300, Alexander Pyhalov wrote:
> We've found that in cases like the one attached, when we insert into foreign
> partition with batch_size set, buffer refcount leak is detected.
>
> The above example we see a dozen of similar messages:
>
> repro_small.sql:31: WARNING:  buffer refcount leak: [14621]
> (rel=base/16718/16732, blockNum=54, flags=0x93800000

Indeed, nice repro!  That's obviously wrong, I'll look into that.
--
Michael

Вложения

Re: buffer refcount leak in foreign batch insert code

От
Michael Paquier
Дата:
On Fri, Apr 21, 2023 at 07:16:03PM +0900, Michael Paquier wrote:
> On Fri, Apr 21, 2023 at 01:07:03PM +0300, Alexander Pyhalov wrote:
>> We've found that in cases like the one attached, when we insert into foreign
>> partition with batch_size set, buffer refcount leak is detected.
>>
>> The above example we see a dozen of similar messages:
>>
>> repro_small.sql:31: WARNING:  buffer refcount leak: [14621]
>> (rel=base/16718/16732, blockNum=54, flags=0x93800000

This reminds me a lot of the recent multi-insert logic added to
various DDL code paths for catalogs, bref..

The number of slots ready for a batch is tracked by ri_NumSlots, and
it is reset to 0 each time a batch has been processed.  How about
resetting the counter at the same place as the slots are cleared, at
the end of ExecBatchInsert() as the same time as when the slots are
cleared?

I was wondering as well about resetting the slot just before copying
something into ri_Slots in ExecInsert() (actually close to the slot
copy), which is something that would make the operation cheaper for
large batches because the last batch could be cleaned up with
ExecEndModifyTable(), but this makes the code much messier when a
tuple is added into one of the slots as we would need to switch
back-and-forth with es_query_cxt from what I can see, because th
batches are inserted before any slot initialization is done.  In
short, I'm okay with what's proposed here and clean up things at the
same time as ri_NumSlots.

Another thing was the interaction of this change with triggers
(delete, insert with returning and batches to flush pending inserts,
etc.), and that looked correct to me (I have plugged in some of these
triggers noisy on notices on the relations of the partitions tree).

Self-reminder: the tests of postgres_fdw are rather long now, perhaps
these should be split into more files in the future..

The attached is what I am finishing with, with a minimal regression
test added to postgres_fdw.  Two partitions are enough.  Alexander,
what do you think?
--
Michael

Вложения

Re: buffer refcount leak in foreign batch insert code

От
Michael Paquier
Дата:
On Mon, Apr 24, 2023 at 09:57:10AM +0900, Michael Paquier wrote:
> The attached is what I am finishing with, with a minimal regression
> test added to postgres_fdw.  Two partitions are enough.

Well, I have gone through that again this morning, and applied the fix
down to 14.  The buildfarm is digesting it fine.
--
Michael

Вложения

Re: buffer refcount leak in foreign batch insert code

От
Alexander Pyhalov
Дата:
Michael Paquier писал 2023-04-25 04:30:
> On Mon, Apr 24, 2023 at 09:57:10AM +0900, Michael Paquier wrote:
>> The attached is what I am finishing with, with a minimal regression
>> test added to postgres_fdw.  Two partitions are enough.
> 
> Well, I have gone through that again this morning, and applied the fix
> down to 14.  The buildfarm is digesting it fine.
> --
> Michael

Thank you. Sorry for the late response, was on vacation.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional