Обсуждение: [HACKERS] Rethinking autovacuum.c memory handling

Поиск
Список
Период
Сортировка

[HACKERS] Rethinking autovacuum.c memory handling

От
Tom Lane
Дата:
I notice that autovacuum.c calls autovacuum_do_vac_analyze, and
thereby vacuum(), in TopTransactionContext.  This doesn't seem
like a terribly great idea, because it doesn't correspond to what
happens during a manually-invoked vacuum.  TopTransactionContext
will go away when vacuum() commits the outer transaction, whereas
in non-autovac usage, we call vacuum() in a PortalHeapMemory
context that is not a child of TopTransactionContext and is not
at risk of being reset multiple times during the vacuum().  This'd
be a hazard if autovacuum_do_vac_analyze or vacuum did any palloc's
before getting to the main loop.  More generally, I'm not aware of
other cases where we invoke a function in a context that we know
that function will destroy as it executes.

I don't see any live bug associated with this in HEAD, but this behavior
requires a rather ugly (and memory-leaking) workaround in the proposed
patch to allow multiple vacuum target rels.

What I think we should do instead is invoke autovacuum_do_vac_analyze
in the PortalContext that do_autovacuum has created, which we already
have a mechanism to reset once per table processed in do_autovacuum.

The attached patch does that, and also modifies perform_work_item()
to use the same approach.  Right now perform_work_item() has a
copied-and-pasted MemoryContextResetAndDeleteChildren(PortalContext)
call in its error recovery path, but that seems a bit out of place
given that perform_work_item() isn't using PortalContext otherwise.

Comments, objections?

            regards, tom lane

PS: I was disappointed to find out that perform_work_item() isn't
exercised at all in the standard regression tests.

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index b745d89..1a32433 100644
*** a/src/backend/postmaster/autovacuum.c
--- b/src/backend/postmaster/autovacuum.c
*************** do_autovacuum(void)
*** 2444,2451 ****
           */
          PG_TRY();
          {
              /* have at it */
-             MemoryContextSwitchTo(TopTransactionContext);
              autovacuum_do_vac_analyze(tab, bstrategy);

              /*
--- 2444,2453 ----
           */
          PG_TRY();
          {
+             /* Use PortalContext for any per-table allocations */
+             MemoryContextSwitchTo(PortalContext);
+
              /* have at it */
              autovacuum_do_vac_analyze(tab, bstrategy);

              /*
*************** do_autovacuum(void)
*** 2482,2487 ****
--- 2484,2492 ----
          }
          PG_END_TRY();

+         /* Make sure we're back in AutovacMemCxt */
+         MemoryContextSwitchTo(AutovacMemCxt);
+
          did_vacuum = true;

          /* the PGXACT flags are reset at the next end of transaction */
*************** perform_work_item(AutoVacuumWorkItem *wo
*** 2614,2619 ****
--- 2619,2627 ----

      autovac_report_workitem(workitem, cur_nspname, cur_datname);

+     /* clean up memory before each work item */
+     MemoryContextResetAndDeleteChildren(PortalContext);
+
      /*
       * We will abort the current work item if something errors out, and
       * continue with the next one; in particular, this happens if we are
*************** perform_work_item(AutoVacuumWorkItem *wo
*** 2622,2630 ****
       */
      PG_TRY();
      {
!         /* have at it */
!         MemoryContextSwitchTo(TopTransactionContext);

          switch (workitem->avw_type)
          {
              case AVW_BRINSummarizeRange:
--- 2630,2639 ----
       */
      PG_TRY();
      {
!         /* Use PortalContext for any per-work-item allocations */
!         MemoryContextSwitchTo(PortalContext);

+         /* have at it */
          switch (workitem->avw_type)
          {
              case AVW_BRINSummarizeRange:
*************** perform_work_item(AutoVacuumWorkItem *wo
*** 2668,2673 ****
--- 2677,2685 ----
      }
      PG_END_TRY();

+     /* Make sure we're back in AutovacMemCxt */
+     MemoryContextSwitchTo(AutovacMemCxt);
+
      /* We intentionally do not set did_vacuum here */

      /* be tidy */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
Michael Paquier
Дата:
On Sat, Sep 23, 2017 at 6:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I notice that autovacuum.c calls autovacuum_do_vac_analyze, and
> thereby vacuum(), in TopTransactionContext.  This doesn't seem
> like a terribly great idea, because it doesn't correspond to what
> happens during a manually-invoked vacuum.

Indeed, the inconsistency is not good here.

> What I think we should do instead is invoke autovacuum_do_vac_analyze
> in the PortalContext that do_autovacuum has created, which we already
> have a mechanism to reset once per table processed in do_autovacuum.
>
> The attached patch does that, and also modifies perform_work_item()
> to use the same approach.  Right now perform_work_item() has a
> copied-and-pasted MemoryContextResetAndDeleteChildren(PortalContext)
> call in its error recovery path, but that seems a bit out of place
> given that perform_work_item() isn't using PortalContext otherwise.

I have spent some time looking at your patch and testing it. This
looks sane. A small comment that I have would be to add an assertion
at the top of perform_work_item to be sure that it is called in the
memory context of AutovacMemCxt.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> I notice that autovacuum.c calls autovacuum_do_vac_analyze, and
> thereby vacuum(), in TopTransactionContext.  This doesn't seem
> like a terribly great idea, because it doesn't correspond to what
> happens during a manually-invoked vacuum.  TopTransactionContext
> will go away when vacuum() commits the outer transaction, whereas
> in non-autovac usage, we call vacuum() in a PortalHeapMemory
> context that is not a child of TopTransactionContext and is not
> at risk of being reset multiple times during the vacuum().  This'd
> be a hazard if autovacuum_do_vac_analyze or vacuum did any palloc's
> before getting to the main loop.  More generally, I'm not aware of
> other cases where we invoke a function in a context that we know
> that function will destroy as it executes.

Oh, good catch.  This must be a very old oversight -- I bet it goes all
the way back to the introduction of autovacuum.  I think if there were
any actual bugs, we would have noticed by now.

> I don't see any live bug associated with this in HEAD, but this behavior
> requires a rather ugly (and memory-leaking) workaround in the proposed
> patch to allow multiple vacuum target rels.

Well, it's definitely broken, so I'd vote for fixing it even if we
weren't considering that patch.

> What I think we should do instead is invoke autovacuum_do_vac_analyze
> in the PortalContext that do_autovacuum has created, which we already
> have a mechanism to reset once per table processed in do_autovacuum.

Sounds sensible.

> The attached patch does that, and also modifies perform_work_item()
> to use the same approach.  Right now perform_work_item() has a
> copied-and-pasted MemoryContextResetAndDeleteChildren(PortalContext)
> call in its error recovery path, but that seems a bit out of place
> given that perform_work_item() isn't using PortalContext otherwise.

Oops :-(

> PS: I was disappointed to find out that perform_work_item() isn't
> exercised at all in the standard regression tests.

Yeah ... It's fairly simple to create a test that will invoke it a few
times, but one problem is that it depends on autovacuum's timing.  If I
add this in brin.sql

-- Test BRIN work items
CREATE TABLE brin_work_items (LIKE brintest) WITH (fillfactor = 10);
CREATE INDEX brin_work_items_idx ON brin_work_items USING brin (textcol)   WITH (autosummarize = on,
pages_per_range=1);
INSERT INTO brin_work_items SELECT * FROM brintest;

the work item is only performed about 15 seconds after the complete
regression tests are finished, which I fear would mean "never" in
practice.

One idea I just had is to somehow add it to src/test/modules/brin, and
set up the postmaster in that test with autovacuum_naptime=1s.  I'll go
check how feasible that is.  (By placing it there we could also verify
that the index does indeed contain the index entries we expect, since it
has pageinspect available.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
Alvaro Herrera
Дата:
Alvaro Herrera wrote:

> One idea I just had is to somehow add it to src/test/modules/brin, and
> set up the postmaster in that test with autovacuum_naptime=1s.  I'll go
> check how feasible that is.  (By placing it there we could also verify
> that the index does indeed contain the index entries we expect, since it
> has pageinspect available.)

Yeah, this works, as per the attached patch.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
"Bossart, Nathan"
Дата:
On 9/23/17, 5:27 AM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
>On Sat, Sep 23, 2017 at 6:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I notice that autovacuum.c calls autovacuum_do_vac_analyze, and
>> thereby vacuum(), in TopTransactionContext.  This doesn't seem
>> like a terribly great idea, because it doesn't correspond to what
>> happens during a manually-invoked vacuum.
>
> Indeed, the inconsistency is not good here.
>
>> What I think we should do instead is invoke autovacuum_do_vac_analyze
>> in the PortalContext that do_autovacuum has created, which we already
>> have a mechanism to reset once per table processed in do_autovacuum.
>>
>> The attached patch does that, and also modifies perform_work_item()
>> to use the same approach.  Right now perform_work_item() has a
>> copied-and-pasted MemoryContextResetAndDeleteChildren(PortalContext)
>> call in its error recovery path, but that seems a bit out of place
>> given that perform_work_item() isn't using PortalContext otherwise.
>
> I have spent some time looking at your patch and testing it. This
> looks sane. A small comment that I have would be to add an assertion
> at the top of perform_work_item to be sure that it is called in the
> memory context of AutovacMemCxt.

This looks reasonable to me as well.  I haven't noticed any issues after
a couple hours of pgbench with aggressive autovacuum settings, either.

Nathan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> I have spent some time looking at your patch and testing it. This
> looks sane. A small comment that I have would be to add an assertion
> at the top of perform_work_item to be sure that it is called in the
> memory context of AutovacMemCxt.

Done like that, thanks for reviewing!
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
Tom Lane
Дата:
"Bossart, Nathan" <bossartn@amazon.com> writes:
> This looks reasonable to me as well.  I haven't noticed any issues after
> a couple hours of pgbench with aggressive autovacuum settings, either.

Thanks for looking.  As I'm sure you realize, what motivated that was
not liking the switch into AutovacMemCxt that you'd added in
autovacuum_do_vac_analyze ... with this patch, we can drop that.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
"Bossart, Nathan"
Дата:
On 9/23/17, 12:36 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>"Bossart, Nathan" <bossartn@amazon.com> writes:
>> This looks reasonable to me as well.  I haven't noticed any issues after
>> a couple hours of pgbench with aggressive autovacuum settings, either.
>
> Thanks for looking.  As I'm sure you realize, what motivated that was
> not liking the switch into AutovacMemCxt that you'd added in
> autovacuum_do_vac_analyze ... with this patch, we can drop that.

Yup.  I’ll go ahead and update that patch now.

Nathan



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rethinking autovacuum.c memory handling

От
Michael Paquier
Дата:
On Sun, Sep 24, 2017 at 2:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> I have spent some time looking at your patch and testing it. This
>> looks sane. A small comment that I have would be to add an assertion
>> at the top of perform_work_item to be sure that it is called in the
>> memory context of AutovacMemCxt.
>
> Done like that, thanks for reviewing!

Thanks for considering my idea.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers