Обсуждение: Wait events for delayed checkpoints

Поиск
Список
Период
Сортировка

Wait events for delayed checkpoints

От
Thomas Munro
Дата:
Hi,

You can't tell if your checkpointer is spending a lot of time waiting
around for flags in delayChkptFlags to clear.  Trivial patch to add
that.  I've managed to see it a few times when checkpointing
repeatedly with a heavy pgbench workload.

I had to stop and think for a moment about whether these events belong
under "WaitEventIPC", "waiting for notification from another process"
or under "WaitEventTimeout", "waiting for a timeout to expire".  I
mean, both?  It's using sleep-and-poll instead of (say) a CV due to
the economics, we want to make the other side as cheap as possible, so
we don't care about making the checkpointer take some micro-naps in
this case.  I feel like the key point here is that it's waiting for
another process to do stuff and unblock it.

Вложения

Re: Wait events for delayed checkpoints

От
Robert Haas
Дата:
On Wed, Oct 11, 2023 at 9:13 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> You can't tell if your checkpointer is spending a lot of time waiting
> around for flags in delayChkptFlags to clear.  Trivial patch to add
> that.  I've managed to see it a few times when checkpointing
> repeatedly with a heavy pgbench workload.
>
> I had to stop and think for a moment about whether these events belong
> under "WaitEventIPC", "waiting for notification from another process"
> or under "WaitEventTimeout", "waiting for a timeout to expire".  I
> mean, both?  It's using sleep-and-poll instead of (say) a CV due to
> the economics, we want to make the other side as cheap as possible, so
> we don't care about making the checkpointer take some micro-naps in
> this case.  I feel like the key point here is that it's waiting for
> another process to do stuff and unblock it.

IPC seems right to me. Yeah, a timeout is being used, but as you say,
that's an implementation detail.

+1 for the idea, too.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Wait events for delayed checkpoints

От
Michael Paquier
Дата:
On Thu, Oct 12, 2023 at 01:32:29PM -0400, Robert Haas wrote:
> IPC seems right to me. Yeah, a timeout is being used, but as you say,
> that's an implementation detail.
>
> +1 for the idea, too.

Agreed that timeout makes little sense in this context, and IPC looks
correct.

+        pgstat_report_wait_start(WAIT_EVENT_CHECKPOINT_DELAY_START);
         do
         {
             pg_usleep(10000L);    /* wait for 10 msec */
         } while (HaveVirtualXIDsDelayingChkpt(vxids, nvxids,
                                               DELAY_CHKPT_START));
+        pgstat_report_wait_end();

HaveVirtualXIDsDelayingChkpt() does immediately a LWLockAcquire()
which would itself report a wait event for ProcArrayLock, overwriting
this new one, no?
--
Michael

Вложения

Re: Wait events for delayed checkpoints

От
Robert Haas
Дата:
On Thu, Oct 12, 2023 at 7:09 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Thu, Oct 12, 2023 at 01:32:29PM -0400, Robert Haas wrote:
> > IPC seems right to me. Yeah, a timeout is being used, but as you say,
> > that's an implementation detail.
> >
> > +1 for the idea, too.
>
> Agreed that timeout makes little sense in this context, and IPC looks
> correct.
>
> +        pgstat_report_wait_start(WAIT_EVENT_CHECKPOINT_DELAY_START);
>          do
>          {
>              pg_usleep(10000L);    /* wait for 10 msec */
>          } while (HaveVirtualXIDsDelayingChkpt(vxids, nvxids,
>                                                DELAY_CHKPT_START));
> +        pgstat_report_wait_end();
>
> HaveVirtualXIDsDelayingChkpt() does immediately a LWLockAcquire()
> which would itself report a wait event for ProcArrayLock, overwriting
> this new one, no?

Ah, right: the wait event should be set and cleared around pg_usleep,
not the whole loop.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Wait events for delayed checkpoints

От
Thomas Munro
Дата:
On Fri, Oct 13, 2023 at 2:19 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Oct 12, 2023 at 7:09 PM Michael Paquier <michael@paquier.xyz> wrote:
> > HaveVirtualXIDsDelayingChkpt() does immediately a LWLockAcquire()
> > which would itself report a wait event for ProcArrayLock, overwriting
> > this new one, no?
>
> Ah, right: the wait event should be set and cleared around pg_usleep,
> not the whole loop.

Duh.  Yeah.  Pushed like that.  Thanks both.