Обсуждение: Re: per backend WAL statistics

Поиск
Список
Период
Сортировка

Re: per backend WAL statistics

От
Andres Freund
Дата:
Hi,

On 2025-01-16 15:59:31 +0000, Bertrand Drouvot wrote:
> On Wed, Jan 15, 2025 at 03:11:32PM +0900, Michael Paquier wrote:
> > On Fri, Jan 10, 2025 at 09:40:38AM +0000, Bertrand Drouvot wrote:
> > + * WAL pending statistics are incremented inside a critical section
> > + * (see XLogWrite()), so we can't use pgstat_prep_pending_entry() and we rely on
> > + * PendingBackendWalStats instead.
> > + */
> > +extern PGDLLIMPORT PgStat_PendingWalStats PendingBackendWalStats;
> > 
> > Hmm.  This makes me wonder if we should rethink a bit the way pending
> > entries are retrieved and if we should do it beforehand for the WAL
> > paths to avoid allocations in some critical sections.  Isn't that also
> > because we avoid calling pgstat_prep_backend_pending() for the I/O
> > case as only backends are supported now, discarding cases like the
> > checkpointer where I/O could happen in a critical path?  As a whole,
> > the approach taken by the patch is not really consistent with the
> > rest.

> I agree that's better to have a generic solution and to be consistent with
> the other variable-numbered stats. 
> 
> The attached is implementing in 0001 the proposition done in [1], i.e:
> 
> 1. It adds a new allow_critical_section to PgStat_KindInfo for pgstats kinds
> 2. It ensures to set temporarly allowInCritSection to true when needed
>
> Note that for safety reason 0001 does set allowInCritSection back to false
> unconditionally (means not checking again for allow_critical_section).

This is a preposterously bad idea.  The restriction to not allocate memory in
critical sections exists for a reason, why on earth should this code be
allowed to just opt out of the restriction of not allowing memory allocations
in critical sections?

The only cases where we can somewhat safely allocate memory in critical
sections is when using memory contexts with pre-reserved memory, where there's
a pretty low bound on how much memory is going to be needed. E.g. logging a
message inside a critical section, where elog.c can reset ErrorContext
afterwards.

Greetings,

Andres Freund



Re: per backend WAL statistics

От
Bertrand Drouvot
Дата:
Hi,

On Thu, Jan 16, 2025 at 11:38:47AM -0500, Andres Freund wrote:
> Hi,
> 
> On 2025-01-16 15:59:31 +0000, Bertrand Drouvot wrote:
> > On Wed, Jan 15, 2025 at 03:11:32PM +0900, Michael Paquier wrote:
> > > On Fri, Jan 10, 2025 at 09:40:38AM +0000, Bertrand Drouvot wrote:
> > > + * WAL pending statistics are incremented inside a critical section
> > > + * (see XLogWrite()), so we can't use pgstat_prep_pending_entry() and we rely on
> > > + * PendingBackendWalStats instead.
> > > + */
> > > +extern PGDLLIMPORT PgStat_PendingWalStats PendingBackendWalStats;
> > > 
> > > Hmm.  This makes me wonder if we should rethink a bit the way pending
> > > entries are retrieved and if we should do it beforehand for the WAL
> > > paths to avoid allocations in some critical sections.  Isn't that also
> > > because we avoid calling pgstat_prep_backend_pending() for the I/O
> > > case as only backends are supported now, discarding cases like the
> > > checkpointer where I/O could happen in a critical path?  As a whole,
> > > the approach taken by the patch is not really consistent with the
> > > rest.
> 
> > I agree that's better to have a generic solution and to be consistent with
> > the other variable-numbered stats. 
> > 
> > The attached is implementing in 0001 the proposition done in [1], i.e:
> > 
> > 1. It adds a new allow_critical_section to PgStat_KindInfo for pgstats kinds
> > 2. It ensures to set temporarly allowInCritSection to true when needed
> >
> > Note that for safety reason 0001 does set allowInCritSection back to false
> > unconditionally (means not checking again for allow_critical_section).
> 
> This is a preposterously bad idea.  The restriction to not allocate memory in
> critical sections exists for a reason,

Thanks for sharing your thoughts on it. In [1], you said:

"
My view is that for IO stats no memory allocation should be required - that
used to be the case and should be the case again
"

So, do you think that the initial proposal that has been made here (See R1. in
[2]) i.e make use of a new PendingBackendWalStats variable:

"
0003 does not rely on pgstat_prep_backend_pending() for its pending statistics
but on a new PendingBackendWalStats variable. The reason is that the pending wal
statistics are incremented in a critical section (see XLogWrite(), and so
a call to pgstat_prep_pending_entry() could trigger a failed assertion:
MemoryContextAllocZero()->"CritSectionCount == 0 || (context)->allowInCritSection"
"

and implemented up to v4 is a viable approach?

[1]: https://www.postgresql.org/message-id/66efowskppsns35v5u2m7k4sdnl7yoz5bo64tdjwq7r5lhplrz%40y7dme5xwh2r5
[2]: https://www.postgresql.org/message-id/Z3zqc4o09dM/Ezyz%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: per backend WAL statistics

От
Andres Freund
Дата:
Hi,

On 2025-01-16 17:11:09 +0000, Bertrand Drouvot wrote:
> On Thu, Jan 16, 2025 at 11:38:47AM -0500, Andres Freund wrote:
> > Hi,
> > 
> > On 2025-01-16 15:59:31 +0000, Bertrand Drouvot wrote:
> > > On Wed, Jan 15, 2025 at 03:11:32PM +0900, Michael Paquier wrote:
> > > > On Fri, Jan 10, 2025 at 09:40:38AM +0000, Bertrand Drouvot wrote:
> > > > + * WAL pending statistics are incremented inside a critical section
> > > > + * (see XLogWrite()), so we can't use pgstat_prep_pending_entry() and we rely on
> > > > + * PendingBackendWalStats instead.
> > > > + */
> > > > +extern PGDLLIMPORT PgStat_PendingWalStats PendingBackendWalStats;
> > > > 
> > > > Hmm.  This makes me wonder if we should rethink a bit the way pending
> > > > entries are retrieved and if we should do it beforehand for the WAL
> > > > paths to avoid allocations in some critical sections.  Isn't that also
> > > > because we avoid calling pgstat_prep_backend_pending() for the I/O
> > > > case as only backends are supported now, discarding cases like the
> > > > checkpointer where I/O could happen in a critical path?  As a whole,
> > > > the approach taken by the patch is not really consistent with the
> > > > rest.
> > 
> > > I agree that's better to have a generic solution and to be consistent with
> > > the other variable-numbered stats. 
> > > 
> > > The attached is implementing in 0001 the proposition done in [1], i.e:
> > > 
> > > 1. It adds a new allow_critical_section to PgStat_KindInfo for pgstats kinds
> > > 2. It ensures to set temporarly allowInCritSection to true when needed
> > >
> > > Note that for safety reason 0001 does set allowInCritSection back to false
> > > unconditionally (means not checking again for allow_critical_section).
> > 
> > This is a preposterously bad idea.  The restriction to not allocate memory in
> > critical sections exists for a reason,
> 
> Thanks for sharing your thoughts on it. In [1], you said:
> 
> "
> My view is that for IO stats no memory allocation should be required - that
> used to be the case and should be the case again
> "
> 
> So, do you think that the initial proposal that has been made here (See R1. in
> [2]) i.e make use of a new PendingBackendWalStats variable:

Well, I think this first needs be fixed for for the IO stats change made in

commit 9aea73fc61d
Author: Michael Paquier <michael@paquier.xyz>
Date:   2024-12-19 13:19:22 +0900
 
    Add backend-level statistics to pgstats


Once we have a pattern to model after, we can apply the same scheme here.


> "
> 0003 does not rely on pgstat_prep_backend_pending() for its pending statistics
> but on a new PendingBackendWalStats variable. The reason is that the pending wal
> statistics are incremented in a critical section (see XLogWrite(), and so
> a call to pgstat_prep_pending_entry() could trigger a failed assertion:
> MemoryContextAllocZero()->"CritSectionCount == 0 || (context)->allowInCritSection"
> "
> 
> and implemented up to v4 is a viable approach?

Yes-ish.  I think it would be better to make it slightly more general than
that, handling this for all types of backend stats, not just for WAL.

Greetings,

Andres Freund



Re: per backend WAL statistics

От
Bertrand Drouvot
Дата:
Hi,

On Thu, Jan 23, 2025 at 05:05:30PM +0900, Michael Paquier wrote:
> On Tue, Jan 21, 2025 at 07:19:55AM +0000, Bertrand Drouvot wrote:
> > PFA v6 that now relies on the new PendingBackendStats variable introduced in
> > 4feba03d8b9.
> > 
> > Remark: I moved PendingBackendStats back to pgstat.h because I think that the
> > "simple" pending stats increment that we are adding in xlog.c are not worth
> > an extra function call overhead (while it made more sense for the more complex IO
> > stats handling). So PendingBackendStats is now visible to the outside world like
> > PendingWalStats and friends.
> 
> You are re-doing here a pattern I was trying to avoid so as we don't
> copy-paste more checks based on pgstat_tracks_backend_bktype more than
> necessary.

I'm not sure I get it. pgstat_tracks_backend_bktype() is also called in
pgstat_count_backend_io_op() and pgstat_count_backend_io_op_time(). What issue
do you see with the extra calls part of this patch?

> I am wondering if we should think harder about the
> interface used to register WAL stats, and make it more consistent with
> the way pg_stat_io is handled, avoiding the hardcoded attribute
> numbers if we have an enum to control which field to update in some
> input routine.

Not sure as WAL stats just tracks a single dimension unlike IO stats which track
both IOObject and IOContext. What would be the benefit(s)?

> As we have only five counters in PgStat_PendingWalStats, the result
> you have is not that invasive, true.

And only one dimension.

> Are you sure that the interactions between pgWalUsage, prevWalUsage
> and prevBackendWalUsage are correct?

I think so and according to my testing I can see WalUsage values
that correlate nicely between pg_stat_wal() and pg_stat_get_backend_wal().

> As far I got it from a code
> read, prevWalUsage, prevBackendWalUsage and their local trackings in
> pgstat_backend.c and pgstat_wal.c rely on instrument.c as the primary
> source, as pgWalUsage can never be reset.  Is that right?

yeah, IIUC pgWalUsage acts as the primary source that both prevWalUsage and
prevBackendWalUsage diff against to calculate incremental stats.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com