Обсуждение: Custom pgstat support performance regression for simple queries

Поиск
Список
Период
Сортировка

Custom pgstat support performance regression for simple queries

От
Andres Freund
Дата:
Hi,

I wanted to run some performance tests for [1] and looked at the profile of a
workload with a lot of short queries. And was rather surprised to see
pgstat_report_stat() to be the top entry - that certainly didn't use to be the
case.

For the, obviously rather extreme, workload of a pgbench session just running
  SELECT 1;
I see about a 6% regression from 17.

That seems too high, even for such an absurd workload.


It's one thing to have a loop over all potential stats kinds when we know
there are pending stats, that's not that frequent. But fc415edf8ca8 changed a
test consisting out of 3 checks of a variable and one direct function call to
a loop over an array with 256 elements, with a number of indirect function
calls - all of that happens *before* pgstat_report_stat() decides that there
are no stats to report.

It seems rather unsurprising that that causes a slowdown.

The pre-check is there to:
    /* Don't expend a clock check if nothing to do */

but you made it way more expensive than a clock check would have been (not
counting old vmware installs or such, where clock checks take ages).

If I change the loop count to only be the builtin stats kinds, the overhead
goes away almost completely.

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/aGKSzFlpQWSh%2F%2B2w%40ip-10-97-1-34.eu-west-3.compute.internal



Re: Custom pgstat support performance regression for simple queries

От
Michael Paquier
Дата:
On Tue, Jul 22, 2025 at 10:57:06AM -0400, Andres Freund wrote:
> It seems rather unsurprising that that causes a slowdown.
>
> The pre-check is there to:
>     /* Don't expend a clock check if nothing to do */
>
> but you made it way more expensive than a clock check would have been (not
> counting old vmware installs or such, where clock checks take ages).
>
> If I change the loop count to only be the builtin stats kinds, the overhead
> goes away almost completely.

Noted.  I was wondering originally if the threshold for the builtin
and custom kinds should be lowered, as well.  Perhaps that's just too
optimistic, so we can first lower things.  Say PGSTAT_KIND_CUSTOM_MIN
to 32 and PGSTAT_KIND_MAX to 48 or so to begin with?  I don't see a
strict policy here, but these would make the whole cheaper to begin
with, with enough margin for any new stats kinds.

Then, about the loop used here, I'd be OK to keep the past shortcuts
for the builtin fixed-sized stats kinds with the requirement that new
builtin stats kinds need to hack into pgstat_report_stat() themselves
on efficiency grounds, combined with a second loop for custom kinds,
taken only if pgstat_register_kind() has been used by tracking if
that's the case in a flag.  Do you have a specific preference here?
--
Michael

Вложения

Re: Custom pgstat support performance regression for simple queries

От
Bertrand Drouvot
Дата:
Hi,

On Wed, Jul 23, 2025 at 09:54:12AM +0900, Michael Paquier wrote:
> Then, about the loop used here, I'd be OK to keep the past shortcuts
> for the builtin fixed-sized stats kinds with the requirement that new
> builtin stats kinds need to hack into pgstat_report_stat() themselves
> on efficiency grounds

Maybe we could use a flag, say:

#define PGSTAT_PENDING_IO     (1 << 0) 
#define PGSTAT_PENDING_WAL    (1 << 1)
#define PGSTAT_PENDING_SLRU   (1 << 2)

and check for a pgstat_pending_mask in pgstat_report_stat() instead?

They would need to set pgstat_pending_mask accordingly when they flush, have
pending stats though.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Custom pgstat support performance regression for simple queries

От
Michael Paquier
Дата:
On Jul 23, 2025, at 16:33, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
> Maybe we could use a flag, say:
>
> #define PGSTAT_PENDING_IO     (1 << 0)
> #define PGSTAT_PENDING_WAL    (1 << 1)
> #define PGSTAT_PENDING_SLRU   (1 << 2)
>
> and check for a pgstat_pending_mask in pgstat_report_stat() instead?
>
> They would need to set pgstat_pending_mask accordingly when they flush, have
> pending stats though.

The point is just to say to the report path to move further if at least one of the fixed stats kinds has something to
flush,then let the loop do its work across all the stats kinds if they have a flush callback, so we don’t really need
tomix multiple numbers; we could just have a single boolean flag that any fixed-sized stats kinds can set to let the
reportingknow that some activity has happened. This would mean that custom fixed-sized kinds would need to interact
withthis flag as well, but that makes the fast-exit path dirty cheap with or without custom stats kinds. 
--
Michael





Re: Custom pgstat support performance regression for simple queries

От
Bertrand Drouvot
Дата:
On Wed, Jul 23, 2025 at 05:09:54PM +0900, Michael Paquier wrote:
> On Jul 23, 2025, at 16:33, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
> > Maybe we could use a flag, say:
> > 
> > #define PGSTAT_PENDING_IO     (1 << 0)
> > #define PGSTAT_PENDING_WAL    (1 << 1)
> > #define PGSTAT_PENDING_SLRU   (1 << 2)
> > 
> > and check for a pgstat_pending_mask in pgstat_report_stat() instead?
> > 
> > They would need to set pgstat_pending_mask accordingly when they flush, have
> > pending stats though.
> 
> The point is just to say to the report path to move further if at least one of
> the fixed stats kinds has something to flush, then let the loop do its work
> across all the stats kinds if they have a flush callback,

Right.

> so we don’t really need to mix multiple numbers; we could just have a single
> boolean flag that any fixed-sized stats kinds can set to let the reporting know
> that some activity has happened.

That works to say "there are pending stats" but not well to say "there are no
pending stats".

Indeed, with a single boolean flag, then how could a stat say that it has nothing
pending anymore (when flushing) without saying "all the stats have nothing
pending" (while some may still have pending stats)?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Custom pgstat support performance regression for simple queries

От
Andres Freund
Дата:
Hi,

On 2025-07-23 10:23:53 +0000, Bertrand Drouvot wrote:
> On Wed, Jul 23, 2025 at 05:09:54PM +0900, Michael Paquier wrote:
> > so we don’t really need to mix multiple numbers; we could just have a single
> > boolean flag that any fixed-sized stats kinds can set to let the reporting know
> > that some activity has happened.
> 
> That works to say "there are pending stats" but not well to say "there are no
> pending stats".
> 
> Indeed, with a single boolean flag, then how could a stat say that it has nothing
> pending anymore (when flushing) without saying "all the stats have nothing
> pending" (while some may still have pending stats)?

I don't think that's a problem - reset that global flag after checking it at
the start of pgstat_report_stat() and set it to true if partial_flush is true
at the end of pgstat_report_stat().

Greetings,

Andres Freund



Re: Custom pgstat support performance regression for simple queries

От
Andres Freund
Дата:
Hi,

On 2025-07-23 09:54:12 +0900, Michael Paquier wrote:
> On Tue, Jul 22, 2025 at 10:57:06AM -0400, Andres Freund wrote:
> > It seems rather unsurprising that that causes a slowdown.
> > 
> > The pre-check is there to:
> >     /* Don't expend a clock check if nothing to do */
> > 
> > but you made it way more expensive than a clock check would have been (not
> > counting old vmware installs or such, where clock checks take ages).
> > 
> > If I change the loop count to only be the builtin stats kinds, the overhead
> > goes away almost completely.
> 
> Noted.  I was wondering originally if the threshold for the builtin
> and custom kinds should be lowered, as well.  Perhaps that's just too
> optimistic, so we can first lower things.  Say PGSTAT_KIND_CUSTOM_MIN
> to 32 and PGSTAT_KIND_MAX to 48 or so to begin with?  I don't see a
> strict policy here, but these would make the whole cheaper to begin
> with, with enough margin for any new stats kinds.

Yes, 256 seems pointlessly large.


> Then, about the loop used here, I'd be OK to keep the past shortcuts
> for the builtin fixed-sized stats kinds with the requirement that new
> builtin stats kinds need to hack into pgstat_report_stat() themselves
> on efficiency grounds, combined with a second loop for custom kinds,
> taken only if pgstat_register_kind() has been used by tracking if
> that's the case in a flag.  Do you have a specific preference here?

I think the downstream approach of having a flag that says if there are *any*
pending stats is better.

Greetings,

Andres Freund



Re: Custom pgstat support performance regression for simple queries

От
Michael Paquier
Дата:
On Wed, Jul 23, 2025 at 11:59:11AM -0400, Andres Freund wrote:
> On 2025-07-23 10:23:53 +0000, Bertrand Drouvot wrote:
>> Indeed, with a single boolean flag, then how could a stat say that it has nothing
>> pending anymore (when flushing) without saying "all the stats have nothing
>> pending" (while some may still have pending stats)?
>
> I don't think that's a problem - reset that global flag after checking it at
> the start of pgstat_report_stat() and set it to true if partial_flush is true
> at the end of pgstat_report_stat().

That's one way of doing it.  While looking at the code, I tend to
think that it is actually simpler to reset it once at the bottom of
pgstat_report_fixed, not touching it at all if we are in a
partial_flush state because we know that we will need to do one report
later on anyway, be it just for the stats in the dshash.  If we do
that, we can also skip entirely the flush_static_cb calls if
pgstat_report_fixed is not set while doing the reports.
--
Michael

Вложения

Re: Custom pgstat support performance regression for simple queries

От
Michael Paquier
Дата:
On Wed, Jul 23, 2025 at 12:00:55PM -0400, Andres Freund wrote:
> On 2025-07-23 09:54:12 +0900, Michael Paquier wrote:
>> Noted.  I was wondering originally if the threshold for the builtin
>> and custom kinds should be lowered, as well.  Perhaps that's just too
>> optimistic, so we can first lower things.  Say PGSTAT_KIND_CUSTOM_MIN
>> to 32 and PGSTAT_KIND_MAX to 48 or so to begin with?  I don't see a
>> strict policy here, but these would make the whole cheaper to begin
>> with, with enough margin for any new stats kinds.
>
> Yes, 256 seems pointlessly large.

I still cannot put my finger on what a good number should be, but I've
settled to something that I think should be good enough for an initial
release:
PGSTAT_KIND_CUSTOM_MIN 128 -> 24
PGSTAT_KIND_MAX 256 -> 32

These numbers mean that we have enough room for 7 more builtins kinds,
same for custom kinds.  Even if this is an issue, this could always be
moved up at some point even across major releases, even if that would
mean for extension developers to switch to a different ID.

>> Then, about the loop used here, I'd be OK to keep the past shortcuts
>> for the builtin fixed-sized stats kinds with the requirement that new
>> builtin stats kinds need to hack into pgstat_report_stat() themselves
>> on efficiency grounds, combined with a second loop for custom kinds,
>> taken only if pgstat_register_kind() has been used by tracking if
>> that's the case in a flag.  Do you have a specific preference here?
>
> I think the downstream approach of having a flag that says if there are *any*
> pending stats is better.

Works for me.  I am finishing with the attached, meaning that
workloads that trigger no stats can take the fast-exit path with a
single boolean check.  What do you think?
--
Michael

Вложения

Re: Custom pgstat support performance regression for simple queries

От
Bertrand Drouvot
Дата:
Hi,

On Thu, Jul 24, 2025 at 09:34:45AM +0900, Michael Paquier wrote:
> On Wed, Jul 23, 2025 at 12:00:55PM -0400, Andres Freund wrote:
> > On 2025-07-23 09:54:12 +0900, Michael Paquier wrote:
> >> Noted.  I was wondering originally if the threshold for the builtin
> >> and custom kinds should be lowered, as well.  Perhaps that's just too
> >> optimistic, so we can first lower things.  Say PGSTAT_KIND_CUSTOM_MIN
> >> to 32 and PGSTAT_KIND_MAX to 48 or so to begin with?  I don't see a
> >> strict policy here, but these would make the whole cheaper to begin
> >> with, with enough margin for any new stats kinds.
> > 
> > Yes, 256 seems pointlessly large.
> 
> I still cannot put my finger on what a good number should be, but I've
> settled to something that I think should be good enough for an initial
> release:
> PGSTAT_KIND_CUSTOM_MIN 128 -> 24
> PGSTAT_KIND_MAX 256 -> 32
> 
> These numbers mean that we have enough room for 7 more builtins kinds,

11 for builtins kinds? (from 13 to 23)

> same for custom kinds.

9 for custom kinds including experimental? (24 -> 32)

I think that gives some room (like we'll almost double the builtins kinds if we
were to use them all).

> Even if this is an issue, this could always be
> moved up at some point even across major releases, even if that would
> mean for extension developers to switch to a different ID.

Yeah...

  * development and have not reserved their own unique kind ID yet. See:
  * https://wiki.postgresql.org/wiki/CustomCumulativeStats
  */
-#define PGSTAT_KIND_EXPERIMENTAL       128
+#define PGSTAT_KIND_EXPERIMENTAL       24

Note that we'll have to update the wiki too.

> >> Then, about the loop used here, I'd be OK to keep the past shortcuts
> >> for the builtin fixed-sized stats kinds with the requirement that new
> >> builtin stats kinds need to hack into pgstat_report_stat() themselves
> >> on efficiency grounds, combined with a second loop for custom kinds,
> >> taken only if pgstat_register_kind() has been used by tracking if
> >> that's the case in a flag.  Do you have a specific preference here?
> > 
> > I think the downstream approach of having a flag that says if there are *any*
> > pending stats is better.
> 
> Works for me.  I am finishing with the attached, meaning that
> workloads that trigger no stats can take the fast-exit path with a
> single boolean check.  What do you think?

That would work but I'm a bit worried about:

- it's easy to miss and error prone to update pgstat_report_fixed (for new code
path, new builtin fixed-sized stats kinds)

- it's easy to miss for extensions that they have to update this global variable.
Indeed, the cb have_static_pending_cb has been removed (there is the new comment
in PgStat_KindInfo though, but I've the feeling it's easier to miss as compared
to the cb)

What about?

- create a global variable but that should be used only by extensions? Say,
pgstat_report_custom_fixed

- for builtin fixed-sized, just rely on have_iostats, have_slrustats and friends

Then in pgstat_report_stat():

- do the check on have_iostats, have_slrustats and friends and on
pgstat_report_custom_fixed

- reset pgstat_report_custom_fixed

That way I think it's less error prone for the builtin stats and more clear
for the extensions (as at least "custom" is also part of the global variable
name and the check in pgstat_report_stat() would make it clear that we rely on
the custom global variable).

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Custom pgstat support performance regression for simple queries

От
Michael Paquier
Дата:
On Thu, Jul 24, 2025 at 07:38:46AM +0000, Bertrand Drouvot wrote:
> On Thu, Jul 24, 2025 at 09:34:45AM +0900, Michael Paquier wrote:
>> These numbers mean that we have enough room for 7 more builtins kinds,
>
> 11 for builtins kinds? (from 13 to 23)
>
> 9 for custom kinds including experimental? (24 -> 32)
>
> I think that gives some room (like we'll almost double the builtins kinds if we
> were to use them all).

Yes.  Problems with a lack of fingers and caffeine, perhaps.

>   * development and have not reserved their own unique kind ID yet. See:
>   * https://wiki.postgresql.org/wiki/CustomCumulativeStats
>   */
> -#define PGSTAT_KIND_EXPERIMENTAL       128
> +#define PGSTAT_KIND_EXPERIMENTAL       24
>
> Note that we'll have to update the wiki too.

Yep, that was on my TODO list.

> What about?
>
> - create a global variable but that should be used only by extensions? Say,
> pgstat_report_custom_fixed
>
> - for builtin fixed-sized, just rely on have_iostats, have_slrustats and friends
>
> Then in pgstat_report_stat():
>
> - do the check on have_iostats, have_slrustats and friends and on
> pgstat_report_custom_fixed
>
> - reset pgstat_report_custom_fixed
>
> That way I think it's less error prone for the builtin stats and more clear
> for the extensions (as at least "custom" is also part of the global variable
> name and the check in pgstat_report_stat() would make it clear that we rely on
> the custom global variable).
>
> Thoughts?

FWIW, I see more benefits in the simplicity of a single flag to govern
both builtin and custom kinds, but I can see that this may come down
to personal taste.  A single flag is simpler here, and cheap.

What have_static_pending_cb was originally aiming for is the removal
of any knowledge specific to stats kinds in the central pgstats APIs,
which is why the flags specific to IO, SLRU and WAL have been moved
out into their own files.

Letting custom stats manipulate pgstat_report_fixed or invent a new
pgstat_report_custom_fixed would be logically the same thing, but
this comes down to the fact that we still want to decide if a report
should be triggered if any of the fixed-numbered stats want to let
pgstat_report_stat() do one round of pending stats flush.

Having a second flag would keep this abstraction level intact.  Now
that leads to overcomplicating the API for a small gain in
debuggability to know which part of the subsystem wants the report to
happen at early stages of pgstat_report_stat().  If we want to know
exactly which stats kind wants the flush to happen, it may be better
to reuse your idea of one single uint32 or uint64 with one bit
allocated for each stats kind to have this information available in
pgstat_report_fixed().  However, we already know that with the
flush_static_cb callback, as dedicated paths can be taken for each
stats kinds.  Once again, this would require stats kind to set their
bit to force a round of reports, gaining more information before we
try to call each flush_static_cb.

I am going to apply the patch to lower the bounds in a bit, to begin
with, then adjust the wiki.
--
Michael

Вложения

Re: Custom pgstat support performance regression for simple queries

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Jul 25, 2025 at 10:38:59AM +0900, Michael Paquier wrote:
> On Thu, Jul 24, 2025 at 07:38:46AM +0000, Bertrand Drouvot wrote:
> > What about?
> > 
> > - create a global variable but that should be used only by extensions? Say,
> > pgstat_report_custom_fixed
> > 
> > - for builtin fixed-sized, just rely on have_iostats, have_slrustats and friends
> > 
> > Then in pgstat_report_stat():
> > 
> > - do the check on have_iostats, have_slrustats and friends and on
> > pgstat_report_custom_fixed
> > 
> > - reset pgstat_report_custom_fixed
> > 
> > That way I think it's less error prone for the builtin stats and more clear
> > for the extensions (as at least "custom" is also part of the global variable
> > name and the check in pgstat_report_stat() would make it clear that we rely on
> > the custom global variable).
> > 
> > Thoughts?
> 
> FWIW, I see more benefits in the simplicity of a single flag to govern
> both builtin and custom kinds, but I can see that this may come down
> to personal taste.  A single flag is simpler here, and cheap.
> 
> What have_static_pending_cb was originally aiming for is the removal
> of any knowledge specific to stats kinds in the central pgstats APIs,
> which is why the flags specific to IO, SLRU and WAL have been moved
> out into their own files.

Yes, with my idea we'd need to move them back. 

> Letting custom stats manipulate pgstat_report_fixed or invent a new
> pgstat_report_custom_fixed would be logically the same thing, but 
> this comes down to the fact that we still want to decide if a report
> should be triggered if any of the fixed-numbered stats want to let
> pgstat_report_stat() do one round of pending stats flush.

Yes.

> Having a second flag would keep this abstraction level intact.

Not a second, just one global "pgstat_report_custom_fixed" and the specifics
flags for IO, SLRU, something like:

if (dlist_is_empty(&pgStatPending) &&
    !have_iostats &&
    !have_slrustats &&
    !pgstat_report_custom_fixed &&
    !pgstat_have_pending_wal())


> Now
> that leads to overcomplicating the API

Not sure.

> for a small gain in
> debuggability to know which part of the subsystem wants the report to 
> happen at early stages of pgstat_report_stat().  If we want to know
> exactly which stats kind wants the flush to happen, it may be better
> to reuse your idea of one single uint32 or uint64 with one bit
> allocated for each stats kind to have this information available in
> pgstat_report_fixed().  However, we already know that with the
> flush_static_cb callback, as dedicated paths can be taken for each
> stats kinds.  Once again, this would require stats kind to set their
> bit to force a round of reports, gaining more information before we
> try to call each flush_static_cb.

Yeah. I'm fine with one single flag as you proposed, I just have the feeling
it's more error prone.

As an example:

@@ -1091,6 +1092,9 @@ XLogInsertRecord(XLogRecData *rdata,
                pgWalUsage.wal_bytes += rechdr->xl_tot_len;
                pgWalUsage.wal_records++;
                pgWalUsage.wal_fpi += num_fpi;
+
+               /* Required for the flush of pending stats WAL data */
+               pgstat_report_fixed = true;
        }

        return EndPos;
@@ -2108,6 +2112,12 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli, bool opportunistic)
                                        LWLockRelease(WALWriteLock);
                                        pgWalUsage.wal_buffers_full++;
                                        TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
+
+                                       /*
+                                        * Required for the flush of pending stats WAL data, per
+                                        * update of pgWalUsage.
+                                        */
+                                       pgstat_report_fixed = true;
                                }

This was not needed before fc415edf8ca8, and I think it was better to use
pgstat_have_pending_wal() to not have to set a flag to every places pgWalUsage.XX
changes.

Having said that, I think the single global flag patch is pretty straightforward
and LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Custom pgstat support performance regression for simple queries

От
Michael Paquier
Дата:
On Fri, Jul 25, 2025 at 08:57:29AM +0000, Bertrand Drouvot wrote:
> This was not needed before fc415edf8ca8, and I think it was better to use
> pgstat_have_pending_wal() to not have to set a flag to every places pgWalUsage.XX
> changes.

At the end, this cost makes the separation of the code layers a bit
better, by having pgstat.c not know about the specifics around the
fixed-numbered stats, so..

> Having said that, I think the single global flag patch is pretty straightforward
> and LGTM.

I have used that and applied it down to v18, closing the open item.
--
Michael

Вложения

Re: Custom pgstat support performance regression for simple queries

От
Andres Freund
Дата:
On 2025-07-28 08:18:01 +0900, Michael Paquier wrote:
> I have used that and applied it down to v18, closing the open item.

Thanks!