Обсуждение: Adding locks statistics

Поиск
Список
Период
Сортировка

Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi hackers,

Please find attached a patch to add a new view (namely pg_stat_lock) that provides
lock statistics.  

It’s output is like the following:

postgres=# select * from pg_stat_lock;
     locktype     | requests | waits | timeouts | deadlock_timeouts | deadlocks | fastpath |          stats_reset

------------------+----------+-------+----------+-------------------+-----------+----------+-------------------------------
 relation         |   612775 |     1 |        0 |                 0 |         0 |   531115 | 2025-08-01
09:18:26.476275+00
 extend           |     3128 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 frozenid         |       11 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 page             |        1 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 tuple            |     3613 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 transactionid    |     6130 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 virtualxid       |    15390 |     0 |        0 |                 0 |         0 |    15390 | 2025-08-01
09:18:26.476275+00
 spectoken        |       12 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 object           |     8393 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 userlock         |        0 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 advisory         |       44 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00
 applytransaction |        0 |     0 |        0 |                 0 |         0 |        0 | 2025-08-01
09:18:26.476275+00

It means that it provides historical trends of locks usage per lock type.

It can be used for example for:

1. checking if "waits" is close to "requests". Then it means you usually have to
wait before acquiring the lock, which means you may have a concurrency issue.

2. lock_timeout and deadlock_timeout tuning (lock_timeout is visible only in the
logs if log_min_error_statement is set appropriately).

3. checking the "requests"/"fastpath" ratio to see if "max_locks_per_transaction"
needs tuning (see c4d5cb71d2).

If any points need more details, it might be a good idea to start sampling pg_locks.

The patch is made of 2 sub-patches:

0001 - It adds a new stat kind PGSTAT_KIND_LOCK for the lock statistics.

This new statistic kind is a fixed one because its key is the lock type
so that we know its size is LOCKTAG_LAST_TYPE + 1.

This statistic kind records the following counters:

- requests: Number of requests for this lock type.
- waits: Number of times requests for this lock type had to wait.
- timeouts: Number of times requests for this lock type had to wait longer than
lock_timeout.
- deadlock_timeouts: Number of times requests for this lock type had to wait longer
than deadlock_timeout.
- deadlocks: Number of times a deadlock occurred on this lock type.
- fastpath: Number of times this lock type was taken via fast path.

No extra details is added (like the ones, i.e relation oid, database oid, we
can find in pg_locks). The idea is to provide an idea on what the locking
behaviour looks like.

Those new counters are incremented outside of the wait events code path,
as suggested in [1].

There are no major design choices, it relies on the current statistics machinery.

0002 - It adds the pg_stat_lock view

It also adds documentation and some tests.

Remarks:

- maybe we could add some metrics related to the lock duration (we have some hints
thanks to the timeout ounters though)
- if this is merged, a next step could be to record those metrics per backend

[1]:
https://www.postgresql.org/message-id/CA%2BTgmobptuUWo7X5zcQrWKh22qeAn4eL%2B%3Dwtb8_ajCOR%2B7_tcw%40mail.gmail.com

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Adding locks statistics

От
Greg Sabino Mullane
Дата:
Great idea. +1. Here is a quick overall review to get things started.

Meta:
patch did not apply via "git apply". Also has carriage returns (e.g. DOS file), and some errant whitespace. Seems to pass pgindent, though.

Name:
I think the name would read better as pg_stat_locks, especially as it returns multiple rows.

Docs: seem good. Needs a section on how to reset via
SELECT pg_stat_reset_shared('lock');
Also plural better here ('locks')

Code:

+ * Copyright (c) 2021-2025, PostgreSQL Global Development Group

If a new file, we can simply say 2025?

+ LWLock locks[LOCKTAG_LAST_TYPE + 1];
+ PgStat_Lock stats;

Adding a lock to the system that counts locks! :) (just found amusing, not a comment)

-#define PGSTAT_KIND_SLRU 11
-#define PGSTAT_KIND_WAL 12
+#define PGSTAT_KIND_LOCK 11
+#define PGSTAT_KIND_SLRU 12
+#define PGSTAT_KIND_WAL 13

Why not just add LOCK as #13?

What about the overhead of maintaining this system? I know it's overall very lightweight, but from my testing, the relation locktype in particular is very, very busy. Busier than I realized until I saw it in this useful view!

Cheers,
Greg

--
Enterprise Postgres Software Products & Tech Support

Re: Adding locks statistics

От
Jeff Davis
Дата:
On Mon, 2025-08-11 at 13:54 -0400, Greg Sabino Mullane wrote:
> Great idea. +1. Here is a quick overall review to get things started.

Can you describe your use case? I'd like to understand whether this is
useful for users, hackers, or both.

Regards,
    Jeff Davis




Re: Adding locks statistics

От
Michael Paquier
Дата:
On Mon, Aug 11, 2025 at 02:53:58PM -0700, Jeff Davis wrote:
> On Mon, 2025-08-11 at 13:54 -0400, Greg Sabino Mullane wrote:
> > Great idea. +1. Here is a quick overall review to get things started.
>
> Can you describe your use case? I'd like to understand whether this is
> useful for users, hackers, or both.

This is a DBA feature, so the questions I'd ask myself are basically:
- Is there any decision-making where these numbers would help?  These
decisions would shape in tweaking the configuration of the server or
the application to as we move from a "bad" number trend to a "good"
number trend.
- What would be good numbers?  In this case, most likely a threshold
reached over a certain period of time.
- Would these new stats overlap with similar statistics gathered in
the system, creating duplication and bloat in the pgstats for no real
gain?

Looking at the patch, the data gathered is this, and I don't think
that we have metrics gathered in the system to get an idea of the
contention in this area, for timeouts and deadlocks:
+    PgStat_Counter requests;
+    PgStat_Counter waits;
+    PgStat_Counter timeouts;
+    PgStat_Counter deadlock_timeouts;
+    PgStat_Counter deadlocks;
+    PgStat_Counter fastpath;

Isn't the "deadlock_timeout" one an overlap between timeout and
deadlock?  Having three counters to track two concepts seems strange
to me.

Regarding the fastpath locking, you'd want to optimize the fastpath to
be close to the number of requests.  If you had these numbers, one
thing that I could see a DBA do is tune max_locks_per_transaction,
which is what influences the per-backend limit of fast-path locks,
with FastPathLockGroupsPerBackend.

Using static counters for the pending data is going to be necessary
when these are called in critical sections, which is I guess why
you've done it this way, right?
--
Michael

Вложения

Re: Adding locks statistics

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Mon, Aug 11, 2025 at 02:53:58PM -0700, Jeff Davis wrote:
>> Can you describe your use case? I'd like to understand whether this is
>> useful for users, hackers, or both.

> This is a DBA feature, so the questions I'd ask myself are basically:
> - Is there any decision-making where these numbers would help?  These
> decisions would shape in tweaking the configuration of the server or
> the application to as we move from a "bad" number trend to a "good"
> number trend.
> - What would be good numbers?  In this case, most likely a threshold
> reached over a certain period of time.
> - Would these new stats overlap with similar statistics gathered in
> the system, creating duplication and bloat in the pgstats for no real
> gain?

I'm also wondering why slicing the numbers in this particular way
(i.e., aggregating by locktype) is a helpful way to look at the data.
Maybe it's just what you want, but that's not obvious to me.

            regards, tom lane



Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Aug 11, 2025 at 01:54:38PM -0400, Greg Sabino Mullane wrote:
> Great idea. +1. Here is a quick overall review to get things started.

Thanks for looking at it!

> Meta:
> patch did not apply via "git apply". Also has carriage returns (e.g. DOS
> file), and some errant whitespace.

Interesting, I'm not seeing those issues on my side when applying with "git am".

> Name:
> I think the name would read better as pg_stat_locks, especially as it
> returns multiple rows.

I considered pg_stat_locks, but chose the singular form to be consistent with
pg_stat_database, pg_stat_subscription, and friends.

> Docs: seem good. Needs a section on how to reset via
> SELECT pg_stat_reset_shared('lock');

That's already included in v1:

"
@@ -5055,6 +5200,12 @@ description | Waiting for a newly initialized WAL file to reach durable storage
           <structname>pg_stat_io</structname> view.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>lock</literal>: Reset all the counters shown in the
+          <structname>pg_stat_lock</structname> view.
+         </para>
+        </listitem>
"

Do you think that this needs any additions or changes?

> Also plural better here ('locks')

I think having pg_stat_<XXX> and XXX being the same option in " pg_stat_reset_shared()"
makes sense. It's how it has been done so far, that's why I went with 'lock'.

> Code:
> 
> + * Copyright (c) 2021-2025, PostgreSQL Global Development Group
> 
> If a new file, we can simply say 2025?

I'm not sure that matters that much, see [1] and we can look at some files added
in 2025 for examples:

src/backend/storage/aio/aio_io.c
src/backend/access/nbtree/nbtpreprocesskeys.c

where 2025 is not "alone".

> + LWLock locks[LOCKTAG_LAST_TYPE + 1];
> + PgStat_Lock stats;
> 
> Adding a lock to the system that counts locks! :) (just found amusing, not
> a comment)

;-) 

> -#define PGSTAT_KIND_SLRU 11
> -#define PGSTAT_KIND_WAL 12
> +#define PGSTAT_KIND_LOCK 11
> +#define PGSTAT_KIND_SLRU 12
> +#define PGSTAT_KIND_WAL 13
> 
> Why not just add LOCK as #13?

Because it looks like that they are ordered by alphabetical order.

> What about the overhead of maintaining this system? I know it's overall
> very lightweight, but from my testing, the relation locktype in particular
> is very, very busy.

The counter increments do call a function generated that way:

"
#define PGSTAT_COUNT_LOCK_FUNC(stat)                    \
void                                                    \
CppConcat(pgstat_count_lock_,stat)(uint8 locktag_type)  \
{                                                       \
    Assert(locktag_type <= LOCKTAG_LAST_TYPE);          \
    PendingLockStats.stats[locktag_type].stat++;        \
    have_lockstats = true;                              \
    pgstat_report_fixed = true;                         \
}
"

So, it's pretty lightweight as you said (given that PendingLockStats is
not shared and just local to the backend that increments the counter).

There could be some contention when the pending stats are flushed but that's
the same for all the stats kinds.

We could do better, though, and avoid the function calls by creating macros instead
of functions. That would mean PendingLockStats to be visible to the outside
world but:

- that's not the direction that has been taken recently (for example, see the
"kept here to avoid exposing PendingBackendStats to the outside world" comment in 
pgstat_backend.c).

- I'm not sure that's worth for this particular case and code paths.

That said, if you (and/or others) have strong concerns about performance penalties,
I could study this area more in depth.

> Busier than I realized until I saw it in this useful
> view!

Thanks! Did you observe some noticeable performance penalties?

[1]: https://www.postgresql.org/message-id/202501211750.xf5j6thdkppy%40alvherre.pgsql 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Aug 12, 2025 at 08:44:58AM +0900, Michael Paquier wrote:
> On Mon, Aug 11, 2025 at 02:53:58PM -0700, Jeff Davis wrote:
> > On Mon, 2025-08-11 at 13:54 -0400, Greg Sabino Mullane wrote:
> > > Great idea. +1. Here is a quick overall review to get things started.
> > 
> > Can you describe your use case? I'd like to understand whether this is
> > useful for users, hackers, or both.

Thanks for looking at it!

I provided a few examples in the initial email ([1]):

"
It can be used for example for:

1. checking if "waits" is close to "requests". Then it means you usually have to
wait before acquiring the lock, which means you may have a concurrency issue.

2. lock_timeout and deadlock_timeout tuning (lock_timeout is visible only in the
logs if log_min_error_statement is set appropriately).

3. checking the "requests"/"fastpath" ratio to see if "max_locks_per_transaction"
needs tuning (see c4d5cb71d2).
"

Do these seem like useful use cases?

> - Is there any decision-making where these numbers would help?  These
> decisions would shape in tweaking the configuration of the server or
> the application to as we move from a "bad" number trend to a "good"
> number trend.

Right, I think that could help for lock_timeout and deadlock_timeout tuning.

> - What would be good numbers?  In this case, most likely a threshold
> reached over a certain period of time.

Also I think it's more a matter of ratio: waits/requests and requests/fastpath
for example.

> - Would these new stats overlap with similar statistics gathered in
> the system, creating duplication and bloat in the pgstats for no real
> gain?

I don't think there is currently stats that overlap with those. 

> Looking at the patch, the data gathered is this, and I don't think
> that we have metrics gathered in the system to get an idea of the
> contention in this area, for timeouts and deadlocks:
> +    PgStat_Counter requests;
> +    PgStat_Counter waits;
> +    PgStat_Counter timeouts;
> +    PgStat_Counter deadlock_timeouts;
> +    PgStat_Counter deadlocks;
> +    PgStat_Counter fastpath;
> 
> Isn't the "deadlock_timeout" one an overlap between timeout and
> deadlock? 

I don't think so because:

- deadlock_timeout and lock_timeout are 2 distincts GUCs
- you could reach the deadlock_timeout without actually producing a deadlock

> Regarding the fastpath locking, you'd want to optimize the fastpath to
> be close to the number of requests.  If you had these numbers, one
> thing that I could see a DBA do is tune max_locks_per_transaction,
> which is what influences the per-backend limit of fast-path locks,
> with FastPathLockGroupsPerBackend.

Exactly.

> Using static counters for the pending data is going to be necessary
> when these are called in critical sections, which is I guess why
> you've done it this way, right?

Yes and there is no need to over complicate this stuff as the max size is
known: LOCKTAG_LAST_TYPE + 1.

[1]: https://www.postgresql.org/message-id/aIyNxBWFCybgBZBS%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Aug 11, 2025 at 07:49:45PM -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > On Mon, Aug 11, 2025 at 02:53:58PM -0700, Jeff Davis wrote:
> >> Can you describe your use case? I'd like to understand whether this is
> >> useful for users, hackers, or both.
> 
> > This is a DBA feature, so the questions I'd ask myself are basically:
> > - Is there any decision-making where these numbers would help?  These
> > decisions would shape in tweaking the configuration of the server or
> > the application to as we move from a "bad" number trend to a "good"
> > number trend.
> > - What would be good numbers?  In this case, most likely a threshold
> > reached over a certain period of time.
> > - Would these new stats overlap with similar statistics gathered in
> > the system, creating duplication and bloat in the pgstats for no real
> > gain?
> 
> I'm also wondering why slicing the numbers in this particular way
> (i.e., aggregating by locktype) is a helpful way to look at the data.
> Maybe it's just what you want, but that's not obvious to me.

Thanks for providing your thoughts!

I thought it was more natural to aggregate by locktype because:

- I think that matches how they are categorized in the doc (from a "wait event"
point of view i.e "Wait Events of Type Lock"). 

- It provides a natural drill-down path, spot issues by locktype in the stats and 
then query pg_locks for specific objects when needed.

Does that make sense to you?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Adding locks statistics

От
Greg Sabino Mullane
Дата:
On Tue, Aug 12, 2025 at 5:32 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
I considered pg_stat_locks, but chose the singular form to be consistent with
pg_stat_database, pg_stat_subscription, and friends.

Counter-examples: pg_stat_statements, pg_stat_subscription_stats. Our names are not consistent. :) It just feels a better fit to have a plural name for a table tracking aggregates of multiple types of objects.

(Ignore my git complaint, was obviously half-asleep when I wrote that).

> Docs: seem good. Needs a section on how to reset via
> SELECT pg_stat_reset_shared('lock');

I meant something closer to the actual description of the view.  Having it buried in the pg_stat_reset_shared section is not intuitive for people looking up the view in the docs.
 
Because it looks like that they are ordered by alphabetical order.

That makes sense.
 
- I'm not sure that's worth for this particular case and code paths.

Will let others opine on that.

Thanks! Did you observe some noticeable performance penalties?

No, but I did not give it any particularly heavy testing. More of a idle thought when pulling up a brand new system and seeing the thousands of locks for the tiniest bit of database action.
 
Cheers,
Greg

--
Enterprise Postgres Software Products & Tech Support

Re: Adding locks statistics

От
Jeff Davis
Дата:
On Tue, 2025-08-12 at 09:37 +0000, Bertrand Drouvot wrote:
> It can be used for example for:
>
> 1. checking if "waits" is close to "requests". Then it means you
> usually have to
> wait before acquiring the lock, which means you may have a
> concurrency issue.
>
> 2. lock_timeout and deadlock_timeout tuning (lock_timeout is visible
> only in the
> logs if log_min_error_statement is set appropriately).
>
> 3. checking the "requests"/"fastpath" ratio to see if
> "max_locks_per_transaction"
> needs tuning (see c4d5cb71d2).
> "
>
> Do these seem like useful use cases?

Those seem plausibly useful, but I don't recall needing that exact
information myself, and I wanted to hear more from others.

For instance, a view could be helpful to diagnose concurrency issues,
but I think that's worth discussing in more detail to see what kinds of
issues it can help with and how it complements other approaches. I
suspect when we get into the details, different people (or different
situations) would want slightly different information out of that view.

Regards,
    Jeff Davis




Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Aug 12, 2025 at 08:54:07AM -0400, Greg Sabino Mullane wrote:
> On Tue, Aug 12, 2025 at 5:32 AM Bertrand Drouvot <
> bertranddrouvot.pg@gmail.com> wrote:
> 
> > Docs: seem good. Needs a section on how to reset via
> > > SELECT pg_stat_reset_shared('lock');
> >
> 
> I meant something closer to the actual description of the view.  Having it
> buried in the pg_stat_reset_shared section is not intuitive for people
> looking up the view in the docs.

I agree that we could add this extra information in the view documentation.

But none of pg_stat_archiver, pg_stat_bgwriter, pg_stat_checkpointer, pg_stat_io,
pg_stat_slru and pg_stat_wal have done so. The only exception is pg_stat_recovery_prefetch
and for a good reason as not all of its fields are reset. 

While I think that's probably a good idea, I think it's worth a dedicated
thread (so that the discussion takes into account the other views mentioned
above).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Aug 12, 2025 at 08:42:48AM -0700, Jeff Davis wrote:
> On Tue, 2025-08-12 at 09:37 +0000, Bertrand Drouvot wrote:
> > It can be used for example for:
> > 
> > 1. checking if "waits" is close to "requests". Then it means you
> > usually have to
> > wait before acquiring the lock, which means you may have a
> > concurrency issue.
> > 
> > 2. lock_timeout and deadlock_timeout tuning (lock_timeout is visible
> > only in the
> > logs if log_min_error_statement is set appropriately).
> > 
> > 3. checking the "requests"/"fastpath" ratio to see if
> > "max_locks_per_transaction"
> > needs tuning (see c4d5cb71d2).
> > "
> > 
> > Do these seem like useful use cases?
> 
> Those seem plausibly useful,

Thanks for sharing your thoughts about the use cases above.

> and I wanted to hear more from others.

> For instance, a view could be helpful to diagnose concurrency issues,
> but I think that's worth discussing in more detail to see what kinds of
> issues it can help with and how it complements other approaches. I
> suspect when we get into the details, different people (or different
> situations) would want slightly different information out of that view.

Fully agree, getting input from others definitely helps validate the approach
and see if the current design needs some changes.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Wed, Aug 13, 2025 at 03:55:41AM +0000, Bertrand Drouvot wrote:
> Fully agree, getting input from others definitely helps validate the approach
> and see if the current design needs some changes.

PFA a mandatory rebase.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Dec 15, 2025 at 04:48:10PM +0000, Bertrand Drouvot wrote:
> Hi,
> 
> On Wed, Aug 13, 2025 at 03:55:41AM +0000, Bertrand Drouvot wrote:
> > Fully agree, getting input from others definitely helps validate the approach
> > and see if the current design needs some changes.
> 
> PFA a mandatory rebase.

New rebase due to 73d60ac385a.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Adding locks statistics

От
Michael Paquier
Дата:
On Tue, Feb 10, 2026 at 07:30:50AM +0000, Bertrand Drouvot wrote:
> New rebase due to 73d60ac385a.

I have been looking at this patch, and can get behind the data
gathered here in terms of being able to tune things, but not all.  See
below for the details of my reasoning.

max_locks_per_xact is a PGC_POSTMASTER, so backend-level stats with
this data would not be relevant for its tuning.  However, something
else can be said about deadlock_timeout, where one could rely on the
data gathered by this view to set it on a backend basis, particularly
if the load pattern is divided into a subsets of connections (say few
backend see a lot of the deadlock_timeout, for example).  Same
argument for lock_timeout, which is user-settable.

As the set of data gathered, I think that I'm OK with timeouts (for
lock_timeout), deadlock_timeouts (for deadlock_timeout), fastpath (for
max_locks_per_xact), that can all be compared with the number of
requests.

Regarding "deadlocks" and "waits", these two are less useful than the
three others because not really actionable.  They would become much
more relevant if and only if we know the distribution of the deadlocks
not only for the lock types, but for the objects involved, especially
if the activity is diluted across many objects.  So these have less
value IMO because they are not really actionable in the system.  The
other three can be directly tuned based on the GUCs we have.

So my suggestion for the moment would be to be more frugal (yeah I
know, sorry..) and limit ourselves to four fields: deadlock_timeout,
requests, fastpath and timeouts.  Three fields to compare with
requests, one for each GUC.

Regarding the implementation, you are right to use a fixed-sized stats
kind for the job.  I can see a lot of code has been copy-pasted from
pgstat_io.c, then slightly adjusted to fit into the picture.  That's
fine here, it makes the implementation straight-forward to read.

Regarding the documentation, listing all the values for locktype is a
recipe for rot.  I'd suggest to remove the list instead, with only a
link referring to pg_locks to avoid the duplication.
--
Michael

Вложения

Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Feb 13, 2026 at 04:36:57PM +0900, Michael Paquier wrote:
> On Tue, Feb 10, 2026 at 07:30:50AM +0000, Bertrand Drouvot wrote:
> > New rebase due to 73d60ac385a.
> 
> I have been looking at this patch, and can get behind the data
> gathered here in terms of being able to tune things

Thanks!

> 
> So my suggestion for the moment would be to be more frugal (yeah I
> know, sorry..) and limit ourselves to four fields: deadlock_timeout,
> requests, fastpath and timeouts.  Three fields to compare with
> requests, one for each GUC.

That's fine by me. We could still add the others in the future if we feel the
need. Done that way in the attached.

> Regarding the implementation, you are right to use a fixed-sized stats
> kind for the job.  I can see a lot of code has been copy-pasted from
> pgstat_io.c, then slightly adjusted to fit into the picture.  That's
> fine here, it makes the implementation straight-forward to read.

Yeah, no need to reinvent the wheel.

> Regarding the documentation, listing all the values for locktype is a
> recipe for rot.  I'd suggest to remove the list instead, with only a
> link referring to pg_locks to avoid the duplication.

Makes sense, done that way. Makes me think that we could do the same for pg_locks
and just link it to the Wait Events of Type Lock? (Table 27.11.)

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Adding locks statistics

От
Andres Freund
Дата:
Hi,

On 2026-02-13 10:24:52 +0000, Bertrand Drouvot wrote:
> On Fri, Feb 13, 2026 at 04:36:57PM +0900, Michael Paquier wrote:
> > On Tue, Feb 10, 2026 at 07:30:50AM +0000, Bertrand Drouvot wrote:
> > > New rebase due to 73d60ac385a.
> > So my suggestion for the moment would be to be more frugal (yeah I
> > know, sorry..) and limit ourselves to four fields: deadlock_timeout,
> > requests, fastpath and timeouts.  Three fields to compare with
> > requests, one for each GUC.
> 
> That's fine by me. We could still add the others in the future if we feel the
> need. Done that way in the attached.

I'm not sure that it's unproblematic to add multiple pgstat count calls to
every lock acquisition, particularly if it's a fastpath acquisition or a
virtualxid lock. Notably these are external function calls, not just
increments of a counter in an inline function.

I also don't really know what one would do with some of the information?

What does the number of virtualxid lock acquisitions tell you that the numbers
of transactions doesn't already tell you in a more understandable way?

What does it tell you that the deadlock checker ran N times? It notably
doesn't count deadlocks, it counts how often we checked for deadlocks.

The percentage of fastpath locks also seems not really informative, because
that could be because we ran out of space for fastpath locks, or because a
lock mode that's ineligible for fastpath locks was used.


What I would actually count is the amount of time waiting for locks, that
seems vastly more useful than the number of acquisitions.  We already do a
GetCurrentTimestamp() inside the timer activations for deadlock timeout, we
probably can figure out a way to reuse that to reduce the increase in overhead
due to timing.  We could also just count the wait time after the deadlock
check has run.


> @@ -17,6 +17,7 @@
>  #include "postmaster/pgarch.h"    /* for MAX_XFN_CHARS */
>  #include "replication/conflict.h"
>  #include "replication/worker_internal.h"
> +#include "storage/lock.h"
>  #include "utils/backend_progress.h" /* for backward compatibility */    /* IWYU pragma: export */
>  #include "utils/backend_status.h"    /* for backward compatibility */    /* IWYU pragma: export */
>  #include "utils/pgstat_kind.h"
> @@ -342,6 +343,25 @@ typedef struct PgStat_IO
>      PgStat_BktypeIO stats[BACKEND_NUM_TYPES];
>  } PgStat_IO;

I don't like the amount of headers this addition will indirectly include in a
lot of places.

I'm also pretty unhappy about an include of access/transam.h recently having
been added. And worker_internal.h quite obviously, given the name, has no
business being included here, and also includes a lot more. Grrr.  I'll start
a thread.

Greetings,

Andres Freund



Re: Adding locks statistics

От
Michael Paquier
Дата:
On Fri, Feb 13, 2026 at 04:13:39PM -0500, Andres Freund wrote:
> I'm not sure that it's unproblematic to add multiple pgstat count calls to
> every lock acquisition, particularly if it's a fastpath acquisition or a
> virtualxid lock. Notably these are external function calls, not just
> increments of a counter in an inline function.

Right.  I am not completely sure if this is really free, either,
particularly for a fastpath lock that we want to be..  Faster.

> What I would actually count is the amount of time waiting for locks, that
> seems vastly more useful than the number of acquisitions.  We already do a
> GetCurrentTimestamp() inside the timer activations for deadlock timeout, we
> probably can figure out a way to reuse that to reduce the increase in overhead
> due to timing.  We could also just count the wait time after the deadlock
> check has run.

That sounds like an interesting suggestion, yes.  That's not going to
be bounded by performance if we know that we are already waiting.
--
Michael

Вложения

Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Feb 13, 2026 at 04:13:39PM -0500, Andres Freund wrote:
> Hi,
> 
> On 2026-02-13 10:24:52 +0000, Bertrand Drouvot wrote:
> > 
> > That's fine by me. We could still add the others in the future if we feel the
> > need. Done that way in the attached.
> 
> I'm not sure that it's unproblematic to add multiple pgstat count calls to
> every lock acquisition, particularly if it's a fastpath acquisition or a
> virtualxid lock. Notably these are external function calls, not just
> increments of a counter in an inline function.

Yeah, we could create inline functions instead.

> I also don't really know what one would do with some of the information?
> 
> What does the number of virtualxid lock acquisitions tell you that the numbers
> of transactions doesn't already tell you in a more understandable way?

I agree not that much, except being able to compute a transactions/virtualxid
ratio or such. Also the idea was to report the same as pg_locks so that one
could start sampling pg_locks if he needs more details.

> What does it tell you that the deadlock checker ran N times? It notably
> doesn't count deadlocks, it counts how often we checked for deadlocks.
> 
> The percentage of fastpath locks also seems not really informative, because
> that could be because we ran out of space for fastpath locks, or because a
> lock mode that's ineligible for fastpath locks was used.

Right, maybe we could add an "exceed fastpath slots" or such counter?

> What I would actually count is the amount of time waiting for locks, that
> seems vastly more useful than the number of acquisitions.  We already do a
> GetCurrentTimestamp() inside the timer activations for deadlock timeout, we
> probably can figure out a way to reuse that to reduce the increase in overhead
> due to timing.  We could also just count the wait time after the deadlock
> check has run.

Yeah, providing the wait_time would be great. Just to be sure, are you suggesting
to remove all the fields (i.e requests, timeouts, deadlock_timeouts and fastpath)
and just add a wait_time field instead? I think that keeping requests would make
sense to be able to get the average wait time per request.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Adding locks statistics

От
Andres Freund
Дата:
Hi,

On 2026-02-16 10:10:21 +0000, Bertrand Drouvot wrote:
> On Fri, Feb 13, 2026 at 04:13:39PM -0500, Andres Freund wrote:
> > On 2026-02-13 10:24:52 +0000, Bertrand Drouvot wrote:
> > > 
> > > That's fine by me. We could still add the others in the future if we feel the
> > > need. Done that way in the attached.
> > 
> > I'm not sure that it's unproblematic to add multiple pgstat count calls to
> > every lock acquisition, particularly if it's a fastpath acquisition or a
> > virtualxid lock. Notably these are external function calls, not just
> > increments of a counter in an inline function.
> 
> Yeah, we could create inline functions instead.

I think unless you have a heck of a lot more of a usecase than "it's
information", counting every lock acqusition seems just worth ti.


> > I also don't really know what one would do with some of the information?
> > 
> > What does the number of virtualxid lock acquisitions tell you that the numbers
> > of transactions doesn't already tell you in a more understandable way?
> 
> I agree not that much, except being able to compute a transactions/virtualxid
> ratio or such. Also the idea was to report the same as pg_locks so that one
> could start sampling pg_locks if he needs more details.

I don't think that's worth adding overhead to something extremely frequently
happening.


> > What does it tell you that the deadlock checker ran N times? It notably
> > doesn't count deadlocks, it counts how often we checked for deadlocks.
> > 
> > The percentage of fastpath locks also seems not really informative, because
> > that could be because we ran out of space for fastpath locks, or because a
> > lock mode that's ineligible for fastpath locks was used.
> 
> Right, maybe we could add an "exceed fastpath slots" or such counter?

That I could get behind.


> > What I would actually count is the amount of time waiting for locks, that
> > seems vastly more useful than the number of acquisitions.  We already do a
> > GetCurrentTimestamp() inside the timer activations for deadlock timeout, we
> > probably can figure out a way to reuse that to reduce the increase in overhead
> > due to timing.  We could also just count the wait time after the deadlock
> > check has run.
> 
> Yeah, providing the wait_time would be great. Just to be sure, are you suggesting
> to remove all the fields (i.e requests, timeouts, deadlock_timeouts and fastpath)
> and just add a wait_time field instead?

Well, I'd maybe make it waits, wait_time and perhaps fastpath exceeded.


> I think that keeping requests would make sense to be able to get the average
> wait time per request.

I don't think I'd request for that (as that would require counting in the
normal case), I'd use the number of waits.


I don't think you have provided any actual motivation for why it's worth the
expense to track the number of lock requests.  Just trackings stats that
nobody has a usecase for that are not free to collect makes no sense.

Greetings,

Andres Freund



Re: Adding locks statistics

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Feb 16, 2026 at 04:00:56PM -0500, Andres Freund wrote:
> Hi,
> 
> On 2026-02-16 10:10:21 +0000, Bertrand Drouvot wrote:
> > Yeah, providing the wait_time would be great. Just to be sure, are you suggesting
> > to remove all the fields (i.e requests, timeouts, deadlock_timeouts and fastpath)
> > and just add a wait_time field instead?
> 
> Well, I'd maybe make it waits, wait_time and perhaps fastpath exceeded.

Okay, done that way in the attached. To avoid overhead due to timing as much as
possible, the patch simply relies on log_lock_waits and deadlock_timeout. It means
that it relies on the existing code, and increments waits and wait_time only if
log_lock_waits is on and if the session waited longer than deadlock_timeout.

I did not want to dissociate the waits and wait_time increments so that their
ratio could still make sense.

That sounds like a good compromise, thoughts?

> > I think that keeping requests would make sense to be able to get the average
> > wait time per request.
> 
> I don't think I'd request for that (as that would require counting in the
> normal case), I'd use the number of waits.

Yeah, I meant to say waits and not requests.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения