Обсуждение: Add memory_limit_hits to pg_stat_replication_slots

Поиск

Список

Период

Сортировка

Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

27 августа, 10:26:45

Hi hackers,

I think that it's currently not always possible to determine how many times
logical_decoding_work_mem has been reached.

For example, say a transaction is made of 40 subtransactions, and I get:

  slot_name   | spill_txns | spill_count | total_txns
--------------+------------+-------------+------------
 logical_slot |         40 |          41 |          1
(1 row)

Then I know that logical_decoding_work_mem has been reached one time (total_txns).

But as soon as another transaction is decoded (that does not involve spilling):

  slot_name   | spill_txns | spill_count | total_txns
--------------+------------+-------------+------------
 logical_slot |         40 |          41 |          2
(1 row)

Then we don't know if logical_decoding_work_mem has been reached one or two 
times.

Please find attached a patch to $SUBJECT, to report the number of times the 
logical_decoding_work_mem has been reached.

With such a counter one could get a ratio like total_txns/memory_limit_hits.

That could help to see if reaching logical_decoding_work_mem is rare or 
frequent enough. If frequent, then maybe there is a need to adjust
logical_decoding_work_mem.

Based on my simple example above, one could say that it might be possible to get
the same with:

(spill_count - spill_txns) + (stream_count - stream_txns)

but that doesn't appear to be the case with a more complicated example (277 vs 247):

  slot_name   | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits |
(spc-spct)+(strc-strt)

--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
 logical_slot |        405 |         552 |         19 |           5 |          105 |               277 |
   247
 
(1 row)

Not sure I like memory_limit_hits that much, maybe work_mem_exceeded is better?

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

v1-0001-Add-memory_limit_hits-to-pg_stat_replication_slot.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

12 сентября, 01:24:54

On Wed, Aug 27, 2025 at 12:26 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi hackers,
>
> I think that it's currently not always possible to determine how many times
> logical_decoding_work_mem has been reached.
>
> For example, say a transaction is made of 40 subtransactions, and I get:
>
>   slot_name   | spill_txns | spill_count | total_txns
> --------------+------------+-------------+------------
>  logical_slot |         40 |          41 |          1
> (1 row)
>
> Then I know that logical_decoding_work_mem has been reached one time (total_txns).
>
> But as soon as another transaction is decoded (that does not involve spilling):
>
>   slot_name   | spill_txns | spill_count | total_txns
> --------------+------------+-------------+------------
>  logical_slot |         40 |          41 |          2
> (1 row)
>
> Then we don't know if logical_decoding_work_mem has been reached one or two
> times.
>
> Please find attached a patch to $SUBJECT, to report the number of times the
> logical_decoding_work_mem has been reached.
>
> With such a counter one could get a ratio like total_txns/memory_limit_hits.
>
> That could help to see if reaching logical_decoding_work_mem is rare or
> frequent enough. If frequent, then maybe there is a need to adjust
> logical_decoding_work_mem.
>
> Based on my simple example above, one could say that it might be possible to get
> the same with:
>
> (spill_count - spill_txns) + (stream_count - stream_txns)
>
> but that doesn't appear to be the case with a more complicated example (277 vs 247):
>
>   slot_name   | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits |
(spc-spct)+(strc-strt)
>
--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
>  logical_slot |        405 |         552 |         19 |           5 |          105 |               277 |
     247 
> (1 row)
>
> Not sure I like memory_limit_hits that much, maybe work_mem_exceeded is better?
>
> Looking forward to your feedback,

Yes, it's a quite different situation in two cases: spilling 100
transactions in one ReorderBufferCheckMemoryLimit() call and spilling
1 transaction in each 100 ReorderBufferCheckMemoryLimit() calls, even
though spill_txn is 100 in both cases. And we don't have any
statistics to distinguish between these cases. I agree with the
statistics.

One minor comment is:

@@ -1977,6 +1978,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
  repSlotStat.stream_bytes = rb->streamBytes;
  repSlotStat.total_txns = rb->totalTxns;
  repSlotStat.total_bytes = rb->totalBytes;
+ repSlotStat.memory_limit_hits = rb->memory_limit_hits;

Since other statistics counter names are camel cases I think it's
better to follow that for the new counter.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

22 сентября, 11:11:22

Hi,

On Thu, Sep 11, 2025 at 03:24:54PM -0700, Masahiko Sawada wrote:
> On Wed, Aug 27, 2025 at 12:26 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Looking forward to your feedback,
> 
> Yes,

Thanks for looking at it!

> it's a quite different situation in two cases: spilling 100
> transactions in one ReorderBufferCheckMemoryLimit() call and spilling
> 1 transaction in each 100 ReorderBufferCheckMemoryLimit() calls, even
> though spill_txn is 100 in both cases. And we don't have any
> statistics to distinguish between these cases.

Right.

> One minor comment is:
> 
> @@ -1977,6 +1978,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
>   repSlotStat.stream_bytes = rb->streamBytes;
>   repSlotStat.total_txns = rb->totalTxns;
>   repSlotStat.total_bytes = rb->totalBytes;
> + repSlotStat.memory_limit_hits = rb->memory_limit_hits;
> 
> Since other statistics counter names are camel cases I think it's
> better to follow that for the new counter.

Makes sense, done with memoryLimitHits in v2 attached (that's the only change
as compared with v1).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

v2-0001-Add-memory_limit_hits-to-pg_stat_replication_slot.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Amit Kapila

Дата:

22 сентября, 14:05:26

On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> >
> > Since other statistics counter names are camel cases I think it's
> > better to follow that for the new counter.
>
> Makes sense, done with memoryLimitHits in v2 attached (that's the only change
> as compared with v1).
>

The memory_limit_hits doesn't go well with the other names in the
view. Can we consider memory_exceeded_count? I find
memory_exceeded_count (or memory_exceeds_count) more clear and
matching with the existing counters. Also, how about keeping it
immediately after slot_name in the view? Keeping it in the end after
total_bytes seems out of place.

--
With Regards,
Amit Kapila.

Re: Add memory_limit_hits to pg_stat_replication_slots

От

shveta malik

Дата:

22 сентября, 14:51:35

On Mon, Sep 22, 2025 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > >
> > > Since other statistics counter names are camel cases I think it's
> > > better to follow that for the new counter.
> >
> > Makes sense, done with memoryLimitHits in v2 attached (that's the only change
> > as compared with v1).
> >
>
> The memory_limit_hits doesn't go well with the other names in the
> view. Can we consider memory_exceeded_count? I find
> memory_exceeded_count (or memory_exceeds_count) more clear and
> matching with the existing counters. Also, how about keeping it
> immediately after slot_name in the view? Keeping it in the end after
> total_bytes seems out of place.
>

Since fields like spill_txns, spill_bytes, and stream_txns also talk
about exceeding 'logical_decoding_work_mem', my preference would be to
place this new field immediately after these spill and stream fields
(and before total_bytes). If not this, then as Amit suggested,
immediately before all these fields.
Other options for name could be 'mem_limit_exceeded_count' or
'mem_limit_hit_count'

thanks
Shveta

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

23 сентября, 11:52:51

On Mon, Sep 22, 2025 at 05:21:35PM +0530, shveta malik wrote:
> On Mon, Sep 22, 2025 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > >
> > > > Since other statistics counter names are camel cases I think it's
> > > > better to follow that for the new counter.
> > >
> > > Makes sense, done with memoryLimitHits in v2 attached (that's the only change
> > > as compared with v1).
> > >
> >
> > The memory_limit_hits doesn't go well with the other names in the
> > view. Can we consider memory_exceeded_count? I find
> > memory_exceeded_count (or memory_exceeds_count) more clear and
> > matching with the existing counters. Also, how about keeping it
> > immediately after slot_name in the view? Keeping it in the end after
> > total_bytes seems out of place.
> >
> 
> Since fields like spill_txns, spill_bytes, and stream_txns also talk
> about exceeding 'logical_decoding_work_mem', my preference would be to
> place this new field immediately after these spill and stream fields
> (and before total_bytes). If not this, then as Amit suggested,
> immediately before all these fields.
> Other options for name could be 'mem_limit_exceeded_count' or
> 'mem_limit_hit_count'

Thank you, Shveta and Amit, for looking at it. Since we already use txns as
abbreviation for transactions then I think it's ok to use "mem". Then I'm using
a mix of your proposals with "mem_exceeded_count" in v3 attached. Regarding the
field position, I like Shveta's proposal and did it that way.

However, technically speaking, "exceeded" is not the perfect wording since
the code was doing (and is still doing something similar to):

        if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
-               rb->size < logical_decoding_work_mem * (Size) 1024)
+               !memory_limit_reached)
                return;

as the comment describes correctly using "reached":

/*
 * Check whether the logical_decoding_work_mem limit was reached, and if yes
 * pick the largest (sub)transaction at-a-time to evict and spill its changes to
 * disk or send to the output plugin until we reach under the memory limit.

So I think that "reached" or "hit" would be better wording. However, the
documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
so I went with "exceeded" for consistency. I think that's fine, if not we may want
to use "reached" for those 3 stats descriptions.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

v3-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

23 сентября, 21:39:22

On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Mon, Sep 22, 2025 at 05:21:35PM +0530, shveta malik wrote:
> > On Mon, Sep 22, 2025 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Sep 22, 2025 at 1:41 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > >
> > > > > Since other statistics counter names are camel cases I think it's
> > > > > better to follow that for the new counter.
> > > >
> > > > Makes sense, done with memoryLimitHits in v2 attached (that's the only change
> > > > as compared with v1).
> > > >
> > >
> > > The memory_limit_hits doesn't go well with the other names in the
> > > view. Can we consider memory_exceeded_count? I find
> > > memory_exceeded_count (or memory_exceeds_count) more clear and
> > > matching with the existing counters. Also, how about keeping it
> > > immediately after slot_name in the view? Keeping it in the end after
> > > total_bytes seems out of place.
> > >
> >
> > Since fields like spill_txns, spill_bytes, and stream_txns also talk
> > about exceeding 'logical_decoding_work_mem', my preference would be to
> > place this new field immediately after these spill and stream fields
> > (and before total_bytes). If not this, then as Amit suggested,
> > immediately before all these fields.
> > Other options for name could be 'mem_limit_exceeded_count' or
> > 'mem_limit_hit_count'
>
> Thank you, Shveta and Amit, for looking at it. Since we already use txns as
> abbreviation for transactions then I think it's ok to use "mem". Then I'm using
> a mix of your proposals with "mem_exceeded_count" in v3 attached. Regarding the
> field position, I like Shveta's proposal and did it that way.
>
> However, technically speaking, "exceeded" is not the perfect wording since
> the code was doing (and is still doing something similar to):
>
>         if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
> -               rb->size < logical_decoding_work_mem * (Size) 1024)
> +               !memory_limit_reached)
>                 return;
>
> as the comment describes correctly using "reached":
>
> /*
>  * Check whether the logical_decoding_work_mem limit was reached, and if yes
>  * pick the largest (sub)transaction at-a-time to evict and spill its changes to
>  * disk or send to the output plugin until we reach under the memory limit.
>
> So I think that "reached" or "hit" would be better wording. However, the
> documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
> so I went with "exceeded" for consistency. I think that's fine, if not we may want
> to use "reached" for those 3 stats descriptions.

I find "exceeded" is fine as the documentation for logical decoding
also uses it[1]:

Similar to spill-to-disk behavior, streaming is triggered when the
total amount of changes decoded from the WAL (for all in-progress
transactions) exceeds the limit defined by logical_decoding_work_mem
setting.

One comment for the v3 patch:

+       <para>
+        Number of times <literal>logical_decoding_work_mem</literal> has been
+        exceeded while decoding changes from WAL for this slot.
+       </para>

How about rewording it to like:

Number of times the memory used by logical decoding has exceeded
logical_decoding_work_mem.

Regards,

[1] https://www.postgresql.org/docs/devel/logicaldecoding-streaming.html

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

24 сентября, 09:31:05

Hi,

On Tue, Sep 23, 2025 at 11:39:22AM -0700, Masahiko Sawada wrote:
> On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > However, technically speaking, "exceeded" is not the perfect wording since
> > the code was doing (and is still doing something similar to):
> >
> >         if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
> > -               rb->size < logical_decoding_work_mem * (Size) 1024)
> > +               !memory_limit_reached)
> >                 return;
> >
> > as the comment describes correctly using "reached":
> >
> > /*
> >  * Check whether the logical_decoding_work_mem limit was reached, and if yes
> >  * pick the largest (sub)transaction at-a-time to evict and spill its changes to
> >  * disk or send to the output plugin until we reach under the memory limit.
> >
> > So I think that "reached" or "hit" would be better wording. However, the
> > documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
> > so I went with "exceeded" for consistency. I think that's fine, if not we may want
> > to use "reached" for those 3 stats descriptions.
> 
> I find "exceeded" is fine as the documentation for logical decoding
> also uses it[1]:
> 
> Similar to spill-to-disk behavior, streaming is triggered when the
> total amount of changes decoded from the WAL (for all in-progress
> transactions) exceeds the limit defined by logical_decoding_work_mem
> setting.

Yes it also uses "exceeds" but I think it's not 100% accurate. It would be
if, in ReorderBufferCheckMemoryLimit, we were using "<=" instead of "<" in:

  if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
       rb->size < logical_decoding_work_mem * (Size) 1024)

I think an accurate wording would be "reaches or exceeds" in all those places,
but just using "exceeds" looks good enough.

> One comment for the v3 patch:
> 
> +       <para>
> +        Number of times <literal>logical_decoding_work_mem</literal> has been
> +        exceeded while decoding changes from WAL for this slot.
> +       </para>
> 
> How about rewording it to like:
> 
> Number of times the memory used by logical decoding has exceeded
> logical_decoding_work_mem.

That sounds better, thanks! Used this wording in v4 attached (that's the only
change as compared to v3).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

v4-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

24 сентября, 20:11:20

On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Sep 23, 2025 at 11:39:22AM -0700, Masahiko Sawada wrote:
> > On Tue, Sep 23, 2025 at 1:52 AM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > However, technically speaking, "exceeded" is not the perfect wording since
> > > the code was doing (and is still doing something similar to):
> > >
> > >         if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
> > > -               rb->size < logical_decoding_work_mem * (Size) 1024)
> > > +               !memory_limit_reached)
> > >                 return;
> > >
> > > as the comment describes correctly using "reached":
> > >
> > > /*
> > >  * Check whether the logical_decoding_work_mem limit was reached, and if yes
> > >  * pick the largest (sub)transaction at-a-time to evict and spill its changes to
> > >  * disk or send to the output plugin until we reach under the memory limit.
> > >
> > > So I think that "reached" or "hit" would be better wording. However, the
> > > documentation for spill_txns, stream_txns already use "exceeded" (and not "reached")
> > > so I went with "exceeded" for consistency. I think that's fine, if not we may want
> > > to use "reached" for those 3 stats descriptions.
> >
> > I find "exceeded" is fine as the documentation for logical decoding
> > also uses it[1]:
> >
> > Similar to spill-to-disk behavior, streaming is triggered when the
> > total amount of changes decoded from the WAL (for all in-progress
> > transactions) exceeds the limit defined by logical_decoding_work_mem
> > setting.
>
> Yes it also uses "exceeds" but I think it's not 100% accurate. It would be
> if, in ReorderBufferCheckMemoryLimit, we were using "<=" instead of "<" in:
>
>   if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED &&
>        rb->size < logical_decoding_work_mem * (Size) 1024)
>
> I think an accurate wording would be "reaches or exceeds" in all those places,
> but just using "exceeds" looks good enough.
>
> > One comment for the v3 patch:
> >
> > +       <para>
> > +        Number of times <literal>logical_decoding_work_mem</literal> has been
> > +        exceeded while decoding changes from WAL for this slot.
> > +       </para>
> >
> > How about rewording it to like:
> >
> > Number of times the memory used by logical decoding has exceeded
> > logical_decoding_work_mem.
>
> That sounds better, thanks! Used this wording in v4 attached (that's the only
> change as compared to v3).
>

Thank you for updating the patch! Here are some comments:

---
+   bool        memory_limit_reached = (rb->size >=
logical_decoding_work_mem * (Size) 1024);
+
+   if (memory_limit_reached)
+       rb->memExceededCount += 1;

Do we want to use 'exceeded' for the variable too for better consistency?

---
One thing I want to clarify is that even if the memory usage exceeds
the logical_decoding_work_mem it doesn't necessarily mean we serialize
or stream transactions because of
ReorderBufferCheckAndTruncateAbortedTXN(). For example, in a situation
where many large already-aborted transactions are truncated by
transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
a high number in mem_exceeded_count column but it might not actually
require any adjustment for logical_decoding_work_mem. One idea is to
increment that counter if exceeding memory usage is caused to
serialize or stream any transactions. On the other hand, it might make
sense and be straightforward too to show a pure statistic that the
memory usage exceeded the logical_decoding_work_mem. What do you
think?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

25 сентября, 13:17:51

Hi,

On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:
> On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> 
> Thank you for updating the patch! Here are some comments:
> 
> ---
> +   bool        memory_limit_reached = (rb->size >=
> logical_decoding_work_mem * (Size) 1024);
> +
> +   if (memory_limit_reached)
> +       rb->memExceededCount += 1;
> 
> Do we want to use 'exceeded' for the variable too for better consistency?

I thought about it, but since we use ">=" I think that "reached" is more
accurate. So I went for "reached" for this one and "exceeded" for "user facing"
ones. That said I don't have a strong opinion about it, and I'd be ok to use
"exceeded" if you feel strong about it.

> ---
> One thing I want to clarify is that even if the memory usage exceeds
> the logical_decoding_work_mem it doesn't necessarily mean we serialize
> or stream transactions because of
> ReorderBufferCheckAndTruncateAbortedTXN().

Right.

> For example, in a situation
> where many large already-aborted transactions are truncated by
> transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
> a high number in mem_exceeded_count column but it might not actually
> require any adjustment for logical_decoding_work_mem.

Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
stream_txns, no?

> One idea is to
> increment that counter if exceeding memory usage is caused to
> serialize or stream any transactions. On the other hand, it might make
> sense and be straightforward too to show a pure statistic that the
> memory usage exceeded the logical_decoding_work_mem. What do you
> think?

The new counter, as it is proposed, helps to see if the workload hits the
logical_decoding_work_mem frequently or not. I think it's valuable information
to have on its own.

Now to check if logical_decoding_work_mem needs adjustment, one could compare
mem_exceeded_count with the existing spill_txns and stream_txns.

For example, If I abort 20 transactions that exceeded logical_decoding_work_mem
, I'd get:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
 spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
          0 |           0 |                 20
(1 row)

That way I could figure out that mem_exceeded_count has been reached for
aborted transactions.

OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
 spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
         38 |          20 |                 58
(1 row)

That probably means that mem_exceeded_count would need to be increased.

What do you think?

BTW, while doing some tests for the above examples, I noticed that the patch
was missing a check on memExceededCount in UpdateDecodingStats() (that produced
mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
v5 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

25 сентября, 22:14:04

On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:
> > On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Thank you for updating the patch! Here are some comments:
> >
> > ---
> > +   bool        memory_limit_reached = (rb->size >=
> > logical_decoding_work_mem * (Size) 1024);
> > +
> > +   if (memory_limit_reached)
> > +       rb->memExceededCount += 1;
> >
> > Do we want to use 'exceeded' for the variable too for better consistency?
>
> I thought about it, but since we use ">=" I think that "reached" is more
> accurate. So I went for "reached" for this one and "exceeded" for "user facing"
> ones. That said I don't have a strong opinion about it, and I'd be ok to use
> "exceeded" if you feel strong about it.

Agreed with the current style. Thank you for the explanation.

>
> > ---
> > One thing I want to clarify is that even if the memory usage exceeds
> > the logical_decoding_work_mem it doesn't necessarily mean we serialize
> > or stream transactions because of
> > ReorderBufferCheckAndTruncateAbortedTXN().
>
> Right.
>
> > For example, in a situation
> > where many large already-aborted transactions are truncated by
> > transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
> > a high number in mem_exceeded_count column but it might not actually
> > require any adjustment for logical_decoding_work_mem.
>
> Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
> stream_txns, no?

Right. I think only mem_exceeded_count has a high number while
spill_txns and stream_txns have lower numbers in this case (like you
shown in your first example below).

>
> > One idea is to
> > increment that counter if exceeding memory usage is caused to
> > serialize or stream any transactions. On the other hand, it might make
> > sense and be straightforward too to show a pure statistic that the
> > memory usage exceeded the logical_decoding_work_mem. What do you
> > think?
>
> The new counter, as it is proposed, helps to see if the workload hits the
> logical_decoding_work_mem frequently or not. I think it's valuable information
> to have on its own.
>
> Now to check if logical_decoding_work_mem needs adjustment, one could compare
> mem_exceeded_count with the existing spill_txns and stream_txns.
>
> For example, If I abort 20 transactions that exceeded logical_decoding_work_mem
> , I'd get:
>
> postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
>  spill_txns | stream_txns | mem_exceeded_count
> ------------+-------------+--------------------
>           0 |           0 |                 20
> (1 row)
>
> That way I could figure out that mem_exceeded_count has been reached for
> aborted transactions.
>
> OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:
>
> postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
>  spill_txns | stream_txns | mem_exceeded_count
> ------------+-------------+--------------------
>          38 |          20 |                 58
> (1 row)
>
> That probably means that mem_exceeded_count would need to be increased.
>
> What do you think?

Right. But one might argue that if we increment mem_exceeded_count
only when serializing or streaming is actually performed,
mem_exceeded_count would be 0 in the first example and therefore users
would be able to simply check mem_exceeded_count without any
computation.

>
> BTW, while doing some tests for the above examples, I noticed that the patch
> was missing a check on memExceededCount in UpdateDecodingStats() (that produced
> mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
> v5 attached.

Thank you for updating the patch!

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

26 сентября, 08:26:08

Hi,

On Thu, Sep 25, 2025 at 12:14:04PM -0700, Masahiko Sawada wrote:
> On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > That probably means that mem_exceeded_count would need to be increased.
> >
> > What do you think?
> 
> Right. But one might argue that if we increment mem_exceeded_count
> only when serializing or streaming is actually performed,
> mem_exceeded_count would be 0 in the first example and therefore users
> would be able to simply check mem_exceeded_count without any
> computation.

Right but we'd not be able to see when the memory limit has been reached for all
the cases (that would hide the aborted transactions case). I think that with
the current approach we have the best of both world (even if it requires some
computations).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Chao Li

Дата:

26 сентября, 09:34:58

Hi Bertrand,

Thanks for the patch. The patch overall goods look to me. Just a few small comments:

On Sep 25, 2025, at 18:17, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:

<v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch>

```

--- a/src/include/replication/reorderbuffer.h

+++ b/src/include/replication/reorderbuffer.h

@@ -690,6 +690,9 @@ struct ReorderBuffer

int64 streamCount; /* streaming invocation counter */

int64 streamBytes; /* amount of data decoded */

+ /* Number of times logical_decoding_work_mem has been reached */

+ int64 memExceededCount;

```

For other metrics, the commented with “Statistics about xxx” above, and line comment after every metric. Maybe use the same style, so that it would be easy to add new metrics in future.

```

--- a/src/backend/utils/adt/pgstatfuncs.c

+++ b/src/backend/utils/adt/pgstatfuncs.c

@@ -2100,7 +2100,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)

Datum

pg_stat_get_replication_slot(PG_FUNCTION_ARGS)

{

-#define PG_STAT_GET_REPLICATION_SLOT_COLS 10

+#define PG_STAT_GET_REPLICATION_SLOT_COLS 11

text *slotname_text = PG_GETARG_TEXT_P(0);

NameData slotname;

TupleDesc tupdesc;

@@ -2125,11 +2125,13 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)

INT8OID, -1, 0);

TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stream_bytes",

INT8OID, -1, 0);

- TupleDescInitEntry(tupdesc, (AttrNumber) 8, "total_txns",

+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "mem_exceeded_count",

INT8OID, -1, 0);

- TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_bytes",

+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_txns",

INT8OID, -1, 0);

- TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",

+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "total_bytes",

+ INT8OID, -1, 0);

+ TupleDescInitEntry(tupdesc, (AttrNumber) 11, "stats_reset",

TIMESTAMPTZOID, -1, 0);

BlessTupleDesc(tupdesc);

```

Is it better to add the new field in the last place?

Say if a client does “select * from pg_stat_get_replication_slit()”, it will just gets an extra column instead of mis-ordered columns.

```

+ <para>

+ Number of times the memory used by logical decoding has exceeded

+ <literal>logical_decoding_work_mem</literal>.

+ </para>

```

Feels like “has” is not needed.

Maybe the wording can be simplified as:

Number of times logical decoding exceeded <literal>logical_decoding_work_mem</literal>.

Best regards,

Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

26 сентября, 11:04:38

Hi Evan,

On Fri, Sep 26, 2025 at 02:34:58PM +0800, Chao Li wrote:
> Hi Bertrand,
> 
> Thanks for the patch. The patch overall goods look to me. Just a few small comments:

Thanks for looking at it!

> > On Sep 25, 2025, at 18:17, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
> > 
> > <v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch>
> 
> 
> 1.
> ```
> --- a/src/include/replication/reorderbuffer.h
> +++ b/src/include/replication/reorderbuffer.h
> @@ -690,6 +690,9 @@ struct ReorderBuffer
>      int64        streamCount;    /* streaming invocation counter */
>      int64        streamBytes;    /* amount of data decoded */
>  
> +    /* Number of times logical_decoding_work_mem has been reached */
> +    int64        memExceededCount;
> ```
> 
> For other metrics, the commented with “Statistics about xxx” above, and line comment after every metric. Maybe use
thesame style, so that it would be easy to add new metrics in future.
 

I'm not sure: for the moment we have only this stat related to logical_decoding_work_mem,
memory usage. If we add other stats in this area later, we could add a comment
"section" as you suggest.

> 2.
> ```
> --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -2100,7 +2100,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
>  Datum
>  pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
>  {
> -#define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> +#define PG_STAT_GET_REPLICATION_SLOT_COLS 11
>      text       *slotname_text = PG_GETARG_TEXT_P(0);
>      NameData    slotname;
>      TupleDesc    tupdesc;
> @@ -2125,11 +2125,13 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
>                         INT8OID, -1, 0);
>      TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stream_bytes",
>                         INT8OID, -1, 0);
> -    TupleDescInitEntry(tupdesc, (AttrNumber) 8, "total_txns",
> +    TupleDescInitEntry(tupdesc, (AttrNumber) 8, "mem_exceeded_count",
>                         INT8OID, -1, 0);
> -    TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_bytes",
> +    TupleDescInitEntry(tupdesc, (AttrNumber) 9, "total_txns",
>                         INT8OID, -1, 0);
> -    TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
> +    TupleDescInitEntry(tupdesc, (AttrNumber) 10, "total_bytes",
> +                       INT8OID, -1, 0);
> +    TupleDescInitEntry(tupdesc, (AttrNumber) 11, "stats_reset",
>                         TIMESTAMPTZOID, -1, 0);
>      BlessTupleDesc(tupdesc);
> ```
> 
> Is it better to add the new field in the last place?
> 
> Say if a client does “select * from pg_stat_get_replication_slit()”, it will just gets an extra column instead of
mis-orderedcolumns.
 

I think it's good to have the function fields "ordering" matching the view
fields ordering. FWIW, the ordering has been discussed in [1].

> 3.
> ```
> +       <para>
> +        Number of times the memory used by logical decoding has exceeded
> +        <literal>logical_decoding_work_mem</literal>.
> +       </para>
> ```
> 
> Feels like “has” is not needed.

It's already done that way in other parts of the documentation:

$ git grep "has exceeded" "*sgml"
doc/src/sgml/maintenance.sgml:    vacuum has exceeded the defined insert threshold, which is defined as:
doc/src/sgml/monitoring.sgml:        logical decoding to decode changes from WAL has exceeded
doc/src/sgml/monitoring.sgml:        from WAL for this slot has exceeded
doc/src/sgml/monitoring.sgml:        Number of times the memory used by logical decoding has exceeded
doc/src/sgml/ref/create_subscription.sgml:          retention duration has exceeded the

So that looks ok to me (I'm not a native English speaker though).

[1]: https://www.postgresql.org/message-id/CAJpy0uBskXMq65rvWm8a-KR7cSb_sZH9CPRCnWAQrTOF5fciGw%40mail.gmail.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

03 октября, 02:39:40

On Thu, Sep 25, 2025 at 10:26 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Thu, Sep 25, 2025 at 12:14:04PM -0700, Masahiko Sawada wrote:
> > On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > That probably means that mem_exceeded_count would need to be increased.
> > >
> > > What do you think?
> >
> > Right. But one might argue that if we increment mem_exceeded_count
> > only when serializing or streaming is actually performed,
> > mem_exceeded_count would be 0 in the first example and therefore users
> > would be able to simply check mem_exceeded_count without any
> > computation.
>
> Right but we'd not be able to see when the memory limit has been reached for all
> the cases (that would hide the aborted transactions case). I think that with
> the current approach we have the best of both world (even if it requires some
> computations).

Agreed. It would be better to show a raw statistic so that users can
use the number as they want.

I've made a small comment change and added the commit message to the
v5 patch. I'm going to push the attached patch barring any objection
or review comments.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

v6-0001-Add-mem_exceeded_count-column-to-pg_stat_replicat.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

03 октября, 06:39:56

Hi,

On Thu, Oct 02, 2025 at 04:39:40PM -0700, Masahiko Sawada wrote:
> On Thu, Sep 25, 2025 at 10:26 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> Agreed. It would be better to show a raw statistic so that users can
> use the number as they want.
> 
> I've made a small comment change

Thanks!

Comparing v5 and v6:

-       /* Number of times logical_decoding_work_mem has been reached */
+       /* Number of times the logical_decoding_work_mem limit has been reached */

LGTM.

> and added the commit message to the
> v5 patch. I'm going to push the attached patch barring any objection
> or review comments.

The commit message LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Ashutosh Bapat

Дата:

03 октября, 14:49:42

On Fri, Oct 3, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 10:26 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Thu, Sep 25, 2025 at 12:14:04PM -0700, Masahiko Sawada wrote:
> > > On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > That probably means that mem_exceeded_count would need to be increased.
> > > >
> > > > What do you think?
> > >
> > > Right. But one might argue that if we increment mem_exceeded_count
> > > only when serializing or streaming is actually performed,
> > > mem_exceeded_count would be 0 in the first example and therefore users
> > > would be able to simply check mem_exceeded_count without any
> > > computation.
> >
> > Right but we'd not be able to see when the memory limit has been reached for all
> > the cases (that would hide the aborted transactions case). I think that with
> > the current approach we have the best of both world (even if it requires some
> > computations).
>
> Agreed. It would be better to show a raw statistic so that users can
> use the number as they want.
>
> I've made a small comment change and added the commit message to the
> v5 patch. I'm going to push the attached patch barring any objection
> or review comments.

+ bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
(Size) 1024);
+
+ if (memory_limit_reached)
+ rb->memExceededCount += 1;

If the memory limit is hit but no transaction was serialized, the
stats won't be updated since UpdateDecodingStats() won't be called. We
need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
if no transaction was streamed or spilled.

-SELECT slot_name, spill_txns, spill_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
-            slot_name            | spill_txns | spill_count
----------------------------------+------------+-------------
- regression_slot_stats4_twophase |          0 |           0
+SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM
pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
+            slot_name            | spill_txns | spill_count |
mem_exceeded_count
+---------------------------------+------------+-------------+--------------------
+ regression_slot_stats4_twophase |          0 |           0 |
         1
 (1 row)

Are we sure that mem_exceeded_count will always be 1 in this case? Can
it be 2 or more because of background activity?

--
Best Wishes,
Ashutosh Bapat

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

03 октября, 16:15:16

Hi,

On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> (Size) 1024);
> +
> + if (memory_limit_reached)
> + rb->memExceededCount += 1;

Thanks for looking at it!

> If the memory limit is hit but no transaction was serialized, the
> stats won't be updated since UpdateDecodingStats() won't be called. We
> need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
> if no transaction was streamed or spilled.

I did some testing and the stats are reported because UpdateDecodingStats() is
also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
,for example, total_txns is reported even if no transaction was streamed or
spilled.

> -SELECT slot_name, spill_txns, spill_count FROM
> pg_stat_replication_slots WHERE slot_name =
> 'regression_slot_stats4_twophase';
> -            slot_name            | spill_txns | spill_count
> ----------------------------------+------------+-------------
> - regression_slot_stats4_twophase |          0 |           0
> +SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM
> pg_stat_replication_slots WHERE slot_name =
> 'regression_slot_stats4_twophase';
> +            slot_name            | spill_txns | spill_count |
> mem_exceeded_count
> +---------------------------------+------------+-------------+--------------------
> + regression_slot_stats4_twophase |          0 |           0 |
>          1
>  (1 row)
> 
> Are we sure that mem_exceeded_count will always be 1 in this case? Can
> it be 2 or more because of background activity?

I think that the question could be the same for spill_txns and spill_count. It
seems to have been working fine (that way) since this test exists (added in 
072ee847ad4c) but I think that you raised a good point.

Sawada-San, what do you think about this particular test, is it safe to rely
on the exact values here?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Ashutosh Bapat

Дата:

03 октября, 19:26:28

On Fri, Oct 3, 2025 at 6:45 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> > + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> > (Size) 1024);
> > +
> > + if (memory_limit_reached)
> > + rb->memExceededCount += 1;
>
> Thanks for looking at it!
>
> > If the memory limit is hit but no transaction was serialized, the
> > stats won't be updated since UpdateDecodingStats() won't be called. We
> > need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
> > if no transaction was streamed or spilled.
>
> I did some testing and the stats are reported because UpdateDecodingStats() is
> also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
> to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
> ,for example, total_txns is reported even if no transaction was streamed or
> spilled.

In a very pathological case, where all transactions happen to be
aborted while decoding and yet memory limit is hit many times, nothing
will be reported till first committed transaction after it is decoded.
Which may never happen. I didn't find a call stack where by
UpdateDecodingStats could be reached from
ReorderBufferCheckAndTruncateAbortedTXN().


--
Best Wishes,
Ashutosh Bapat

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

03 октября, 19:42:31

On Fri, Oct 3, 2025 at 6:15 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> > + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> > (Size) 1024);
> > +
> > + if (memory_limit_reached)
> > + rb->memExceededCount += 1;
>
> Thanks for looking at it!

+1

> > -SELECT slot_name, spill_txns, spill_count FROM
> > pg_stat_replication_slots WHERE slot_name =
> > 'regression_slot_stats4_twophase';
> > -            slot_name            | spill_txns | spill_count
> > ----------------------------------+------------+-------------
> > - regression_slot_stats4_twophase |          0 |           0
> > +SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM
> > pg_stat_replication_slots WHERE slot_name =
> > 'regression_slot_stats4_twophase';
> > +            slot_name            | spill_txns | spill_count |
> > mem_exceeded_count
> > +---------------------------------+------------+-------------+--------------------
> > + regression_slot_stats4_twophase |          0 |           0 |
> >          1
> >  (1 row)
> >
> > Are we sure that mem_exceeded_count will always be 1 in this case? Can
> > it be 2 or more because of background activity?
>
> I think that the question could be the same for spill_txns and spill_count. It
> seems to have been working fine (that way) since this test exists (added in
> 072ee847ad4c) but I think that you raised a good point.
>
> Sawada-San, what do you think about this particular test, is it safe to rely
> on the exact values here?

In short, I'm fine with the change proposed by Ashtosh. I believe that
in this case it's safe to rely on the exact values in principle since
once we reach the memory limit we truncate all changes in the
transaction and skip further changes. This test with the patch fails
if there are other activities enough to reach the memory limit and
those transactions are aborted, which it's unlikely to happen, I
guess. That being said, there is no downside if we check
'mem_exceeded_count > 0' instead of checking the exact number and it
seems more stable for future changes.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

03 октября, 21:18:07

On Fri, Oct 3, 2025 at 9:26 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Oct 3, 2025 at 6:45 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> > > + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> > > (Size) 1024);
> > > +
> > > + if (memory_limit_reached)
> > > + rb->memExceededCount += 1;
> >
> > Thanks for looking at it!
> >
> > > If the memory limit is hit but no transaction was serialized, the
> > > stats won't be updated since UpdateDecodingStats() won't be called. We
> > > need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
> > > if no transaction was streamed or spilled.
> >
> > I did some testing and the stats are reported because UpdateDecodingStats() is
> > also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
> > to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
> > ,for example, total_txns is reported even if no transaction was streamed or
> > spilled.
>
> In a very pathological case, where all transactions happen to be
> aborted while decoding and yet memory limit is hit many times, nothing
> will be reported till first committed transaction after it is decoded.
> Which may never happen. I didn't find a call stack where by
> UpdateDecodingStats could be reached from
> ReorderBufferCheckAndTruncateAbortedTXN().

The more we report the status frequently, the less chances we lose the
statistics in case of logical decoding being interrupted but the more
overheads we have to update the statistics. I personally prefer not to
call UpdateDecodingStats() frequently since pgstat_report_replslot()
always flush the statistics. If the transaction is serialized or
streamed, we can update the memExceededCount together with other
statistics such as streamBytes and spillBytes. But if we can free
enough memory only by truncating already-aborted transactions, we need
to rely on the next committed/aborted/prepared transaction to update
the statistics. So how about calling UpdateDecodingStats() only in
case where we only truncate aborted transactions and the memory usage
gets lower than the limit?

I've attached the patch that implements this idea with a small
refactoring. It also has the change to the regression test results
we've discussed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

fix_masahiko.patch.txt

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Ashutosh Bapat

Дата:

06 октября, 08:20:52

On Fri, Oct 3, 2025 at 11:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Oct 3, 2025 at 9:26 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Oct 3, 2025 at 6:45 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> > > > + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> > > > (Size) 1024);
> > > > +
> > > > + if (memory_limit_reached)
> > > > + rb->memExceededCount += 1;
> > >
> > > Thanks for looking at it!
> > >
> > > > If the memory limit is hit but no transaction was serialized, the
> > > > stats won't be updated since UpdateDecodingStats() won't be called. We
> > > > need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
> > > > if no transaction was streamed or spilled.
> > >
> > > I did some testing and the stats are reported because UpdateDecodingStats() is
> > > also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
> > > to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
> > > ,for example, total_txns is reported even if no transaction was streamed or
> > > spilled.
> >
> > In a very pathological case, where all transactions happen to be
> > aborted while decoding and yet memory limit is hit many times, nothing
> > will be reported till first committed transaction after it is decoded.
> > Which may never happen. I didn't find a call stack where by
> > UpdateDecodingStats could be reached from
> > ReorderBufferCheckAndTruncateAbortedTXN().
>
> The more we report the status frequently, the less chances we lose the
> statistics in case of logical decoding being interrupted but the more
> overheads we have to update the statistics. I personally prefer not to
> call UpdateDecodingStats() frequently since pgstat_report_replslot()
> always flush the statistics. If the transaction is serialized or
> streamed, we can update the memExceededCount together with other
> statistics such as streamBytes and spillBytes. But if we can free
> enough memory only by truncating already-aborted transactions, we need
> to rely on the next committed/aborted/prepared transaction to update
> the statistics. So how about calling UpdateDecodingStats() only in
> case where we only truncate aborted transactions and the memory usage
> gets lower than the limit?

Yes, that's what my intention was.

>
> I've attached the patch that implements this idea with a small
> refactoring. It also has the change to the regression test results
> we've discussed.

The change looks good to me.

Given Andres's comment, in a nearby thread, about being cautious about
adding useless statistics, I think this one needs a bit more
discussion. In the proposal email Bertant wrote
> Please find attached a patch to $SUBJECT, to report the number of times the
> logical_decoding_work_mem has been reached.
>
> With such a counter one could get a ratio like total_txns/memory_limit_hits.
>
> That could help to see if reaching logical_decoding_work_mem is rare or
> frequent enough. If frequent, then maybe there is a need to adjust
> logical_decoding_work_mem.
>

I agree with the goal that we need a metric to decide whether
exceeding logical decoding work mem is frequent or not.

> Based on my simple example above, one could say that it might be possible to get
> the same with:
>
> (spill_count - spill_txns) + (stream_count - stream_txns)
>
> but that doesn't appear to be the case with a more complicated example (277 vs 247):
>
>   slot_name   | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits |
(spc-spct)+(strc-strt)
>
--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
>  logical_slot |        405 |         552 |         19 |           5 |          105 |               277 |
     247 
> (1 row)

Agreed that any arithmetic on the currently reported counters don't
provide the exact number of times the memory limit was hit. The
question is whether there exists some arithmetic which gives a good
indicator of whether hitting memory limit is frequent or rare. In your
example, is the difference 247 vs 277 significant enough to lead to
the wrong conclusion about the frequency? Is there another case where
this difference is going to lead to a wrong conclusion?

--
Best Wishes,
Ashutosh Bapat

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

06 октября, 09:52:24

Hi,

On Mon, Oct 06, 2025 at 10:50:52AM +0530, Ashutosh Bapat wrote:
> On Fri, Oct 3, 2025 at 11:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Oct 3, 2025 at 9:26 AM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Fri, Oct 3, 2025 at 6:45 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> > > > > + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> > > > > (Size) 1024);
> > > > > +
> > > > > + if (memory_limit_reached)
> > > > > + rb->memExceededCount += 1;
> > > >
> > > > Thanks for looking at it!
> > > >
> > > > > If the memory limit is hit but no transaction was serialized, the
> > > > > stats won't be updated since UpdateDecodingStats() won't be called. We
> > > > > need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
> > > > > if no transaction was streamed or spilled.
> > > >
> > > > I did some testing and the stats are reported because UpdateDecodingStats() is
> > > > also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
> > > > to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
> > > > ,for example, total_txns is reported even if no transaction was streamed or
> > > > spilled.
> > >
> > > In a very pathological case, where all transactions happen to be
> > > aborted while decoding and yet memory limit is hit many times, nothing
> > > will be reported till first committed transaction after it is decoded.
> > > Which may never happen. I didn't find a call stack where by
> > > UpdateDecodingStats could be reached from
> > > ReorderBufferCheckAndTruncateAbortedTXN().
> >
> > The more we report the status frequently, the less chances we lose the
> > statistics in case of logical decoding being interrupted but the more
> > overheads we have to update the statistics. I personally prefer not to
> > call UpdateDecodingStats() frequently since pgstat_report_replslot()
> > always flush the statistics. If the transaction is serialized or
> > streamed, we can update the memExceededCount together with other
> > statistics such as streamBytes and spillBytes. But if we can free
> > enough memory only by truncating already-aborted transactions, we need
> > to rely on the next committed/aborted/prepared transaction to update
> > the statistics.

Indeed, there is cases when committed/aborted/prepared would not be called
right after ReorderBufferCheckAndTruncateAbortedTXN().

> So how about calling UpdateDecodingStats() only in
> > case where we only truncate aborted transactions and the memory usage
> > gets lower than the limit?
> 
> Yes, that's what my intention was.

I also think it makes sense.

> > I've attached the patch that implements this idea with a small
> > refactoring.

Thanks!

> It also has the change to the regression test results
> > we've discussed.

-SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
+SELECT slot_name, spill_txns, spill_count, mem_exceeded_count > 0 as mem_exceeded_count FROM pg_stat_replication_slots
WHEREslot_name = 'regression_slot_stats4_twophase';
 
             slot_name            | spill_txns | spill_count | mem_exceeded_count
 ---------------------------------+------------+-------------+--------------------
- regression_slot_stats4_twophase |          0 |           0 |                  1
+ regression_slot_stats4_twophase |          0 |           0 | t

Could we also imagine that there are other activities enough to reach the memory
limit and transactions are not aborted, meaning spill_txns and/or spill_count are > 0?

In that case we may want to get rid of this test instead (as checking spill_txns >=0
and spill_count >=0 would not really reflect the intend of this test).

> The change looks good to me.
> 
> Given Andres's comment, in a nearby thread, about being cautious about
> adding useless statistics, I think this one needs a bit more
> discussion. In the proposal email Bertant wrote
> > Please find attached a patch to $SUBJECT, to report the number of times the
> > logical_decoding_work_mem has been reached.
> >
> > With such a counter one could get a ratio like total_txns/memory_limit_hits.
> >
> > That could help to see if reaching logical_decoding_work_mem is rare or
> > frequent enough. If frequent, then maybe there is a need to adjust
> > logical_decoding_work_mem.
> >
> 
> I agree with the goal that we need a metric to decide whether
> exceeding logical decoding work mem is frequent or not.
> 
> > Based on my simple example above, one could say that it might be possible to get
> > the same with:
> >
> > (spill_count - spill_txns) + (stream_count - stream_txns)
> >
> > but that doesn't appear to be the case with a more complicated example (277 vs 247):
> >
> >   slot_name   | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits |
(spc-spct)+(strc-strt)
> >
--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
> >  logical_slot |        405 |         552 |         19 |           5 |          105 |               277 |
       247
 
> > (1 row)
> 
> Is there another case where this difference is going to lead to a wrong conclusion?

Yeah, for example when the ratio aborted/committed is high, we could get things
like:

  slot_name   | spill_txns | spill_count | stream_txns | stream_count | total_txns | mem_exceeded_count |
(spc-spct)+(strc-strt)

--------------+------------+-------------+-------------+--------------+------------+--------------------+------------------------
 logical_slot |          1 |           2 |           0 |            0 |        192 |                244 |
      1
 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

06 октября, 23:18:38

On Sun, Oct 5, 2025 at 11:52 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Oct 06, 2025 at 10:50:52AM +0530, Ashutosh Bapat wrote:
> > On Fri, Oct 3, 2025 at 11:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Oct 3, 2025 at 9:26 AM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 3, 2025 at 6:45 PM Bertrand Drouvot
> > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Fri, Oct 03, 2025 at 05:19:42PM +0530, Ashutosh Bapat wrote:
> > > > > > + bool memory_limit_reached = (rb->size >= logical_decoding_work_mem *
> > > > > > (Size) 1024);
> > > > > > +
> > > > > > + if (memory_limit_reached)
> > > > > > + rb->memExceededCount += 1;
> > > > >
> > > > > Thanks for looking at it!
> > > > >
> > > > > > If the memory limit is hit but no transaction was serialized, the
> > > > > > stats won't be updated since UpdateDecodingStats() won't be called. We
> > > > > > need to call UpdateDecodingStats() in ReorderBufferCheckMemoryLimit()
> > > > > > if no transaction was streamed or spilled.
> > > > >
> > > > > I did some testing and the stats are reported because UpdateDecodingStats() is
> > > > > also called in DecodeCommit(), DecodeAbort() and DecodePrepare() (in addition
> > > > > to ReorderBufferSerializeTXN() and ReorderBufferStreamTXN()). That's also why
> > > > > ,for example, total_txns is reported even if no transaction was streamed or
> > > > > spilled.
> > > >
> > > > In a very pathological case, where all transactions happen to be
> > > > aborted while decoding and yet memory limit is hit many times, nothing
> > > > will be reported till first committed transaction after it is decoded.
> > > > Which may never happen. I didn't find a call stack where by
> > > > UpdateDecodingStats could be reached from
> > > > ReorderBufferCheckAndTruncateAbortedTXN().
> > >
> > > The more we report the status frequently, the less chances we lose the
> > > statistics in case of logical decoding being interrupted but the more
> > > overheads we have to update the statistics. I personally prefer not to
> > > call UpdateDecodingStats() frequently since pgstat_report_replslot()
> > > always flush the statistics. If the transaction is serialized or
> > > streamed, we can update the memExceededCount together with other
> > > statistics such as streamBytes and spillBytes. But if we can free
> > > enough memory only by truncating already-aborted transactions, we need
> > > to rely on the next committed/aborted/prepared transaction to update
> > > the statistics.
>
> Indeed, there is cases when committed/aborted/prepared would not be called
> right after ReorderBufferCheckAndTruncateAbortedTXN().
>
> > So how about calling UpdateDecodingStats() only in
> > > case where we only truncate aborted transactions and the memory usage
> > > gets lower than the limit?
> >
> > Yes, that's what my intention was.
>
> I also think it makes sense.
>
> > > I've attached the patch that implements this idea with a small
> > > refactoring.
>
> Thanks!
>
> > It also has the change to the regression test results
> > > we've discussed.
>
> -SELECT slot_name, spill_txns, spill_count, mem_exceeded_count FROM pg_stat_replication_slots WHERE slot_name =
'regression_slot_stats4_twophase';
> +SELECT slot_name, spill_txns, spill_count, mem_exceeded_count > 0 as mem_exceeded_count FROM
pg_stat_replication_slotsWHERE slot_name = 'regression_slot_stats4_twophase'; 
>              slot_name            | spill_txns | spill_count | mem_exceeded_count
>  ---------------------------------+------------+-------------+--------------------
> - regression_slot_stats4_twophase |          0 |           0 |                  1
> + regression_slot_stats4_twophase |          0 |           0 | t
>
> Could we also imagine that there are other activities enough to reach the memory
> limit and transactions are not aborted, meaning spill_txns and/or spill_count are > 0?
>
> In that case we may want to get rid of this test instead (as checking spill_txns >=0
> and spill_count >=0 would not really reflect the intend of this test).

It makes sense to me to make an assumption that there are no
concurrent activities that are capturable by logical decoding during
this test. So I think we don't need to care about that case. On the
other hand, under this assumption, it also makes sense to check it
with the exact number. I've chosen >0 as we can achieve the same goal
while being more flexible for potential future changes. I'm open to
other suggestions though.

>
> > The change looks good to me.
> >
> > Given Andres's comment, in a nearby thread, about being cautious about
> > adding useless statistics, I think this one needs a bit more
> > discussion. In the proposal email Bertant wrote
> > > Please find attached a patch to $SUBJECT, to report the number of times the
> > > logical_decoding_work_mem has been reached.
> > >
> > > With such a counter one could get a ratio like total_txns/memory_limit_hits.
> > >
> > > That could help to see if reaching logical_decoding_work_mem is rare or
> > > frequent enough. If frequent, then maybe there is a need to adjust
> > > logical_decoding_work_mem.
> > >
> >
> > I agree with the goal that we need a metric to decide whether
> > exceeding logical decoding work mem is frequent or not.
> >
> > > Based on my simple example above, one could say that it might be possible to get
> > > the same with:
> > >
> > > (spill_count - spill_txns) + (stream_count - stream_txns)
> > >
> > > but that doesn't appear to be the case with a more complicated example (277 vs 247):
> > >
> > >   slot_name   | spill_txns | spill_count | total_txns | stream_txns | stream_count | memory_limit_hits |
(spc-spct)+(strc-strt)
> > >
--------------+------------+-------------+------------+-------------+--------------+-------------------+------------------------
> > >  logical_slot |        405 |         552 |         19 |           5 |          105 |               277 |
         247 
> > > (1 row)
> >
> > Is there another case where this difference is going to lead to a wrong conclusion?
>
> Yeah, for example when the ratio aborted/committed is high, we could get things
> like:
>
>   slot_name   | spill_txns | spill_count | stream_txns | stream_count | total_txns | mem_exceeded_count |
(spc-spct)+(strc-strt)
>
--------------+------------+-------------+-------------+--------------+------------+--------------------+------------------------
>  logical_slot |          1 |           2 |           0 |            0 |        192 |                244 |
        1 

+1

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

07 октября, 11:08:48

Hi,

On Mon, Oct 06, 2025 at 01:18:38PM -0700, Masahiko Sawada wrote:
> On Sun, Oct 5, 2025 at 11:52 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Could we also imagine that there are other activities enough to reach the memory
> > limit and transactions are not aborted, meaning spill_txns and/or spill_count are > 0?
> >
> > In that case we may want to get rid of this test instead (as checking spill_txns >=0
> > and spill_count >=0 would not really reflect the intend of this test).
> 
> It makes sense to me to make an assumption that there are no
> concurrent activities that are capturable by logical decoding during
> this test. So I think we don't need to care about that case. On the
> other hand, under this assumption, it also makes sense to check it
> with the exact number. I've chosen >0 as we can achieve the same goal
> while being more flexible for potential future changes. I'm open to
> other suggestions though.

 >0 is fine by me. I was just wondering about spill_txns and spill_count too.

That could sound weird that we are confident for spill_txns and spill_count
to rely on the exact values and not for the new field. That said, I agree that
 >0 is more flexible for potential future changes (in the sense that this one
is more likely to change in its implementation). In short, I'm fine with your
proposal.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

07 октября, 20:44:33

On Tue, Oct 7, 2025 at 1:08 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Oct 06, 2025 at 01:18:38PM -0700, Masahiko Sawada wrote:
> > On Sun, Oct 5, 2025 at 11:52 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Could we also imagine that there are other activities enough to reach the memory
> > > limit and transactions are not aborted, meaning spill_txns and/or spill_count are > 0?
> > >
> > > In that case we may want to get rid of this test instead (as checking spill_txns >=0
> > > and spill_count >=0 would not really reflect the intend of this test).
> >
> > It makes sense to me to make an assumption that there are no
> > concurrent activities that are capturable by logical decoding during
> > this test. So I think we don't need to care about that case. On the
> > other hand, under this assumption, it also makes sense to check it
> > with the exact number. I've chosen >0 as we can achieve the same goal
> > while being more flexible for potential future changes. I'm open to
> > other suggestions though.
>
>  >0 is fine by me. I was just wondering about spill_txns and spill_count too.
>
> That could sound weird that we are confident for spill_txns and spill_count
> to rely on the exact values and not for the new field. That said, I agree that
>  >0 is more flexible for potential future changes (in the sense that this one
> is more likely to change in its implementation). In short, I'm fine with your
> proposal.

Thank you for the comment. I've noted this discussion as a comment in
the new tests.

I've attached the updated version patch. Please review it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

v7-0001-Add-mem_exceeded_count-column-to-pg_stat_replicat.patch

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Bertrand Drouvot

Дата:

08 октября, 08:02:03

Hi,

On Tue, Oct 07, 2025 at 10:44:33AM -0700, Masahiko Sawada wrote:
> 
> Thank you for the comment. I've noted this discussion as a comment in
> the new tests.

Thanks!

> I've attached the updated version patch. Please review it.

LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add memory_limit_hits to pg_stat_replication_slots

От

Masahiko Sawada

Дата:

08 октября, 20:06:22

On Tue, Oct 7, 2025 at 10:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Oct 07, 2025 at 10:44:33AM -0700, Masahiko Sawada wrote:
> >
> > Thank you for the comment. I've noted this discussion as a comment in
> > the new tests.
>
> Thanks!
>
> > I've attached the updated version patch. Please review it.
>
> LGTM.
>

Thank you! Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Add memory_limit_hits to pg_stat_replication_slots

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения