Обсуждение: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

Поиск

Список

Период

Сортировка

Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

19 августа, 20:21:47

Hi,
we have following situation: Pg 14.17 (yes, I know, but it can't be
upgraded now/soon), on Ubuntu focal, in AWS cloud on EC2 server using
arm64 architecture.

All works, is fine.

Every now and then (usually every 3-5 minutes, but not through the whole
day), we see situations where every query suddently takes ~ 1 second.

I'm talkign about "prepare" for preparing statements. binds. also
"discard all".

We have logging enabled to csv log, which offers milisecond precicision.
For ~ 1 second there are no logs going to log (we usually have at 5-20
messages logged per second), no connection, nothing. And then we get
bunch (30+) messages with the same milisecond time.

And they all have insane durations - 800-1300ms for virtually anything.

After such second, everything works fine, without any problems.

Up to next case.

Unfortunately due to short duration of such things, and the fact that
literally *everything* is paused for this 1 second, it's hard to
debug/diagnose.

Servers have memory that is almost 2x total db size (200gb vs. 384gb of
ram), so disk shouldn't be an issue.

Aside from this, we don't see any other problems.

Any idea how to look at it, what to look for, to be able to diagnose the
issue?

We do use some savepoints, but it's hard to tell when exactly they
happen (we usually log only queries that take more than 250ms, and
queries that use savepoints are usually much faster).

The DB server in question has ~ 150 connections, and handles, at this
time, we had ~ 40-50 ktps.

Logging is set using:

               name                │            setting
═══════════════════════════════════╪════════════════════════════════
 log_autovacuum_min_duration       │ 0
 log_checkpoints                   │ on
 log_connections                   │ on
 log_destination                   │ csvlog
 log_directory                     │ /cache/postgres_logs
 log_disconnections                │ off
 log_duration                      │ off
 log_error_verbosity               │ default
 log_executor_stats                │ off
 log_file_mode                     │ 0600
 log_filename                      │ postgresql-%Y-%m-%d_%H%M%S.log
 log_hostname                      │ off
 log_line_prefix                   │ db=%d,user=%u
 log_lock_waits                    │ on
 log_min_duration_sample           │ 0
 log_min_duration_statement        │ 0
 log_min_error_statement           │ error
 log_min_messages                  │ warning
 log_parameter_max_length          │ -1
 log_parameter_max_length_on_error │ 0
 log_parser_stats                  │ off
 log_planner_stats                 │ off
 log_recovery_conflict_waits       │ off
 log_replication_commands          │ off
 log_rotation_age                  │ 60
 log_rotation_size                 │ 51200
 log_statement                     │ none
 log_statement_sample_rate         │ 0.0001
 log_statement_stats               │ off
 log_temp_files                    │ 0
 log_timezone                      │ UTC
 log_transaction_sample_rate       │ 0
 log_truncate_on_rotation          │ off
 logging_collector                 │ on

and it doesn't seem to show anything interesting. Checkpoints happen,
but they don't seem correlated in any way.

Any ideas?

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

19 августа, 21:39:03

On 8/19/25 10:21, hubert depesz lubaczewski wrote:
> Hi,
> we have following situation: Pg 14.17 (yes, I know, but it can't be
> upgraded now/soon), on Ubuntu focal, in AWS cloud on EC2 server using
> arm64 architecture.
> 
> All works, is fine.
> 
> Every now and then (usually every 3-5 minutes, but not through the whole
> day), we see situations where every query suddently takes ~ 1 second.
> 

Given the subject line, what you are reporting is happening on the 
replica, correct?

If so where is the replica relative to the primary, in terms of network 
distance?

Also what are the 'hardware' specifications on the replica instance?

-- 
Adrian Klaver
adrian.klaver@aklaver.com

RE: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Scot Kreienkamp

Дата:

19 августа, 23:25:26

匀礀渀挀栀爀漀渀漀甀猀 挀漀洀洀椀琀㼀  伀爀 愀猀礀渀挀栀爀漀渀漀甀猀㼀ഀ਀ഀ਀ഀ਀匀挀漀琀 䬀爀攀椀攀渀欀愀洀瀀 簀 䄀瀀瀀氀椀挀愀琀椀漀渀猀 䤀渀昀爀愀猀琀爀甀挀琀甀爀攀 䄀爀挀栀椀琀攀挀琀 簀 䰀愀ⴀ娀ⴀ䈀漀礀 䌀漀爀瀀漀爀愀琀攀ഀ਀⠀㜀㌀㐀⤀ ㌀㠀㐀ⴀ㘀㐀　㌀ 簀 ㄀ⴀ㜀㌀㐀ⴀ㤀㄀㔀ⴀ㄀㐀㐀㐀 簀 匀挀漀琀⸀䬀爀攀椀攀渀欀愀洀瀀䀀氀愀ⴀ稀ⴀ戀漀礀⸀挀漀洀ഀ਀伀渀攀 䰀愀ⴀ娀ⴀ䈀漀礀 䐀爀椀瘀攀 簀 䴀漀渀爀漀攀Ⰰ 䴀椀挀栀椀最愀渀 㐀㠀㄀㘀㈀ 簀 氀愀ⴀ稀ⴀ戀漀礀⸀挀漀洀㰀栀琀琀瀀㨀⼀⼀眀眀眀⸀氀愀ⴀ稀ⴀ戀漀礀⸀挀漀洀⼀㸀ഀ਀ 昀愀挀攀戀漀漀欀⸀挀漀洀⼀氀愀稀戀漀礀㰀栀琀琀瀀㨀⼀⼀昀愀挀攀戀漀漀欀⸀挀漀洀⼀氀愀稀戀漀礀㸀  簀 椀渀猀琀愀最爀愀洀⸀挀漀洀⼀氀愀稀戀漀礀㰀栀琀琀瀀猀㨀⼀⼀椀渀猀琀愀最爀愀洀⸀挀漀洀⼀氀愀稀戀漀礀㸀 簀 礀漀甀琀甀戀攀⸀挀漀洀⼀氀愀稀戀漀礀㰀栀琀琀瀀㨀⼀⼀礀漀甀琀甀戀攀⸀挀漀洀⼀氀愀稀戀漀礀㸀ഀ਀ഀ਀ഀ਀嬀挀椀搀㨀氀愀稀戀漀礀开㈀　㈀㐀开椀渀挀开渀愀瘀礀开瀀洀猀㈀㄀㠀㤀开爀最戀开爀攀瘀㈀　㈀㔀开攀㐀　愀㤀㐀昀㈀ⴀ攀㌀㐀㐀ⴀ㐀愀㐀愀ⴀ愀　㈀挀ⴀ昀挀㌀㄀㤀㤀㘀攀㄀㈀㜀挀⸀瀀渀最崀ഀ਀ഀ਀吀栀椀猀 洀攀猀猀愀最攀 椀猀 椀渀琀攀渀搀攀搀 漀渀氀礀 昀漀爀 琀栀攀 椀渀搀椀瘀椀搀甀愀氀 漀爀 攀渀琀椀琀礀 琀漀 眀栀椀挀栀 椀琀 椀猀 愀搀搀爀攀猀猀攀搀⸀ 䤀琀 洀愀礀 挀漀渀琀愀椀渀 瀀爀椀瘀椀氀攀最攀搀Ⰰ 挀漀渀昀椀搀攀渀琀椀愀氀 椀渀昀漀爀洀愀琀椀漀渀 眀栀椀挀栀 椀猀 攀砀攀洀瀀琀 昀爀漀洀 搀椀猀挀氀漀猀甀爀攀 甀渀搀攀爀 愀瀀瀀氀椀挀愀戀氀攀 氀愀眀猀⸀ 䤀昀 礀漀甀 愀爀攀 渀漀琀 琀栀攀 椀渀琀攀渀搀攀搀 爀攀挀椀瀀椀攀渀琀Ⰰ 礀漀甀 愀爀攀 猀琀爀椀挀琀氀礀 瀀爀漀栀椀戀椀琀攀搀 昀爀漀洀 搀椀猀猀攀洀椀渀愀琀椀渀最 漀爀 搀椀猀琀爀椀戀甀琀椀渀最 琀栀椀猀 椀渀昀漀爀洀愀琀椀漀渀 ⠀漀琀栀攀爀 琀栀愀渀 琀漀 琀栀攀 椀渀琀攀渀搀攀搀 爀攀挀椀瀀椀攀渀琀⤀ 漀爀 挀漀瀀礀椀渀最 琀栀椀猀 椀渀昀漀爀洀愀琀椀漀渀⸀ 䤀昀 礀漀甀 栀愀瘀攀 爀攀挀攀椀瘀攀搀 琀栀椀猀 挀漀洀洀甀渀椀挀愀琀椀漀渀 椀渀 攀爀爀漀爀Ⰰ 瀀氀攀愀猀攀 渀漀琀椀昀礀 甀猀 椀洀洀攀搀椀愀琀攀氀礀 戀礀 攀ⴀ洀愀椀氀 漀爀 戀礀 琀攀氀攀瀀栀漀渀攀 愀琀 琀栀攀 愀戀漀瘀攀 渀甀洀戀攀爀⸀ 吀栀愀渀欀 礀漀甀⸀ഀ਀

Вложения

winmail.dat

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Alban Hertroys

Дата:

20 августа, 12:41:19

> On 19 Aug 2025, at 22:25, Scot Kreienkamp <Scot.Kreienkamp@la-z-boy.com> wrote:
>
>
匀礀渀挀栀爀漀渀漀甀猀 挀漀洀洀椀琀㼀  伀爀 愀猀礀渀挀栀爀漀渀漀甀猀㼀ഀ਀ഀ਀ഀ਀匀挀漀琀 䬀爀攀椀攀渀欀愀洀瀀 簀 䄀瀀瀀氀椀挀愀琀椀漀渀猀 䤀渀昀爀愀猀琀爀甀挀琀甀爀攀 䄀爀挀栀椀琀攀挀琀 簀 䰀愀ⴀ娀ⴀ䈀漀礀 䌀漀爀瀀漀爀愀琀攀ഀ਀⠀㜀㌀㐀⤀ ㌀㠀㐀ⴀ㘀㐀　㌀ 簀 ㄀ⴀ㜀㌀㐀ⴀ㤀㄀㔀ⴀ㄀㐀㐀㐀 簀 匀挀漀琀⸀䬀

This looks like a bug in Outlook (at work) that I ran into over 3 years ago after enabling beta-support for UTF-8
encoding…The fun part was that the messages looked perfectly fine on my end, but were gibberish on the receiving ends
only.Is that bug still present perhaps? 

Or is it just my end that turns Scott's (and only Scott's) messages into gibberish Chinese?

Alban Hertroys
--
There is always an exception to always.

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Ron Johnson

Дата:

20 августа, 13:53:09

Scot's email wasn't gibberish to me (reading from chrome). Spaces between every character, and 99.9689% signature block (I did the math!), but all English (and a giant png file).

On Wed, Aug 20, 2025 at 5:41 AM Alban Hertroys <haramrae@gmail.com> wrote:

> On 19 Aug 2025, at 22:25, Scot Kreienkamp <Scot.Kreienkamp@la-z-boy.com> wrote:
>
> 匀礀渀挀栀爀漀渀漀甀猀挀漀洀洀椀琀㼀伀爀愀猀礀渀挀栀爀漀渀漀甀猀㼀ഀ਀ഀ਀ഀ਀匀挀漀琀䬀爀攀椀攀渀欀愀洀瀀簀䄀瀀瀀氀椀挀愀琀椀漀渀猀䤀渀昀爀愀猀琀爀甀挀琀甀爀攀䄀爀挀栀椀琀攀挀琀簀䰀愀ⴀ娀ⴀ䈀漀礀䌀漀爀瀀漀爀愀琀攀ഀ਀⠀㜀㌀㐀⤀ ㌀㠀㐀ⴀ㘀㐀　㌀簀㄀ⴀ㜀㌀㐀ⴀ㤀㄀㔀ⴀ㄀㐀㐀㐀簀匀挀漀琀⸀䬀

This looks like a bug in Outlook (at work) that I ran into over 3 years ago after enabling beta-support for UTF-8 encoding… The fun part was that the messages looked perfectly fine on my end, but were gibberish on the receiving ends only. Is that bug still present perhaps?

Or is it just my end that turns Scott's (and only Scott's) messages into gibberish Chinese?

Alban Hertroys
--
There is always an exception to always.

Death to <Redacted>, and butter sauce.

Don't boil me, I'm still alive.

<Redacted> lobster!

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

20 августа, 14:32:34

On Tue, Aug 19, 2025 at 11:39:03AM -0700, Adrian Klaver wrote:
> > Every now and then (usually every 3-5 minutes, but not through the whole
> > day), we see situations where every query suddently takes ~ 1 second.
> Given the subject line, what you are reporting is happening on the replica,
> correct?

Yes.

> If so where is the replica relative to the primary, in terms of network
> distance?

=$ ping -c 10 primary
reports:
10 packets transmitted, 10 received, 0% packet loss, time 9181ms
rtt min/avg/max/mdev = 0.942/0.956/0.991/0.012 ms

> Also what are the 'hardware' specifications on the replica instance?

c8g.48xlarge ec2 instance. It is arm64, 192 cores, with 384 gb of ram.

As for storage, this is relatitvely slow, because this db is rather
small:
gp3 500gb volume, with 6000 iops. At no point is IO in any way close to
limits, the whole db fits easily in ram.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

20 августа, 14:35:20

On Tue, Aug 19, 2025 at 08:25:26PM +0000, Scot Kreienkamp wrote:
> Synchronous commit?  Or asynchronous?

Isn't this potential problem on primary, and not replica? Anyway,
synchronous commit is set to "on", and "synchronous_standby_names" is
empty.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Pawel Veselov

Дата:

20 августа, 14:42:32

> >
匀礀渀挀栀爀漀渀漀甀猀 挀漀洀洀椀琀㼀  伀爀 愀猀礀渀挀栀爀漀渀漀甀猀㼀ഀ਀ഀ਀ഀ਀匀挀漀琀 䬀爀攀椀攀渀欀愀洀瀀 簀 䄀瀀瀀氀椀挀愀琀椀漀渀猀 䤀渀昀爀愀猀琀爀甀挀琀甀爀攀 䄀爀挀栀椀琀攀挀琀 簀 䰀愀ⴀ娀ⴀ䈀漀礀 䌀漀爀瀀漀爀愀琀攀ഀ਀⠀㜀㌀㐀⤀ ㌀㠀㐀ⴀ㘀㐀　㌀ 簀 ㄀ⴀ㜀㌀㐀ⴀ㤀㄀㔀ⴀ㄀㐀㐀㐀 簀 匀挀漀琀⸀䬀
> This looks like a bug in Outlook (at work) that I ran into over 3 years ago after enabling beta-support for UTF-8
encoding…The fun part was that the messages looked perfectly fine on my end, but were gibberish on the receiving ends
only.Is that bug still present perhaps?
 

Interesting.
Your email client probably incorrectly interprets UTF-16 encoding.
That's what the content charset is in Scott's email.
The first text character in Scott's email is "S" (U0053), but when
sending(?) it translates to 匀 (U5300).

Gmail web also is struggling with it, judging by the spaces I see
between the characters in the Scott's original. Those spaces are
replaced '0x00' chars from the first byte of each UTF-16 char.

>
> Or is it just my end that turns Scott's (and only Scott's) messages into gibberish Chinese?
>
> Alban Hertroys
> --
> There is always an exception to always.
>
>
>
>
>
>


-- 
With best of best regards
Pawel S. Veselov

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Tom Lane

Дата:

20 августа, 15:03:39

Alban Hertroys <haramrae@gmail.com> writes:
> Or is it just my end that turns Scott's (and only Scott's) messages into gibberish Chinese?

What it looks like from here is that the message body is in UTF16:
every other byte is zero.  That gives my email client indigestion,
too.

            regards, tom lane

RE: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Scot Kreienkamp

Дата:

20 августа, 17:59:17

䄀瀀漀氀漀最椀攀猀 昀漀爀 琀栀攀 最椀戀戀攀爀椀猀栀 攀瘀攀爀礀漀渀攀⸀  伀甀琀氀漀漀欀 搀攀昀愀甀氀琀猀 琀漀 䠀吀䴀䰀Ⰰ 䤀 栀愀瘀攀渀✀琀 瀀漀猀琀攀搀 椀渀 瀀爀漀戀愀戀氀礀 洀漀爀攀 琀栀愀渀 愀 礀攀愀爀 愀渀搀 昀漀爀最漀琀 琀漀 猀攀琀 椀琀 琀漀 瀀氀愀椀渀 琀攀砀琀⸀  吀栀攀 猀椀最渀愀琀甀爀攀 䤀 挀愀渀✀琀 搀漀 愀渀礀琀栀椀渀最 愀戀漀甀琀Ⰰ 琀栀愀琀✀猀 愀甀琀漀洀愀琀椀挀愀氀氀礀 愀搀搀攀搀 愀昀琀攀爀 䤀 栀椀琀 猀攀渀搀⸀ഀ਀ഀ਀ഀ਀ഀ਀ഀ਀匀挀漀琀 䬀爀攀椀攀渀欀愀洀瀀 簀 䄀瀀瀀氀椀挀愀琀椀漀渀猀 䤀渀昀爀愀猀琀爀甀挀琀甀爀攀 䄀爀挀栀椀琀攀挀琀 簀 䰀愀ⴀ娀ⴀ䈀漀礀 䌀漀爀瀀漀爀愀琀攀ഀ਀伀渀攀 䰀愀ⴀ娀ⴀ䈀漀礀 䐀爀椀瘀攀 簀 䴀漀渀爀漀攀Ⰰ 䴀椀挀栀椀最愀渀 㐀㠀㄀㘀㈀ 簀 ∠ ⠀㜀㌀㐀⤀ ㌀㠀㐀ⴀ㘀㐀　㌀ 簀 簀  ∠ ㄀ⴀ㜀㌀㐀ⴀ㤀㄀㔀ⴀ㄀㐀㐀㐀  簀  䔀洀愀椀氀㨀 匀挀漀琀⸀䬀爀攀椀攀渀欀愀洀瀀䀀氀愀ⴀ稀ⴀ戀漀礀⸀挀漀洀ഀ਀ഀ਀ഀ਀ഀ਀吀栀椀猀 洀攀猀猀愀最攀 椀猀 椀渀琀攀渀搀攀搀 漀渀氀礀 昀漀爀 琀栀攀 椀渀搀椀瘀椀搀甀愀氀 漀爀 攀渀琀椀琀礀 琀漀 眀栀椀挀栀 椀琀 椀猀 愀搀搀爀攀猀猀攀搀⸀ 䤀琀 洀愀礀 挀漀渀琀愀椀渀 瀀爀椀瘀椀氀攀最攀搀Ⰰ 挀漀渀昀椀搀攀渀琀椀愀氀 椀渀昀漀爀洀愀琀椀漀渀 眀栀椀挀栀 椀猀 攀砀攀洀瀀琀 昀爀漀洀 搀椀猀挀氀漀猀甀爀攀 甀渀搀攀爀 愀瀀瀀氀椀挀愀戀氀攀 氀愀眀猀⸀ 䤀昀 礀漀甀 愀爀攀 渀漀琀 琀栀攀 椀渀琀攀渀搀攀搀 爀攀挀椀瀀椀攀渀琀Ⰰ 礀漀甀 愀爀攀 猀琀爀椀挀琀氀礀 瀀爀漀栀椀戀椀琀攀搀 昀爀漀洀 搀椀猀猀攀洀椀渀愀琀椀渀最 漀爀 搀椀猀琀爀椀戀甀琀椀渀最 琀栀椀猀 椀渀昀漀爀洀愀琀椀漀渀 ⠀漀琀栀攀爀 琀栀愀渀 琀漀 琀栀攀 椀渀琀攀渀搀攀搀 爀攀挀椀瀀椀攀渀琀⤀ 漀爀 挀漀瀀礀椀渀最 琀栀椀猀 椀渀昀漀爀洀愀琀椀漀渀⸀ 䤀昀 礀漀甀 栀愀瘀攀 爀攀挀攀椀瘀攀搀 琀栀椀猀 挀漀洀洀甀渀椀挀愀琀椀漀渀 椀渀 攀爀爀漀爀Ⰰ 瀀氀攀愀猀攀 渀漀琀椀昀礀 甀猀 椀洀洀攀搀椀愀琀攀氀礀 戀礀 攀ⴀ洀愀椀氀 漀爀 戀礀 琀攀氀攀瀀栀漀渀攀 愀琀 琀栀攀 愀戀漀瘀攀 渀甀洀戀攀爀⸀ 吀栀愀渀欀 礀漀甀⸀ഀ਀

Вложения

winmail.dat

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

20 августа, 18:14:47

On 8/20/25 04:32, hubert depesz lubaczewski wrote:
> On Tue, Aug 19, 2025 at 11:39:03AM -0700, Adrian Klaver wrote:
>>> Every now and then (usually every 3-5 minutes, but not through the whole
>>> day), we see situations where every query suddently takes ~ 1 second.
>> Given the subject line, what you are reporting is happening on the replica,
>> correct?
> 
> Yes.
> 
>> If so where is the replica relative to the primary, in terms of network
>> distance?
> 
> =$ ping -c 10 primary
> reports:
> 10 packets transmitted, 10 received, 0% packet loss, time 9181ms
> rtt min/avg/max/mdev = 0.942/0.956/0.991/0.012 ms
> 
>> Also what are the 'hardware' specifications on the replica instance?
> 
> c8g.48xlarge ec2 instance. It is arm64, 192 cores, with 384 gb of ram.
> 
> As for storage, this is relatitvely slow, because this db is rather
> small:
> gp3 500gb volume, with 6000 iops. At no point is IO in any way close to
> limits, the whole db fits easily in ram.

Hmm.

 From initial post:

"For ~ 1 second there are no logs going to log (we usually have at 5-20
messages logged per second), no connection, nothing. And then we get
bunch (30+) messages with the same milisecond time."

Are the 30+ messages all coming in on one connection or multiple 
connections?

Also to be clear these are statements that are being run on the replica 
locally, correct?

Does the AWS monitoring indicate any issues?

> 
> Best regards,
> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

20 августа, 19:08:11

On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
> Hmm.
> 
> From initial post:
> 
> "For ~ 1 second there are no logs going to log (we usually have at 5-20
> messages logged per second), no connection, nothing. And then we get
> bunch (30+) messages with the same milisecond time."
> 
> Are the 30+ messages all coming in on one connection or multiple
> connections?

Multiple connections.

> Also to be clear these are statements that are being run on the replica
> locally, correct?

What do you mean locally?

Application servers are *not* on the same server as the db. So no, they
are not local.

> Does the AWS monitoring indicate any issues?

Nope. All looks "fine".

Same for our monitoring  - CPU usage, iops, load.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Rob Sargent

Дата:

20 августа, 20:24:29


> On Aug 20, 2025, at 10:08 AM, hubert depesz lubaczewski <depesz@depesz.com> wrote:
>
> On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
>> Hmm.
>>
>> From initial post:
>>
>> "For ~ 1 second there are no logs going to log (we usually have at 5-20
>> messages logged per second), no connection, nothing. And then we get
>> bunch (30+) messages with the same milisecond time."

Is that logger ms or db server ms?  The latter seems unlikely to me.

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Thom Brown

Дата:

20 августа, 20:30:00

On Wed, 20 Aug 2025 at 17:08, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
>
> On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
> > Hmm.
> >
> > From initial post:
> >
> > "For ~ 1 second there are no logs going to log (we usually have at 5-20
> > messages logged per second), no connection, nothing. And then we get
> > bunch (30+) messages with the same milisecond time."
> >
> > Are the 30+ messages all coming in on one connection or multiple
> > connections?
>
> Multiple connections.
>
> > Also to be clear these are statements that are being run on the replica
> > locally, correct?
>
> What do you mean locally?
>
> Application servers are *not* on the same server as the db. So no, they
> are not local.
>
> > Does the AWS monitoring indicate any issues?
>
> Nope. All looks "fine".
>
> Same for our monitoring  - CPU usage, iops, load.
>
> Best regards,
>
> depesz

Do you have THP enabled? Can you use mpstat and see what %steal shows as?

Thom

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

20 августа, 20:45:13

On 8/20/25 09:08, hubert depesz lubaczewski wrote:
> On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
>> Hmm.
>>
>>  From initial post:
>>
>> "For ~ 1 second there are no logs going to log (we usually have at 5-20
>> messages logged per second), no connection, nothing. And then we get
>> bunch (30+) messages with the same milisecond time."
>>
>> Are the 30+ messages all coming in on one connection or multiple
>> connections?
> 
> Multiple connections.
> 
>> Also to be clear these are statements that are being run on the replica
>> locally, correct?
> 
> What do you mean locally?

I should have been clearer. Are the queries being run against the 
replica or the primary?

> 
> Application servers are *not* on the same server as the db. So no, they
> are not local.

Where are the application servers relative to the replica server?

How many applications servers are hitting the database?

> 
>> Does the AWS monitoring indicate any issues?
> 
> Nope. All looks "fine".


> 
> Same for our monitoring  - CPU usage, iops, load.
> 
> Best regards,
> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Alban Hertroys

Дата:

20 августа, 23:03:52

> On 20 Aug 2025, at 14:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alban Hertroys <haramrae@gmail.com> writes:
>> Or is it just my end that turns Scott's (and only Scott's) messages into gibberish Chinese?
>
> What it looks like from here is that the message body is in UTF16:
> every other byte is zero.  That gives my email client indigestion,
> too.

Actually, I think it’s misreporting the message content as UTF-16, while it is in fact encoded as UTF-8. So the error
appearsto be on the sender side. 

Apparently, Apple Mail.app is rather strict about following the reported encoding, while Gmail for example does some
attemptto guess. That would explain the less terrible results with Gmail. 

I can base64 decode the message just fine on the command line, which runs in a terminal emulator (Apple’s Terminal.app)
thatdoes handle UTF-8, but not UTF-16 by default. 

Alban Hertroys
--
Illustratief voor het verschil tussen steek- en snijwapens:
Met mensen die een punthoofd hebben,
kun je goed de draak steken.
Met mensen die scherp zijn,
lukt dat juist slecht.

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

21 августа, 12:50:53

On Wed, Aug 20, 2025 at 11:24:29AM -0600, Rob Sargent wrote:
> > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
> >> Hmm.
> >> 
> >> From initial post:
> >> 
> >> "For ~ 1 second there are no logs going to log (we usually have at 5-20
> >> messages logged per second), no connection, nothing. And then we get
> >> bunch (30+) messages with the same milisecond time."
> 
> Is that logger ms or db server ms?  The latter seems unlikely to me. 

Sorry, I don't understand your question.

I know it's unlikely, so it seems that something "froze" pg, for ~ 1s.
The question is how to diagnose/debug what it is.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

21 августа, 13:03:20

On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote:
> Do you have THP enabled? Can you use mpstat and see what %steal shows as?

Sorry, what is "THP"? I tried searching for "what is thp", and most
common search results are related to some chemical compound.

mpstat, yes, we have it. Let me quickly show what I get:

Two days ago, at ~ 10:10pm UTC I saw this in Pg logs:

# grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d'
      2 2025-08-19 22:09:29.084 UTC
      1 2025-08-19 22:09:29.094 UTC
      2 2025-08-19 22:09:29.097 UTC
     70 2025-08-19 22:09:29.109 UTC
     90 2025-08-19 22:09:29.110 UTC
      6 2025-08-19 22:09:29.111 UTC
      1 2025-08-19 22:09:29.153 UTC
      1 2025-08-19 22:09:29.555 UTC

As you can see we have 70, and then 90 messages all logged with the same
timestamp. All of them (160) were "duration:x" lines.

At the time, mpstat (called using `mpstat 5`) looked like this:

             CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
22:09:04     all    2.25    0.00    0.27    0.24    0.00    0.10    0.00    0.00    0.00   97.14
22:09:09     all    2.70    0.00    0.28    0.27    0.00    0.12    0.00    0.00    0.00   96.63
22:09:14     all    2.83    0.00    0.63    0.23    0.00    0.09    0.00    0.00    0.00   96.22
22:09:19     all    2.66    0.00    0.28    0.23    0.00    0.10    0.00    0.00    0.00   96.74
22:09:24     all    2.36    0.00    0.27    0.21    0.00    0.10    0.00    0.00    0.00   97.06
22:09:29     all    2.49    0.00    0.40    0.40    0.00    0.11    0.00    0.00    0.00   96.60
22:09:34     all    2.65    0.00    0.28    0.23    0.00    0.11    0.00    0.00    0.00   96.73
22:09:39     all    2.26    0.00    0.26    0.21    0.00    0.09    0.00    0.00    0.00   97.19
22:09:44     all    2.31    0.00    0.25    0.22    0.00    0.09    0.00    0.00    0.00   97.13
22:09:49     all    2.44    0.00    0.22    0.24    0.00    0.08    0.00    0.00    0.00   97.02
22:09:54     all    2.23    0.00    0.23    0.28    0.00    0.10    0.00    0.00    0.00   97.16
22:09:59     all    2.48    0.00    0.26    0.30    0.00    0.11    0.00    0.00    0.00   96.85
22:10:04     all    2.29    0.00    0.28    0.26    0.00    0.12    0.00    0.00    0.00   97.06
22:10:09     all    2.31    0.00    0.22    0.21    0.00    0.11    0.00    0.00    0.00   97.14
22:10:14     all    2.38    0.00    0.25    0.21    0.00    0.10    0.00    0.00    0.00   97.05
22:10:19     all    2.54    0.00    0.24    0.20    0.00    0.11    0.00    0.00    0.00   96.91
22:10:24     all    2.26    0.00    0.25    0.22    0.00    0.10    0.00    0.00    0.00   97.17
22:10:29     all    2.12    0.00    0.27    0.24    0.00    0.09    0.00    0.00    0.00   97.27
22:10:34     all    2.56    0.00    0.26    0.26    0.00    0.12    0.00    0.00    0.00   96.80
22:10:39     all    2.16    0.00    0.23    0.24    0.00    0.10    0.00    0.00    0.00   97.27
22:10:44     all    2.37    0.00    0.26    0.22    0.00    0.10    0.00    0.00    0.00   97.05
22:10:49     all    2.25    0.00    0.25    0.61    0.00    0.09    0.00    0.00    0.00   96.80
22:10:54     all    2.41    0.00    0.28    0.22    0.00    0.10    0.00    0.00    0.00   96.99
22:10:59     all    2.83    0.00    0.29    0.19    0.00    0.12    0.00    0.00    0.00   96.57

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

21 августа, 13:07:06

On Wed, Aug 20, 2025 at 10:45:13AM -0700, Adrian Klaver wrote:
> On 8/20/25 09:08, hubert depesz lubaczewski wrote:
> > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
> > > Hmm.
> > > 
> > >  From initial post:
> > > 
> > > "For ~ 1 second there are no logs going to log (we usually have at 5-20
> > > messages logged per second), no connection, nothing. And then we get
> > > bunch (30+) messages with the same milisecond time."
> > > Are the 30+ messages all coming in on one connection or multiple
> > > connections?
> > Multiple connections.
> > > Also to be clear these are statements that are being run on the replica
> > > locally, correct?
> > What do you mean locally?
> I should have been clearer. Are the queries being run against the replica or
> the primary?

All to replica. Primary has its own work, of course, but the problem
we're experiencing is on replicas.

> > Application servers are *not* on the same server as the db. So no, they
> > are not local.
> Where are the application servers relative to the replica server?

Well, there is a lot of them. All are in AWS EC2, same region, various
availability zones. Generally we use 3 AZs, with more or less equal
split, so I'd say ~ 33% of app servers is within the same AZ.

> How many applications servers are hitting the database?

To be honest, I'm not sure. I have visibility into dbs, and bouncers,
not really into Apps. I know that these are automatically dynamically
scaled, so number of app server is very varying.

I'd say anything from 40 to 200 app servers hit first layer of bouncers,
which we usually have 6-9 (2-3 per az).

These go to 2nd layer of bouncers, on the db server itself.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Thom Brown

Дата:

21 августа, 14:41:44

On Thu, 21 Aug 2025 at 11:03, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
>
> On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote:
> > Do you have THP enabled? Can you use mpstat and see what %steal shows as?
>
> Sorry, what is "THP"? I tried searching for "what is thp", and most
> common search results are related to some chemical compound.

Ah, yeah I meant transparent hugepage:

cat /sys/kernel/mm/transparent_hugepage/enabled

This should show it being set as "never".

>
> mpstat, yes, we have it. Let me quickly show what I get:
>
> Two days ago, at ~ 10:10pm UTC I saw this in Pg logs:
>
> # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d'
>       2 2025-08-19 22:09:29.084 UTC
>       1 2025-08-19 22:09:29.094 UTC
>       2 2025-08-19 22:09:29.097 UTC
>      70 2025-08-19 22:09:29.109 UTC
>      90 2025-08-19 22:09:29.110 UTC
>       6 2025-08-19 22:09:29.111 UTC
>       1 2025-08-19 22:09:29.153 UTC
>       1 2025-08-19 22:09:29.555 UTC
>
> As you can see we have 70, and then 90 messages all logged with the same
> timestamp. All of them (160) were "duration:x" lines.
>
> At the time, mpstat (called using `mpstat 5`) looked like this:
>
>              CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 22:09:04     all    2.25    0.00    0.27    0.24    0.00    0.10    0.00    0.00    0.00   97.14
> 22:09:09     all    2.70    0.00    0.28    0.27    0.00    0.12    0.00    0.00    0.00   96.63
> 22:09:14     all    2.83    0.00    0.63    0.23    0.00    0.09    0.00    0.00    0.00   96.22
> 22:09:19     all    2.66    0.00    0.28    0.23    0.00    0.10    0.00    0.00    0.00   96.74
> 22:09:24     all    2.36    0.00    0.27    0.21    0.00    0.10    0.00    0.00    0.00   97.06
> 22:09:29     all    2.49    0.00    0.40    0.40    0.00    0.11    0.00    0.00    0.00   96.60
> 22:09:34     all    2.65    0.00    0.28    0.23    0.00    0.11    0.00    0.00    0.00   96.73
> 22:09:39     all    2.26    0.00    0.26    0.21    0.00    0.09    0.00    0.00    0.00   97.19
> 22:09:44     all    2.31    0.00    0.25    0.22    0.00    0.09    0.00    0.00    0.00   97.13
> 22:09:49     all    2.44    0.00    0.22    0.24    0.00    0.08    0.00    0.00    0.00   97.02
> 22:09:54     all    2.23    0.00    0.23    0.28    0.00    0.10    0.00    0.00    0.00   97.16
> 22:09:59     all    2.48    0.00    0.26    0.30    0.00    0.11    0.00    0.00    0.00   96.85
> 22:10:04     all    2.29    0.00    0.28    0.26    0.00    0.12    0.00    0.00    0.00   97.06
> 22:10:09     all    2.31    0.00    0.22    0.21    0.00    0.11    0.00    0.00    0.00   97.14
> 22:10:14     all    2.38    0.00    0.25    0.21    0.00    0.10    0.00    0.00    0.00   97.05
> 22:10:19     all    2.54    0.00    0.24    0.20    0.00    0.11    0.00    0.00    0.00   96.91
> 22:10:24     all    2.26    0.00    0.25    0.22    0.00    0.10    0.00    0.00    0.00   97.17
> 22:10:29     all    2.12    0.00    0.27    0.24    0.00    0.09    0.00    0.00    0.00   97.27
> 22:10:34     all    2.56    0.00    0.26    0.26    0.00    0.12    0.00    0.00    0.00   96.80
> 22:10:39     all    2.16    0.00    0.23    0.24    0.00    0.10    0.00    0.00    0.00   97.27
> 22:10:44     all    2.37    0.00    0.26    0.22    0.00    0.10    0.00    0.00    0.00   97.05
> 22:10:49     all    2.25    0.00    0.25    0.61    0.00    0.09    0.00    0.00    0.00   96.80
> 22:10:54     all    2.41    0.00    0.28    0.22    0.00    0.10    0.00    0.00    0.00   96.99
> 22:10:59     all    2.83    0.00    0.29    0.19    0.00    0.12    0.00    0.00    0.00   96.57

This output looks fine, so it doesn't show anything concerning, so
suggests the issue is somehow on the Postgres side.

Did you happen to poll pg_stat_activity at the time to see whether you
had lots of IPC waits? I'm wondering whether the storage layer is
freezing up for a moment.

Thom

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

21 августа, 15:03:02

On Thu, Aug 21, 2025 at 12:41:44PM +0100, Thom Brown wrote:
> Ah, yeah I meant transparent hugepage:
> cat /sys/kernel/mm/transparent_hugepage/enabled
> This should show it being set as "never".

Ah. Sorry, couldn't decipher. Yes, it's "never".

> > # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d'
> >       2 2025-08-19 22:09:29.084 UTC
> >       1 2025-08-19 22:09:29.094 UTC
> >       2 2025-08-19 22:09:29.097 UTC
> >      70 2025-08-19 22:09:29.109 UTC
> >      90 2025-08-19 22:09:29.110 UTC
> >       6 2025-08-19 22:09:29.111 UTC
> >       1 2025-08-19 22:09:29.153 UTC
> >       1 2025-08-19 22:09:29.555 UTC
…
> > 22:10:54     all    2.41    0.00    0.28    0.22    0.00    0.10    0.00    0.00    0.00   96.99
> > 22:10:59     all    2.83    0.00    0.29    0.19    0.00    0.12    0.00    0.00    0.00   96.57
> 
> This output looks fine, so it doesn't show anything concerning, so
> suggests the issue is somehow on the Postgres side.
> 
> Did you happen to poll pg_stat_activity at the time to see whether you
> had lots of IPC waits? I'm wondering whether the storage layer is
> freezing up for a moment.

So, we get select * from pg_stat_activity, for client backends that are
not idle, every 29 seconds.
So, 1 second "freeze" is impossible to cathc.

Plus - I suspect that if I ran select * from pg_stat_activity while "in
freeze", it would also get frozen.

Anyway, I have data from 22:09:22 and 22:09:51. In both cases only
4 non-idle backend.

6 of them had NULL in wait_event*

one was Client/ClientRead and one was IPC/BgWorkerShutdown.

State_change for the IPC/BgWorkerShutdown backend was 2025-08-19
22:09:51.79504+00 so it was well past the moment when the problem
struck.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

21 августа, 18:04:25

On 8/21/25 03:07, hubert depesz lubaczewski wrote:
> On Wed, Aug 20, 2025 at 10:45:13AM -0700, Adrian Klaver wrote:
>> On 8/20/25 09:08, hubert depesz lubaczewski wrote:
>>> On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
>>>> Hmm.
>>>>
>>>>   From initial post:
>>>>
>>>> "For ~ 1 second there are no logs going to log (we usually have at 5-20
>>>> messages logged per second), no connection, nothing. And then we get
>>>> bunch (30+) messages with the same milisecond time."
>>>> Are the 30+ messages all coming in on one connection or multiple
>>>> connections?
>>> Multiple connections.
>>>> Also to be clear these are statements that are being run on the replica
>>>> locally, correct?
>>> What do you mean locally?
>> I should have been clearer. Are the queries being run against the replica or
>> the primary?
> 
> All to replica. Primary has its own work, of course, but the problem
> we're experiencing is on replicas.

If I am following there is more then one primary --> replica pair and 
the problem exists across all the pairs.

>> How many applications servers are hitting the database?
> 
> To be honest, I'm not sure. I have visibility into dbs, and bouncers,
> not really into Apps. I know that these are automatically dynamically
> scaled, so number of app server is very varying.
> 
> I'd say anything from 40 to 200 app servers hit first layer of bouncers,
> which we usually have 6-9 (2-3 per az).
> 
> These go to 2nd layer of bouncers, on the db server itself.

By bouncer I assume you mean something like pgBouncer, a connection pooler.

Is it possible to determine what bouncer the queries in question are 
coming from?

> 
> Best regards,
> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

21 августа, 18:13:42

On Thu, Aug 21, 2025 at 08:04:25AM -0700, Adrian Klaver wrote:
> > > > > "For ~ 1 second there are no logs going to log (we usually have at 5-20
> > > > > messages logged per second), no connection, nothing. And then we get
> > > > > bunch (30+) messages with the same milisecond time."
> > > > > Are the 30+ messages all coming in on one connection or multiple
> > > > > connections?
> > > > Multiple connections.
> > > > > Also to be clear these are statements that are being run on the replica
> > > > > locally, correct?
> > > > What do you mean locally?
> > > I should have been clearer. Are the queries being run against the replica or
> > > the primary?
> > All to replica. Primary has its own work, of course, but the problem
> > we're experiencing is on replicas.
> 
> If I am following there is more then one primary --> replica pair and the
> problem exists across all the pairs.

Not all. We have ~ 300 such clusters. The thing doesn't cause any
customer-visible issues (after all it's just 1 second delay every so
often), so it's generally overlooked when it happens.

But we were paying closer attention to one such cluster, and then couple
of other, and we've seen this behavior.

> > > How many applications servers are hitting the database?
> > 
> > To be honest, I'm not sure. I have visibility into dbs, and bouncers,
> > not really into Apps. I know that these are automatically dynamically
> > scaled, so number of app server is very varying.
> > 
> > I'd say anything from 40 to 200 app servers hit first layer of bouncers,
> > which we usually have 6-9 (2-3 per az).
> > 
> > These go to 2nd layer of bouncers, on the db server itself.
> 
> By bouncer I assume you mean something like pgBouncer, a connection pooler.
> Is it possible to determine what bouncer the queries in question are coming
> from?

From the POV of db, all queries are coming from one of N localhost
bouncers. N is usually 2…6.
From the POV of the local bouncers, the queries come from range of
remote bouncers.

Generally we haven't seen any correlation between queries coming from
specific ranges of ips. Logged queries, the ones that we see with
runtime of 1s, have comments that indicate source, and they some from
"all-around". Specifically, "DISCARD ALL" queries are generated by
bouncers themselves (both layers).

Just so that it will be clear, I don't expect anyone to be able to
diagnose the problem based on description. I'm looking more into idea
what to look for. The issue is that with the situation being pretty
short, and happening on servers with non-trivial query load, I can't do
stuff, like, for example, strace, stuff.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

21 августа, 18:59:03

On 8/21/25 08:13, hubert depesz lubaczewski wrote:
> On Thu, Aug 21, 2025 at 08:04:25AM -0700, Adrian Klaver wrote:

>> By bouncer I assume you mean something like pgBouncer, a connection pooler.
>> Is it possible to determine what bouncer the queries in question are coming
>> from?
> 
>  From the POV of db, all queries are coming from one of N localhost
> bouncers. N is usually 2…6.
>  From the POV of the local bouncers, the queries come from range of
> remote bouncers.
> 
> Generally we haven't seen any correlation between queries coming from
> specific ranges of ips. Logged queries, the ones that we see with
> runtime of 1s, have comments that indicate source, and they some from
> "all-around". Specifically, "DISCARD ALL" queries are generated by
> bouncers themselves (both layers).

Well so much for that theory:)

> 
> Just so that it will be clear, I don't expect anyone to be able to
> diagnose the problem based on description. I'm looking more into idea
> what to look for. The issue is that with the situation being pretty
> short, and happening on servers with non-trivial query load, I can't do
> stuff, like, for example, strace, stuff.

Getting to the bottom of the bag of ideas:

Have you looked at the OS system log for the time period involved?

You mentioned this seemed to involve PREPARE and DISCARD ALL.
Is this the same set of statements or is it all over the place?

Also it would be helpful to know what bouncer you are actually using and 
what mode you are running in?


> 
> Best regards,
> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

21 августа, 19:51:45

On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:
> Getting to the bottom of the bag of ideas:
> Have you looked at the OS system log for the time period involved?

Yes. Mostly dmesg. Nothing interesting logged around the time.

> You mentioned this seemed to involve PREPARE and DISCARD ALL.
> Is this the same set of statements or is it all over the place?

No. From what I can tell it's random sample.

> Also it would be helpful to know what bouncer you are actually using and
> what mode you are running in?

pgBouncer, version 1.23.1. As for more... mostly transaction pooling.
Applications go using transaction pooling, but people (dbas, ops) have
session pooling.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

21 августа, 21:17:27

On 8/21/25 09:51, hubert depesz lubaczewski wrote:
> On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:
>> Getting to the bottom of the bag of ideas:
>> Have you looked at the OS system log for the time period involved?
> 
> Yes. Mostly dmesg. Nothing interesting logged around the time.
> 
>> You mentioned this seemed to involve PREPARE and DISCARD ALL.
>> Is this the same set of statements or is it all over the place?
> 
> No. From what I can tell it's random sample.
> 
>> Also it would be helpful to know what bouncer you are actually using and
>> what mode you are running in?
> 
> pgBouncer, version 1.23.1. As for more... mostly transaction pooling.
> Applications go using transaction pooling, but people (dbas, ops) have
> session pooling.

Have you looked at?:

https://www.pgbouncer.org/changelog.html#pgbouncer-124x

To see if anything stands out.

Then there is:

https://www.pgbouncer.org/config.html#max_prepared_statements

The below may also be worth looking at:

https://github.com/pgbouncer/pgbouncer/pull/1144

I can't help thinking that there is a caching issue at stake, though 
that is just a guess.


> 
> Best regards,
> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Chris Wilson

Дата:

21 августа, 21:38:34

If all your queries are coming through pgBouncer, and only those hang (the server itself responds if you connect directly to it), then it might be this pgBouncer issue:

https://github.com/pgbouncer/pgbouncer/issues/1054

Although that issue is now "closed", because the invisible "debug" log message was upgraded to a warning (and I don't think that change is in any released version), the underlying problem still exists: pgbouncer hangs completely (stops forwarding packets) for a while if the PAM authentication queue becomes full.

If you have a relatively slow PAM service (such as pam_ldap) then you can trigger it by opening ~100 connections to pgBouncer simultaneously (without waiting for previous ones to authenticate), something like this:

for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_name -c "SELECT 1" & done

Thanks, Chris.

On Thu, 21 Aug 2025 at 19:17, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 8/21/25 09:51, hubert depesz lubaczewski wrote:
> On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:
>> Getting to the bottom of the bag of ideas:
>> Have you looked at the OS system log for the time period involved?
>
> Yes. Mostly dmesg. Nothing interesting logged around the time.
>
>> You mentioned this seemed to involve PREPARE and DISCARD ALL.
>> Is this the same set of statements or is it all over the place?
>
> No. From what I can tell it's random sample.
>
>> Also it would be helpful to know what bouncer you are actually using and
>> what mode you are running in?
>
> pgBouncer, version 1.23.1. As for more... mostly transaction pooling.
> Applications go using transaction pooling, but people (dbas, ops) have
> session pooling.

Have you looked at?:

https://www.pgbouncer.org/changelog.html#pgbouncer-124x

To see if anything stands out.

Then there is:

https://www.pgbouncer.org/config.html#max_prepared_statements

The below may also be worth looking at:

https://github.com/pgbouncer/pgbouncer/pull/1144

I can't help thinking that there is a caching issue at stake, though
that is just a guess.

>
> Best regards,
>
> depesz
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 15:37:16

On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote:
> Have you looked at?:
> https://www.pgbouncer.org/changelog.html#pgbouncer-124x
> To see if anything stands out.
> Then there is:
> https://www.pgbouncer.org/config.html#max_prepared_statements
> The below may also be worth looking at:
> https://github.com/pgbouncer/pgbouncer/pull/1144
> I can't help thinking that there is a caching issue at stake, though that is
> just a guess.

Will check, but we don't use prepared statements, so this doesn't seem
to be part-of-the-problem.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 15:39:29

On Thu, Aug 21, 2025 at 07:38:34PM +0100, Chris Wilson wrote:
> If all your queries are coming through pgBouncer, and only those hang (the
> server itself responds if you connect directly to it), then it might be
> this pgBouncer issue:
> 
> https://github.com/pgbouncer/pgbouncer/issues/1054
> 
> Although that issue is now "closed", because the invisible "debug" log
> message was upgraded to a warning (and I don't think that change is in any
> released version), the underlying problem still exists: pgbouncer hangs
> completely (stops forwarding packets) for a while if the PAM
> authentication queue becomes full.
> 
> If you have a relatively slow PAM service (such as pam_ldap) then you can
> trigger it by opening ~100 connections to pgBouncer simultaneously (without
> waiting for previous ones to authenticate), something like this:
> 
> for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_name -c
> "SELECT 1" & done

Please note that during the time of problem *NOTHING* seems to be
happening in Pg. There are no messages in logs about anything.

We don't have direct access to db, so there is no sane way to check if
directly sent queries will work.

I can, of course, start shell, and run queries there, but ephemeral
nature of the problem makes it more difficult.

As for authentication - there is no ldap/pam in place. All auth is based
on certs/passowrds-in-file, and we never noticed authentication
slowdowns.

Plus - queries like 'DISCARD ALL' are not ran because someone connected…

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

22 августа, 17:33:03

On 8/22/25 05:37, hubert depesz lubaczewski wrote:
> On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote:
>> Have you looked at?:
>> https://www.pgbouncer.org/changelog.html#pgbouncer-124x
>> To see if anything stands out.
>> Then there is:
>> https://www.pgbouncer.org/config.html#max_prepared_statements
>> The below may also be worth looking at:
>> https://github.com/pgbouncer/pgbouncer/pull/1144
>> I can't help thinking that there is a caching issue at stake, though that is
>> just a guess.
> 
> Will check, but we don't use prepared statements, so this doesn't seem
> to be part-of-the-problem.

In your OP there is:

"Every now and then (usually every 3-5 minutes, but not through the whole
day), we see situations where every query suddently takes ~ 1 second.


I'm talkign about "prepare" for preparing statements. binds. also
"discard all".
"

Is that referring to something else?

> 
> Best regards,
> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 18:03:18

On Fri, Aug 22, 2025 at 07:33:03AM -0700, Adrian Klaver wrote:
> On 8/22/25 05:37, hubert depesz lubaczewski wrote:
> > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote:
> > > Have you looked at?:
> > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x
> > > To see if anything stands out.
> > > Then there is:
> > > https://www.pgbouncer.org/config.html#max_prepared_statements
> > > The below may also be worth looking at:
> > > https://github.com/pgbouncer/pgbouncer/pull/1144
> > > I can't help thinking that there is a caching issue at stake, though that is
> > > just a guess.
> > 
> > Will check, but we don't use prepared statements, so this doesn't seem
> > to be part-of-the-problem.
> 
> In your OP there is:
> 
> "Every now and then (usually every 3-5 minutes, but not through the whole
> day), we see situations where every query suddently takes ~ 1 second.
> 
> 
> I'm talkign about "prepare" for preparing statements. binds. also
> "discard all".
> "
> 
> Is that referring to something else?

No. Same problem. The thing is that it comes and goes.

*BUT*

I got repeatable case today. Is is breaking on its own everyy
~ 5 minutes.

So I made test.sql with:
\timing on
SELECT 'SELECT now(), pg_sleep(.05);' FROM generate_series(1,100000) i \gexec

and ran it like this:

psql -p 5432 -d the_db -f test.sql -qAtX | perl -pe 's/\|\s*$/ :: /'

This bypassed pgbouncer, connected via unix socket, the whole thing got simpler.

After 4 minutes of 50ms runtimes I got:

2025-08-22 14:44:14.013541+00 :: Time: 50.090 ms
2025-08-22 14:44:14.063636+00 :: Time: 50.088 ms
2025-08-22 14:44:14.11373+00 :: Time: 50.085 ms
2025-08-22 14:44:14.163822+00 :: Time: 50.099 ms
2025-08-22 14:44:14.213927+00 :: Time: 1393.932 ms (00:01.394)
2025-08-22 14:44:15.607872+00 :: Time: 50.171 ms
2025-08-22 14:44:15.658097+00 :: Time: 50.209 ms
2025-08-22 14:44:15.708353+00 :: Time: 50.212 ms
2025-08-22 14:44:15.758589+00 :: Time: 50.170 ms

This is pg_sleep. There are no locks, nothing else. I think this
disproves idea that the problem is about pgbouncer, and it kinda looks
like a thing in pg?

Interestingly: ran strace on the pg backend that was doing this loop,
like this:
strace -tt -T -ff -p 1266521  -s 256 -o a.log

And it showed something really strange:

…
14:57:00.864684 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL,
NULL)= 68 <0.000007>
 
14:57:00.864739 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063>
14:57:00.914823 sendto(9,
"T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22
14:57:00.86+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:00.914849recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2),
pg_backend_pid(),pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007>
 
14:57:00.914891 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050064>
14:57:00.964978 sendto(9,
"T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22
14:57:00.91+00\0\0\0\0071266521\0\0\0\0C\0\0\0\
14:57:00.965006 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL,
NULL)= 68 <0.000007>
                                          14:57:00.965052 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063>
 
14:57:01.015133 futex(0xffe7a6636538, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0
<1.354927>
14:57:02.370180 write(2, "\0\0\36\1YS\23\0T2025-08-22 14:57:02.370
UTC,\"postgres\",\"the.db\",1266521,\"[local]\",68a884fb.135359,5,\"SELECT\",2025-08-2214:55:55
UTC,85/0,0,LOG,00000,\"duration:1405.106 ms  statement: SELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);
 
14:57:02.370521 sendto(8,
"\2\0\0\0@\0\0\0\216A\0\0\0\0\0\0\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\255\221\33\0\0\0\0\0\262\341\6\0\0\0\0\0\0\0\0\0\0\0\0\0",
64,0, NULL, 0) = 64 <0.000041>
 
14:57:02.370639 sendto(9,
"T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22
14:57:00.97+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.371063recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2),
pg_backend_pid(),pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000019>
 
14:57:02.371215 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050090>
14:57:02.421381 sendto(9,
"T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22
14:57:02.37+00\0\0\0\0071266521\0\0\0\0C\0\0\0\
14:57:02.421480 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
<0.000008>
                                                 14:57:02.421507 epoll_pwait(4, [{EPOLLIN, {u32=470567512,
u64=187651886697048}}],1, -1, NULL, 8) = 1 <0.000037>
 
14:57:02.421570 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL,
NULL)= 68 <0.000012>
 
14:57:02.421673 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050099>
14:57:02.471866 sendto(9,
"T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22
14:57:02.42+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.471961recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1
EAGAIN(Resource temporarily unavailable) <0.000008>
 
…

As in, normal execution is: recv from, epoll_pwait that takes 0.05s, and sendto.
all clear, nothing fancy.

*BUT*
in case of this one problematic thing there is futex() call that took 1.354s ?! Checked, and in straced ~ 800 queries
thisis the only futex call() ?!
 

Any idea on what could it be?

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Pavel Stehule

Дата:

22 августа, 18:07:09

pá 22. 8. 2025 v 17:03 odesílatel hubert depesz lubaczewski <depesz@depesz.com> napsal:

On Fri, Aug 22, 2025 at 07:33:03AM -0700, Adrian Klaver wrote:
> On 8/22/25 05:37, hubert depesz lubaczewski wrote:
> > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote:
> > > Have you looked at?:
> > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x
> > > To see if anything stands out.
> > > Then there is:
> > > https://www.pgbouncer.org/config.html#max_prepared_statements
> > > The below may also be worth looking at:
> > > https://github.com/pgbouncer/pgbouncer/pull/1144
> > > I can't help thinking that there is a caching issue at stake, though that is
> > > just a guess.
> >
> > Will check, but we don't use prepared statements, so this doesn't seem
> > to be part-of-the-problem.
>
> In your OP there is:
>
> "Every now and then (usually every 3-5 minutes, but not through the whole
> day), we see situations where every query suddently takes ~ 1 second.
>
>
> I'm talkign about "prepare" for preparing statements. binds. also
> "discard all".
> "
>
> Is that referring to something else?

No. Same problem. The thing is that it comes and goes.

*BUT*

I got repeatable case today. Is is breaking on its own everyy
~ 5 minutes.

So I made test.sql with:
\timing on
SELECT 'SELECT now(), pg_sleep(.05);' FROM generate_series(1,100000) i \gexec

and ran it like this:

psql -p 5432 -d the_db -f test.sql -qAtX | perl -pe 's/\|\s*$/ :: /'

This bypassed pgbouncer, connected via unix socket, the whole thing got simpler.

After 4 minutes of 50ms runtimes I got:

2025-08-22 14:44:14.013541+00 :: Time: 50.090 ms
2025-08-22 14:44:14.063636+00 :: Time: 50.088 ms
2025-08-22 14:44:14.11373+00 :: Time: 50.085 ms
2025-08-22 14:44:14.163822+00 :: Time: 50.099 ms
2025-08-22 14:44:14.213927+00 :: Time: 1393.932 ms (00:01.394)
2025-08-22 14:44:15.607872+00 :: Time: 50.171 ms
2025-08-22 14:44:15.658097+00 :: Time: 50.209 ms
2025-08-22 14:44:15.708353+00 :: Time: 50.212 ms
2025-08-22 14:44:15.758589+00 :: Time: 50.170 ms

This is pg_sleep. There are no locks, nothing else. I think this
disproves idea that the problem is about pgbouncer, and it kinda looks
like a thing in pg?

Interestingly: ran strace on the pg backend that was doing this loop,
like this:
strace -tt -T -ff -p 1266521 -s 256 -o a.log

And it showed something really strange:

…
14:57:00.864684 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007>
14:57:00.864739 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063>
14:57:00.914823 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:00.86+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:00.914849 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007>
14:57:00.914891 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050064>
14:57:00.964978 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:00.91+00\0\0\0\0071266521\0\0\0\0C\0\0\0\
14:57:00.965006 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007> 14:57:00.965052 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063>
14:57:01.015133 futex(0xffe7a6636538, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 <1.354927>
14:57:02.370180 write(2, "\0\0\36\1YS\23\0T2025-08-22 14:57:02.370 UTC,\"postgres\",\"the.db\",1266521,\"[local]\",68a884fb.135359,5,\"SELECT\",2025-08-22 14:55:55 UTC,85/0,0,LOG,00000,\"duration: 1405.106 ms statement: SELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);
14:57:02.370521 sendto(8, "\2\0\0\0@\0\0\0\216A\0\0\0\0\0\0\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\255\221\33\0\0\0\0\0\262\341\6\0\0\0\0\0\0\0\0\0\0\0\0\0", 64, 0, NULL, 0) = 64 <0.000041>
14:57:02.370639 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:00.97+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.371063 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000019>
14:57:02.371215 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050090>
14:57:02.421381 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:02.37+00\0\0\0\0071266521\0\0\0\0C\0\0\0\
14:57:02.421480 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000008> 14:57:02.421507 epoll_pwait(4, [{EPOLLIN, {u32=470567512, u64=187651886697048}}], 1, -1, NULL, 8) = 1 <0.000037>
14:57:02.421570 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000012>
14:57:02.421673 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050099>
14:57:02.471866 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:02.42+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.471961 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000008>
…

As in, normal execution is: recv from, epoll_pwait that takes 0.05s, and sendto.
all clear, nothing fancy.

*BUT*
in case of this one problematic thing there is futex() call that took 1.354s ?! Checked, and in straced ~ 800 queries this is the only futex call() ?!

Any idea on what could it be?

Few years ago I had a similar experience - some sec mystic lags - the problem was in virtualization.

Regards

Pavel

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 18:19:33

On Fri, Aug 22, 2025 at 05:07:09PM +0200, Pavel Stehule wrote:
> > Any idea on what could it be?
> Few years ago I had a similar experience - some sec mystic lags - the
> problem was in virtualization.

While I can't discount this idea, after all it is aws ec2, so virtual
boxes, what makes me wonder is:

in case of this slow situation there is futex call, which takes
a second. But there are no futex calls in any other query processing
logs.

So, while it is obvious that futex "wasted" this time, the question is:
why would pg run futex() for this one query at all?

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Tom Lane

Дата:

22 августа, 18:21:22

hubert depesz lubaczewski <depesz@depesz.com> writes:
> I got repeatable case today. Is is breaking on its own everyy
> ~ 5 minutes.

Interesting.  That futex call is presumably caused by interaction
with some other process within the standby server, and the only
plausible candidate really is the startup process (which is replaying
WAL received from the primary).  There are cases where WAL replay
will take locks that can block queries on the standby.  Can you
correlate the delays on the standby server with any DDL events
occurring on the primary?

            regards, tom lane

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 18:30:22

On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > I got repeatable case today. Is is breaking on its own everyy
> > ~ 5 minutes.
> 
> Interesting.  That futex call is presumably caused by interaction
> with some other process within the standby server, and the only
> plausible candidate really is the startup process (which is replaying
> WAL received from the primary).  There are cases where WAL replay
> will take locks that can block queries on the standby.  Can you
> correlate the delays on the standby server with any DDL events
> occurring on the primary?

Nope. Plus there is certain repetition of these cases, so even if I'd
miss *some* create table/alter, it just isn't going to be happening
every 4-5 minutes.

For example, looking at logs for the last ~2h, and just checking
situation when there are more than 20 messages in the same milisecond,
I can see:

    108 14:02:03.149
     25 14:04:01.619
    110 14:05:36.924
     77 14:05:36.925
    108 14:09:28.155
     38 14:13:52.481
     63 14:13:52.482
     73 14:13:52.484
    146 14:18:19.338
     39 14:18:19.339
     24 14:20:01.694
     82 14:23:07.352
     55 14:23:07.353
     37 14:23:07.353
     45 14:27:44.125
    132 14:27:44.126
    109 14:31:41.593
     70 14:31:41.594
     24 14:32:01.205
     21 14:34:01.477
     79 14:35:36.761
    104 14:35:36.762
     22 14:39:49.541
    151 14:39:49.542
     22 14:39:49.543
    112 14:44:15.607
     73 14:44:15.608
     28 14:48:01.256
     50 14:48:25.588
    131 14:48:25.589
    139 14:52:44.391
     74 14:57:02.369
    117 14:57:02.370
     20 15:00:02.008
    137 15:00:43.982
     34 15:00:43.983
     20 15:01:01.110
     22 15:04:21.037
    153 15:04:21.038
     20 15:08:01.136
     31 15:08:55.798
    126 15:08:55.799
     76 15:13:46.654
     83 15:13:46.655
     20 15:17:01.700
    107 15:18:42.112
     72 15:18:42.113
    124 15:23:48.689
     32 15:23:48.690
     25 15:23:48.690
     28 15:24:01.397

So, while there are outliers, I'd say that most of the problems happens every
3-5 minutes.

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Adrian Klaver

Дата:

22 августа, 18:39:21

On 8/22/25 08:30, hubert depesz lubaczewski wrote:
> On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
>> hubert depesz lubaczewski <depesz@depesz.com> writes:
>>> I got repeatable case today. Is is breaking on its own everyy
>>> ~ 5 minutes.
>>
>> Interesting.  That futex call is presumably caused by interaction
>> with some other process within the standby server, and the only
>> plausible candidate really is the startup process (which is replaying
>> WAL received from the primary).  There are cases where WAL replay
>> will take locks that can block queries on the standby.  Can you
>> correlate the delays on the standby server with any DDL events
>> occurring on the primary?
> 
> Nope. Plus there is certain repetition of these cases, so even if I'd
> miss *some* create table/alter, it just isn't going to be happening
> every 4-5 minutes.
> 

> So, while there are outliers, I'd say that most of the problems happens every
> 3-5 minutes.

Are you using the Postgres community version or the AWS variant?

> 
> depesz
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 18:42:08

On Fri, Aug 22, 2025 at 08:39:21AM -0700, Adrian Klaver wrote:
> On 8/22/25 08:30, hubert depesz lubaczewski wrote:
> > On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
> > > hubert depesz lubaczewski <depesz@depesz.com> writes:
> > > > I got repeatable case today. Is is breaking on its own everyy
> > > > ~ 5 minutes.
> > > 
> > > Interesting.  That futex call is presumably caused by interaction
> > > with some other process within the standby server, and the only
> > > plausible candidate really is the startup process (which is replaying
> > > WAL received from the primary).  There are cases where WAL replay
> > > will take locks that can block queries on the standby.  Can you
> > > correlate the delays on the standby server with any DDL events
> > > occurring on the primary?
> > 
> > Nope. Plus there is certain repetition of these cases, so even if I'd
> > miss *some* create table/alter, it just isn't going to be happening
> > every 4-5 minutes.
> > 
> 
> > So, while there are outliers, I'd say that most of the problems happens every
> > 3-5 minutes.
> 
> Are you using the Postgres community version or the AWS variant?

Community. From pgdg repo.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

Tom Lane

Дата:

22 августа, 18:55:18

hubert depesz lubaczewski <depesz@depesz.com> writes:
> On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
>> Interesting.  That futex call is presumably caused by interaction
>> with some other process within the standby server, and the only
>> plausible candidate really is the startup process (which is replaying
>> WAL received from the primary).  There are cases where WAL replay
>> will take locks that can block queries on the standby.  Can you
>> correlate the delays on the standby server with any DDL events
>> occurring on the primary?

> Nope. Plus there is certain repetition of these cases, so even if I'd
> miss *some* create table/alter, it just isn't going to be happening
> every 4-5 minutes.

Nonetheless, I'm suspecting an interaction with the startup process,
because there just isn't that much else that this process could be
needing to deal with.  Can you try strace'ing both the process doing
the test query and the startup process, to see what the startup
process is doing at the times the futex calls happen?

            regards, tom lane

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

22 августа, 19:01:50

On Fri, Aug 22, 2025 at 11:55:18AM -0400, Tom Lane wrote:
> Nonetheless, I'm suspecting an interaction with the startup process,
> because there just isn't that much else that this process could be
> needing to deal with.  Can you try strace'ing both the process doing
> the test query and the startup process, to see what the startup
> process is doing at the times the futex calls happen?

I can, but not today. Will get back to you about it, next week, though.

Best regards,

depesz

Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

От

hubert depesz lubaczewski

Дата:

25 августа, 15:44:12

On Fri, Aug 22, 2025 at 11:55:18AM -0400, Tom Lane wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
> >> Interesting.  That futex call is presumably caused by interaction
> >> with some other process within the standby server, and the only
> >> plausible candidate really is the startup process (which is replaying
> >> WAL received from the primary).  There are cases where WAL replay
> >> will take locks that can block queries on the standby.  Can you
> >> correlate the delays on the standby server with any DDL events
> >> occurring on the primary?
> 
> > Nope. Plus there is certain repetition of these cases, so even if I'd
> > miss *some* create table/alter, it just isn't going to be happening
> > every 4-5 minutes.
> 
> Nonetheless, I'm suspecting an interaction with the startup process,
> because there just isn't that much else that this process could be
> needing to deal with.  Can you try strace'ing both the process doing
> the test query and the startup process, to see what the startup
> process is doing at the times the futex calls happen?

Sooo. This time, I wasn't able to find freeze for 1s. But I found
a cluster with ~ 600ms freezes.

Ran tight-ish loop of pg_sleep(0.05):

| #!/usr/bin/env bash
| (
|     echo '\timing'
|     while true
|     do
|         echo "SELECT now(), pg_backend_pid(), pg_sleep(.05);"
|     done
| ) |
|     psql -qAtXF, |
|     perl -ne '
|         if (/^(.*),(.*),\s*\z/) {
|             printf("%-30s :: %6s :: ", $1, $2);
|         } elsif (/^Time: (\d\d+)\.(\d+) ms\s/) {
|             printf "%s.%s ms :: %s\n", $1, $2, "#"x$1
|         }
|     '

and it logged:

2025-08-25 11:51:44.504176+00  :: 3452279 :: 50.753 ms :: ##################################################
2025-08-25 11:51:44.554945+00  :: 3452279 :: 50.600 ms :: ##################################################
2025-08-25 11:51:44.605597+00  :: 3452279 :: 50.612 ms :: ##################################################
2025-08-25 11:51:44.656266+00  :: 3452279 :: 611.872 ms :: ########################################################…
2025-08-25 11:51:45.268284+00  :: 3452279 :: 51.709 ms :: ###################################################
2025-08-25 11:51:45.319975+00  :: 3452279 :: 50.693 ms :: ##################################################
2025-08-25 11:51:45.370746+00  :: 3452279 :: 50.779 ms :: ##################################################

This time, on this cluster, the problems seemes to happen ~ 10 minutes apart.

Got strace of startup, and of pg backend that did it.

First, the backend processing pg_sleep:

…
11:51:44.554846 recvfrom(9, "\27\3\3\0E", 5, 0, NULL, NULL) = 5 <0.000013>
11:51:44.554895 recvfrom(9,
"!:\17\372\355\223k\"\200F\260\27&\7Z\260Z\24\251\235\240\24\24w\372||I\361kX%sC\241\36\227\275\372M\335>U\371bg#\2648O\265\276\305~a_\335\355\204\232\317\264f\316\344\205A\313D",
69,0, NULL, NULL) = 69 <0.000018>
 
11:51:44.554998 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050079>
11:51:44.605157 write(2, "Message to pg.log") = 289 <0.000023>
11:51:44.605225 sendto(9, "to-psql…", 186, 0, NULL, 0) = 186 <0.000031>
11:51:44.605315 recvfrom(9, 0xaaaade97b493, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
<0.000013>
11:51:44.605360 epoll_pwait(4, [{EPOLLIN, {u32=3733872600, u64=187650855034840}}], 1, -1, NULL, 8) = 1 <0.000104>
11:51:44.605497 recvfrom(9, "\27\3\3\0E", 5, 0, NULL, NULL) = 5 <0.000013>
11:51:44.605546 recvfrom(9,
"\274\211\244\366\205\263\336\224\256\342:-nD\343\243U\206\rSW\364S\307\356\225\25\255\240\244i#\322P\33\316\223\213\36m\221E\262_\222\337\346\215\356\32T\247F\273U\\z\346\313M\1\21\316\270\203\f@P\201",
69,0, NULL, NULL) = 69 <0.000020>
 
11:51:44.605663 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050076>
11:51:44.655824 write(2, "Message to pg.log") = 289 <0.000024>
11:51:44.655893 sendto(9, "to-psql…", 186, 0, NULL, 0) = 186 <0.000028>
11:51:44.655978 recvfrom(9, 0xaaaade97b493, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
<0.000011>
11:51:44.656021 epoll_pwait(4, [{EPOLLIN, {u32=3733872600, u64=187650855034840}}], 1, -1, NULL, 8) = 1 <0.000103>
11:51:44.656157 recvfrom(9, "\27\3\3\0E", 5, 0, NULL, NULL) = 5 <0.000013>
11:51:44.656207 recvfrom(9,
"\327\27\312\363\316\241W\340\5\230\273\376\276\221Be\343\35\322+Y>x\300\1\200\25;\353O\321Y\27IrT\n\314\372\27\240\311\306b\237\264\212\314\22R\232\370\21G\367\303\252\255\3\v7\16\203\353\247u\270\355\373",
69,0, NULL, NULL) = 69 <0.000020>
 
11:51:44.656336 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050076>
11:51:44.706471 futex(0xfff39363d338, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0
<0.560006>
11:51:45.266590 write(2, "Message to pg.log") = 290 <0.000046>
11:51:45.266732 sendto(8,
"\2\0\0\0@\0\0\0\30A\0\0\0\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0(\336\n\0\0\0\0\0\234M\2\0\0\0\0\0\0\0\0\0\0\0\0\0",
64,0, NULL, 0) = 64 <0.000043>
 
11:51:45.267745 sendto(9, "to-psql…", 186, 0, NULL, 0) = 186 <0.000066>
11:51:45.268065 recvfrom(9, 0xaaaade97b493, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
<0.000016>
11:51:45.268128 epoll_pwait(4, [{EPOLLIN, {u32=3733872600, u64=187650855034840}}], 1, -1, NULL, 8) = 1 <0.000011>
11:51:45.268180 recvfrom(9, "\27\3\3\0E", 5, 0, NULL, NULL) = 5 <0.000011>
11:51:45.268236 recvfrom(9,
"u\6FJ6r3wb\302\325\203\234\7\375\3575\364\331\r\265\316o\3506}\25\346I\2039j2\232}\224\372\324\200>\270|\353\203\177u\261\35F\1\254\253\24\31\4\20\246\230\25M\361\371\16\301a\274$~\321",
69,0, NULL, NULL) = 69 <0.000011>
 
11:51:45.268370 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050917>
11:51:45.319409 write(2, "Message to pg.log") = 289 <0.000033>
11:51:45.319545 sendto(9, "to-psql…", 186, 0, NULL, 0) = 186 <0.000046>
11:51:45.319661 recvfrom(9, 0xaaaade97b493, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
<0.000018>
11:51:45.319725 epoll_pwait(4, [{EPOLLIN, {u32=3733872600, u64=187650855034840}}], 1, -1, NULL, 8) = 1 <0.000083>
11:51:45.319852 recvfrom(9, "\27\3\3\0E", 5, 0, NULL, NULL) = 5 <0.000015>
11:51:45.319919 recvfrom(9,
"\\,\213O\270\3139\215D\325a\2577\307\342\343\304\34B\r\202\333\274\251\346\221\270.B\306)oA\346\375_\6XT\3\316$\306\212\224\350ab\324B\263\204\220\372M\215\f\225\277\247c+8\30J\246\262v\307",
69,0, NULL, NULL) = 69 <0.000019>
 
11:51:45.320053 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050078>
11:51:45.370209 write(2, "Message to pg.log") = 289 <0.000026>
11:51:45.370294 sendto(9, "to-psql…", 186, 0, NULL, 0) = 186 <0.000042>
11:51:45.370405 recvfrom(9, 0xaaaade97b493, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
<0.000012>
11:51:45.370453 epoll_pwait(4, [{EPOLLIN, {u32=3733872600, u64=187650855034840}}], 1, -1, NULL, 8) = 1 <0.000107>
11:51:45.370623 recvfrom(9, "\27\3\3\0E", 5, 0, NULL, NULL) = 5 <0.000023>
11:51:45.370697 recvfrom(9,
"&3\224>\332\313OiS\312\272\326\270\313\221a\216\220w\357P\231z&\255K\17\331\263\32\367\212\31\344Qm\272C\236\316&0\375\225\224m\340\372\314\20]\345\251kaJ\210Y\344\33\277\270B\25E\241\251'\347",
69,0, NULL, NULL) = 69 <0.000015>
 
11:51:45.370823 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050077>
…

The interesting part is of course:

11:51:44.706471 futex(0xfff39363d338, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0
<0.560006>

Now, for the startup process around this time:

…
11:51:44.603450 read(9, "…", 1024) = 128 <0.000015>
11:51:44.603498 pread64(4, "…"..., 8192, 4349952) = 8192 <0.000019>
11:51:44.603558 pwrite64(219,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
8192,398286848) = 8192 <0.000038> 
11:51:44.603642 pread64(4, "…"..., 8192, 4358144) = 8192 <0.000691>
11:51:44.604394 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000017>
 
11:51:44.604443 kill(1785191, SIGURG)   = 0 <0.000021>
11:51:44.604491 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.008037>
11:51:44.612568 read(9, "…", 1024) = 128 <0.000013>
11:51:44.612615 pread64(4, "…"..., 8192, 4358144) = 8192 <0.000016>
11:51:44.612693 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.612742 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.612787 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001810>
11:51:44.614628 read(9, "…", 1024) = 128 <0.000012>
11:51:44.614670 pread64(4, "…"..., 8192, 4358144) = 8192 <0.000019>
11:51:44.614775 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.614818 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.614861 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.005601>
11:51:44.620497 read(9, "…", 1024) = 128 <0.000012>
11:51:44.620539 pread64(4, "…"..., 8192, 4358144) = 8192 <0.000524>
11:51:44.621098 pread64(4, "…"..., 8192, 4366336) = 8192 <0.000013>
11:51:44.621152 pread64(4, "…"..., 8192, 4374528) = 8192 <0.000014>
11:51:44.621242 pread64(4, "…"..., 8192, 4382720) = 8192 <0.000014>
11:51:44.621297 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.621344 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000610>
11:51:44.621985 read(9, "…", 1024) = 128 <0.000012>
11:51:44.622028 pread64(4, "…"..., 8192, 4382720) = 8192 <0.000014>
11:51:44.622075 pread64(4, "…"..., 8192, 4390912) = 8192 <0.000646>
11:51:44.622786 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.622831 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.622875 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006632>
11:51:44.629537 read(9, "…", 1024) = 128 <0.000012>
11:51:44.629580 pread64(4, "…"..., 8192, 4390912) = 8192 <0.000014>
11:51:44.629650 pread64(4, "…"..., 8192, 4399104) = 8192 <0.000014>
11:51:44.629720 pread64(4, "…"..., 8192, 4407296) = 8192 <0.000014>
11:51:44.629775 pread64(4, "…"..., 8192, 4415488) = 8192 <0.000604>
11:51:44.630423 pread64(4, "…"..., 8192, 4415488) = 8192 <0.000017>
11:51:44.630474 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.630520 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000012>
 
11:51:44.630563 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000791>
11:51:44.631384 read(9, "…", 1024) = 128 <0.000013>
11:51:44.631426 pread64(4, "…"..., 8192, 4415488) = 8192 <0.000014>
11:51:44.631476 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.631519 kill(1785191, SIGURG)   = 0 <0.000013>
11:51:44.631559 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000917>
11:51:44.632506 read(9, "…", 1024) = 128 <0.000012>
11:51:44.632546 pread64(4, "…"..., 8192, 4415488) = 8192 <0.000013>
11:51:44.632602 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.632645 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.632688 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.003932>
11:51:44.636650 read(9, "…", 1024) = 128 <0.000012>
11:51:44.636691 pread64(4, "…"..., 8192, 4415488) = 8192 <0.000587>
11:51:44.637334 pread64(4, "…"..., 8192, 4423680) = 8192 <0.000013>
11:51:44.637426 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.637469 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000796>
11:51:44.638295 read(9, "…", 1024) = 128 <0.000011>
11:51:44.638335 pread64(4, "…"..., 8192, 4431872) = 8192 <0.000013>
11:51:44.638436 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.638483 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.638524 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002996>
11:51:44.641552 read(9, "…", 1024) = 128 <0.000012>
11:51:44.641596 pread64(4, "…"..., 8192, 4431872) = 8192 <0.000014>
11:51:44.641656 pread64(4, "…"..., 8192, 4440064) = 8192 <0.000014>
11:51:44.641709 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.641755 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000838>
11:51:44.642622 read(9, "…", 1024) = 128 <0.000012>
11:51:44.642665 pread64(4, "…"..., 8192, 4440064) = 8192 <0.000014>
11:51:44.642724 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.642768 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.642810 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002441>
11:51:44.645305 read(9, "…", 1024) = 128 <0.000013>
11:51:44.645358 pread64(4, "…"..., 8192, 4440064) = 8192 <0.000546>
11:51:44.645936 pread64(4, "…"..., 8192, 4448256) = 8192 <0.000013>
11:51:44.646009 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.646057 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000813>
11:51:44.646900 read(9, "…", 1024) = 128 <0.000012>
11:51:44.646941 pread64(4, "…"..., 8192, 4448256) = 8192 <0.000013>
11:51:44.646995 pread64(4, "…"..., 8192, 4456448) = 8192 <0.000645>
11:51:44.647692 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.647736 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.647778 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.004123>
11:51:44.651941 read(9, "…", 1024) = 128 <0.000015>
11:51:44.651988 pread64(4, "…"..., 8192, 4456448) = 8192 <0.000592>
11:51:44.652636 pread64(4, "…"..., 8192, 4464640) = 8192 <0.000016>
11:51:44.652715 pread64(4, "…"..., 8192, 4472832) = 8192 <0.000015>
11:51:44.652808 pread64(4, "…"..., 8192, 4481024) = 8192 <0.000015>
11:51:44.652874 pread64(4, "…"..., 8192, 4489216) = 8192 <0.000014>
11:51:44.652943 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.652987 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000564>
11:51:44.653577 read(9, "…", 1024) = 128 <0.000013>
11:51:44.653614 pread64(4, "…"..., 8192, 4489216) = 8192 <0.000571>
11:51:44.654210 pread64(4, "…"..., 8192, 4497408) = 8192 <0.000014>
11:51:44.654304 pread64(4, "…"..., 8192, 4505600) = 8192 <0.000014>
11:51:44.654384 pread64(4, "…"..., 8192, 4513792) = 8192 <0.000014>
11:51:44.654447 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014> 
11:51:44.654491 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000664>
11:51:44.655180 read(9, "…", 1024) = 128 <0.000012>
11:51:44.655222 pread64(4, "…"..., 8192, 4513792) = 8192 <0.000550>
11:51:44.655844 pread64(4, "…"..., 8192, 4521984) = 8192 <0.000023>
11:51:44.655927 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000032>
 
11:51:44.656008 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000798>
11:51:44.656838 read(9, "…", 1024) = 128 <0.000014>
11:51:44.656883 pread64(4, "…"..., 8192, 4521984) = 8192 <0.000559>
11:51:44.657497 pread64(4, "…"..., 8192, 4530176) = 8192 <0.000015>
11:51:44.657585 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000018>
 
11:51:44.657636 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000770>
11:51:44.658455 read(9, "…", 1024) = 128 <0.000013>
11:51:44.658499 pread64(4, "…"..., 8192, 4530176) = 8192 <0.000014>
11:51:44.658545 pread64(4, "…"..., 8192, 4538368) = 8192 <0.000014>
11:51:44.658605 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.658650 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.658696 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.004334>
11:51:44.663061 read(9, "…", 1024) = 128 <0.000013>
11:51:44.663103 pread64(4, "…"..., 8192, 4538368) = 8192 <0.000014>
11:51:44.663152 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.663195 kill(1785191, SIGURG)   = 0 <0.000014>
11:51:44.663236 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001864>
11:51:44.665129 read(9, "…", 1024) = 128 <0.000012>
11:51:44.665171 pread64(4, "…"..., 8192, 4538368) = 8192 <0.000014>
11:51:44.665222 pread64(4, "…"..., 8192, 4546560) = 8192 <0.000014>
11:51:44.665317 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.665368 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000633>
11:51:44.666032 read(9, "…", 1024) = 128 <0.000013>
11:51:44.666075 pread64(4, "…"..., 8192, 4546560) = 8192 <0.000014>
11:51:44.666122 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.666168 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.666210 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001302>
11:51:44.667543 read(9, "…", 1024) = 128 <0.000012>
11:51:44.667586 pread64(4, "…"..., 8192, 4546560) = 8192 <0.000014>
11:51:44.667636 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.667680 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.667723 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002685>
11:51:44.670446 read(9, "…", 1024) = 128 <0.000029>
11:51:44.670531 pread64(4, "…"..., 8192, 4546560) = 8192 <0.000019>
11:51:44.670593 pread64(4, "…"..., 8192, 4554752) = 8192 <0.000650>
11:51:44.671357 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000018>
 
11:51:44.671414 pread64(4, "…"..., 8192, 4554752) = 8192 <0.000015>
11:51:44.671476 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.671520 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.671565 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.004498>
11:51:44.676108 read(9, "…", 1024) = 128 <0.000013>
11:51:44.676156 pread64(4, "…"..., 8192, 4554752) = 8192 <0.000015>
11:51:44.676239 pread64(4, "…"..., 8192, 4562944) = 8192 <0.000015>
11:51:44.676344 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000025>
 
11:51:44.676406 kill(1785191, SIGURG)   = 0 <0.000019>
11:51:44.676452 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001968>
11:51:44.678463 read(9, "…", 1024) = 128 <0.000014>
11:51:44.678509 pread64(4, "…"..., 8192, 4562944) = 8192 <0.000016>
11:51:44.678582 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.678629 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.678681 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.005916>
11:51:44.684634 read(9, "…", 1024) = 128 <0.000015>
11:51:44.684681 pread64(4, "…"..., 8192, 4562944) = 8192 <0.000018>
11:51:44.684755 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000036>
 
11:51:44.684825 kill(1785191, SIGURG)   = 0 <0.000024>
11:51:44.684876 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.003147>
11:51:44.688056 read(9, "…", 1024) = 128 <0.000025>
11:51:44.688111 pread64(4, "…"..., 8192, 4562944) = 8192 <0.000015>
11:51:44.688159 pread64(4, "…"..., 8192, 4571136) = 8192 <0.000013>
11:51:44.688261 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.688309 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000644>
11:51:44.688983 read(9, "…", 1024) = 128 <0.000012>
11:51:44.689025 pread64(4, "…"..., 8192, 4571136) = 8192 <0.000014>
11:51:44.689086 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.689139 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000751>
11:51:44.689926 read(9, "…", 1024) = 128 <0.000015>
11:51:44.689977 pread64(4, "…"..., 8192, 4571136) = 8192 <0.000019>
11:51:44.690050 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000017>
 
11:51:44.690100 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000908>
11:51:44.691039 read(9, "…", 1024) = 128 <0.000013>
11:51:44.691082 pread64(4, "…"..., 8192, 4571136) = 8192 <0.000014>
11:51:44.691137 pread64(523, "…"..., 8192, 4964352) = 8192 <0.000017>
11:51:44.691195 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.691240 kill(1785191, SIGURG)   = 0 <0.000019>
11:51:44.691286 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001791>
11:51:44.693107 read(9, "…", 1024) = 128 <0.000012>
11:51:44.693150 pread64(4, "…"..., 8192, 4571136) = 8192 <0.000014>
11:51:44.693199 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000667>
11:51:44.693941 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.693986 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000810>
11:51:44.694829 read(9, "…", 1024) = 128 <0.000014>
11:51:44.694876 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000018>
11:51:44.694939 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000017>
 
11:51:44.694988 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000780>
11:51:44.695800 read(9, "…", 1024) = 128 <0.000013>
11:51:44.695844 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000016>
11:51:44.695918 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.695965 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000802>
11:51:44.696797 read(9, "…", 1024) = 128 <0.000013>
11:51:44.696841 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000014>
11:51:44.696919 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.696963 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000730>
11:51:44.697724 read(9, "…", 1024) = 128 <0.000012>
11:51:44.697767 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000014>
11:51:44.697818 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.697872 kill(1785191, SIGURG)   = 0 <0.000019>
11:51:44.697919 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002082>
11:51:44.700032 read(9, "…", 1024) = 128 <0.000014>
11:51:44.700076 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000015>
11:51:44.700125 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.700171 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000817>
11:51:44.701022 read(9, "…", 1024) = 128 <0.000013>
11:51:44.701066 pread64(4, "…"..., 8192, 4579328) = 8192 <0.000014>
11:51:44.701111 pread64(4, "…"..., 8192, 4587520) = 8192 <0.000014>
11:51:44.701163 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.701209 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000683>
11:51:44.701924 read(9, "…", 1024) = 128 <0.000013>
11:51:44.701966 pread64(4, "…"..., 8192, 4587520) = 8192 <0.000014>
11:51:44.702022 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.702067 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.702110 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001738>
11:51:44.703878 read(9, "…", 1024) = 128 <0.000013>
11:51:44.703933 pread64(4, "…"..., 8192, 4587520) = 8192 <0.000015>
11:51:44.703982 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.704028 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.704071 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.007049>
11:51:44.711185 read(9, "…", 1024) = 128 <0.000015>
11:51:44.711245 pread64(4, "…"..., 8192, 4587520) = 8192 <0.000016>
11:51:44.711373 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000972>
11:51:44.712392 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000019>
 
11:51:44.712447 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.712489 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.003785>
11:51:44.716310 read(9, "…", 1024) = 128 <0.000014>
11:51:44.716356 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000027>
11:51:44.716432 pwrite64(219,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
8192,398295040) = 8192 <0.000035>
 
11:51:44.716539 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000018>
 
11:51:44.716592 kill(1785191, SIGURG)   = 0 <0.000019>
11:51:44.716638 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.013628>
11:51:44.730329 read(9, "…", 1024) = 128 <0.000015>
11:51:44.730387 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000016>
11:51:44.730449 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.730503 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.730546 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006824>
11:51:44.737453 read(9, "…", 1024) = 128 <0.000015>
11:51:44.737510 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000016>
11:51:44.737582 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000018>
 
11:51:44.737635 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.737681 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.005750>
11:51:44.743490 read(9, "…", 1024) = 128 <0.000016>
11:51:44.743545 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000017>
11:51:44.743643 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000017>
 
11:51:44.743696 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000697>
11:51:44.744442 read(9, "…", 1024) = 128 <0.000014>
11:51:44.744504 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000017>
11:51:44.744559 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.744611 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.744657 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.008226>
11:51:44.752946 read(9, "…", 1024) = 128 <0.000016>
11:51:44.753004 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000015>
11:51:44.753064 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.753117 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.753159 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006172>
11:51:44.759391 read(9, "…", 1024) = 128 <0.000013>
11:51:44.759444 pread64(4, "…"..., 8192, 4595712) = 8192 <0.000013>
11:51:44.759518 pread64(4, "…"..., 8192, 4603904) = 8192 <0.000019>
11:51:44.759599 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.759650 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000647>
11:51:44.760331 read(9, "…", 1024) = 128 <0.000012>
11:51:44.760388 pread64(4, "…"..., 8192, 4603904) = 8192 <0.000014>
11:51:44.760488 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.760534 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.760577 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.004137>
11:51:44.764747 read(9, "…", 1024) = 128 <0.000013>
11:51:44.764791 pread64(4, "…"..., 8192, 4603904) = 8192 <0.000015>
11:51:44.764857 pread64(4, "…"..., 8192, 4612096) = 8192 <0.000015>
11:51:44.764905 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.764952 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.764995 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.007306>
11:51:44.772339 read(9, "…", 1024) = 128 <0.000016>
11:51:44.772390 pread64(4, "…"..., 8192, 4612096) = 8192 <0.000017>
11:51:44.772466 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000020>
 
11:51:44.772518 kill(1785191, SIGURG)   = 0 <0.000019>
11:51:44.772565 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006657>
11:51:44.779258 read(9, "…", 1024) = 128 <0.000013>
11:51:44.779312 pread64(4, "…"..., 8192, 4612096) = 8192 <0.000015>
11:51:44.779366 pread64(4, "…"..., 8192, 4620288) = 8192 <0.000014>
11:51:44.779475 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.779524 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000798>
11:51:44.780352 read(9, "…", 1024) = 128 <0.000012>
11:51:44.780396 pread64(4, "…"..., 8192, 4620288) = 8192 <0.000014>
11:51:44.780445 pread64(4, "…"..., 8192, 4628480) = 8192 <0.000079>
11:51:44.780559 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.780605 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.780649 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.016560>
11:51:44.797272 read(9, "…", 1024) = 128 <0.000013>
11:51:44.797328 pread64(4, "…"..., 8192, 4628480) = 8192 <0.000014>
11:51:44.797430 pread64(4, "…"..., 8192, 4636672) = 8192 <0.000020>
11:51:44.797514 pread64(4, "…"..., 8192, 4644864) = 8192 <0.000015>
11:51:44.797593 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.797645 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.797690 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.022164>
11:51:44.819913 read(9, "…", 1024) = 128 <0.000016>
11:51:44.819969 pread64(4, "…"..., 8192, 4644864) = 8192 <0.000018>
11:51:44.820049 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000017>
 
11:51:44.820102 kill(1785191, SIGURG)   = 0 <0.000019>
11:51:44.820149 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001903>
11:51:44.822097 read(9, "…", 1024) = 128 <0.000013>
11:51:44.822143 pread64(4, "…"..., 8192, 4644864) = 8192 <0.000015>
11:51:44.822189 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.822231 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.822274 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.009359>
11:51:44.832831 read(9, "…", 1024) = 128 <0.000017>
11:51:44.832891 pread64(4, "…"..., 8192, 4644864) = 8192 <0.000017>
11:51:44.832968 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000018>
 
11:51:44.833034 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.833081 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002325>
11:51:44.835464 read(9, "…", 1024) = 128 <0.000014>
11:51:44.835516 pread64(4, "…"..., 8192, 4644864) = 8192 <0.000014>
11:51:44.835577 pread64(4, "…"..., 8192, 4653056) = 8192 <0.000015>
11:51:44.835660 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.835713 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.835757 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.009059>
11:51:44.844872 read(9, "…", 1024) = 128 <0.000022>
11:51:44.844948 pread64(4, "…"..., 8192, 4653056) = 8192 <0.000015>
11:51:44.845028 pread64(4, "…"..., 8192, 4661248) = 8192 <0.000014>
11:51:44.845097 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.845147 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.845189 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.003170>
11:51:44.848401 read(9, "…", 1024) = 128 <0.000012>
11:51:44.848446 pread64(4, "…"..., 8192, 4661248) = 8192 <0.000013>
11:51:44.848502 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000014>
 
11:51:44.848547 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.848588 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.007233>
11:51:44.855856 read(9, "…", 1024) = 128 <0.000012>
11:51:44.855899 pread64(4, "…"..., 8192, 4661248) = 8192 <0.000014>
11:51:44.855984 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000014>
11:51:44.856037 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.856086 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.856127 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001785>
11:51:44.857943 read(9, "…", 1024) = 128 <0.000012>
11:51:44.857984 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000015>
11:51:44.858052 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000013>
 
11:51:44.858095 kill(1785191, SIGURG)   = 0 <0.000015>
11:51:44.858136 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.009127>
11:51:44.867318 read(9, "…", 1024) = 128 <0.000014>
11:51:44.867371 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000016>
11:51:44.867467 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.867521 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.867566 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.009935>
11:51:44.877535 read(9, "…", 1024) = 128 <0.000014>
11:51:44.877582 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000016>
11:51:44.877650 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.877699 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.877745 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002926>
11:51:44.880703 read(9, "…", 1024) = 128 <0.000015>
11:51:44.880749 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000015>
11:51:44.880802 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.880851 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.880894 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.008904>
11:51:44.889841 read(9, "…", 1024) = 128 <0.000014>
11:51:44.889888 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000015>
11:51:44.889944 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.889992 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.890035 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.008759>
11:51:44.898838 read(9, "…", 1024) = 128 <0.000013>
11:51:44.898881 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000014>
11:51:44.898941 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.898990 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000798>
11:51:44.899829 read(9, "…", 1024) = 128 <0.000014>
11:51:44.899874 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000015>
11:51:44.899928 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.899976 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.900020 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.013089>
11:51:44.913151 read(9, "…", 1024) = 128 <0.000014>
11:51:44.913199 pread64(4, "…"..., 8192, 4669440) = 8192 <0.000019>
11:51:44.913251 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.913302 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.913352 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000720>
11:51:44.914105 read(9, "…", 1024) = 128 <0.000013>
11:51:44.914159 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.914211 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.914259 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000788>
11:51:44.915084 read(9, "…", 1024) = 128 <0.000014>
11:51:44.915128 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000016>
11:51:44.915180 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.915227 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.915272 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006360>
11:51:44.921673 read(9, "…", 1024) = 128 <0.000014>
11:51:44.921718 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.921775 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.921825 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000806>
11:51:44.922663 read(9, "…", 1024) = 128 <0.000013>
11:51:44.922708 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.922787 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.922843 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.922887 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001993>
11:51:44.924912 read(9, "…", 1024) = 128 <0.000013>
11:51:44.924957 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.925006 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.925053 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.925097 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.005110>
11:51:44.930239 read(9, "…", 1024) = 128 <0.000014>
11:51:44.930285 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.930343 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.930390 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.930434 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.001418>
11:51:44.931890 read(9, "…", 1024) = 128 <0.000013>
11:51:44.931935 pread64(4, "…"..., 8192, 4677632) = 8192 <0.000015>
11:51:44.931985 pread64(4, "…"..., 8192, 4685824) = 8192 <0.000015>
11:51:44.932074 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.932122 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.932165 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006029>
11:51:44.938243 read(9, "…", 1024) = 128 <0.000014>
11:51:44.938289 pread64(4, "…"..., 8192, 4685824) = 8192 <0.000015>
11:51:44.938342 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.938391 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.938435 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.004870>
11:51:44.943345 read(9, "…", 1024) = 128 <0.000014>
11:51:44.943391 pread64(4, "…"..., 8192, 4685824) = 8192 <0.000015>
11:51:44.943449 pread64(4, "…"..., 8192, 4694016) = 8192 <0.000070>
11:51:44.943574 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.943622 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.943667 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.017585>
11:51:44.961298 read(9, "…", 1024) = 128 <0.000014>
11:51:44.961346 pread64(4, "…"..., 8192, 4694016) = 8192 <0.000016>
11:51:44.961437 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000016>
 
11:51:44.961487 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.961532 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.004287>
11:51:44.965860 read(9, "…", 1024) = 128 <0.000013>
11:51:44.965906 pread64(4, "…"..., 8192, 4694016) = 8192 <0.000015>
11:51:44.965967 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.966015 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.966061 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.006415>
11:51:44.972508 read(9, "…", 1024) = 128 <0.000013>
11:51:44.972552 pread64(4, "…"..., 8192, 4694016) = 8192 <0.000014>
11:51:44.972605 pread64(4, "…"..., 8192, 4702208) = 8192 <0.000015>
11:51:44.972658 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.972705 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.972748 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.005758>
11:51:44.978548 read(9, "…", 1024) = 128 <0.000014>
11:51:44.978594 pread64(4, "…"..., 8192, 4702208) = 8192 <0.000015>
11:51:44.978642 pread64(4, "…"..., 8192, 4710400) = 8192 <0.000016>
11:51:44.978714 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.978763 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.978807 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.005587>
11:51:44.984426 read(9, "…", 1024) = 128 <0.000013>
11:51:44.984471 pread64(4, "…"..., 8192, 4710400) = 8192 <0.000015>
11:51:44.984557 pread64(4, "…"..., 8192, 4718592) = 8192 <0.000016>
11:51:44.984649 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.984696 kill(1785191, SIGURG)   = 0 <0.000017>
11:51:44.984741 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.003444>
11:51:44.988217 read(9, "…", 1024) = 128 <0.000014>
11:51:44.988262 pread64(4, "…"..., 8192, 4718592) = 8192 <0.000018>
11:51:44.988315 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.988362 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.000829>
11:51:44.989222 read(9, "…", 1024) = 128 <0.000014>
11:51:44.989268 pread64(4, "…"..., 8192, 4718592) = 8192 <0.000015>
11:51:44.989315 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000015>
 
11:51:44.989369 kill(1785191, SIGURG)   = 0 <0.000018>
11:51:44.989416 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.008689>
11:51:44.998139 read(9, "…", 1024) = 128 <0.000013>
11:51:44.998183 pread64(4, "…"..., 8192, 4718592) = 8192 <0.000014>
11:51:44.998233 newfstatat(AT_FDCWD, "/var/lib/postgresql/14/main/finish.recovery", 0xffffe5362908, 0) = -1 ENOENT (No
suchfile or directory) <0.000017>
 
11:51:44.998281 kill(1785191, SIGURG)   = 0 <0.000016>
11:51:44.998325 epoll_pwait(10, [{EPOLLIN, {u32=3733868344, u64=187650855030584}}], 1, 5000, NULL, 8) = 1 <0.002090>
…

Does it mean anything to you?

Best regards,

depesz

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

Вложения

Вложения