Обсуждение: Fwd: Undeliverable: Re: Backend handling replication slot stuck using 100% cpu, unkillable

Поиск
Список
Период
Сортировка

Fwd: Undeliverable: Re: Backend handling replication slot stuck using 100% cpu, unkillable

От
hubert depesz lubaczewski
Дата:
Hi,
not sure if -www is the correct mailing list for problems with mailing
lists, but just in case - forwarding error mail that I'm getting when
I'm sending to pgsql-bugs.

Best regards,

depesz

mx.google.com rejected your message to the following email addresses:

diogojoliveira@gmail.com<mailto:diogojoliveira@gmail.com>
Your message wasn't delivered because the recipient's email provider rejected it.


mx.google.com gave this error:
This mail is unauthenticated, which poses a security risk to the sender and Gmail users, and has been blocked. The
sendermust authenticate with at least one of SPF or DKIM. For this message, DKIM checks did not pass and SPF check for
[depesz.com]did not pass with ip: [2a01:111:f400:fe5b::205]. The sender should visit
https://support.google.com/mail/answer/81126#authenticationfor instructions on setting up authentication.
a15-20020aa7d90f000000b0051a47011c0dsi11808560edr.112- gsmtp 







Diagnostic information for administrators:

Generating server: CPWPR80MB7193.lamprd80.prod.outlook.com

diogojoliveira@gmail.com
mx.google.com
Remote server returned '550-5.7.26 This mail is unauthenticated, which poses a security risk to the 550-5.7.26 sender
andGmail users, and has been blocked. The sender must 550-5.7.26 authenticate with at least one of SPF or DKIM. For
thismessage, 550-5.7.26 DKIM checks did not pass and SPF check for [depesz.com] did not pass 550-5.7.26 with ip:
[2a01:111:f400:fe5b::205].The sender should visit 550-5.7.26
https://support.google.com/mail/answer/81126#authenticationfor 550 5.7.26 instructions on setting up authentication.
a15-20020aa7d90f000000b0051a47011c0dsi11808560edr.112- gsmtp' 

Original message headers:

Received: from CPWPR80MB7879.lamprd80.prod.outlook.com (2603:10d6:103:254::9)
 by CPWPR80MB7193.lamprd80.prod.outlook.com (2603:10d6:103:207::5) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.24; Mon, 3 Jul
 2023 12:58:23 +0000
Resent-From: <diogojoliveira@hotmail.com>
Received: from CPWPR80MB7879.lamprd80.prod.outlook.com ([::1]) by
 CPWPR80MB7879.lamprd80.prod.outlook.com ([fe80::1ef2:29e:4be3:56a4%6]) with
 Microsoft SMTP Server id 15.20.6544.026; Mon, 3 Jul 2023 12:58:23 +0000
Authentication-Results: spf=pass (sender IP is 217.196.149.56)
 smtp.mailfrom=lists.postgresql.org; dkim=fail (signature did not verify)
 header.d=depesz.com;dmarc=fail action=none header.from=depesz.com;
Received-SPF: Pass (protection.outlook.com: domain of lists.postgresql.org
 designates 217.196.149.56 as permitted sender)
 receiver=protection.outlook.com; client-ip=217.196.149.56;
 helo=malur.postgresql.org; pr=C
X-IncomingTopHeaderMarker:
OriginalChecksum:45D1A90A2847CEB34D9B9E090A791847762DA124B490A720F9A8B77EA65A9536;UpperCasedChecksum:90CF451873BA8732FC109637876DA48F5C7EA9B9877E92431978619EADD8E09B;SizeAsReceived:2766;Count:26
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=depesz.com;
        s=20170201; h=In-Reply-To:Content-Type:MIME-Version:References:Reply-To:
        Message-ID:Subject:To:Sender:From:Date:Cc:Content-Transfer-Encoding:
        Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender:
        Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:
        List-Subscribe:List-Post:List-Owner:List-Archive;
        bh=lfqQxXnbJXredxyMinU9Lvpzss4xGBa8BH80HgkaV2k=; b=pm/VUx1PRYAg01JyAfNdUP/rTH
        41jeGEJBXS6Rilzl8HB66EJEeEapyMIhyt/Zq33giGQ003vRhlMTf+M4JB0rBbu+2E6FyiPY0xwXR
        KCjaE5fqNq9XN3g/fTX72NGJFZ8lsoPOrbBjhXhXdEuwLi8Nb1z8SSmIBHdxQCdo6oqU=;
Date: Mon, 3 Jul 2023 14:58:07 +0200
From: hubert depesz lubaczewski <depesz@depesz.com>
Sender: depesz@depesz.com
To: pgsql-bugs mailing list <pgsql-bugs@postgresql.org>
Subject: Re: Backend handling replication slot stuck using 100% cpu,
 unkillable
Message-ID: <ZKLF3zT4kV+VUGjJ@depesz.com>
Reply-To: depesz@depesz.com
References: <ZKKywNsS9tR/3R80@depesz.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <ZKKywNsS9tR/3R80@depesz.com>
List-Id: <pgsql-bugs.lists.postgresql.org>
List-Help: <https://lists.postgresql.org/manage/>
List-Subscribe: <https://lists.postgresql.org/manage/>
List-Post: <mailto:pgsql-bugs@lists.postgresql.org>
List-Owner: <mailto:pgsql-bugs-owner@lists.postgresql.org>
List-Archive: <https://www.postgresql.org/list/pgsql-bugs>
Archived-At: <https://www.postgresql.org/message-id/ZKLF3zT4kV%2BVUGjJ%40depesz.com>
Precedence: bulk
List-Unsubscribe:
<https://lists.postgresql.org/unsub/37/05c145d04424123193777873bcfcdd5e494c95ea6a9dc1a15382665d8ba93cd8/>
X-IncomingHeaderCount: 26
Return-Path: depesz@depesz.com
X-EOPAttributedMessage: 0
X-EOPTenantAttributedMessage: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa:0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic:
        DM3NAM02FT048:EE_|CPWPR80MB7701:EE_|CPWPR80MB7879:EE_|CPWPR80MB7193:EE_
X-MS-UserLastLogonTime: 7/2/2023 2:31:13 PM
X-MS-Office365-Filtering-Correlation-Id: f60e14d0-3b86-455a-8a9d-08db7bc52c61
X-MS-Exchange-EOPDirect: true
X-Sender-IP: 217.196.149.56
X-SID-PRA: DEPESZ@DEPESZ.COM
X-SID-Result: FAIL
X-Microsoft-Antispam: BCL:0;
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Jul 2023 12:58:18.6108
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: f60e14d0-3b86-455a-8a9d-08db7bc52c61
X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa
X-MS-Exchange-CrossTenant-AuthSource: DM3NAM02FT048.eop-nam02.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: Internet
X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CPWPR80MB7701
X-MS-Exchange-Transport-EndToEndLatency: 00:00:03.7441398
X-MS-Exchange-Processed-By-BccFoldering: 15.20.6544.025
X-MS-Exchange-Inbox-Rules-Loop: diogojoliveira@hotmail.com
X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-2698a.templateTenant

On Mon, Jul 03, 2023 at 01:36:32PM +0200, hubert depesz lubaczewski wrote:
> Hi,
> we are using debezium to get change data from Pg.
> 
> This particular pg is 12.9, and will be soon upgrade to 14.something
> (this thursday).

So, i installed dbgsym for this pg, and this bny accident upgraded pg to
12.14.

Now I do have debug symbols, though, so backtrace can be more
informative.

I ran:

for i in 1 2 3; do date; sudo gdb -batch -p 8938 -ex bt; sleep 30; echo; done

And got this:

#v+
Mon Jul  3 12:50:14 UTC 2023
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
hash_seq_search (status=status@entry=0xfffff0173f40) at ./build/../src/backend/utils/hash/dynahash.c:1439
1439    ./build/../src/backend/utils/hash/dynahash.c: No such file or directory.
#0  hash_seq_search (status=status@entry=0xfffff0173f40) at ./build/../src/backend/utils/hash/dynahash.c:1439
#1  0x0000ffffa1bc8714 in rel_sync_cache_publication_cb (arg=<optimized out>, cacheid=<optimized out>,
hashvalue=<optimizedout>) at ./build/../src/backend/replication/pgoutput/pgoutput.c:665
 
#2  0x0000aaaab4bfcef4 in CallSyscacheCallbacks (cacheid=47, hashvalue=1542357812) at
./build/../src/backend/utils/cache/inval.c:1520
#3  0x0000aaaab4a91884 in ReorderBufferExecuteInvalidations (rb=0xffffa4ac5308 <malloc+160>, txn=0xfffff0174240,
txn=0xfffff0174240)at ./build/../src/backend/replication/logical/reorderbuffer.c:2187
 
#4  ReorderBufferCommit (rb=0xffffa4ac5308 <malloc+160>, xid=xid@entry=2741814901, commit_lsn=187650155969544,
end_lsn=<optimizedout>, commit_time=commit_time@entry=741514150878208, origin_id=origin_id@entry=0,
origin_lsn=origin_lsn@entry=0)at ./build/../src/backend/replication/logical/reorderbuffer.c:1816
 
#5  0x0000aaaab4a869bc in DecodeCommit (xid=2741814901, parsed=0xfffff0174390, buf=0xfffff0174510, ctx=0xaaaad5e1df00)
at./build/../src/backend/replication/logical/decode.c:654
 
#6  DecodeXactOp (ctx=ctx@entry=0xaaaad5e1df00, buf=0xfffff0174510, buf@entry=0xfffff0174570) at
./build/../src/backend/replication/logical/decode.c:249
#7  0x0000aaaab4a86ad4 in LogicalDecodingProcessRecord (ctx=0xaaaad5e1df00, record=0xaaaad5e1e198) at
./build/../src/backend/replication/logical/decode.c:117
#8  0x0000aaaab4a996ec in XLogSendLogical () at ./build/../src/backend/replication/walsender.c:2883
#9  0x0000aaaab4a9bbb0 in WalSndLoop (send_data=send_data@entry=0xaaaab4a99688 <XLogSendLogical>) at
./build/../src/backend/replication/walsender.c:2232
#10 0x0000aaaab4a9c674 in StartLogicalReplication (cmd=0xaaaad5e47f90) at
./build/../src/backend/replication/walsender.c:1134
#11 exec_replication_command (cmd_string=cmd_string@entry=0xaaaad5d1db00 "START_REPLICATION SLOT \"slot_name\" LOGICAL
1D6C/92965050(\"proto_version\" '1', \"publication_names\" 'xxx')") at
./build/../src/backend/replication/walsender.c:1602
#12 0x0000aaaab4af0c08 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xaaaad5d7aaf8, dbname=<optimized out>,
username=<optimizedout>) at ./build/../src/backend/tcop/postgres.c:4289
 
#13 0x0000aaaab4a759a8 in BackendRun (port=0xaaaad5d76150, port=0xaaaad5d76150) at
./build/../src/backend/postmaster/postmaster.c:4517
#14 BackendStartup (port=0xaaaad5d76150) at ./build/../src/backend/postmaster/postmaster.c:4200
#15 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1725
#16 0x0000aaaab4a769d4 in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:1398
#17 0x0000aaaab480355c in main (argc=5, argv=0xaaaad5d16720) at ./build/../src/backend/main/main.c:228

Mon Jul  3 12:50:45 UTC 2023
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
0x0000aaaab4c1e11c in hash_search_with_hash_value (hashp=0xaaaad5dea900, keyPtr=0xfffff0173f5c,
keyPtr@entry=0xfffff0173f7c,hashvalue=1107843932, action=action@entry=HASH_FIND, foundPtr=foundPtr@entry=0x0) at
./build/../src/backend/utils/hash/dynahash.c:949
949     ./build/../src/backend/utils/hash/dynahash.c: No such file or directory.
#0  0x0000aaaab4c1e11c in hash_search_with_hash_value (hashp=0xaaaad5dea900, keyPtr=0xfffff0173f5c,
keyPtr@entry=0xfffff0173f7c,hashvalue=1107843932, action=action@entry=HASH_FIND, foundPtr=foundPtr@entry=0x0) at
./build/../src/backend/utils/hash/dynahash.c:949
#1  0x0000aaaab4c1e79c in hash_search (hashp=<optimized out>, keyPtr=keyPtr@entry=0xfffff0173f7c,
action=action@entry=HASH_FIND,foundPtr=foundPtr@entry=0x0) at ./build/../src/backend/utils/hash/dynahash.c:911
 
#2  0x0000aaaab4c07180 in RelationCacheInvalidateEntry (relationId=<optimized out>) at
./build/../src/backend/utils/cache/relcache.c:2820
#3  0x0000aaaab4bfcfe0 in LocalExecuteInvalidationMessage (msg=0xffffa033ab88) at
./build/../src/backend/utils/cache/inval.c:603
#4  0x0000aaaab4a91884 in ReorderBufferExecuteInvalidations (rb=0xffffa4ac5308 <malloc+160>, txn=0xfffff0174240,
txn=0xfffff0174240)at ./build/../src/backend/replication/logical/reorderbuffer.c:2187
 
#5  ReorderBufferCommit (rb=0xffffa4ac5308 <malloc+160>, xid=xid@entry=2741814901, commit_lsn=187650155969544,
end_lsn=<optimizedout>, commit_time=commit_time@entry=741514150878208, origin_id=origin_id@entry=0,
origin_lsn=origin_lsn@entry=0)at ./build/../src/backend/replication/logical/reorderbuffer.c:1816
 
#6  0x0000aaaab4a869bc in DecodeCommit (xid=2741814901, parsed=0xfffff0174390, buf=0xfffff0174510, ctx=0xaaaad5e1df00)
at./build/../src/backend/replication/logical/decode.c:654
 
#7  DecodeXactOp (ctx=ctx@entry=0xaaaad5e1df00, buf=0xfffff0174510, buf@entry=0xfffff0174570) at
./build/../src/backend/replication/logical/decode.c:249
#8  0x0000aaaab4a86ad4 in LogicalDecodingProcessRecord (ctx=0xaaaad5e1df00, record=0xaaaad5e1e198) at
./build/../src/backend/replication/logical/decode.c:117
#9  0x0000aaaab4a996ec in XLogSendLogical () at ./build/../src/backend/replication/walsender.c:2883
#10 0x0000aaaab4a9bbb0 in WalSndLoop (send_data=send_data@entry=0xaaaab4a99688 <XLogSendLogical>) at
./build/../src/backend/replication/walsender.c:2232
#11 0x0000aaaab4a9c674 in StartLogicalReplication (cmd=0xaaaad5e47f90) at
./build/../src/backend/replication/walsender.c:1134
#12 exec_replication_command (cmd_string=cmd_string@entry=0xaaaad5d1db00 "START_REPLICATION SLOT \"slot_name\" LOGICAL
1D6C/92965050(\"proto_version\" '1', \"publication_names\" 'xxx')") at
./build/../src/backend/replication/walsender.c:1602
#13 0x0000aaaab4af0c08 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xaaaad5d7aaf8, dbname=<optimized out>,
username=<optimizedout>) at ./build/../src/backend/tcop/postgres.c:4289
 
#14 0x0000aaaab4a759a8 in BackendRun (port=0xaaaad5d76150, port=0xaaaad5d76150) at
./build/../src/backend/postmaster/postmaster.c:4517
#15 BackendStartup (port=0xaaaad5d76150) at ./build/../src/backend/postmaster/postmaster.c:4200
#16 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1725
#17 0x0000aaaab4a769d4 in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:1398
#18 0x0000aaaab480355c in main (argc=5, argv=0xaaaad5d16720) at ./build/../src/backend/main/main.c:228

Mon Jul  3 12:51:16 UTC 2023
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
hash_seq_search (status=status@entry=0xfffff0173f40) at ./build/../src/backend/utils/hash/dynahash.c:1439
1439    ./build/../src/backend/utils/hash/dynahash.c: No such file or directory.
#0  hash_seq_search (status=status@entry=0xfffff0173f40) at ./build/../src/backend/utils/hash/dynahash.c:1439
#1  0x0000ffffa1bc8714 in rel_sync_cache_publication_cb (arg=<optimized out>, cacheid=<optimized out>,
hashvalue=<optimizedout>) at ./build/../src/backend/replication/pgoutput/pgoutput.c:665
 
#2  0x0000aaaab4bfcef4 in CallSyscacheCallbacks (cacheid=47, hashvalue=3011071378) at
./build/../src/backend/utils/cache/inval.c:1520
#3  0x0000aaaab4a91884 in ReorderBufferExecuteInvalidations (rb=0xffffa4ac5308 <malloc+160>, txn=0xfffff0174240,
txn=0xfffff0174240)at ./build/../src/backend/replication/logical/reorderbuffer.c:2187 
#4  ReorderBufferCommit (rb=0xffffa4ac5308 <malloc+160>, xid=xid@entry=2741814901, commit_lsn=187650155969544,
end_lsn=<optimizedout>, commit_time=commit_time@entry=741514150878208, origin_id=origin_id@entry=0,
origin_lsn=origin_lsn@entry=0)at ./build/../src/backend/replication/logical/reorderbuffer.c:1816
 
#5  0x0000aaaab4a869bc in DecodeCommit (xid=2741814901, parsed=0xfffff0174390, buf=0xfffff0174510, ctx=0xaaaad5e1df00)
at./build/../src/backend/replication/logical/decode.c:654
 
#6  DecodeXactOp (ctx=ctx@entry=0xaaaad5e1df00, buf=0xfffff0174510, buf@entry=0xfffff0174570) at
./build/../src/backend/replication/logical/decode.c:249
#7  0x0000aaaab4a86ad4 in LogicalDecodingProcessRecord (ctx=0xaaaad5e1df00, record=0xaaaad5e1e198) at
./build/../src/backend/replication/logical/decode.c:117
#8  0x0000aaaab4a996ec in XLogSendLogical () at ./build/../src/backend/replication/walsender.c:2883
#9  0x0000aaaab4a9bbb0 in WalSndLoop (send_data=send_data@entry=0xaaaab4a99688 <XLogSendLogical>) at
./build/../src/backend/replication/walsender.c:2232
#10 0x0000aaaab4a9c674 in StartLogicalReplication (cmd=0xaaaad5e47f90) at
./build/../src/backend/replication/walsender.c:1134
#11 exec_replication_command (cmd_string=cmd_string@entry=0xaaaad5d1db00 "START_REPLICATION SLOT \"slot_name\" LOGICAL
1D6C/92965050(\"proto_version\" '1', \"publication_names\" 'xxx')") at
./build/../src/backend/replication/walsender.c:1602
#12 0x0000aaaab4af0c08 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xaaaad5d7aaf8, dbname=<optimized out>,
username=<optimizedout>) at ./build/../src/backend/tcop/postgres.c:4289
 
#13 0x0000aaaab4a759a8 in BackendRun (port=0xaaaad5d76150, port=0xaaaad5d76150) at
./build/../src/backend/postmaster/postmaster.c:4517
#14 BackendStartup (port=0xaaaad5d76150) at ./build/../src/backend/postmaster/postmaster.c:4200
#15 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1725
#16 0x0000aaaab4a769d4 in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:1398
#17 0x0000aaaab480355c in main (argc=5, argv=0xaaaad5d16720) at ./build/../src/backend/main/main.c:228
#v-

Based on suggestion from IRC, i tried "return 0" and contiunue in gdb session.

backtrace changed to:

#v+
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
0x0000aaaab4bfcf84 in LocalExecuteInvalidationMessage (msg=0xffff9ff33e18) at
./build/../src/backend/utils/cache/inval.c:578
578     ./build/../src/backend/utils/cache/inval.c: No such file or directory.
#0  0x0000aaaab4bfcf84 in LocalExecuteInvalidationMessage (msg=0xffff9ff33e18) at
./build/../src/backend/utils/cache/inval.c:578
#1  0x0000aaaab4a91884 in ReorderBufferExecuteInvalidations (rb=0xffffa4ac5308 <malloc+160>, txn=0xfffff0174240,
txn=0xfffff0174240)at ./build/../src/backend/replication/logical/reorderbuffer.c:2187
 
#2  ReorderBufferCommit (rb=0xffffa4ac5308 <malloc+160>, xid=xid@entry=2741814901, commit_lsn=187650155969544,
end_lsn=<optimizedout>, commit_time=commit_time@entry=741514150878208, origin_id=origin_id@entry=0,
origin_lsn=origin_lsn@entry=0)at ./build/../src/backend/replication/logical/reorderbuffer.c:1816
 
#3  0x0000aaaab4a869bc in DecodeCommit (xid=2741814901, parsed=0xfffff0174390, buf=0xfffff0174510, ctx=0xaaaad5e1df00)
at./build/../src/backend/replication/logical/decode.c:654
 
#4  DecodeXactOp (ctx=ctx@entry=0xaaaad5e1df00, buf=0xfffff0174510, buf@entry=0xfffff0174570) at
./build/../src/backend/replication/logical/decode.c:249
#5  0x0000aaaab4a86ad4 in LogicalDecodingProcessRecord (ctx=0xaaaad5e1df00, record=0xaaaad5e1e198) at
./build/../src/backend/replication/logical/decode.c:117
#6  0x0000aaaab4a996ec in XLogSendLogical () at ./build/../src/backend/replication/walsender.c:2883
#7  0x0000aaaab4a9bbb0 in WalSndLoop (send_data=send_data@entry=0xaaaab4a99688 <XLogSendLogical>) at
./build/../src/backend/replication/walsender.c:2232
#8  0x0000aaaab4a9c674 in StartLogicalReplication (cmd=0xaaaad5e47f90) at
./build/../src/backend/replication/walsender.c:1134
#9  exec_replication_command (cmd_string=cmd_string@entry=0xaaaad5d1db00 "START_REPLICATION SLOT
\"data_access_platform_cdc\"LOGICAL 1D6C/92965050 (\"proto_version\" '1', \"publication_names\" 'cdc')") at
./build/../src/backend/replication/walsender.c:1602
#10 0x0000aaaab4af0c08 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xaaaad5d7aaf8, dbname=<optimized out>,
username=<optimizedout>) at ./build/../src/backend/tcop/postgres.c:4289
 
#11 0x0000aaaab4a759a8 in BackendRun (port=0xaaaad5d76150, port=0xaaaad5d76150) at
./build/../src/backend/postmaster/postmaster.c:4517
#12 BackendStartup (port=0xaaaad5d76150) at ./build/../src/backend/postmaster/postmaster.c:4200
#13 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1725
#14 0x0000aaaab4a769d4 in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:1398
#15 0x0000aaaab480355c in main (argc=5, argv=0xaaaad5d16720) at ./build/../src/backend/main/main.c:228
#v-

and then it went back to hash_seq_search  :(

Anything I can do about it?

Best regards,

depesz




Вложения

Re: Undeliverable: Re: Backend handling replication slot stuck using 100% cpu, unkillable

От
Daniel Gustafsson
Дата:
> On 3 Jul 2023, at 15:05, hubert depesz lubaczewski <depesz@depesz.com> wrote:

> not sure if -www is the correct mailing list for problems with mailing
> lists,

It is.

> forwarding error mail that I'm getting when
> I'm sending to pgsql-bugs.

This is fairly common, IIUC GMail believes that the list sending email as you
is violating the SPF configuration for @depesz.com.

--
Daniel Gustafsson




Daniel Gustafsson <daniel@yesql.se> writes:
>> On 3 Jul 2023, at 15:05, hubert depesz lubaczewski <depesz@depesz.com> wrote:
>> forwarding error mail that I'm getting when
>> I'm sending to pgsql-bugs.

> This is fairly common, IIUC GMail believes that the list sending email as you
> is violating the SPF configuration for @depesz.com.

I get similar gripes on a routine basis from diogojoliveira and some
other addresses.  As near as I can tell, the actual problem is that
these people have arranged to forward list mail from their subscribed
account to gmail, and the forwarding is being done in a way that
makes it have the original sender's envelope FROM (... not the
list's envelope FROM, nor the forwarding person's).  But it's visibly
coming from the forwarding machine.  If there's a hard SPF policy for
the envelope sender's domain, kaboom!

gmail is not doing anything except what you told them to, ie
believe that this is a forgery.  The fault is in the person's
custom forwarding arrangements.  I gather unfortunately that
this is quite hard to do correctly.

(Personally, I've given up and just added spam filtering rules
to bit-bucket these reports.  Those folks are probably not
seeing any list mail from me, but that's their problem not mine.)

            regards, tom lane



Re: Undeliverable: Re: Backend handling replication slot stuck using 100% cpu, unkillable

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Daniel Gustafsson <daniel@yesql.se> writes:
> >> On 3 Jul 2023, at 15:05, hubert depesz lubaczewski <depesz@depesz.com> wrote:
> >> forwarding error mail that I'm getting when
> >> I'm sending to pgsql-bugs.
>
> > This is fairly common, IIUC GMail believes that the list sending email as you
> > is violating the SPF configuration for @depesz.com.
>
> I get similar gripes on a routine basis from diogojoliveira and some
> other addresses.  As near as I can tell, the actual problem is that
> these people have arranged to forward list mail from their subscribed
> account to gmail, and the forwarding is being done in a way that
> makes it have the original sender's envelope FROM (... not the
> list's envelope FROM, nor the forwarding person's).  But it's visibly
> coming from the forwarding machine.  If there's a hard SPF policy for
> the envelope sender's domain, kaboom!

There's certainly up-sides and down-sides to rewriting FROM and From
lines.  Generally speaking, the kind of forwarding that doesn't change
the email at all works pretty well and is exactly what the mailing lists
do and is what gmail recommends when forwarding to them, because it
doesn't end up breaking DKIM.  The issue is that when the emails aren't
DKIM signed then there's no way to verify that they haven't been changed
by the forwarder and when there's an SPF rule saying to bounce those
emails, that's what happens.

It's also possible to set up ARC on the forwarder to provide assurance
that the forwarder validated the email when it arrived and to claim that
to the end system, but that only works if the end system trusts the
forwarding system and that doesn't tend to happen across organizations
(gmail may trust its own ARC signatures and so email that goes from a
random system to gmail and which gmail validates and then forwards on
while adding their ARC signature but breaking DKIM can be accepted by
gmail still, but my own efforts to get gmail to accept my ARC signatures
has gone exactly nowhere).

I've also looked into trying to not send bounces when this happens but
unfortunately there doesn't seem to be an easy way to make that happen
except to disable bounce reports from being generated at all, which
would be far worse.

For better or worse, these days if you care about delivery and avoiding
bounces, you pretty much have to be doing SPF+DKIM+DMARC with all the
annoyence that entails.  If you don't care much about delivery then
you can expect to get such bounces.

Thanks,

Stephen

Вложения