Обсуждение: struggling with logical replication and WAL segments removed even with max_slot_wal_keep_size = -1

Поиск
Список
Период
Сортировка
All;


we have been fighting logical replication for some timetgres v14,  
max_slot_wal_keep_size = -1 but I still get this kind of error on the 
publisher every few days:

2026-01-03 18:21:00.497 UTC [770693] [unknown]@[unknown] LOG: connection 
received: host=10-173-12-128.masked.cc.com port=50000
2026-01-03 18:21:00.498 UTC [770693] postgres@cc LOG:  replication 
connection authorized: user=postgres 
application_name=pg_1415516682_sync_1415489254_7242369616429431728
2026-01-03 18:21:01.301 UTC [770693] postgres@cc LOG:  received 
replication command: START_REPLICATION SLOT 
"pg_1415516682_sync_1415489254_7242369616429431728" LOGICAL 9D6/AB9CE2B8 
(proto_version '2', publication_names '"sit_cc_ch_pub"')
2026-01-03 18:21:01.301 UTC [770693] postgres@cc STATEMENT: 
START_REPLICATION SLOT 
"pg_1415516682_sync_1415489254_7242369616429431728" LOGICAL 9D6/AB9CE2B8 
(proto_version '2', publication_names '"sit_cc_ch_pub"')
2026-01-03 18:21:01.301 UTC [770693] postgres@cc LOG:  starting logical 
decoding for slot "pg_1415516682_sync_1415489254_7242369616429431728"
2026-01-03 18:21:01.301 UTC [770693] postgres@cc DETAIL: Streaming 
transactions committing after 9D6/AB9CE2B8, reading WAL from 9D6/AB9CE280.
2026-01-03 18:21:01.301 UTC [770693] postgres@cc STATEMENT: 
START_REPLICATION SLOT 
"pg_1415516682_sync_1415489254_7242369616429431728" LOGICAL 9D6/AB9CE2B8 
(proto_version '2', publication_names '"sit_cc_ch_pub"')
2026-01-03 18:21:01.301 UTC [770693] postgres@cc ERROR:  requested WAL 
segment 00000002000009D6000000AB has already been removed
2026-01-03 18:21:01.301 UTC [770693] postgres@cc STATEMENT: 
START_REPLICATION SLOT 
"pg_1415516682_sync_1415489254_7242369616429431728" LOGICAL 9D6/AB9CE2B8 
(proto_version '2', publication_names '"sit_cc_ch_pub"')
2026-01-03 18:21:01.301 UTC [770693] postgres@cc LOG: disconnection: 
session time: 0:00:00.805 user=postgres database=cc 
host=10-173-12-128.ssnc-corp.cloud port=50000



any advice would be greatly appreciated, thanks in advance




On Sat, 2026-01-03 at 11:28 -0700, Sbob wrote:
> we have been fighting logical replication for some timetgres v14, 
> max_slot_wal_keep_size = -1 but I still get this kind of error on the
> publisher every few days:
>
> ERROR:  requested WAL segment 00000002000009D6000000AB has already been removed

What is the "wal_status" of the replication slot in "pg_replication_slots"?

Unless you are hitting a bug (what is you minor version?), I'd say that some
rogue software is removing files from "pg_wal".

Compare the "restart_lsn" from "pg_replication_slots" with the available
WAL segments on the primary.

Yours,
Laurenz Albe



On 1/4/26 9:29 AM, Laurenz Albe wrote:
> On Sat, 2026-01-03 at 11:28 -0700, Sbob wrote:
>> we have been fighting logical replication for some timetgres v14,
>> max_slot_wal_keep_size = -1 but I still get this kind of error on the
>> publisher every few days:
>>
>> ERROR:  requested WAL segment 00000002000009D6000000AB has already been removed
> What is the "wal_status" of the replication slot in "pg_replication_slots"?
>
> Unless you are hitting a bug (what is you minor version?), I'd say that some
> rogue software is removing files from "pg_wal".
>
> Compare the "restart_lsn" from "pg_replication_slots" with the available
> WAL segments on the primary.
>
> Yours,
> Laurenz Albe


We are running version 14.19

I will check the wal status next time this happens

Thanks