Обсуждение: Logical replication halted due to "this slot has been invalidated because it exceeded the maximum reserved size."

Поиск
Список
Период
Сортировка

Hello lists,

 

We have two  PostgreSQL 14.3 on Red Hat Linux 8.5 running 

two different databases on VMS, where logical replicationn is used between databases.

 

Recently we have got bitten by a repeating issue in the databases

 

 

As we use logical replication between these two systems, we have

had to rebuilt logical replication, by dropping the subscriber as

that willl drop the logical replication slot on the primary and issue

does not occur for some time, but it will repeat.

 

This time it repeated after 3 days logical replication was rebuilt.

Last time it took 2-3 months until it repeated.

 

We started to get these warnings on the standby

 

2022-11-29 10:21:06.940 EET [1698404] ERROR:  could not start WAL streaming: ERROR:  cannot read from logical replication slot ”logs”

        DETAIL:  This slot has been invalidated because it exceeded the maximum reserved size.

2022-11-29 10:21:06.942 EET [1698368] LOG:  background worker "logical replication worker" (PID 1698404) exited with exit code 1

 

Even though according to manual the max_slot_wal_keep_size is -1 and should not have a limit ?

 

If max_slot_wal_keep_size is -1 (the default), replication slots may retain an unlimited amount of WAL files

 

https://postgresqlco.nf/doc/en/param/max_slot_wal_keep_size/

 

 

psql (14.3)

Type "help" for help.

 

postgres=# show max_slot_wal_keep_size;

 max_slot_wal_keep_size 

------------------------

 -1

(1 row)

 

 

Why is this happening? Is this a bug in PG 14.3 ?

 

Our fix for the time being is

 

DB=# alter subscription logs disable;

ALTER SUBSCRIPTION

SN4ReportingDB=# drop subscription logs

NOTICE:  dropped replication slot ”logs” on publisher

DROP SUBSCRIPTION

SN4ReportingDB=# create subscription logs connection 'dbname=DBNAMAE host=192.168.1.1 port=5000 user=postgres' publication log pub  with (copy_data=false);

NOTICE:  created replication slot ”logs” on publisher

CREATE SUBSCRIPTION

 

But as this is a live system with a terabyte of data, some data will be lost unless we rebuilt the whole replication from scratch and this is not

bearable!

 

 

Any advice?

 

Regards,

Viljo Hakala

 

Hello lists,

 

We have two  PostgreSQL 14.3 on Red Hat Linux 8.5 running 

two different databases on VMS, where logical replicationn is used between databases.

 

Recently we have got bitten by a repeating issue in the databases

 

 

As we use logical replication between these two systems, we have

had to rebuilt logical replication, by dropping the subscriber as

that willl drop the logical replication slot on the primary and issue

does not occur for some time, but it will repeat.

 

This time it repeated after 3 days logical replication was rebuilt.

Last time it took 2-3 months until it repeated.

 

We started to get these warnings on the standby

 

2022-11-29 10:21:06.940 EET [1698404] ERROR:  could not start WAL streaming: ERROR:  cannot read from logical replication slot ”logs”

        DETAIL:  This slot has been invalidated because it exceeded the maximum reserved size.

2022-11-29 10:21:06.942 EET [1698368] LOG:  background worker "logical replication worker" (PID 1698404) exited with exit code 1

 

Even though according to manual the max_slot_wal_keep_size is -1 and should not have a limit ?

 

If max_slot_wal_keep_size is -1 (the default), replication slots may retain an unlimited amount of WAL files

 

https://postgresqlco.nf/doc/en/param/max_slot_wal_keep_size/

 

 

psql (14.3)

Type "help" for help.

 

postgres=# show max_slot_wal_keep_size;

 max_slot_wal_keep_size 

------------------------

 -1

(1 row)

 

 

Why is this happening? Is this a bug in PG 14.3 ?

 

Our fix for the time being is

 

DB=# alter subscription logs disable;

ALTER SUBSCRIPTION

SN4ReportingDB=# drop subscription logs

NOTICE:  dropped replication slot ”logs” on publisher

DROP SUBSCRIPTION

SN4ReportingDB=# create subscription logs connection 'dbname=DBNAMAE host=192.168.1.1 port=5000 user=postgres' publication log pub  with (copy_data=false);

NOTICE:  created replication slot ”logs” on publisher

CREATE SUBSCRIPTION

 

But as this is a live system with a terabyte of data, some data will be lost unless we rebuilt the whole replication from scratch and this is not

bearable!

 

 

Any advice?

 

Regards,

Viljo Hakala