Обсуждение: Logical Replication 08P01 invalid memory alloc request size 1095736448

Поиск
Список
Период
Сортировка
Hello All,
we have this set up :
Postgres 15.13 on RHEL 9.6

Logical Master also have 2 streaming replica
Logical Replica also have 1 streaming replica

We just do patch updates from 15.12 to 15.13, restart Logical Master and then Logical Replica, and suddenly Logical replication stops working. Streaming replication of both are fine.

The message on Logical Replica : 
ERROR,08P01,"could not receive data from WAL stream: ERROR:  invalid memory alloc request size 1095736448",,,,,,,,,"","logical replication worker",,0

The message on Logical Master :
"START_REPLICATION", LOG,00000,"starting logical decoding for slot ""_subscription""","Streaming transactions committing after 33D45/373C2D98, reading WAL from 33D44/EF3219B0.",,,,,"START_REPLICATION SLOT ""_subscription"" LOGICAL 33D45/373C2D98 (proto_version '3', publication_names '""_publication""')",,,"_subscription","walsender",,0

"START_REPLICATION", ,LOG,00000,"logical decoding found consistent point at 33D44/EF3219B0","Logical decoding will begin using saved snapshot.",,,,,"START_REPLICATION SLOT ""_subscription"" LOGICAL 33D45/373C2D98 (proto_version '3', publication_names '""_publication""')",,,"_subscription","walsender",,0

Checking at error code 08P01 is protocol_violation.

Is there any idea on what causing this and how to resolve ?

--
Regards,

Soni Maula Harriz
Some updates :
Checking at lsn_diff from 33D44/EF3219B0 to 33D45/373C2D98, 
 pg_wal_lsn_diff
-----------------
      1208620008
which is quite different to memory alloc request 1095736448.

Digging deeper into debug messages :

DEBUG,00000,"find_in_dynamic_libpath: trying ""/usr/pgsql-15/lib/libpqwalreceiver""",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"find_in_dynamic_libpath: trying ""/usr/pgsql-15/lib/libpqwalreceiver.so""",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"InitPostgres",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"my backend ID is 3",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
LOG,00000,"logical replication apply worker for subscription ""consprod_subscription"" has started",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"connecting to publisher using connection string ""host=10.2.5.43 port=5432 user=replicator dbname=consprod""",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"logical replication apply worker for subscription ""consprod_subscription"" two_phase is DISABLED",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"sending feedback (force 0) to recv 33D45/373C2D98, write 33D45/373C2D98, flush 33D45/373C2D98",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"sending feedback (force 0) to recv 33D45/373C2D98, write 33D45/373C2D98, flush 33D45/373C2D98",,,,,,,,,"","logical replication worker",,0
ERROR,08P01,"could not receive data from WAL stream: ERROR:  invalid memory alloc request size 1095736448",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"shmem_exit(1): 5 before_shmem_exit callbacks to make",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"shmem_exit(1): 7 on_shmem_exit callbacks to make",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"proc_exit(1): 1 callbacks to make",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"exit(1)",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"shmem_exit(-1): 0 before_shmem_exit callbacks to make",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"shmem_exit(-1): 0 on_shmem_exit callbacks to make",,,,,,,,,"","logical replication worker",,0
DEBUG,00000,"proc_exit(-1): 0 callbacks to make",,,,,,,,,"","logical replication worker",,0
LOG,00000,"background worker ""logical replication worker"" (PID 62160) exited with exit code 1",,,,,,,,,"","postmaster",,0

Please help

On Sun, Jul 6, 2025 at 4:40 PM Soni M <diptatapa@gmail.com> wrote:
Hello All,
we have this set up :
Postgres 15.13 on RHEL 9.6

Logical Master also have 2 streaming replica
Logical Replica also have 1 streaming replica

We just do patch updates from 15.12 to 15.13, restart Logical Master and then Logical Replica, and suddenly Logical replication stops working. Streaming replication of both are fine.

The message on Logical Replica : 
ERROR,08P01,"could not receive data from WAL stream: ERROR:  invalid memory alloc request size 1095736448",,,,,,,,,"","logical replication worker",,0

The message on Logical Master :
"START_REPLICATION", LOG,00000,"starting logical decoding for slot ""_subscription""","Streaming transactions committing after 33D45/373C2D98, reading WAL from 33D44/EF3219B0.",,,,,"START_REPLICATION SLOT ""_subscription"" LOGICAL 33D45/373C2D98 (proto_version '3', publication_names '""_publication""')",,,"_subscription","walsender",,0

"START_REPLICATION", ,LOG,00000,"logical decoding found consistent point at 33D44/EF3219B0","Logical decoding will begin using saved snapshot.",,,,,"START_REPLICATION SLOT ""_subscription"" LOGICAL 33D45/373C2D98 (proto_version '3', publication_names '""_publication""')",,,"_subscription","walsender",,0

Checking at error code 08P01 is protocol_violation.

Is there any idea on what causing this and how to resolve ?

--
Regards,

Soni Maula Harriz


--
Regards,

Soni Maula Harriz
Soni M <diptatapa@gmail.com> writes:
> We just do patch updates from 15.12 to 15.13, restart Logical Master and
> then Logical Replica, and suddenly Logical replication stops working.
> Streaming replication of both are fine.

This sounds like the same bug previously discussed in

https://www.postgresql.org/message-id/flat/680bdaf6-f7d1-4536-b580-05c2760c67c6%40deepbluecap.com

We'll have a fix in next month's quarterly releases.  For the moment,
you could either roll back to 15.12 or cherry-pick the fix at
commit fc0fb77c5.

            regards, tom lane



Thanks Tom, really appreciate it.

On Sun, Jul 6, 2025 at 10:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Soni M <diptatapa@gmail.com> writes:
> We just do patch updates from 15.12 to 15.13, restart Logical Master and
> then Logical Replica, and suddenly Logical replication stops working.
> Streaming replication of both are fine.

This sounds like the same bug previously discussed in

https://www.postgresql.org/message-id/flat/680bdaf6-f7d1-4536-b580-05c2760c67c6%40deepbluecap.com

We'll have a fix in next month's quarterly releases.  For the moment,
you could either roll back to 15.12 or cherry-pick the fix at
commit fc0fb77c5.

                        regards, tom lane


--
Regards,

Soni Maula Harriz