Logical Replication Memory Allocation Error - "invalid memory alloc request size"

Поиск
Список
Период
Сортировка
От Max Madden
Тема Logical Replication Memory Allocation Error - "invalid memory alloc request size"
Дата
Msg-id CAD1FGCT2sYrP_70RTuo56QTizyc+J3wJdtn2gtO3VttQFpdMZg@mail.gmail.com
обсуждение исходный текст
Ответы RE: Logical Replication Memory Allocation Error - "invalid memory alloc request size"
Список pgsql-general
Hello,

I'm encountering a consistent issue with PostgreSQL 15 logical replication and would appreciate any guidance on debugging or resolving this problem.

Setup:
- Source: PostgreSQL 15.x
- Target: PostgreSQL 15.x
- Replication: Logical replication using publication/subscription (pgoutput)
- Tables: 3 tables (details below)

Table Details:
- Table 1: ~1,300 records, 7 columns, no large objects 
- Table 2: ~100,000 records, 7 columns, no large objects
- Table 3: ~100,000 records, 17 columns, no large objects

Problem:

The initial snapshot and data copy complete successfully for all tables. However, anywhere from 5 minutes to 2 hours after the initial sync, the subscription consistently fails with memory allocation errors like:

```
2025-06-10 14:14:56.800 UTC [299] ERROR: could not receive data from WAL stream: ERROR: invalid memory alloc request size 1238451248
2025-06-10 14:14:56.805 UTC [1] LOG: background worker "logical replication worker" (PID 299) exited with exit code 1
```

This occurs whether I replicate all 3 tables together or individually.

My initial hypothesis is that large transactions are creating WAL segments that exceed memory limits when sent to the subscriber. However, I haven't been able to confirm this / find the cause.

Questions:

1. What's the best approach to debug this memory allocation issue?
2. Are there specific PostgreSQL settings I should check ?
3. How can I identify if large transactions are indeed the root cause?

Additional Context:
- This happens consistently across multiple replication attempts
- The error size varies but is always requesting > 1GB
- No custom logical replication settings currently applied
- Subscriber machine has 256 GB of RAM and Ubuntu 20.04
- Can recreate it on different machines

I should also mention that we're operating in a managed environment on DigitalOcean, which means we don't have direct access to the WAL logs on the publisher node. This is why the log information above is limited. I understand this constraint makes it more difficult to provide help, but I would really appreciate any insights or suggestions you might have.

Thanks,
 
Max

В списке pgsql-general по дате отправления: