Re: Logical Replica ReorderBuffer Size Accounting Issues
От | Amit Kapila |
---|---|
Тема | Re: Logical Replica ReorderBuffer Size Accounting Issues |
Дата | |
Msg-id | CAA4eK1JjxNFGkDHLSSecWWD3nP+1KE4M=4G-AX2D2S+K_=m09w@mail.gmail.com обсуждение исходный текст |
Ответ на | Logical Replica ReorderBuffer Size Accounting Issues (Alex Richman <alexrichman@onesignal.com>) |
Ответы |
Re: Logical Replica ReorderBuffer Size Accounting Issues
|
Список | pgsql-bugs |
On Thu, Jan 5, 2023 at 5:27 PM Alex Richman <alexrichman@onesignal.com> wrote: > > We've noticed an odd memory issue with walsenders for logical replication slots - They experience large spikes in memoryusage up to ~10x over the baseline from ~500MiB to ~5GiB, exceeding the configured logical_decoding_work_mem. Sincewe have ~40 active subscriptions this produces a spike of ~200GiB on the sender, which is quite worrying. > > The spikes in memory always slowly ramp up to ~5GB over ~10 minutes, then quickly drop back down to the ~500MB baseline. > > logical_decoding_work_mem is configured to 256MB, and streaming is configured on the subscription side, so I would expectthe slots to either stream to spill bytes to disk when they get to the 256MB limit, and not get close to 5GiB. Howeverpg_stat_replication_slots shows 0 spilled or streamed bytes for any slots. > > > I used GDB to call MemoryContextStats on a walsender process with 5GB usage, which logged this large reorderbuffer context: > --- snip --- > ReorderBuffer: 65536 total in 4 blocks; 64624 free (169 chunks); 912 used > ReorderBufferByXid: 32768 total in 3 blocks; 12600 free (6 chunks); 20168 used > Tuples: 4311744512 total in 514 blocks (12858943 chunks); 6771224 free (12855411 chunks); 4304973288 used > TXN: 16944 total in 2 blocks; 13984 free (46 chunks); 2960 used > Change: 574944 total in 70 blocks; 214944 free (2239 chunks); 360000 used > --- snip --- > > > It's my understanding that the reorder buffer context is the thing that logical_decoding_work_mem specifically constraints,so it's surprising to see that it's holding onto ~4GB of tuples instead of spooling them. I found the code forthat here: https://github.com/postgres/postgres/blob/eb5ad4ff05fd382ac98cab60b82f7fd6ce4cfeb8/src/backend/replication/logical/reorderbuffer.c#L3557 whichsuggests it's checking rb->size against the configured work_mem. > > I then used GDB to break into a high memory walsender and grab rb->size, which was only 73944. So it looks like the tuplememory isn't being properly accounted for in the total reorderbuffer size, so nothing is getting streamed/spooled? > One possible reason for this difference is that the memory allocated to decode the tuple from WAL in the function ReorderBufferGetTupleBuf() is different from the actual memory required/accounted for the tuple in the function ReorderBufferChangeSize(). Do you have any sample data to confirm this? If you can't share sample data, can you let us know the average tuple size? -- With Regards, Amit Kapila.
В списке pgsql-bugs по дате отправления: