Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
От | Alexander Lakhin |
---|---|
Тема | Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault |
Дата | |
Msg-id | b1a1eaf3-d5b7-da52-6bb7-c5b3fbe47f3e@gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
|
Список | pgsql-bugs |
Hello Tom, 02.03.2024 19:11, Tom Lane wrote: > PG Bug reporting form <noreply@postgresql.org> writes: >> When a backend with deeply nested memory contexts hits out-of-memory >> condition and logs the contexts, it might lead to a segmentation fault >> (due to the lack of free memory again). > Hmph. That's not an out-of-memory crash, that's a stack-too-deep > crash. I tried to decrease the limit and still got the failure (with the much shorter stack): ulimit -Sv 200000; TESTS=infinite_recurse make -s check-tests (gdb) p $rsp $1 = (void *) 0x7ffcc83d4ff0 (gdb) frame 13269 #13269 0x000056289bc2685a in main (argc=8, argv=0x56289d3b4930) at main.c:198 198 PostmasterMain(argc, argv); (gdb) p $rsp $2 = (void *) 0x7ffcc84834d0 (gdb) p $rsp - 0x7ffcc83d4ff0 $3 = (void *) 0xae4e0 (Far less than ulimit -s == 8 MB.) It made me think that it's not a stack overflow issue, but may be I miss something. > Seems like we ought to do one or both of these: > > 1. Put a CHECK_STACK_DEPTH() call in MemoryContextStatsInternal. > > 2. Teach MemoryContextStatsInternal to refuse to recurse more > than N levels, for N perhaps around 100. > > Neither of these are very attractive though, as they'd obscure > the OOM situation that we're trying to help debug. > > It strikes me that we don't actually need recursion in order to > traverse the context tree: since the nodes have parent pointers, > it'd be possible to visit them all using only iteration. The > recursion seems necessary though to manage the child summarization > logic as we have it (in particular, we must have a local_totals > per level to produce summarization like this). Maybe we could > modify solution #2 into > > 2a. Once we get more than say 100 levels deep, summarize everything > below that in a single line, obtained in an iterative rather than > recursive traversal. > > I wonder whether MemoryContextDelete and other cleanup methods > also need to be rewritten to avoid recursion. In the infinite_recurse > test case I think we escape trouble because we longjmp out of most > of the stack before we try to clean up --- but you could probably > devise a test case that tries to do a subtransaction abort at a > deep call level, and then maybe kaboom? Exploiting and protecting MemoryContextStatsInternal() were discussed before: https://www.postgresql.org/message-id/flat/1661334672.728714027%40f473.i.mail.ru (It looks like the function got no stack-overflow protection at the end.) But I'm still not sure that we deal here with the same issue. Best regards, Alexander
В списке pgsql-bugs по дате отправления: