Обсуждение: Disabling memory overcommit deemed dangerous
Hi hackers, In our documentation we recommend disabling memory overcommit to prevent the OOM killer from kicking in, see [1]. Accordingly, we expect PostgreSQL to handle OOM situations gracefully. In my experience there are unfortunately several severe problems with that approach: 1. PostgreSQL contains code paths that aren't safe against failing memory allocations. Examples are broken cleanup code, see [2], or various calls to strdup() where we don't check the return value. 2. On Linux, running OOM during stack expansion triggers SIGSEGV. This is not a theoretical concern. I hit this case in my tests. We could set up a custom stack via MAP_STACK | MAP_GROWSDOWN, but in practice that's very tricky because of ASLR. The only real alternative is committing (= writing to) all memory on backend startup. Problem with that approach is that all that memory would count already towards the commit limit. We might get away with that if we lower the maximum stack size significantly. 3. Other processes running on the same system are mostly not safe against failing memory allocations. In my tests I ended up multiple times with a server that I couldn't log in anymore because some related process had crashed due to running OOM. I cannot see how someone would today reliably run a PostgreSQL server with memory overcommit disabled, if it truly runs occasionally OOM. Even if we fixed (1) and (2) we would still be left with (3). cgroups might help with (3) but the last time I checked they didn't properly implement memory overcommit. My proposal is to remove the part about disabling memory overcommit from the documentation, or alternatively, describe the pros and cons of both approaches. Thoughts? [1] https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT [2] https://www.postgresql.org/message-id/flat/b12f9e22-2618-42b8-8644-88bae192c7fd%40gmail.com -- David Geier
David Geier <geidav.pg@gmail.com> writes: > In our documentation we recommend disabling memory overcommit to prevent > the OOM killer from kicking in, see [1]. Accordingly, we expect > PostgreSQL to handle OOM situations gracefully. In my experience there > are unfortunately several severe problems with that approach: > 1. PostgreSQL contains code paths that aren't safe against failing > memory allocations. Examples are broken cleanup code, see [2], or > various calls to strdup() where we don't check the return value. If you are aware of such places, please submit patches to fix them, because they are bugs with or without overcommit. Overcommit does *not* prevent the kernel from returning ENOMEM, so this seems like an extremely specious argument for not telling people to disable overcommit. > 2. On Linux, running OOM during stack expansion triggers SIGSEGV. Again, allowing overcommit is hardly a cure. > 3. Other processes running on the same system are mostly not safe > against failing memory allocations. The overcommit recommendation is only meant for machines that are more or less dedicated to Postgres, so I'm not sure how much this matters. Also, we've seen comparable problems on some platforms after running the kernel out of file descriptors. The bottom line is that you need a reasonable amount of headroom in your system provisioning. > I cannot see how someone would today reliably run a PostgreSQL server > with memory overcommit disabled, if it truly runs occasionally OOM. We have very substantial field experience showing that leaving memory overcommit enabled also makes the system unreliable, if it approaches OOM conditions. I don't think removing that advice is an improvement. regards, tom lane
Hi Tom! On 02.09.2025 20:10, Tom Lane wrote: > David Geier <geidav.pg@gmail.com> writes: > > If you are aware of such places, please submit patches to fix them, > because they are bugs with or without overcommit. Overcommit does > *not* prevent the kernel from returning ENOMEM, so this seems like > an extremely specious argument for not telling people to disable > overcommit. Yes, but to the best of my knowledge only for really wild allocation requests. I haven't come across any ENOMEM in my testing when overcommit was enabled. I agree that we want these places fixed regardless. I'll submit a patch for the strdup() calls but there's a bigger problem here: we don't really have means to test the changes we make. For example the bug in [2] requires, according to the discussion, some more involved refactoring of the cleanup code. How do we make sure these changes are actually correct? We could build some infrastructure for OOM testing but it feels like wasted effort because even if we fixed all the problems of category (1), we're still not good because of (2) and (3). > >> 2. On Linux, running OOM during stack expansion triggers SIGSEGV. > > Again, allowing overcommit is hardly a cure. It's not but neither is disallowing overcommit. > >> 3. Other processes running on the same system are mostly not safe >> against failing memory allocations. > > The overcommit recommendation is only meant for machines that are > more or less dedicated to Postgres, so I'm not sure how much this > matters. Also, we've seen comparable problems on some platforms > after running the kernel out of file descriptors. The bottom line > is that you need a reasonable amount of headroom in your system > provisioning. That's rarely the case in a production environment. Typically there are backups, monitoring, virus scanner, etc. running on the same host which are usually not resilient against failure (e.g. don't automatically restart / retry). Same goes for e.g. the login problem mentioned. Say a DBA runs into an OOM, checks out the documentation and applies the overcommit change. Now he has a false sense of safety and will be surprised that suddenly his service got new, unexpected points of failure. > > We have very substantial field experience showing that leaving memory > overcommit enabled also makes the system unreliable, if it approaches > OOM conditions. I don't think removing that advice is an improvement. Completely agreed. Leaving overcommit enabled is also bad. There's no safe way of running PostgreSQL in the presence of OOMs. Therefore, it depends on what's more important: having some chance PostgreSQL stays up but risking other programs to die, or always have PostgreSQL die but have the other programs always stay up. I think it would be good make the tradeoffs both settings have more explicit in the documentation and stress that actually the most important is to configure PostgreSQL such that OOMs are very unlikely to happen. If you agree I can draft a patch. -- David Geier