Обсуждение: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
We operate a large PostgreSQL fleet (~15,000 databases) on dedicated Linux hosts.
Each host runs multiple PostgreSQL instances (multi-instance setup, not just multiple DBs inside one instance).
Environment:
PostgreSQL Versions: Mix of 13.13 and 15.12 (upgrades in progress to be at 15.12 currently both are actively in use)
OS / Kernel: RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
RAM: 256 GiB (varies slightly)
Swap: Currently none
Workload: Highly mixed — OLTP-style internal apps with unpredictable query patterns and connection counts
Goal: Uniform, safe memory settings across the fleet to avoid kernel or database instability
We’re reviewing vm.overcommit_*
settings because we’ve seen conflicting guidance:
vm.overcommit_memory = 2
gives predictability but can reject allocations earlyvm.overcommit_memory = 1
is more flexible but risks OOM kills if many backends hit peak memory usage at once
We’re considering:
vm.overcommit_memory = 2
for strict accountingIncreasing
vm.overcommit_ratio
from 50 → 80 or 90 to better reflect actual PostgreSQL usage (e.g.,work_mem
reservations that aren’t fully used)
Our questions for those running large PostgreSQL fleets:
What
overcommit_ratio
do you find safe for PostgreSQL without causing kernel memory crunches?Do you prefer
overcommit_memory = 1
or= 2
for production stability?How much swap (if any) do you keep in large-memory servers where PostgreSQL is the primary workload? Is having swap configured a good idea or not ?
Any real-world cases where kernel accounting was too strict or too loose for PostgreSQL?
- What settings to go with if we are not planning on using swap ?
We’d like to avoid both extremes:
Too low a ratio → PostgreSQL backends failing allocations even with free RAM
Too high a ratio → OOM killer terminating PostgreSQL under load spikes
Any operational experiences, tuning recommendations, or kernel/PG interaction pitfalls would be very helpful.
TIA
On 8/5/25 13:01, Priya V wrote: > *Environment:* > *PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress > to be at 15.12 currently both are actively in use) PostgreSQL 13 end of life after November 13, 2025 > *OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range RHEL 7 has been EOL for quite a while now. Note that you have to watch out for collation issues/corrupted indexes after OS upgrades due to collations changing with newer glibc versions. > *Swap:* Currently none bad idea > *Workload:* Highly mixed — OLTP-style internal apps with > unpredictable query patterns and connection counts > > *Goal:* Uniform, safe memory settings across the fleet to avoid > kernel or database instability > We’re considering: > *|vm.overcommit_memory = 2|* for strict accounting yes > Increasing |vm.overcommit_ratio| from 50 → 80 or 90 to better > reflect actual PostgreSQL usage (e.g., |work_mem| reservations that > aren’t fully used) work_mem does not reserve memory -- it is a maximum that might be used in memory for a particular operation > *Our questions for those running large PostgreSQL fleets:* > 1. > What |overcommit_ratio| do you find safe for PostgreSQL without > causing kernel memory crunches? Read this: https://www.cybertec-postgresql.com/en/what-you-should-know-about-linux-memory-overcommit-in-postgresql/ > 2. > Do you prefer |overcommit_memory = 1| or |= 2| for production stability? Use overcommit_memory = 2 for production stability > 3. > How much swap (if any) do you keep in large-memory servers where > PostgreSQL is the primary workload? Is having swap configured a good > idea or not ? You don't necessary need a large amount of swap, but you definitely should not disable it. Some background on that: https://chrisdown.name/2018/01/02/in-defence-of-swap.html > 4. > Any real-world cases where kernel accounting was too strict or too > loose for PostgreSQL? In my experience the biggest issues are when postgres is running in a memory constrained cgroup. If you want to constrain memory with cgroups, use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max. > 5. What settings to go with if we are not planning on using swap ? IMHO do not disable swap on Linux, at least not on production, ever. > We’d like to avoid both extremes: > Too low a ratio → PostgreSQL backends failing allocations even with > free RAM Have you actually seen this or are you theorizing? > Too high a ratio → OOM killer terminating PostgreSQL under load spikes If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100, maybe 80 or so as you suggested), and swap is not disabled, and you are not running in a memory constrained cgroup, I would be very surprised if you will ever get hit by the OOM killer. And if you do, things are so bad the database was probably dying anyway. HTH, -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On 5 Aug 2025, at 20:52, Joe Conway <mail@joeconway.com> wrote:On 8/5/25 13:01, Priya V wrote:*Environment:*
*PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress
to be at 15.12 currently both are actively in use)
PostgreSQL 13 end of life after November 13, 2025*OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
RHEL 7 has been EOL for quite a while now. Note that you have to watch out for collation issues/corrupted indexes after OS upgrades due to collations changing with newer glibc versions.*Swap:* Currently none
bad idea*Workload:* Highly mixed — OLTP-style internal apps with
unpredictable query patterns and connection counts
*Goal:* Uniform, safe memory settings across the fleet to avoid
kernel or database instabilityWe’re considering:
*|vm.overcommit_memory = 2|* for strict accounting
yesIncreasing |vm.overcommit_ratio| from 50 → 80 or 90 to better
reflect actual PostgreSQL usage (e.g., |work_mem| reservations that
aren’t fully used)
work_mem does not reserve memory -- it is a maximum that might be used in memory for a particular operation*Our questions for those running large PostgreSQL fleets:*
1.
What |overcommit_ratio| do you find safe for PostgreSQL without
causing kernel memory crunches?
Read this:
https://www.cybertec-postgresql.com/en/what-you-should-know-about-linux-memory-overcommit-in-postgresql/2.
Do you prefer |overcommit_memory = 1| or |= 2| for production stability?
Use overcommit_memory = 2 for production stability3.
How much swap (if any) do you keep in large-memory servers where
PostgreSQL is the primary workload? Is having swap configured a good
idea or not ?
You don't necessary need a large amount of swap, but you definitely should not disable it.
Some background on that:
https://chrisdown.name/2018/01/02/in-defence-of-swap.html4.
Any real-world cases where kernel accounting was too strict or too
loose for PostgreSQL?
In my experience the biggest issues are when postgres is running in a memory constrained cgroup. If you want to constrain memory with cgroups, use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max.5. What settings to go with if we are not planning on using swap ?
IMHO do not disable swap on Linux, at least not on production, ever.We’d like to avoid both extremes:
Too low a ratio → PostgreSQL backends failing allocations even with
free RAM
Have you actually seen this or are you theorizing?Too high a ratio → OOM killer terminating PostgreSQL under load spikes
If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100, maybe 80 or so as you suggested), and swap is not disabled, and you are not running in a memory constrained cgroup, I would be very surprised if you will ever get hit by the OOM killer. And if you do, things are so bad the database was probably dying anyway.
HTH,
--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com
0
cat /proc/sys/vm/overcommit_ratio
50
$ cat /proc/sys/vm/swappiness
60
4.18.0-477.83.1.el8_8.x86_64
total used free shared buff/cache available
Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi
Swap: 0B 0B 0B
Joe,Can you name any technical reason why not having swap for a database is an actual bad idea?Memory always is limited. Swap was invented to overcome a situation where the (incidental) memory usage of paged in memory was could (regularly) get higher than physical memory would allow, and thus have the (clear) workaround of having swap to 'cushion' the memory shortage issue by allowing a "second level" memory storage on disk.Still, this does not making memory unlimited. Swap extends the physical memory available with the amount of swap. There still is a situation where you can run out of memory when swap is added, simply by paging in more memory than physical memory and swap.Today, most systems are not memory constrained anymore, or: it is possible to get a server with enough physical memory to hold your common needed total memory need.And given the latency sensitive nature of databases in general, which includes postgres, for any serious deployment you should get a server with enough memory to host your workload, and configure postgres not to overload the memory.If you do oversubscribe on (physical) memory, you will get pain somewhere, there is no way around that.The article in defense of swap in essence is saying that if you happen to oversubscribe on memory, sharing the pain between anonymous and file is better.I would say you are already in a bad place if that happens, which is especially bad for databases, and databases should allow you to make memory usage predictable.However, what I found is that with 4+ kernels (4.18 to be precise; rhel 8), the kernel can try to favour file pages in certain situations making anonymous memory getting paged out even if swappiness is set to 1 or 0, and if there is a wealth of inactive file memory. It seems to have to do with workingset protection(?) mechanisms, but given the lack of clear statistics I can't be sure about that. What it does lead to in my situations is a constant rate of swapping in and out in certain situations, whilst there is no technical reason for linux to do so because there is enough available memory.My point of view has been that vm.overcommit_memory set to 2 was the way to go, because that allows linux to limit based on a set limit on allocation time, which guarantees way to make the database never run out of memory.it does guarantees linux to never run out of memory, absolutely.However, this limit is hard, and is applied for the process at both usermode and system mode (kernel level), and thus can enforce not providing memory at times where it's not safe to do so, and thus corrupt execution. I have to be honest, I have not seen this myself, but trustworthy sources have reported this repeatedly, which I am inclined to believe. This means postgres execution can corrupt/terminate in unlucky situations, which is impacts availability.Frits HooglandOn 5 Aug 2025, at 20:52, Joe Conway <mail@joeconway.com> wrote:On 8/5/25 13:01, Priya V wrote:*Environment:*
*PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress
to be at 15.12 currently both are actively in use)
PostgreSQL 13 end of life after November 13, 2025*OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
RHEL 7 has been EOL for quite a while now. Note that you have to watch out for collation issues/corrupted indexes after OS upgrades due to collations changing with newer glibc versions.*Swap:* Currently none
bad idea*Workload:* Highly mixed — OLTP-style internal apps with
unpredictable query patterns and connection counts
*Goal:* Uniform, safe memory settings across the fleet to avoid
kernel or database instabilityWe’re considering:
*|vm.overcommit_memory = 2|* for strict accounting
yesIncreasing |vm.overcommit_ratio| from 50 → 80 or 90 to better
reflect actual PostgreSQL usage (e.g., |work_mem| reservations that
aren’t fully used)
work_mem does not reserve memory -- it is a maximum that might be used in memory for a particular operation*Our questions for those running large PostgreSQL fleets:*
1.
What |overcommit_ratio| do you find safe for PostgreSQL without
causing kernel memory crunches?
Read this:
https://www.cybertec-postgresql.com/en/what-you-should-know-about-linux-memory-overcommit-in-postgresql/2.
Do you prefer |overcommit_memory = 1| or |= 2| for production stability?
Use overcommit_memory = 2 for production stability3.
How much swap (if any) do you keep in large-memory servers where
PostgreSQL is the primary workload? Is having swap configured a good
idea or not ?
You don't necessary need a large amount of swap, but you definitely should not disable it.
Some background on that:
https://chrisdown.name/2018/01/02/in-defence-of-swap.html4.
Any real-world cases where kernel accounting was too strict or too
loose for PostgreSQL?
In my experience the biggest issues are when postgres is running in a memory constrained cgroup. If you want to constrain memory with cgroups, use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max.5. What settings to go with if we are not planning on using swap ?
IMHO do not disable swap on Linux, at least not on production, ever.We’d like to avoid both extremes:
Too low a ratio → PostgreSQL backends failing allocations even with
free RAM
Have you actually seen this or are you theorizing?Too high a ratio → OOM killer terminating PostgreSQL under load spikes
If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100, maybe 80 or so as you suggested), and swap is not disabled, and you are not running in a memory constrained cgroup, I would be very surprised if you will ever get hit by the OOM killer. And if you do, things are so bad the database was probably dying anyway.
HTH,
--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com
(Both: please trim and reply inline on these lists as I have done; Frits, please reply all not just to the list -- I never received your reply to me) On 8/6/25 11:51, Priya V wrote: > *cat /proc/sys/vm/overcommit_ratio* > 50 > $ *cat /proc/sys/vm/swappiness* > 60 > > *Workload*: Multi-tenant PostgreSQL > > *uname -r* > 4.18.0-477.83.1.el8_8.x86_64 IMHO you should strongly consider getting on a more recent distro with a newer kernel. > *free -h* > total used free shared buff/cache available > Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi > Swap: 0B 0B 0B As I said, do not disable swap. You don't need a huge amount, but maybe 16 GB or so would do it. > if we set overcommit_memory = 2, what should we set the > overcommit_ration value to ? Can you pls suggest ? > Is there a rule of thumb to go with ? There is no rule of thumb that I am aware of. Every workload is different. Start with something like 80 and do your own testing to refine that number. > *Our goal is to not run into OOM issues, no memory wastage and also not > starve kernel ? * With overcommit_memory = 2, swap on (and reasonably sized), and overcommit_ratio to something reasonable (certainly below 100), I think you will have a difficult time getting an OOM kill even if you try during testing. But you have to do your own testing for your workloads -- there is no magic button here. That is, unless you run postgres in a cgroup with memory.limit (cgroup v1) or memory.max (cgroup v2) set. Note, running in containers with memory limits set e.g. via Kubernetes will do that under the covers. That is a completely different story. > On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits.hoogland@gmail.com > <mailto:frits.hoogland@gmail.com>> wrote: > Can you name any technical reason why not having swap for a database > is an actual bad idea? Did you read the blog I linked? Do your own experiments. * Swap is what is used when anonymous memory must be reclaimed to allow for an allocation of anonymous memory. * The Linux kernel will aggressively use all available memory for file buffers, pushing usage against the limits. * Especially in the older 4 series kernels, file buffers often cannot be reclaimed fast enough * With no swap and a large-ish anonymous memory request, it is easy to push over the limit to cause the OOM killer to strike. * On the other hand, with swap enabled anon memory can be reclaimed giving the kernel more time to deal with file buffer reclamation. At least that is what I have observed. HTH, -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
> As I said, do not disable swap. You don't need a huge amount, but maybe 16 GB or so would do it. Joe, please, can you state a technical reason for saying this? All you are saying is ‘don’t do this’. I’ve stated my reasons for why this doesn’t make sense, and you don’t give any reason. The article you cite does seem to point to general usage, not database usage. Frits > Op 6 aug 2025 om 18:33 heeft Joe Conway <mail@joeconway.com> het volgende geschreven: > > (Both: please trim and reply inline on these lists as I have done; > Frits, please reply all not just to the list -- I never received your > reply to me) > >> On 8/6/25 11:51, Priya V wrote: >> *cat /proc/sys/vm/overcommit_ratio* >> 50 >> $ *cat /proc/sys/vm/swappiness* >> 60 >> *Workload*: Multi-tenant PostgreSQL >> *uname -r* >> 4.18.0-477.83.1.el8_8.x86_64 > > IMHO you should strongly consider getting on a more recent distro with a newer kernel. > >> *free -h* >> total used free shared buff/cache available >> Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi >> Swap: 0B 0B 0B > > As I said, do not disable swap. You don't need a huge amount, but maybe 16 GB or so would do it. > >> if we set overcommit_memory = 2, what should we set the overcommit_ration value to ? Can you pls suggest ? >> Is there a rule of thumb to go with ? > > There is no rule of thumb that I am aware of. Every workload is different. Start with something like 80 and do your owntesting to refine that number. > >> *Our goal is to not run into OOM issues, no memory wastage and also not starve kernel ? * > > With overcommit_memory = 2, swap on (and reasonably sized), and overcommit_ratio to something reasonable (certainly below100), I think you will have a difficult time getting an OOM kill even if you try during testing. But you have to doyour own testing for your workloads -- there is no magic button here. > > That is, unless you run postgres in a cgroup with memory.limit (cgroup v1) or memory.max (cgroup v2) set. Note, runningin containers with memory limits set e.g. via Kubernetes will do that under the covers. That is a completely differentstory. > >> On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits.hoogland@gmail.com <mailto:frits.hoogland@gmail.com>> wrote: >> Can you name any technical reason why not having swap for a database >> is an actual bad idea? > > Did you read the blog I linked? Do your own experiments. > > * Swap is what is used when anonymous memory must be reclaimed to allow for an allocation of anonymous memory. > > * The Linux kernel will aggressively use all available memory for file buffers, pushing usage against the limits. > > * Especially in the older 4 series kernels, file buffers often cannot be reclaimed fast enough > > * With no swap and a large-ish anonymous memory request, it is easy to push over the limit to cause the OOM killer to strike. > > * On the other hand, with swap enabled anon memory can be reclaimed giving the kernel more time to deal with file bufferreclamation. > > At least that is what I have observed. > > HTH, > > -- > Joe Conway > PostgreSQL Contributors Team > Amazon Web Services: https://aws.amazon.com
On 8/6/25 17:14, Frits Hoogland wrote: >> As I said, do not disable swap. You don't need a huge amount, but >> maybe 16 GB or so would do it. > Joe, please, can you state a technical reason for saying this? > All you are saying is ‘don’t do this’. > > I’ve stated my reasons for why this doesn’t make sense, and you don’t give any reason. What do you call the below? >> Op 6 aug 2025 om 18:33 heeft Joe Conway <mail@joeconway.com> het volgende geschreven: >> * Swap is what is used when anonymous memory must be reclaimed to >> allow for an allocation of anonymous memory. >> >> * The Linux kernel will aggressively use all available memory for >> file buffers, pushing usage against the limits. >> >> * Especially in the older 4 series kernels, file buffers often >> cannot be reclaimed fast enough >> >> * With no swap and a large-ish anonymous memory request, it is >> easy to push over the limit to cause the OOM killer to strike. >> >> * On the other hand, with swap enabled anon memory can be >> reclaimed giving the kernel more time to deal with file buffer >> reclamation. >> >> At least that is what I have observed. If you don't think that is adequate technical reason, feel free to ignore my advice. -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
Op 6 aug 2025 om 18:33 heeft Joe Conway <mail@joeconway.com> het volgende geschreven:* Swap is what is used when anonymous memory must be reclaimed to
allow for an allocation of anonymous memory.
Correct. Swapped out pages are anonymous memory pages exclusively.
* The Linux kernel will aggressively use all available memory for
file buffers, pushing usage against the limits.
* Especially in the older 4 series kernels, file buffers often
cannot be reclaimed fast enough
* With no swap and a large-ish anonymous memory request, it is
easy to push over the limit to cause the OOM killer to strike.
* On the other hand, with swap enabled anon memory can be
reclaimed giving the kernel more time to deal with file buffer
reclamation.
At least that is what I have observed.
On 7 Aug 2025, at 03:12, Joe Conway <mail@joeconway.com> wrote:On 8/6/25 17:14, Frits Hoogland wrote:As I said, do not disable swap. You don't need a huge amount, but
maybe 16 GB or so would do it.Joe, please, can you state a technical reason for saying this?
All you are saying is ‘don’t do this’.
I’ve stated my reasons for why this doesn’t make sense, and you don’t give any reason.
What do you call the below?Op 6 aug 2025 om 18:33 heeft Joe Conway <mail@joeconway.com> het volgende geschreven:* Swap is what is used when anonymous memory must be reclaimed to
allow for an allocation of anonymous memory.
* The Linux kernel will aggressively use all available memory for
file buffers, pushing usage against the limits.
* Especially in the older 4 series kernels, file buffers often
cannot be reclaimed fast enough
* With no swap and a large-ish anonymous memory request, it is
easy to push over the limit to cause the OOM killer to strike.
* On the other hand, with swap enabled anon memory can be
reclaimed giving the kernel more time to deal with file buffer
reclamation.
At least that is what I have observed.
If you don't think that is adequate technical reason, feel free to ignore my advice.
--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com
On Wed, Aug 6, 2025 at 11:14:34PM +0200, Frits Hoogland wrote: > > As I said, do not disable swap. You don't need a huge amount, but maybe 16 GB or so would do it. > > Joe, please, can you state a technical reason for saying this? > All you are saying is ‘don’t do this’. > > I’ve stated my reasons for why this doesn’t make sense, and you don’t give any reason. > > The article you cite does seem to point to general usage, not database usage. Here is a blog entry about it from 2012: https://momjian.us/main/blogs/pgblog/2012.html#July_25_2012 -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
Sent: Tuesday, August 12, 2025 1:49:26 PM
To: Frits Hoogland <frits.hoogland@gmail.com>
Cc: Joe Conway <mail@joeconway.com>; Priya V <mailme0216@gmail.com>; pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
Subject: Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet
On Wed, Aug 6, 2025 at 11:14:34PM +0200, Frits Hoogland wrote:
> > As I said, do not disable swap. You don't need a huge amount, but maybe 16 GB or so would do it.
>
> Joe, please, can you state a technical reason for saying this?
> All you are saying is ‘don’t do this’.
>
> I’ve stated my reasons for why this doesn’t make sense, and you don’t give any reason.
>
> The article you cite does seem to point to general usage, not database usage.
Here is a blog entry about it from 2012:
https://urldefense.com/v3/__https://momjian.us/main/blogs/pgblog/2012.html*July_25_2012__;Iw!!FjuHKAHQs5udqho!KYC_U3_OpOjplPgrQZatREvJ8Ft3U7O30Lh6SH0uGMsTcYagWyupQLofwkNLDhhxgmzWzl6blB1HpFk0oj4$
--
Bruce Momjian <bruce@momjian.us> https://urldefense.com/v3/__https://momjian.us__;!!FjuHKAHQs5udqho!KYC_U3_OpOjplPgrQZatREvJ8Ft3U7O30Lh6SH0uGMsTcYagWyupQLofwkNLDhhxgmzWzl6blB1H1-eyJ3I$
EDB https://urldefense.com/v3/__https://enterprisedb.com__;!!FjuHKAHQs5udqho!KYC_U3_OpOjplPgrQZatREvJ8Ft3U7O30Lh6SH0uGMsTcYagWyupQLofwkNLDhhxgmzWzl6blB1HJNHkhfc$
Do not let urgent matters crowd out time for investment in the future.
On 8/8/25 10:21, Frits Hoogland wrote: > If swappiness is set to 0, but swap is available, some documentation > suggests it will never use anonymous memory, however I found this not to > be true, linux might still choose anonymous memory to reclaim. A bug in RHEL8 meant that swappiness was not taken into account unless cgroupv2 was configured or vm.force_cgroup_v2_swappiness was set to 1. See references [1] and [2]. Could this be the cause of your observation? [1] https://access.redhat.com/solutions/6785021 [2] https://github.com/systemd/systemd/issues/9276
Thank you for your message Frederic, I am very much aware of that issue. It’s actually incorrect to say that is a bug: that is how cgroupsv1, which is bundledwith rhel8, works. However, it is very counter intuitive. For that reason redhat created the force_cgroup_v2_swappinessparameter uniquely theirselves, it’s not a common Linux parameter. The specific issue I see in certain cases leading to unreasonable swap usage is Linux workingset detection kicking in, whichcan choose anonymous memory despite having lots of file memory available, leading to swapping, which sometimes leadsto a thrashing situation. It is funny to see how emotional people react to removing swap, and how people go through great efforts to carefully tryingwrap that in a technical reason or point to people having said something that agrees with their emotion. I should saythat I understand the reluctance, it’s not weird to feel anxious. The kernel has no inherent swap requirement. Of course, removing swap cannot be blindly applied, you have to carefully makeit suit your environment, usage and intention. And there ARE cases where swap makes sense (if you have memory usage thatexceeds physical memory, and you add enough swap to sustain that). But a database in general typically responds bad toswapping (or anything that fluctuates latency), and when swap removal is sensibly done, it prevents anonymous (includinglesser frequently used ;-)) memory from getting swapped. I will not convince everybody, but I hope I can make some people that understand the technology thinking about it and considerthe arguments. Friendly regards, Frits > > Op 18 aug 2025 om 18:17 heeft Frédéric Yhuel <frederic.yhuel@dalibo.com> het volgende geschreven: > > > >> On 8/8/25 10:21, Frits Hoogland wrote: >> If swappiness is set to 0, but swap is available, some documentation suggests it will never use anonymous memory, howeverI found this not to be true, linux might still choose anonymous memory to reclaim. > > > A bug in RHEL8 meant that swappiness was not taken into account unless cgroupv2 was configured or vm.force_cgroup_v2_swappinesswas set to 1. See references [1] and [2]. Could this be the cause of your observation? > > [1] https://access.redhat.com/solutions/6785021 > [2] https://github.com/systemd/systemd/issues/9276
On 8/19/25 17:37, Frits Hoogland wrote: > The specific issue I see in certain cases leading to unreasonable swap usage is Linux workingset detection kicking in Do you have a way to highlight that precisely? I mean, can you prove that it is Linux workingset detection that is causing swapping? I've also encountered surprising cases where the swap fills up despite there being plenty of available memory (lots of page cache). However these cases were not associated with slowdowns or other problems. I only became aware of them because a client was anxious about her swap usage.