Even on Linux with overcommit you can have allocations fail, in practical scenar...

tucnak · 2025-12-05T06:45:16 1764917116

I hear this claim on swap all the time, and honestly it doesn't sound convincing. Maybe ten or twenty years ago, but today? CAS latency for DIMM has been going UP, and so is NVMe bandwidth. Depending on memory access patterns, and whether it fits in the NVMe controller's cache (the recent Samsung 9100 model includes 4 GB of DDR4 for cache and prefetch) your application may work just fine.

pornel · 2025-12-05T13:13:51 1764940431

Swap can be fine on desktops where usage patterns vary a lot, and there are a bunch of idle apps to swap out. It might be fine on a server with light loads or a memory leak that just gets written out somewhere.

What I had in mind was servers scaled to run near maximum capacity of the hardware. When the load exceeds what the server can handle in RAM and starts shoving requests' working memory into swap, you typically won't get higher throughput to catch up with the overload. Swap, even if "fast enough", will slow down your overall throughput when you need it to go faster. This will make requests pile up even more, making more of them go into swap. Even if it doesn't cause a death spiral, it's not an economical way to run servers.

What you really need to do is shed the load before it overwhelms the server, so that each box runs at its maximum throughput, and extra traffic is load-balanced elsewhere, or rejected, or at least queued in some more deliberate and efficient fashion, rather than franticly moving server's working memory back and forth from disk.

You can do this scaling without OOM handling if you have other ways of ensuring limited memory usage or leaving enough headroom for spikes, but OOM handling lets you fly closer to the sun, especially when the RAM cost of requests can be very uneven.

zozbot234 · 2025-12-05T13:27:50 1764941270

It's almost never the case that memory is uniformly accessed, except for highly artificial loads such as doing inference on a large ML model. If you can stash the "cold" parts of your RAM working set into swap, that's a win and lets you serve more requests out of the same hardware compared to working with no swap. Of course there will always be a load that exceeds what the hardware can provide, but that's true regardless of how much swap you use.

BitFlogger · 2025-12-05T23:06:31 1764975991

Swap isn't just for when you run out of ram though.

Don't look at swap as more memory on slow / hdds. Look at it as a place the kernel can use if it needs a place to put something temporarily.

This can happen on large memory systems fairly easily when memory gets fragments and something asks for a chunk of memory than can't be allocated because there isn't a large enough contiguous block, so the allocation fails.

I always do a least a couple of GBs now for swap... I won't really miss the storage and that at least gives the kernel a place to re-org/compact memory and keep chugging along.