Scaling the hardware

When the process requests more physical memory than what is currently available on your system, the system kernel may decide to copy some memory pages to the configured swap device. When some process tries to access the memory page that was already moved to the swap device, it will be copied back to RAM. This process is called swapping. This kind of memory management is common in today's operating systems, and the notion about it is important because most default system configurations use hard disk drives as their swap devices. And hard disks, even SSDs, are extremely slow compared to RAM.

Swapping in general is not a bad occurrence. Some systems may use swap devices for less accessed memory pages, even if there is a lot of free memory available, just to save resources in advance. But if memory pressure is very high and all processes really request and start to use more memory than is available, performance will drop drastically. In such situations, the system kernel might start swapping same memory pages back and forth, and will spend most of its time constantly writing and reading from disk. From a user's point of view, the system is considered dead at this stage. So, if your application is memory-intensive, it is extremely important to scale the hardware to prevent this.

While having enough memory on a system is important, it is also important to make sure that the applications are not acting crazy and eating too much memory. For instance, if a program works on big video files that can weigh in at several hundreds of megabytes, it should not load them entirely in memory, but rather work on chunks or use disk streams.

Note that scaling up the hardware (vertical scaling) has some obvious limitations. You cannot fit an infinite amount of hardware into a single server rack. Also, highly efficient hardware is extremely expensive (law of diminishing returns), so there is also an economical bound for this approach. From this point of view, it is always better to have a system that can be scaled out by adding new computation nodes, or workers (horizontal scaling). This allows you to scale out your service with commodity software that has the best performance/price ratio.

Unfortunately, designing and maintaining highly scalable distributed systems is both hard and expensive. If your system cannot be easily scaled horizontally or it is faster and cheaper to scale vertically, it may be better to scale it vertically instead of wasting time and resources on a total redesign of your system architecture. Remember that hardware invariably tends to be faster and cheaper with time. Many products stay in this sweet spot where their scaling needs to align with the trend of raising hardware performance (for the same price).