System Design, Part 4: Load Balancer

Load Balancer is a service that distributes traffic accross number of services, or resource pool. Load Balancer is a proxy service, it takes incoming connections and calls the appropriate server on behalf of that.

The client knows about only the load balancer, means it only knows its location, or IP Address and don't know about the actual server its request is going to.

Advantages of Load Balancer

High availability - Since there are more than one instances of servers, there is a very low down time.
High scalability - A new server can always be added in that resource pool in case of traffic increase, and the load balancer will automatically include it for handling next incoming connections.
Security - In case one server is compromised, it can be removed from the pool and the system will still run fine.

All of the above factors eventually results in increased performance of the whole system.

How Load Balancers work?

It responds to the request of user

Statically
Dynamically

If one server goes down, the load balancer forwards traffic to other available servers in the resource pool.

Static Load Balancers

Distributes traffic accross servers based on predefined rules.
It doesn't monitors the real-time load of server.
It is simple to implement.
Since, it doesn't monitor server load real time, it doesn't adapt itself to the real-time load changing conditions, it just sticks to the algorithm assigned to it.
Good for conditions where there is consistent traffic, system infrastructure is simple.
It is cost effictive since it doesn't have complex infrastructure and doesn't have realtime monitoring.

Some common static load balancing algorithms are:

Round-Robin

Distributes incoming request sequentially
It ensures that the traffic distribution is even for each server.
It is simple to implement.

Weighted Round Robin

Similar to Round Robin algorithm, the difference is that each server is assigned with a weighted score or weight.
This weight is a numeric value that is assigned to a server on the basis of the capacity to handle traffic.
Depending upon the the weiighted score, the load balancer districutes each request to these server.

In that case, if there are 4 incoming requests, then the distribution will be like:

Good for server pools with servers of different capacities.
A bit complex than Round Robin and requires more maintenence

Source IP Hash

In this algorithm, the load balancer has a hash function, that takes IP Address and give a hash key. Each hash key is associated to a particular server.
This is good to ensure that the client with the same source IP Address connects to the same server everytime, it helps in maintaining and persisting the user's session.
It has limited distribution, if too many IPs generates same hash function, then it may get a lot of load on the same server.

Dynamic Load Balancers

Dynamic load balancers makes real-time decision on how to distribute load on the servers in the resource pool.
It considers several factors like current load on the server, availability of a server, etc.

Some dynamic load balancing algorithms

Least Connection Method

Checks which server has least number of connections and assigns the new incoming request to that server.
Good for the applications where it takes a bit longer time for a request to process, like large file uploads.

Least response time method

The load balancer in this case takes the current load and the performance of server into considerations and checks the quickest response time of the server, then forwards the incoming connection to the server with quickest response time.
Good for the application where it requires the quickest response time, like the payment gateways, e-commerce, etc.

Resource bases algorithm

The load balancer continuously monitors the health of the servers in the resource pool, and forwards the incoming connection accordingly.
It checks for server's CPU and memory, and decides according to that.
Good for application requiring heavy CPU and memory, like video encoding, file format conversions, virus checks, etc.
One drawback is that this whole real-time monitoring is also resource heavy, as each server has a dedicated service running in it that keeps the track of its health, and that process too takes some amount of CPU and Memory

There can be two types of Load Balancers:

Hardware, and
Software

Hardware load balancers are physical specialized hardware with specialized operating system installed in it that dedicatedly performs load balancing among resource pool. Hardware load balancers are difficult to scale and requires another secondary load balancers in case the main on fails. This result in high cost of maintenence and installation, as it has to be physically installed. Ususally used in single data centres.

Software load balancers are simple softwares that can be installed in general machines like VPS, virtual machines, bare-metal, etc. This is easier to scale and also cost effective.

Based on the functionalities, there can be following types of load balancers:

Network load balancers: Also called layer 4 load balancers, as they operate on the layer 4 of the OSI networking model - the network layer. It takes the network related information like IP Address and forwards traffic accordingly.

Application load balancers: Also called layer 7 load balancers, as they work on the layer 7 of the OSI networking model - the application layer. It evaluates several application layer protocol information like HTTP header, SSL data, etc. and forward incoming traffic accordingly.

Global server load balancer: This load balancer goes a bit beyond the functionality of a traditional load balancer and it provides load balancing among differnt geographically located data centres in any part of the world.

Previous: Scaling Single Server