Building Internet-Scale Web Platforms with the Amazon Elastic Load Balancer

Cloud Scalability


The Elastic Load Balancer distributes your Application’s inbound traffic to multiple Web Servers running on EC2 instances. This offers the following key benefits to your architecture:

Increased Throughput: This increases the capacity of your Web infrastructure to handle additional traffic (i.e. Horizontally scaling-out).

Avoiding Single Points of Failure: An individual Web Server is no longer a single point of failure, since traffic is distributed across multiple server instances. This makes your application much more resilient.

Scale Out with aLoad Balancer

Figure-1: ELB distributes inbound traffic across multiple EC2 instances.

Maintaining Healthier Servers: The risk of overloading or overwhelming a single Web server is now minimized due to distribution of traffic. This increases the chances of your individual Web servers staying healthier over much longer periods of time.

An architect needs to consider several critical aspects of a deployment such as:

  • How do I truly achieve ‘internet-scale’? What does my ‘scaled out’ architecture look like?
  • How does the ELB schedule incoming traffic?
  • What if my load balancer itself becomes a single point of failure?
  • How does my design guarantee fault-tolerance, resiliency, and high-availability?
  • What security features does the load balancer offer for my inflight traffic?

In this post, we answer these questions and also help you understand why the ELB is more effective than a home-brewed load balancing solution using Nginx or Apache.

How Does Web Traffic Reach Your Load Balancer?

To understand some of the scalability features of ELB, we need to first understand how client traffic reaches your load balancer.

When you create a ELB, it automatically registers a unique DNS entry for itself in the AWS Name Servers. For example, if you created a load balancer called MyApp in the us-east-1 region, your load balancer gets a unique DNS name such as


Figure-2: How client traffic reaches your load balancer.

You can now hit the load balancer using this domain name; but this name is a bit too long for your users to remember.

So additionally, you register your own user-friendly domain name such as and create a CNAME alias in your own name servers (which points to the load balancer’s domain name). Your users now have a much easier domain name to reach your application – via the load balancer, of course!

  • Step 1: Clients (Browsers, Mobile Devices, IoT Devices) will attempt to use your friendly DNS name to access your application. Say, a user types in the browser.
  • Step 2: As a first step, the client attempts to resolve this DNS name for you. During the standard DNS resolution process, your name server returns a CNAME Record which points to the load balancer’s domain name. (Thus, it will return as part of the DNS resolution).
  • Step 3: Amazon’s name servers are now queried to resolve this load balancer’s domain name. The AWS name servers return an appropriate IP address of the load balancer ‘instance’ back to your client.
  • Step 4: The client then uses this resolved IP address to dispatch HTTP requests to your application.

As you shall see later, the DNS system plays a critical role in helping the ELB to horizontally scale out.

Achieving Fault Tolerance and Scale in Your Infrastructure

The ELB is engineered for fault tolerance and massive scale in the following ways:

  • Continuous Health Checks: The ELB periodically checks the health of your upstream EC2 instances and only routes traffic to healthy EC2 instances. Client requests only reach your healthy App Servers and that makes your App much more fault-tolerant.
    • Scaling out the Web Layer: Scaling-out the web layer is the ability to add more EC2 instances as needed, without disrupting the live application itself. This contributes to both fault tolerance and scale.
    • Intelligent Traffic Scheduling: The ELB adopts traffic scheduling based on the actual load present on each of your upstream EC2 instances. This prevents just a few EC2 instances from getting overwhelmed and keeps the load well-distributed.
    • Multiple Availability Zones: The ELB can distribute traffic to EC2 instances that exist across multiple availability zones and thus achieve a greater degree of fault tolerance.
    • Scaling Out the Load Balancer Itself: What if the ELB itself gets maxed-out on it’s capacity? The ELB automatically creates multiple instances of the load balancer and uses a DNS Round-robin technique to split inbound traffic across those load balancer instances. This contributes to both fault-tolerance and scale.

    We explore each of these aspects in detail in our further sections.

    Continuous Health Checks

    The Elastic Load Balancer continuously monitors the health of all the registered upstream EC2 instances and detects unhealthy EC2 instances in real-time.

    Health Checks in Load Balancers

    Figure-3: ELB performing health checks.

    If an EC2 instance becomes unhealthy the ELB automatically stops sending traffic to that EC2 instance, and distributes the load to the remaining healthy EC2 instances. If a registered EC2 instance becomes healthy again in the future, the ELB will automatically restore traffic to that EC2 instance.

    This feature brings a high degree of resilience to your overall application: You can now afford downtime on your individual Web or App servers without worrying about the entire Application going down – since traffic gets rerouted automatically. This also provides your Sys Ops team the freedom to replenish, restart, or refresh any EC2 instances in production while the overall application continues to work.

    The load balancer will continue to perform periodic health checks on all your registered EC2 instances regardless of whether an instance is presently marked as healthy or unhealthy.

    Scaling Out the Web Layer

    Once you create the Elastic Load Balancer, you register your individual EC2 instances (Web Servers) to this load balancer by specifying the IP addresses for each of those instances. This way the ELB knows all IP addresses where incoming traffic has to be routed.

    You can then reconfigure the ELB to add or remove EC2 instances as needed, without having to shut down the ELB and without disrupting the normal flow of your application traffic – This change is completely transparent to your clients. This facilitates on-demand scaling of your application without the need for any downtime.

    Intelligent Traffic Scheduling

    When a request reaches the load balancer, it needs to determine which upstream EC2 instance this request needs to be forwarded to. In the case of HTTP traffic, the load balancer uses a routing algorithm called least outstanding requests. This algorithm favors EC2 instances which presently have the least number of outstanding HTTP requests. Using this technique the load balancer is able to avoid overwhelming only a few EC2 instances and keep the workloads more balanced across your upstream servers.

    Instead of defining static weights to each upstream server, this load balancing technique makes its decisions on the real-time load on your upstream servers.

    Achieving Fault Tolerance with Multiple Availability Zones

    You can create your EC2 Instances across multiple availability zones within an AWS region and then register them with the ELB.

    Multiple Availability Zones for the Load Balancer

    Figure-4: Distribution of traffic across multiple availability zones.

    Each availability zone is an independent data-center with its own infrastructure. Availability Zones are engineered such that, one Zone does not share infrastructure (such as power generators, cooling equipment, primary network connectivity) with other zones.

    Moreover, each Zone is located at physically separate location. So natural calamities are unlikely to impact multiple availability zones at the same time. Zones are connected with each other via very low- latency network links.

    ELB will then route the inbound traffic across all these EC2 instances (spread across these multiple availability zones) and thus achieve a higher degree of fault tolerance in your application.

    By default, the ELB will route traffic equally to each of the enabled availability zones. It is hence recommended to have an equivalent number of EC2 instances in each of the availability zones that you plan to use.

    For example, if you had 12 EC2 instances in us-east-1a and only 6 EC2 instances in us-east-1b, both zones will still receive an equal amount of traffic from the ELB. The EC2 instances in us-east-1b are likely to be overwhelmed, while you will have underutilized capacity in us-east-1a.

    If all the EC2 instances in a particular Availability Zones become unhealthy (say, due to a data center outage), ELB will automatically route the traffic to the healthy EC2 instances in your other availability zones. It would hence be important to ensure that you have a sufficient number of healthy EC2 instances in your other availability zones to handle this additional traffic that now gets routed towards them.

    It is also important to note that each ELB is tied to a specific region only and it cannot distribute traffic across two different AWS Regions.

    Scaling-out the Load Balancer Itself

    While the ELB itself may seem like a single monolithic component at a first glance, in reality, there are multiple load-balancer instances which service your application’s traffic.

    This is necessary so that the ELB can scale-out its own request handling capacity and provide for a higher availability of the load balancer itself.

    For example, when you enable additional availability zones for your ELB, the ELB automatically creates an instance of the load balancer in each of those availability zones for you.

    The controller service within the ELB monitors these load balancer instances and it automatically adds or removes load balancer instances based on your present traffic needs.

    Here is how it works:

    • ELB Spawns Additional Load Balancing Instances: If your traffic has significantly gone up in the recent past, the ELB decides to add more load balancer instances. It then automatically spawns these new load balancer instances, each having their own IP address.

    Scaling out to Multiple Load Balancer Instances

    Figure-5: Multiple instances of the load balancer itself.

    • ELB Updates the DNS Name Server: The ELB now automatically updates the Amazon name server records to map these additional IP addresses, of the new load balancer instances, to that same domain name. (Note: The DNS entry for your load balancer such as is controlled by Amazon name servers).
    • Client Attempts a DNS Lookup: When a client attempts to access your application, it tries to resolve the domain name of your load balancer. As part of the DNS resolution process, the Amazon name servers return a set of IP addresses to the client (These are the IP address of multiple load balancer instances).
    • Client Uses DNS Round Robin: The client uses DNS round- robin to make a request to one of those IP addresses. The traffic thus hits one of the load balancer instances. (Note: This will be one of the load balancer instances within one of your availability zones).

    DNS Lookup for Multiple Load Balancer Instances

    Figure-6: DNS Round Robin to distribute traffic.

    • Load Balance Routes Request Upstream: This load balancer instance in turn forwards the request to a healthy upstream EC2 instance within that availability zone.

    The ELB automatically configures and manages the DNS records for your load balancer and sets a low TTL value (only 60 seconds) for those records.

    So if new load balancer instances get automatically provisioned or some instances are automatically shutdown by the ELB, clients can quickly receive this ‘remapped’ IP information: The older DNS records would have expired, and clients are forced to do fresh DNS lookups.

    It is worthwhile to note that clients only receive IP information about the load balancer instances during DNS resolution – they do not receive any IP information about the upstream EC2 instances. Those EC2 instances are opaque to your clients.

    Security Features of the ELB

SSL Offloading

Processing HTTPS-SSL traffic is a highly CPU-intensive task. The ELB offers SSL termination for all your inbound HTTPS-SSL traffic and performs the necessary encryption-decryption of your application traffic in real time.

By offloading this job to the ELB, your upstream EC2 servers are free to focus on their primary task (i.e. executing your business logic). This effectively increases the capacity of your upstream Web servers to do real work.


Figure-7: SSL Termination at the Load Balancer

To enable the ELB to offload your SSL connections, you configure it with the following information:

  • Your application’s private key.
  • Your application’s public certificate (Signed by a certifying authority, of course).
  • A certificate-chain containing the intermediate certificates and the root certificate itself.

Secured Key Management

Your PKI certificates are now managed centrally within the ELB instead of being scattered across each of your EC2 instances. This also makes PKI key-management and security-compliance much easier.

Additional Ingress Security

By default EC2 instances have a public IP address and accept ingress traffic from any remote client directly.

With the ELB, you can ‘lock-down’ the traffic between your load balancer and your EC2 instances – So your EC2 instances accept ingress HTTP traffic exclusively from the ELB. This is done by imposing additional Security Groups on your EC2 instances.

This prevents malicious clients from directly hitting your EC2 instances and brings an additional layer of security for your Web Servers.

Exploring the ELB Management Tools

ELB offers multiple ways to configure and manage itself such as the following:

AWS Management Console: This is a point-and-click Web Application from where you manage all of the AWS features, including the ELB configuration itself.

AWS Command Line Tool: This is a shell based tool (Command Line Interface) which can be used on OSX, Windows, and Linux to manage your ELB. This is useful for manual use as well as for DevOps scripting and automation purposes.

REST and SOAP APIs: You can use ELB’s REST or SOAP APIs to programmatically manage or query your ELB.

AWS SDK: This SDK is a wrapper on top of the SOAP APIs and are available as client libraries for various programming languages such as Java, PHP, .NET and Ruby.

The most common ways of managing your ELB would be via the Web Management Console or via the Command Line Tool. The APIs or the SDK would be useful if you are trying to automate your SysOps or create your own user interface to manage the ELB.

Summary and Key Benefits

Elastic Scaling and Improved DevOps

Production SysOps become easier: (a) It is easier to scale up-down by adding more EC2 instances, without causing application downtime. (b) It is easier to roll-out ‘incremental version updates’ across your EC2 instances without causing any downtime.

The ELB itself is engineered to scale as needed. You don’t need to spend or plan for excess capacity in advance. With a few intelligent scripts and monitoring controls, you can build an App infrastructure that seamlessly scales as required.

Improved Security

The ELB becomes the front-end component for your application infrastructure; everything else is tucked-away in the backend. Your individual App Servers are not exposed to your clients, thus making them less vulnerable to exploits. Having one place to manage your SSL certificates and private keys makes it easier from a security point of view.

Higher Application Performance

Having ELB to manage your HTTPS/SSL termination enhances the performance of your other EC2 instances, as this CPU- intensive task is now delegated to ELB. This could improve the response latency of your App Servers.

Simplified DNS Management

As the ELB is the only computer exposed to your client devices, you don’t really need to create or manage public domain names for all the EC2 instances in your deployment. As you add or remove EC2 instances you simply have to manage them in your ELB without needing DNS updates for all those EC2 instances.

Pay Per Use

The ELB offers a pay-per-use model and offers a free-usage tier as well. You pay for each pro-rated hour of the ELB service utilized, plus a fee for each gigabyte of traffic. Your ELB costs per-month depend on your application’s actual traffic in that month (and thus, the success of your online business). Just like other IaaS offerings, you’ve converted your infra Capex to Opex here.

Leave a Reply

Your email address will not be published.