Load balancers are used to support scaling and high availability for applications and services. A load balancer is primarily composed of three components—a frontend, a backend, and routing rules. Requests coming to the frontend of a load balancer are distributed based on routing rules to the backend, where we place multiple instances of a service. This can be used for performance-related reasons, where we would like to distribute traffic equally between endpoints in the backend, or for high availability, where multiple instances of services are used to increase the chances that at least one endpoint will be available at all times.
We will cover the following recipes in this chapter:
For this chapter, an Azure subscription is required.
The code samples can be found at https://github.com/PacktPublishing/Azure-Networking-Cookbook-Second-Edition/tree/master/Chapter10.
Microsoft Azure supports two types of load balancers—internal and public. An internal load balancer is assigned a private IP address (from the address range of subnets in the virtual network) for a frontend IP address, and it targets the private IP addresses of our services (usually, an Azure virtual machine (VM)) in the backend. An internal load balancer is usually used by services that are not internet-facing and are accessed only from within our virtual network.
Before you start, open the browser and go to the Azure portal via https://portal.azure.com.
In order to create a new internal load balancer with the Azure portal, we must use the following steps:
Figure 10.1: Creating a new internal load balancer
An internal load balancer is assigned a private IP address, and all requests coming to the frontend of an internal load balancer must come to that private address. This limits the traffic coming to the load balancer to be from within the virtual network associated with the load balancer. Traffic can come from other networks (other virtual networks or local networks) if there is some kind of virtual private network (VPN) in place. The traffic coming to the frontend of the internal load balancer will be distributed across the endpoints in the backend of the load balancer. Internal load balancers are usually used for services that are not placed in a demilitarized zone (DMZ) (and are therefore not accessible over the internet), but rather in middle- or back-tier services in a multi-tier application architecture.
We also need to keep in mind the differences between the Basic and Standard SKUs. The main difference is in performance (this is better in the Standard SKU) and SLA (Standard has an SLA guaranteeing 99.99% availability, while Basic has no SLA). Also, note that Standard SKU requires a Network Security Group (NSG). If an NSG is not present on the subnet or Network Interface, or NIC (of the VM in the backend), traffic will not be allowed to reach its target. For more information on load balancer SKUs, see https://docs.microsoft.com/azure/load-balancer/skus.
The second type of load balancer in Azure is a public load balancer. The main difference is that a public load balancer is assigned a public IP address in the frontend, and all requests come over the internet. The requests are then distributed to the endpoints in the backend.
Before you start, open the browser and go to the Azure portal via https://portal.azure.com.
In order to create a new public load balancer with the Azure portal, we must follow these steps:
Figure 10.2: Creating a new public load balancer
The public load balancer is assigned a public IP address in the frontend. Therefore, all requests coming to the public load balancer will come over the internet, targeting the load balancer's public IP address. Requests are then distributed to endpoints in the backend of the load balancer. What's interesting is that the public load balancer does not target the public IP addresses in the backend, but private IP addresses instead. For example, let's say that we have one public load balancer with two Azure VMs in the backend. Traffic coming to the public IP address of the load balancer will then be distributed to VMs but will target the VMs' private IP addresses.
Public load balancers are used for public-facing services, most commonly for web servers.
After the load balancer is created, either internally or publicly, we need to configure it further in order to start using it. During the creation process, we define the frontend of the load balancer and know where traffic needs to go to reach the load balancer. But, in order to define where that traffic needs to go after reaching the load balancer, we must first define a backend pool.
Before you start, open the browser and go to the Azure portal via https://portal.azure.com.
In order to create the backend pool, we must do the following:
Figure 10.3: Adding a new backend pool
Figure 10.4: Additional information for adding the backend pool
Figure 10.5: Adding VMs to the backend pool
Figure 10.6: List of VMs for creating a backend pool
Figure 10.7: The backend pool list
The two main components of any load balancer are the frontend and the backend. The frontend defines the endpoint of the load balancer, and the backend defines where the traffic needs to go after reaching the load balancer. As the frontend information is created along with the load balancer, we must define the backend ourselves, after which the traffic will be evenly distributed across endpoints in the backend. The available options for the backend pool are VMs and VM scale sets.
More information on VMs, availability sets, and VM scale sets is available in my book, Hands-On Cloud Administration in Azure, published by Packt at https://www.packtpub.com/virtualization-and-cloud/hands-cloud-administration-azure.
After the frontend and the backend of the load balancer are defined, traffic is evenly distributed among endpoints in the backend. But what if one of the endpoints is unavailable? In that case, some of the requests will fail until we detect the issue, or even fail indefinitely should the issue remain undetected. The load balancer would send a request to all the defined endpoints in the backend pool and the request would fail when directed to an unavailable server.
This is why we introduce the next two components in the load balancer—health probes and rules. These components are used to detect issues and define what to do when issues are detected.
Health probes constantly monitor all endpoints defined in the backend pool and detect if any of them become unavailable. They do this by sending a probe in the configured protocol and listening for a response. If an HTTP probe is configured, an HTTP 200 OK response is required to be considered successful.
Before you start, open the browser and go to the Azure portal via https://portal.azure.com.
To create a new health probe in the load balancer, we must do the following:
Figure 10.8: Adding a new health probe
Figure 10.9: Providing health probe information
After we define the health probe, it will be used to monitor the endpoints in the backend pool. We define the protocol and the port as useful information that will provide information regarding whether the service we are using is available or not. Monitoring the state of the server would not be enough, as it could be misleading. For example, the server could be running and available, but the IIS or SQL server that we use might be down. So, the protocol and the port will detect changes in the service that we are interested in and not only whether the server is running. The interval defines how often a check is performed, and the unhealthy threshold defines after how many consecutive fails the endpoint is declared unavailable.
The last piece of the puzzle when speaking of Azure load balancers is the rule. Rules finally tie all things together and define which health probe (there can be more than one) will monitor which backend pool (more than one can be available). Furthermore, rules enable port mapping from the frontend of a load balancer to the backend pool, defining how ports relate and how incoming traffic is forwarded to the backend.
Before you start, open your browser and go to the Azure portal via https://portal.azure.com.
In order to create a load balancer rule, we must do the following:
Figure 10.10: Adding load balancing rules
Figure 10.11: Configuring load balancing rules
The load balancer rule is the final piece that ties all the components together. We define which frontend IP address is used and which backend the pool traffic will be forwarded to. The health probe is assigned to monitor the endpoints in the backend pool and to keep track of whether there are any unresponsive endpoints. We also create a port mapping that will determine which protocol and port the load balancer will listen on and, when the traffic arrives, where this traffic will be forwarded.
As its default distribution mode, Azure Load Balancer uses a five-tuple hash (source IP, source port, destination IP, destination port, and protocol type). If we change the session persistence to Client IP, the distribution will be two-tuple (requests from the same client IP address will be handled by the same VM). Changing session persistence to Client IP and protocol will change the distribution to three-tuple (requests from the same client IP address and protocol combination will be handled by the same VM).
Inbound Network Address Translation (NAT) rules are an optional setting in Azure Load Balancer. These rules essentially create another port mapping from the frontend to the backend, forwarding traffic from a specific port on the frontend to a specific port in the backend. The difference between inbound NAT rules and port mapping in load balancer rules is that inbound NAT rules apply to direct forwarding to a VM, whereas load balancer rules forward traffic to a backend pool.
Before you start, open the browser and go to the Azure portal via https://portal.azure.com.
In order to create a new inbound NAT rule, we must do the following:
Figure 10.12: Adding an inbound NAT rule for an existing load balancer
Figure 10.13: Configuring the inbound NAT rule settings
Inbound NAT rules enable you to use the public IP of the load balancer to connect directly to a specific backend instance. They create a port mapping similar to the port mapping created by load balancer rules but to a specific backend instance. A load balancer rule creates additional settings, such as the health probe or session persistence. Inbound NAT rules exclude these settings and create unconditional mapping from the frontend to the backend. With an inbound NAT rule, forwarded traffic will always reach the single server in the backend, whereas a load balancer will forward traffic to the backend pool and will use a pseudo-round-robin algorithm to route traffic to any of the healthy servers in the backend pool.
When creating load balancing rules, we can create implicit outbound rules. This will enable Source Network Address Translation (SNAT) for VMs in the backend pool and allow them to access the internet over the load balancer's public IP address (specified in the rule). But in some scenarios, implicit rules are not enough and we need to create explicit outbound rules. Explicit outbound rules (and SNAT in general) are available only for public load balancers with the Standard SKU.
Before we begin, make sure that implicit outbound rules are disabled from load balancing rules:
Figure 10.14: Disabling implicit outbound rules
Now, open the browser and go to the Azure portal via https://portal.azure.com.
In order to create a load balancer rule, we must do the following:
Figure 10.15: Adding outbound rules
Figure 10.16: The outbound rule pane
Outbound rules depend on three things—frontend IP addresses, instances in the backend pool, and connections. Each frontend IP address has a limited number of ports for connections. The more IP addresses are assigned to the frontend, the more connections are allowed. On the other hand, the number of connections allowed (per backend instance) decreases with the number of instances in the backend.
If we set the default number of outbound ports, allocation is done automatically and without control. If we have a VM scale set with the default number of instances, port allocation will be done automatically for each VM in the scale set. If the number of instances in a scale set increases, this means that the number of ports allocated to each VM will drop in turn.
To avoid this, we can set port allocation to manual and either limit the number of instances that are allowed or limit the number of ports per instance. This will ensure that each VM has a certain number of ports dedicated and that connections will not be dropped.
3.129.45.92