Describing the service mesh concept

Services ought to be meshed to be versatile, robust, and resilient in their interactions. For an ever-growing microservices world, service mesh-enablement through automated toolkits is being widely recommended. Thus, we come across a number of service mesh solutions that are becoming extremely critical for producing and sustaining both cloud-native and enabled applications. Microservices are turning out to be the most competent building blocks and the units of deployment for enterprise-grade business applications. Because of the seamless convergence of containers and microservices, the activities of continuous integration, delivery, and deployment gets simplified and sped up. As described previously, the Kubernetes platform comes in handy when automating the container life cycle management tasks. Thereby, it is clear that the combination of microservices, containers, and Kubernetes, the market-leading container clustering, orchestration, and management platform works well toward operations automation and optimization. Not only infrastructure optimization but also the various non-functional requirements of applications are being easily attained. This combination also activates and accentuates faster and frequent software deployment in order to satisfy customer, user, business, and technology changes.

However, there are still some gaps for ensuring the mandated service's resiliency. It is widely insisted that the reliability of business applications and IT services (platform and infrastructure) has to be guaranteed to boost cloud adoption. The infrastructure's elasticity and service resiliency together heighten the reliability of software applications. It is widely believed that resilient services collectively result in reliable applications. The underlying infrastructural modules also have to contribute immensely for guaranteeing the application reliability. There are several techniques for enhancing the reliability of cloud infrastructures. The clustering of IT resources such as BM servers, VMs, and containers is one widely accepted and accentuated approach toward IT reliability.

With the faster proliferation of virtual machines and containers in our data centers, the aspect of auto-scaling is seeing reality. Then, there are these powerful techniques such as replication, partition, isolation, and sharing, which are contributing immensely to the higher availability of IT services and business applications. The distributed computing has come as a blessing to ensure high availability. However, the distributed nature of IT systems and business service components brings forth its own issues. Remote calls via fragile networks is troublesome. The predictability aspect is greatly lost when distributed and different systems and services are being leveraged to accomplish business goals. Thus, we need a fresh approach to solve the service resiliency problem for the ensuing service-oriented cloud era. As indicated previously, service resiliency leads to reliable applications. Cloud infrastructures are being continuously upgraded through a host of pioneering reliability-enablement technologies to realize the IT reliability vision.

A service mesh is an additional abstraction layer that manages the communication among microservices. In traditional applications, the networking and communication logic are built and inserted into the code. Now, we tend toward microservices, which focus on just business logic alone. All other associated actions are being separated out and presented as horizontal and utility services. Such kinds of partitioning legacy applications as a dynamic pool of fine-grained and single-responsibility services guarantees a number of benefits for service providers and consumers. In traditional applications, the resiliency logic is built directly into the application itself. That is, retries and timeouts, monitoring/visibility, tracing, service discovery, and so on, are all hard-coded into the application. However, as application architectures become increasingly segmented into refined, polyglot, and micro-scale services, it is paramount to move the communication logic out of the application and put it into the underlying infrastructure.

In short, a service mesh architecture uses a proxy called a sidecar container that's attached to every container orchestration pod or Docker host/node. This proxy can then be attached to centralized control plane software, which is ceaselessly gathering all kinds of operations data such as fine-grained network telemetry data, applying network management policies or proxy configuration changes, and then establishing and enforcing network security policies. The other features are dynamic service discovery, load balancing, timeouts, fall-backs, retries, circuit breaking, distributed tracing, and security policy enforcement between services.

A service mesh solution typically provides several critical features to multi-service applications running at scale. The resiliency patterns such as retry, timeout, circuit breaker, failure, latency awareness, and distributed tracing are being optimally implemented and innately built-in into the service mesh solutions. There are distributed tracing tools such as Zipkin (Zipkin is a distributed tracing system). It helps gather timing data that's needed to troubleshoot latency problems in microservices environment) and OpenTracing. The service mesh solutions also provide top-line service metrics such as success rates, request volumes, and latencies. In addition to that, it performs failure and latency-aware load balancing in order to route around slow or broken service instances.

Kubernetes already has a very basic service mesh solution out-of-the-box. It is the service resource, which provides the service discovery by targeting the necessary pods. The famous round-robin method is leveraged for balancing of service requests. A service works by managing the iptables on each host in the cluster. This does not support the other key features of the typical service mesh solutions. However, by implementing one of the fully featured service mesh systems (Istio, Linkerd, or Conduit) in the cluster, the following capabilities can be obtained easily.

Service mesh solutions allow services to talk plain HTTP, and there is nothing to bother HTTPS on the application layer. The service mesh proxies will manage HTTPS encapsulation on the sender side and TLS termination on the receiving side. That is, application components can use plain HTTP, gRPC or any other protocol without bothering about the encryption in transit:

Service mesh proxy knows which services are to be allowed to be accessed and used.
It supports the circuit breaker pattern and other resiliency-enablement patterns, such as retries and timeouts.
It also enables latency-aware load balancing. Traditionally, round-robin load balancing is used and this unfortunately does not take the latency of each target into consideration. The fully furnished service mesh balances the load according to response times of each backend target.
It has the capability of doing queue-depth load balancing. This can route new requests based on the least busy target by understanding the processing amount of the current request. The service mesh knows the service request history.
Service mesh can route particular requests marked by selected HTTP header to specific targets behind the load balancer. This makes it easy to do canary deployment testing.
Service mesh can do health checks and the eviction of misbehaving targets.
Service mesh can report the requests volume per target, latency metrics, success, and error rates.
The primary goal of service mesh is to establish service communication resiliency. Service mesh solutions can get integrated with the service registry to identify services dynamically. This integration helps in discovering and involving/invoking appropriate services toward task fulfilment.
The service security is also substantially enhanced as service meshes can authenticate services so that only the approved services can communicate with one another to implement business tasks.
Service monitoring is also activated and accomplished through service mesh solutions. If multiple services are chained together to fulfil a service request, then the issue/problem tracking and distributed tracing get greatly simplified through the end-to-end monitoring capability being offered by standardized service meshes.

Considering the growing number of microservices participating and contributing in business workloads and IT applications, the service mesh emerges as a mission-critical infrastructure solution. Service mesh solutions have become an important ingredient in supporting and sustaining microservices-centric applications and their reliability considerably. There are a few open source as well as commercial-grade service mesh solutions on the market. As it is an emerging concept, there will be substantial advancements in the future to strengthen service resiliency and robustness. The ingrained resiliency of services ultimately leads to reliable applications.

The service mesh layer is for efficiently and effectively handling service-to-service communication. Typically, every service mesh is implemented as a series/mesh of interconnected network proxies, and this arrangement is able to manage service traffic better. This service mesh idea has gained a lot of traction with the continued rise of MSA. The communication traffic in the ensuring era of MSA is going to be distinctly different. That is, service-to-service communication becomes the vital factor for application behavior at runtime. Traditionally, application functions occur locally as a part of the same runtime. However, in the case of microservices, application functions occur through remote procedure calls (RPCs). Thus, the widely articulated and accentuated deficiencies of distributed computing over unreliable wide area networks (WANs) go up considerably.

Programmers who are dealing with distributed systems understand the fallacies of distributed computing. The key fallacies are as follows:

The network is reliable
The latency is zero
The bandwidth is infinite
The network is secure
The topology doesn't change
There is one administrator
The transport cost is zero
The network is homogeneous

The promising and strategic solution approach for enhancing the resiliency in the ubiquitous service era (geographically distributed and disparate microservices communicate with one another) is to embrace the service mesh method. The service mesh relieves application service developers from that burden by pushing that responsibility down into the infrastructure layer.

With a service mesh, services running on a container, pod, virtual machine (VM), or bare-metal (BM) server are configured to send their messages to a local proxy, which is installed as a sidecar module. That local proxy is designed and destined to do things like timeout, retry, circuit breaking, encryption, application of custom routing rules, and service discovery. All kinds of network monitoring and management activities are being precisely performed by a service mesh. With the unprecedented acceptance of the service mesh concept, there are discussions and discourses between service mesh solutions and other middleware solutions such as Enterprise Service Bus (ESB), Enterprise Application Integration (EAI), hub, and API Gateways.

The service communication is being made resilient through the incorporation of service mesh solution, as portrayed by the following diagram:

As the preceding diagram shows, there are four service clusters (A-D). That is, the service cluster consists of a service and its instances. Each service instance is empowered with a sidecar network proxy. All network traffic using a variety of communication and data transmission protocols from a service instance flows to other services via its local sidecar proxy. The local proxy takes care of all the needs of service communications, thereby most of the widely reported deficiencies of distributed computing can be fully overcome. We have already indicated the widely agreed and accepted fallacies of distributed computation, and it is being understood that the leverage of service mesh simply and sagaciously eliminates them.

The data plane: In a service mesh, there are two vital modules, that is, the control and the data planes. The sidecar proxy (data plane) performs the following tasks. As described previously, the prominent functionalities of the sidecar proxy are as follows:

Eventually consistent service discovery
Health checking
Routing
Load balancing
Authentication and authorization
Observability

All of these functionalities are the prime responsibility of the data plane of any service mesh solution. Precisely speaking, the sidecar proxy is the data plane. In other words, the data plane is squarely responsible for conditionally translating, forwarding, and observing every network packet that flows to and from services (client, as well as server). That is, the main responsibility of the data plane is to ensure that any service request is delivered from microservice A to microservice B in a reliable and secure manner.

The control plane: The network abstraction that the data plane (sidecar proxy) of service mesh provides is really magical. However, the proxy has to be supplied the right and relevant details to route service request messages to appropriate services and their instances. Also, the service discovery is not being done by proxy. The settings for the load balancing, timeout, circuit breaking, and so on, ought to be specified in an unambiguous manner in the control plane. The actual configuration of the data plane functionalities are done within the control plane. If we correlate the TCP/IP analogy, the control plane is similar to configuring the switches and routers so that TCP/IP will work properly on top of these switches and routers. In the service mesh, the control plane is responsible for configuring the network of sidecar proxies. The control plane functionalities include configuring the following things:

Routing
Load balancing
Circuit breaker/retry/timeout
Deployments
Service discovery

The control plane controls a set of distributed and stateless sidecar proxies. The control plane manages and configures the proxies to correctly route traffic. Furthermore, the control plane configures mixers to enforce policies and collect telemetry. The following diagram shows the various components that make up both data and control planes:

Table of Contents for Describing the service mesh concept

Create new playlist

Sign In

Sign Up

Table of Contents for
Describing the service mesh concept