Chapter 6. Hands-on Cluster Management, Failover, and Load Balancing

In Chapter 5, we just had a quick introduction to cluster management, Linux containers, and cluster management. Let’s jump into using these things to solve issues with running microservices at scale. For reference, we’ll be using the microservice projects we developed in Chapters 2, 3, and 4 (Spring Boot, Dropwizard, and WildFly Swarm, respectively). The following steps can be accomplished with any of the three Java frameworks.

Getting Started

To package our microservice as a Docker image and eventually deploy it to Kubernetes, let’s navigate to our project (Spring Boot example in this case) and return to JBoss Forge. JBoss Forge has some plug-ins for making it easy to quickly add the Maven plug-ins we need to use:

$ cd hola-springboot
$ forge

Now let’s install a JBoss Forge addon:

hola-springboot]$ addon-install 
--coordinate io.fabric8.forge:devops,2.2.148

***SUCCESS*** Addon io.fabric8.forge:devops,2.2.148 was
installed successfully.

Now let’s add the Maven plug-ins:

[hola-springboot]$ fabric8-setup
***SUCCESS*** Added Fabric8 Maven support with base Docker
image: fabric8/java-jboss-openjdk8-jdk:1.0.10. Added the
following Maven profiles [f8-build, f8-deploy,
f8-local-deploy] to make building the project easier,
e.g., mvn -Pf8-local-deploy

Let’s take a look at what the tooling did. If we open the pom.xml file, we see it added some properties:

<docker.assemblyDescriptorRef>
    artifact
</docker.assemblyDescriptorRef>
<docker.from>
    docker.io/fabric8/java-jboss-openjdk8-jdk:1.0.10
</docker.from>
<docker.image>
    fabric8/${project.artifactId}:${project.version}
</docker.image>
<docker.port.container.http>8080</docker.port.container.http>
<docker.port.container.jolokia>
    8778
</docker.port.container.jolokia>
<fabric8.iconRef>icons/spring-boot</fabric8.iconRef>
<fabric8.service.containerPort>
    8080
</fabric8.service.containerPort>
<fabric8.service.name>hola-springboot</fabric8.service.name>
<fabric8.service.port>80</fabric8.service.port>
<fabric8.service.type>LoadBalancer</fabric8.service.type>

It also added two Maven plug-ins: docker-maven-plugin and fabric8-maven-plugin:

   <plugin>
    <groupId>io.fabric8</groupId>
    <artifactId>docker-maven-plugin</artifactId>
    <version>0.14.2</version>
    <configuration>
      <images>
        <image>
          <name>${docker.image}</name>
          <build>
            <from>${docker.from}</from>
            <assembly>
              <basedir>/app</basedir>
              <descriptorRef>
                ${docker.assemblyDescriptorRef}
              </descriptorRef >
            </assembly>
            <env>
              <JAR>
                ${project.artifactId}-${project.version}.war
              </JAR>
              <JAVA_OPTIONS>
                -Djava.security.egd=/dev/./urandom<
              /JAVA_OPTIONS>
            </env>
          </build>
        </image>
      </images>
    </configuration>
  </plugin>
        <plugin>
    <groupId>io.fabric8</groupId>
    <artifactId>fabric8-maven-plugin</artifactId>
    <version>2.2.100</version>
    <executions>
      <execution>
        <id>json</id>
        <phase>generate-resources</phase>
        <goals>
          <goal>json</goal>
        </goals>
      </execution>
      <execution>
        <id>attach</id>
        <phase>package</phase>
        <goals>
          <goal>attach</goal>
        </goals>
      </execution>
    </executions>
  </plugin>

Lastly, the tooling added some convenience Maven profiles:

f8-build

Build the docker image and Kubernetes manifest YML.

f8-deploy

Build the docker image and deploy to a remote docker registry; then deploy the application to Kubernetes.

f8-local-deploy

Build the docker image, generate the Kubernetes manifest.yml, and deploy to a locally running Kubernetes.

The JBoss Forge addon is part of the Fabric8 open source project. Fabric8 builds developer tooling for interacting with Docker, Kubernetes, and OpenShift, including Maven plug-ins, variable injection libraries for Spring/CDI, and clients for accessing the Kubernetes/OpenShift API. Fabric8 also builds API management, CI/CD, chaos monkey and Kubernetes-based NetflixOSS functionality on top of Kubernetes.

Packaging Our Microservice as a Docker Image

With the Maven plug-ins added from the previous step, all we have to do to build the docker image is run the following Maven command. This step, and all others related to building Docker images or deploying to Kubernetes, assume the CDK (earlier in this chapter) is up and running:

$ mvn -Pf8-build

[INFO] DOCKER> ... d3f157b39583 Pull complete
       ============= 10% ============ 20% ============
       30% ============ 40% ============ 50% =============
       60% ============ 70% ============ 80% ============
       90% ============ 100% =
[INFO] DOCKER> ... f5a6e0d26670 Pull complete
       = 100% ==
[INFO] DOCKER> ... 6d1f91fc8ac8 Pull complete
       = 100% ==
[INFO] DOCKER> ... 77c58da5314d Pull complete
       = 100% ==
[INFO] DOCKER> ... 1416b43aef4d Pull complete
       = 100% ==
[INFO] DOCKER> ... fcc736051e6e Pull complete
[INFO] DOCKER> ... Digest: sha256:e77380a4924bb599162e3382e6443e
8aa50c0
[INFO] DOCKER> ... Downloaded image for java-jboss-openjdk8-jdk:
1.0.10
[INFO] DOCKER> [fabric8/hola-springboot:1.0] : Built image 13e72
5c3c771
[INFO]
[INFO] fabric8-maven-plugin:2.2.100:json (default-cli) @ hola-
springboot
[INFO] Configured with file: /Users/ceposta/dev/sandbox/micro
services-by-example/source/spring-boot/hola-springboot/target
/classes/kubernetes.json
[INFO] Generated env mappings: {}
[INFO] Generated port mappings: {http=ContainerPort(container
Port=8080, hostIP=null, hostPort=null, name=http, protocol=
null, additionalProperties={}), jolokia=ContainerPort(
containerPort=8778, hostIP=null, hostPort=null, name=jolokia,
protocol=null, additionalProperties={})}
[INFO] Removed 'version' label from service selector for
service ``
[INFO] Generated ports: [ServicePort(name=null, nodePort=null,
port=80, protocol=TCP, targetPort=IntOrString(IntVal=8080,
Kind=null, StrVal=null, additionalProperties={}),
additionalProperties={})]
[INFO] Icon URL: img/icons/spring-boot.svg
[INFO] Looking at repo with directory /microservices-by-example
/.git
[INFO] Added environment annotations:
[INFO]     Service hola-springboot selector: {project=hola-
springboot,
[INFO]         provider=fabric8, group=com.redhat.examples}
ports: 80
[INFO]     ReplicationController hola-springboot replicas: 1,
[INFO]         image: fabric8/hola-springboot:1.0
[INFO] Template is now:
[INFO]     Service hola-springboot selector: {project=hola-
springboot,
[INFO]       provider=fabric8, group=com.redhat.examples}
ports: 80
[INFO]     ReplicationController hola-springboot replicas: 1,
[INFO]       image: fabric8/hola-springboot:1.0
[INFO] ------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------
[INFO] Total time: 04:22 min
[INFO] Finished at: 2016-03-31T15:59:58-07:00
[INFO] Final Memory: 47M/560M
[INFO] ------------------------------------------------------

Deploying to Kubernetes

If we have the Docker tooling installed, we should see that our microservice has been packaged in a Docker container:

$ docker images
langswif01(cdk-v2 (master))$ docker images
REPOSITORY              TAG  IMAGE ID       CREATED  SIZE
fabric8/hola-springboot 1.0  13e725c3c771   3d ago   439.7 MB

We could start up the Docker container using docker run, but we want to deploy this into a cluster and leave the management of the microservice to Kubernetes. Let’s deploy it with the following Maven command:

$ mvn -Pf8-local-deploy

If your environment is configured correctly (i.e., you’ve started the CDK, installed the oc tooling, logged in with the oc login, and created a new project with oc new-project microservices-book), you should see a successful build similar to this:

[INFO] --- fabric8-maven-plugin:apply (default-cli) @ hola-
springboot ---
[INFO] Using https://10.1.2.2:8443/ in namespace microservice-
book
[INFO] Kubernetes JSON: /Users/ceposta/dev/sandbox
[INFO]      /microservices-by-example/source/spring-boot/hola-
springboot
[INFO]      /target/classes/kubernetes.json
[INFO] OpenShift platform detected
[INFO] Using namespace: microservice-book
[INFO] Creating a Template from kubernetes.json namespace
[INFO]      microservice-book name hola-springboot
[INFO] Created Template: target/fabric8/applyJson/microservice-
book/
[INFO]      template-hola-springboot.json
[INFO] Looking at repo with directory /Users/ceposta/dev/
sandbox/
[INFO]      microservices-by-example/.git
[INFO] Creating a Service from kubernetes.json namespace
[INFO]      microservice-book name hola-springboot
[INFO] Created Service: target/fabric8/applyJson/microservice-
book
[INFO]      /service-hola-springboot.json
[INFO] Creating a ReplicationController from kubernetes.json
namespace
[INFO]      microservice-book name hola-springboot
[INFO] Created ReplicationController: target/fabric8/applyJson
[INFO]      /microservice-book/replicationcontroller-hola-
springboot.json
[INFO] Creating Route microservice-book:hola-springboot host:
[INFO] -------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] -------------------------------------------------------
[INFO] Total time: 19.101 s
[INFO] Finished at: 2016-04-04T09:05:02-07:00
[INFO] Final Memory: 52M/726M
[INFO] -------------------------------------------------------

Let’s take a quick look at what the fabric8-maven-plugin plug-in did for us.

First, Kubernetes exposes a REST API that allows us to manipulate the cluster (what’s deployed, how many, etc.). Kubernetes follows a “reconciliation of end state” model where you describe what you want your deployment to look like and Kubernetes makes it happen. This is similar to how some configuration management systems work where you declaratively express what should be deployed and not how it should be accomplished. When we post data to the Kubernetes REST API, Kubernetes will reconcile what needs to happen inside the cluster. For example, if “we want a pod running hola-springboot" we can make an HTTP POST to the REST API with a JSON/YAML manifest file, and Kubernetes will create the pod, create the Docker containers running inside that pod, and schedule the pod to one of the hosts in the cluster. A Kubernetes pod is an atomic unit that can be scheduled within a Kubernetes cluster. It typically consists of a single Docker container, but it can contain many Docker containers. In our code samples, we will treat our hello-world deployments as a single Docker container per pod.

The fabric8-maven-plugin we used in the preceding code example will automatically generate the REST objects inside a JSON/YAML manifest file for us and POST this data to the Kubernetes API. After running the mvn -Pf8-local-deploy command successfully, we should be able to navigate to the webconsole (https://10.1.2.2:8443) or using the CLI tooling to see our new pod running our hello-springboot application:

$ oc get pod
NAME                    READY     STATUS    RESTARTS   AGE
hola-springboot-8xtdm   1/1       Running   0          3d

At this point we have a single pod running in our cluster. What advantage does Kubernetes bring as a cluster manager? Let’s start by exploring the first of many. Let’s kill the pod and see what happens:

$ oc delete pod/hola-springboot-8xtdm
pod "hola-springboot-8xtdm" deleted

Now let’s list our pods again:

$ oc get pod
NAME                    READY     STATUS    RESTARTS   AGE
hola-springboot-42p89   0/1       Running   0          3d

Wow! It’s still there! Or, more correctly, another pod has been created after we deleted the previous one. Kubernetes can start/stop/auto-restart your microservices for you. Can you imagine what a headache it would be to determine whether a service is started/stopped at any kind of scale? Let’s continue exploring some of the other valuable cluster management features Kubernetes brings to the table for managing microservices.

Scaling

One of the advantages of deploying in a microservices architecture is independent scalability. We should be able to replicate the number of services in our cluster easily without having to worry about port conflicts, JVM or dependency mismatches, or what else is running on the same machine. With Kubernetes, these types of scaling concerns can be accomplished with the ReplicationController. Let’s see what replication controllers exist in our deployment:

 $ oc get replicationcontroller
CONTROLLER        CONTAINER(S)      IMAGE(S)
hola-springboot   hola-springboot   fabric8/hola-springboot:1.0

SELECTOR
group=com.redhat.examples,project=hola-springboot,
provider=fabric8,version=1.0

REPLICAS   AGE
1          3d

We can also abbreviate the command:

$ oc get rc

One of the objects that the fabric8-maven-plugin created for us is the ReplicationController with a replica value of 1. This means we want to always have one pod/instance of our microservice at all times. If a pod dies (or gets deleted), then Kubernetes is charged with reconciling the desired state for us, which is replicas=1. If the cluster is not in the desired state, Kubernetes will take action to make sure the desired configuration is satisfied. What happens if we want to change the desired number of replicas and scale up our service?

$ oc scale rc hola-springboot --replicas=3
replicationcontroller "hola-springboot" scaled

Now if we list the pods, we should see three pods of our hello-springboot application:

$$ oc get pod
NAME                    READY     STATUS    RESTARTS   AGE
hola-springboot-42p89   1/1       Running   0          3d
hola-springboot-9s6a6   1/1       Running   0          3d
hola-springboot-np2l1   1/1       Running   0          3d

Now if any of our pods dies or gets deleted, Kubernetes will do what it needs to do to make sure the replica count is 3. Notice, also, that we didn’t have to change ports on these services or do any unnatural port remapping. Each one of the services is listening on port 8080 and does not collide with the others.

Let’s go ahead and scale down to 0 to get ready for the next section; we can just run the same command:

$ oc scale rc hola-springboot --replicas=0

Kubernetes also has the ability to do autoscaling by watching metrics like CPU, memory usage, or user-defined triggers, to scale the number of replicas up or down. Autoscaling is outside the scope of this book but is a very valuable piece of the cluster-management puzzle.

Service discovery

One last concept in Kubernetes that we should understand is Service. In Kubernetes, a Service is a simple abstraction that provides a level of indirection between a group of pods and an application using the service represented by that group of pods. We’ve seen how pods are managed by Kubernetes and can come and go. We’ve also seen how Kubernetes can easily scale up the number of instances of a particular service. In our example, we’re going to start our backend service from the previous chapters to play the role of service provider. How will our hola-springboot communicate with the backend?

Let’s run the backend service by navigating to our source code to the folder backend and deploy it locally to our locally running Kubernetes cluster running in the CDK:

$ mvn -Pf8-local-deploy

Let’s take a look at what Kubernetes services exist:

$ oc get service
NAME              CLUSTER_IP      EXTERNAL_IP   PORT(S)
backend           172.30.231.63                 80/TCP
hola-springboot   172.30.202.59                 80/TCP

SELECTOR                                           AGE
component=backend,provider=fabric8                 3d
group=com.redhat.examples,project=hola-springboot,
provider=fabric8                                   3d

Note the Service objects get automatically created by the fabric8-maven-plugin just like the ReplicationController objects in the previous section. There are two interesting attributes of a service that appear in the preceding code example. One is the CLUSTER_IP. This is a virtual IP that is assigned when a Service object is created and never goes away. It’s a single, fixed IP that is available to any applications running within the Kubernetes cluster and can be used to talk to backend pods. The pods are “selected” with the SELECTOR field. Pods in Kubernetes can be “labeled” with whatever metadata you want to apply (like “version” or “component” or “team”) and can subsequently be used in the selector for a Service. In this example, we’re selecting all the pods with label component=backend and provider=fabric8. This means any pods that are “selected” by the selector can be reached just by using the cluster IP. No need for complicated distributed registries (e.g., Zookeeper, Consul, or Eureka) or anything like that. It’s all built right into Kubernetes. Cluster-level DNS is also built into Kubernetes. Using DNS in general for microservice service discovery can be very challenging and downright painful. In Kubernetes, the cluster DNS points to the cluster IP; and since the cluster IP is a fixed IP and doesn’t go away, there are no issues with DNS caching and other gremlins that can pop up with traditional DNS.

Let’s add a couple environment variables to our hola-springboot project to use our backend service when running inside a Kubernetes cluster:

<fabric8.env.GREETING_BACKENDSERVICEHOST>
    backend<
/fabric8.env.GREETING_BACKENDSERVICEHOST>
<fabric8.env.GREETING_BACKENDSERVICEPORT>
    80
</fabric8.env.GREETING_BACKENDSERVICEPORT>

Let’s build the Kubernetes manifest and verify we’re passing in these environment variables to our pod. Note that Spring Boot will resolve configuration from application.properties but can be overridden with system properties and environment variables at runtime:

$ mvn fabric8:json

Inspect file target/classes/kubernetes.json:

  "containers" : [ {
    "args" : [ ],
    "command" : [ ],
    "env" : [ {
      "name" : "GREETING_BACKENDSERVICEHOST",
      "value" : "backend"
    }, {
      "name" : "GREETING_BACKENDSERVICEPORT",
      "value" : "80"
    }, {
      "name" : "KUBERNETES_NAMESPACE",
      "valueFrom" : {
        "fieldRef" : {
          "fieldPath" : "metadata.namespace"
        }
      }
    } ],

Let’s delete all of the pieces of the hola-springboot project and redeploy:

$ oc delete all -l project=hola-springboot
$ mvn -Pf8-local-deploy

We should now be able to list the pods and see our hola-springboot pod running as well as our backend service pod:

$ oc get pod
NAME                    READY     STATUS    RESTARTS   AGE
backend-nk224           1/1       Running   5          3d
hola-springboot-r5ykr   1/1       Running   0          2m

Now, just to illustrate a handy debugging technique, we’re going to set up port-forwarding between our local machine and our hola-springboot-r5ykr pod and verify that our service is working correctly and we can call the backend. Let’s set up portforward to port 9000 on our local machine:

$ oc port-forward -p hola-springboot-r5ykr 9000:8080

We should not be able to communicate with the pod over our localhost port 9000. Note this technique works great even across a remote Kubernetes cluster, not just on our local CDK. So instead of having to try and find which host our pod is running and how to ssh into it, we can just use oc port-forward to expose it locally.

So now we should be able to navigate locally using our browser or a CLI command line:

$ curl http://localhost:9000/api/hola
Hola Spring Boot de 172.17.0.9

We can see the /api/hola endpoint at http://localhost:9000 using our port-forwarding! We also see that the /api/hola endpoint is returning the IP address of the pod in which it’s running. Let’s call the /api/greeting API, which is supposed to call our backend:

$ curl http://localhost:9000/api/greeting
Hola Spring Boot from cluster Backend at host: 172.17.0.5

We can see that the backend pod returns its IP address in this call! So our service was discovered correctly, and all it took was a little bit of DNS and the power of Kubernetes service discovery. One big thing to notice about this approach is that we did not specify any extra client libraries or set up any registries or anything. We happen to be using Java in this case, but using Kubernetes cluster DNS provides a technology-agnostic way of doing basic service discovery!

Fault Tolerance

Complex distributed systems like a microservice architecture must be built with an important premise in mind: things will fail. We can spend a lot of energy preventing things from failing, but even then we won’t be able to predict every case where and how dependencies in a microservice environment can fail. A corollary to our premise of “things will fail” is that “we design our services for failure.” Another way of saying that is “figure out how to survive in an environment where there are failures.”

Cluster Self-Healing

If a service begins to misbehave, how will we know about it? Ideally our cluster management solution can detect and alert us about failures and let human intervention take over. This is the approach we typically take in traditional environments. When running microservices at scale, where we have lots of services that are supposed to be identical, do we really want to stop and troubleshoot every possible thing that can go wrong with a single service? Long-running services may experience unhealthy states. An easier approach is to design our microservices such that they can be terminated at any moment, especially when they appear to be behaving incorrectly.

Kubernetes has a couple of health probes we can use out of the box to allow the cluster to administer and self-heal itself. The first is a readiness probe, which allows Kubernetes to determine whether or not a pod should be considered in any service discovery or load-balancing algorithms. For example, some Java apps may take a few seconds to bootstrap the containerized process, even though the pod is technically up and running.

If we start sending traffic to a pod in this state, users may experience failures or inconsistent states. With readiness probes, we can let Kubernetes query an HTTP endpoint (for example) and only consider the pod ready if it gets an HTTP 200 or some other response. If Kubernetes determines a pod does not become ready within a specified period of time, the pod will be killed and restarted.

Another health probe we can use is a liveness probe. This is similar to the readiness probe; however, it’s applicable after a pod has been determined to be “ready” and is eligible to receive traffic. Over the course of the life of a pod or service, if the liveness probe (which could also be a simple HTTP endpoint) starts to indicate an unhealthy state (e.g., HTTP 500 errors), Kubernetes can automatically kill the pod and restart it.

When we used the JBoss Forge tooling and the fabric8-setup command from the Fabric8 addon, a readiness probe was automatically added to our Kubernetes manifest by adding the following Maven properties to the respective pom.xml. If it wasn’t, you can use the command fabric8-readiness-probe or fabric8-liveness-probe to add it to an existing project:

<fabric8.readinessProbe.httpGet.path>
    /health
</fabric8.readinessProbe.httpGet.path>
<fabric8.readinessProbe.httpGet.port>
    8080
</fabric8.readinessProbe.httpGet.port>
<fabric8.readinessProbe.initialDelaySeconds>
    5
</fabric8.readinessProbe.initialDelaySeconds>
<fabric8.readinessProbe.timeoutSeconds>
    30
</fabric8.readinessProbe.timeoutSeconds>

The Kubernetes JSON that gets generated for including these Maven properties includes:

"readinessProbe" : {
  "httpGet" : {
    "path" : "/health",
    "port" : 8080
  },

This means the “readiness” quality of the hola-springboot pod will be determined by periodically polling the /health endpoint of our pod. When we added the actuator to our Spring Boot microservice earlier, a /health endpoint was added which returns:

{
  "diskSpace": {
    "free": 106880393216,
    "status": "UP",
    "threshold": 10485760,
    "total": 107313364992
  },
  "status": "UP"
}

The same thing can be done with Dropwizard and WildFly Swarm!

Circuit Breaker

As a service provider, your responsibility is to your consumers to provide the functionality you’ve promised. Following promise theory, a service provider may depend on other services or downstream systems but cannot and should not impose requirements upon them. A service provider is wholly responsible for its promise to consumers. Because distributed systems can and do fail, there will be times when service promises can’t be met or can be only partly met. In our previous examples, we showed our Hola apps reaching out to a backend service to form a greeting at the /api/greeting endpoint. What happens if the backend service is not available? How do we hold up our end of the promise?

We need to be able to deal with these kinds of distributed systems faults. A service may not be available; a network may be experiencing intermittent connectivity; the backend service may be experiencing enough load to slow it down and introduce latency; a bug in the backend service may be causing application-level exceptions. If we don’t deal with these situations explicitly, we run the risk of degrading our own service, holding up threads, database locks, and resources, and contributing to rolling, cascading failures that can take an entire distributed network down. To help us account for these failures, we’re going to leverage a library from the NetflixOSS stack named Hystrix.

Hystrix is a fault-tolerant Java library that allows microservices to hold up their end of a promise by:

  • Providing protection against dependencies that are unavailable

  • Monitoring and providing timeouts to guard against unexpected dependency latency

  • Load shedding and self-healing

  • Degrading gracefully

  • Monitoring failure states in real time

  • Injecting business-logic and other stateful handling of faults

With Hystrix, you wrap any call to your external dependencies with a HystrixCommand and implement the possibly faulty calls inside the run() method. To help you get started, let’s look at implementing a HystrixCommand for the hola-wildflyswarm project. Note for this example, we’re going to follow the Netflix best practices of making everything explicit, even if that introduces some boilerplate code. Debugging distributed systems is difficult and having exact stack traces for your code without too much magic is more important than hiding everything behind complicated magic that becomes impossible to debug at runtime. Even though the Hystrix library has annotations for convenience, we’ll stick with implementing the Java objects directly for this book and leave it to the reader to explore the more mystical ways to use Hystrix.

First let’s add the hystrix-core dependency to our Maven pom.xml:

  <dependency>
    <groupId>com.netflix.hystrix</groupId>
    <artifactId>hystrix-core</artifactId>
    <version>${hystrix.version}</version>
  </dependency>

Let’s create a new Java class called BackendCommand that extends from HystrixCommand in our hola-wildflyswarm project shown in Example 6-1.

Example 6-1. src/main/java/com/redhat/examples/wfswarm/rest/BackendCommand
public class BackendCommand extends HystrixCommand<BackendDTO> {

    private String host;
    private int port;
    private String saying;

    public BackendCommand(String host, int port) {
        super(HystrixCommandGroupKey.Factory
            .asKey("wfswarm.backend"));
        this.host = host;
        this.port = port;
    }

    public BackendCommand withSaying(String saying) {
        this.saying = saying;
        return this;
    }

    @Override
    protected BackendDTO run() throws Exception {
        String backendServiceUrl =
            String.format("http://%s:%d",  host, port);

        System.out.println("Sending to: " + backendServiceUrl);

        Client client = ClientBuilder.newClient();
        return client.target(backendServiceUrl)
                .path("api")
                .path("backend")
                .queryParam("greeting", saying)
                .request(MediaType.APPLICATION_JSON_TYPE)
                .get(BackendDTO.class);

    }

}

You can see here we’ve extended HystrixCommand and provided our BackendDTO class as the type of response our command object will return. We’ve also added some constructor and builder methods for configuring the command. Lastly, and most importantly, we’ve added a run() method here that actually implements the logic for making an external call to the backend service. Hystrix will add thread timeouts and fault behavior around this run() method.

What happens, though, if the backend service is not available or becomes latent? You can configure thread timeouts and rate of failures which would trigger circuit-breaker behavior. A circuit breaker in this case will simply open a circuit to the backend service by not allowing any calls to go through (failing fast) for a period of time. The idea with this circuit-breaker behavior is to allow any backend remote resources time to recover or heal without continuing to take load and possibly further cause it to persist or degrade into unhealthy states.

You can configure Hystrix by providing configuration keys, JVM system properties, or by using a type-safe DSL for your command object. For example, if we want to enable the circuit breaker (default true) and open the circuit if we get five or more failed requests (timeout, network error, etc.) within five seconds, we could pass the following into the constructor of our BackendCommand object:

  public BackendCommand(String host, int port) {
      super(Setter.withGroupKey(
        HystrixCommandGroupKey.Factory
          .asKey("wildflyswarm.backend"))
          .andCommandPropertiesDefaults(
                 HystrixCommandProperties.Setter()
                 .withCircuitBreakerEnabled(true)
                 .withCircuitBreakerRequestVolumeThreshold(5)
                 .withMetricsRollingStatistical 
                 WindowInMilliseconds(5000)
              ))
      ;
      this.host = host;
      this.port = port;
  }

Please see the Hystrix documentation for more advanced configurations as well as for how to externalize the configurations or even configure them dynamically at runtime.

If a backend dependency becomes latent or unavailable and Hystrix intervenes with a circuit breaker, how does our service keep its promise? The answer to this may be very domain specific. For example, if we consider a team that is part of a personalization service, we want to display custom book recommendations for a user. We may end up calling the book-recommendation service, but what if it isn’t available or is too slow to respond? We could degrade to a book list that may not be personalized; maybe we’d send back a book list that’s generic for users in a particular region. Or maybe we’d not send back any personalized list and just a very generic “list of the day.” To do this, we can use Hystrix’s built-in fallback method. In our example, if the backend service is not available, let’s add a fallback method to return a generic BackendDTO response:

public class BackendCommand extends HystrixCommand<BackendDTO> {

   <rest of class here>

    @Override
    protected BackendDTO getFallback() {
        BackendDTO rc = new BackendDTO();
        rc.setGreeting("Greeting fallback!");
        rc.setIp("127.0.0,1");
        rc.setTime(System.currentTimeMillis());
        return rc;
    }


}

Our /api/greeting-hystrix service should not be able to service a client and hold up part of its promise, even if the backend service is not available.

Note this is a contrived example, but the idea is ubiquitous. However, the application of whether to fallback or gracefully degrade versus breaking a promise is very domain specific. For example, if you’re trying to transfer money in a banking application and a backend service is down, you may wish to reject the transfer. Or you may wish to make only a certain part of the transfer available while the backend gets reconciled. Either way, there is no one-size-fits-all fallback method. In general, coming up with the fallback is related to what kind of customer experience gets exposed and how best to gracefully degrade considering the domain.

Bulkhead

Hystrix offers some powerful features out of the box, as we’ve seen. One more failure mode to consider is when services become latent but not latent enough to trigger a timeout or the circuit breaker. This is one of the worst situations to deal with in distributed systems as latency like this can quickly stall (or appear to stall) all worker threads and cascade the latency all the way back to users. We would like to be able to limit the effect of this latency to just the dependency that’s causing the slowness without consuming every available resource. To accomplish this, we’ll employ a technique called the bulkhead. A bulkhead is basically a separation of resources such that exhausting one set of resources does not impact others. You often see bulkheads in airplanes or trains dividing passenger classes or in boats used to stem the failure of a section of the boat (e.g., if there’s a crack in the hull, allow it to fill up a specific partition but not the entire boat).

Hystrix implements this bulkhead pattern with thread pools. Each downstream dependency can be allocated a thread pool to which it’s assigned to handle external communication. Netflix has benchmarked the overhead of these thread pools and has found for these types of use cases, the overhead of the context switching is minimal, but it’s always worth benchmarking in your own environment if you have concerns. If a dependency downstream becomes latent, then the thread pool assigned to that dependency can become fully utilized, but other requests to the dependency will be rejected. This has the effect of containing the resource consumption to just the degraded dependency instead of cascading across all of our resources.

If the thread pools are a concern, Hystrix also can implement the bulkhead on the calling thread with counting semaphores. Refer to the Hystrix documentation for more information.

The bulkhead is enabled by default with a thread pool of 10 worker threads with no BlockingQueue as a backup. This is usually a sufficient configuration, but if you must tweak it, refer to the configuration documentation of the Hystrix component. Configuration would look something like this (external configuration is possible as well):

  public BackendCommand(String host, int port) {
      super(Setter.withGroupKey(
        HystrixCommandGroupKey.Factory
           .asKey("wildflyswarm.backend"))
           .andThreadPoolPropertiesDefaults(
                 HystrixThreadPoolProperties.Setter()
                 .withCoreSize(10)
                 .withMaxQueueSize(-1))
           .andCommandPropertiesDefaults(
                 HystrixCommandProperties.Setter()
                 .withCircuitBreakerEnabled(true)
                 .withCircuitBreakerRequestVolumeThreshold(5)
                 .withMetricsRollingStatisticalWindow 
                 InMilliseconds(5000)
              ))
      ;
      this.host = host;
      this.port = port;
  }

To test out this configuration, let’s build and deploy the hola-wildflyswarm project and play around with the environment.

Build Docker image and deploy to Kubernetes:

$ mvn -Pf8-local-deploy

Let’s verify the new /api/greeting-hystrix endpoint is up and functioning correctly (this assumes you’ve been following along and still have the backend service deployed; refer to previous sections to get that up and running):

$ oc get pod
NAME                      READY   STATUS    RESTARTS   AGE
backend-pwawu             1/1     Running   0          18h
hola-dropwizard-bf5nn     1/1     Running   0          19h
hola-springboot-n87w3     1/1     Running   0          19h
hola-wildflyswarm-z73g3   1/1     Running   0          18h

Let’s port-forward the hola-wildflyswarm pod again so we can reach it locally. Recall this is a great benefit of using Kubernetes that you can run this command regardless of where the pod is actually running in the cluster:

$ oc port-forward -p hola-wildflyswarm-z73g3 9000:8080

Now let’s navigtate to http://localhost:9000/api/greeting-hystrix:

greeting hystrix

Now let’s take down the backend service by scaling its ReplicationController replica count down to zero:

$ oc scale rc/backend --replicas=0

By doing this, there should be no backend pods running:

$ oc get pod
NAME                      READY     STATUS        RESTARTS   AGE
backend-pwawu             1/1       Terminating   0          18h
hola-dropwizard-bf5nn     1/1       Running       0          19h
hola-springboot-n87w3     1/1       Running       0          19h
hola-wildflyswarm-z73g3   1/1       Running       0          18h

Now if we refresh our browser pointed at http://localhost:9000/api/greeting-hystrix, we should see the service degrade to using the Hystrix fallback method:

greeting hystrix

Load Balancing

In a highly scaled distributed system, we need a way to discover and load balance against services in the cluster. As we’ve seen in previous examples, our microservices must be able to handle failures; therefore, we have to be able to load balance against services that exist, services that may be joining or leaving the cluster, or services that exist in an autoscaling group. Rudimentary approaches to load balancing, like round-robin DNS, are not adequate. We may also need sticky sessions, autoscaling, or more complex load-balancing algorithms. Let’s take a look at a few different ways of doing load balancing in a microservices environment.

Kubernetes Load Balancing

The great thing about Kubernetes is that it provides a lot of distributed-systems features out of the box; no need to add any extra components (server side) or libraries (client side). Kubernetes Services provided a means to discover microservices and they also provide server-side load balancing. If you recall, a Kubernetes Service is an abstraction over a group of pods that can be specified with label selectors. For all the pods that can be selected with the specified selector, Kubernetes will load balance any requests across them. The default Kubernetes load-balancing algorithm is round robin, but it can be configured for other algorithms such as session affinity. Note that clients don’t have to do anything to add a pod to the Service; just adding a label to your pod will enable it for selection and be available. Clients reach the Kubernetes Service by using the cluster IP or cluster DNS provided out of the box by Kubernetes. Also recall the cluster DNS is not like traditional DNS and does not fall prey to the DNS caching TTL problems typically encountered with using DNS for discovery/load balancing. Also note, there are no hardware load balancers to configure or maintain; it’s all just built in.

To demonstrate load balancing, let’s scale up the backend services in our cluster:

$ oc scale rc/backend --replicas=3

Now if we check our pods, we should see three backend pods:

$ oc get pod
NAME                      READY     STATUS    RESTARTS   AGE
backend-8ywcl             1/1       Running   0          18h
backend-d9wm6             1/1       Running   0          18h
backend-vt61x             1/1       Running   0          18h
hola-dropwizard-bf5nn     1/1       Running   0          20h
hola-springboot-n87w3     1/1       Running   0          20h
hola-wildflyswarm-z73g3   1/1       Running   0          19h

If we list the Kubernetes services available, we should see the backend service as well as the selector used to select the pods that will be eligible for taking requests. The Service will load balance to these pods:

$ oc get svc
NAME                CLUSTER_IP         PORT(S)
backend             172.30.231.63      80/TCP
hola-dropwizard     172.30.124.61      80/TCP
hola-springboot     172.30.55.130      80/TCP
hola-wildflyswarm   172.30.198.148     80/TCP

We can see here that the backend service will select all pods with labels component=backend and provider=fabric8. Let’s take a quick moment to see what labels are on one of the backend pods:

$ oc describe pod/backend-8ywcl | grep Labels
Labels:                     component=backend,provider=fabric8

We can see that the backend pods have the labels that match what the service is looking for; so any time we communicate with the service, we will be load balanced over these matching pods.

Let’s make a call to our hola-wildflyswarm service. We should see the response contain different IP addresses for the backend service:

$ oc port-forward -p hola-wildflyswarm-z73g3 9000:8080

$ curl http://localhost:9000/api/greeting
Hola from cluster Backend at host: 172.17.0.45

$ curl http://localhost:9000/api/greeting
Hola from cluster Backend at host: 172.17.0.44

$ curl http://localhost:9000/api/greeting
Hola from cluster Backend at host: 172.17.0.46

Here we enabled port forwarding so that we can reach our hola-wildflyswarm service and tried to access the http://localhost:9000/api/greeting endpoint. I used curl here, but you can use your favorite HTTP/REST tool, including your web browser. Just refresh your web browser a few times to see that the backend, which gets called is different each time. The Kubernetes Service is load balancing over the respective pods as expected.

Do We Need Client-Side Load Balancing?

Client-side load balancers can be used inside Kubernetes if you need more fine-grained control or domain-specific algorithms for determining which service or pod you need to send to. You can even do things like weighted load balancing, skipping pods that seem to be faulty, or some custom-based Java logic to determine which service/pod to call. The downside to client-side load balancing is that it adds complexity to your application and is often language specific. In a majority of cases, you should prefer to use the technology-agnostic, built-in Kubernetes service load balancing. If you find you’re in a minority case where more sophisticated load balancing is required, consider a client-side load balancer like SmartStack, bakerstreet.io, or NetflixOSS Ribbon.

In this example, we’ll use NetflixOSS Ribbon to provide client-side load balancing. There are different ways to use Ribbon and a few options for registering and discovering clients. Service registries like Eureka and Consul may be good options in some cases, but when running within Kubernetes, we can just leverage the built-in Kubernetes API to discover services/pods. To enable this behavior, we’ll use ribbon-discovery project from Kubeflix. Let’s enable the dependencies in our pom.xml that we’ll need:

    <dependency>
      <groupId>org.wildfly.swarm</groupId>
      <artifactId>ribbon</artifactId>
    </dependency>
      <dependency>
        <groupId>io.fabric8.kubeflix</groupId>
        <artifactId>ribbon-discovery</artifactId>
        <version>${kubeflix.version}</version>
    </dependency>

For Spring Boot we could opt to use Spring Cloud, which provides convenient Ribbon integration, or we could just use the NetflixOSS dependencies directly:

    <dependency>
      <groupId>com.netflix.ribbon</groupId>
      <artifactId>ribbon-core</artifactId>
      <version>${ribbon.version}</version>
    </dependency>
    <dependency>
      <groupId>com.netflix.ribbon</groupId>
      <artifactId>ribbon-loadbalancer</artifactId>
      <version>${ribbon.version}</version>
    </dependency>

Once we’ve got the right dependencies, we can configure Ribbon to use Kubernetes discovery:

      loadBalancer = LoadBalancerBuilder.newBuilder()
              .withDynamicServerList(
                  new KubernetesServerList(config))
              .buildDynamicServerListLoadBalancer();

Then we can use the load balancer with the Ribbon LoadBalancerCommand:

    @Path("/greeting-ribbon")
    @GET
    public String greetingRibbon() {
        BackendDTO backendDTO = LoadBalancerCommand.
        <BackendDTO>builder()
        .withLoadBalancer(loadBalancer)
        .build()
        .submit(new ServerOperation<BackendDTO>() {
            @Override
            public Observable<BackendDTO> call(Server server) {
                String backendServiceUrl = String.format(
                    "http://%s:%d",
                    server.getHost(), server.getPort());

                System.out.println("Sending to: " +
                    backendServiceUrl);

                Client client = ClientBuilder.newClient();
                return Observable.just(client
                       .target(backendServiceUrl)
                       .path("api")
                       .path("backend")
                       .queryParam("greeting", saying)
                       .request(MediaType.APPLICATION_JSON_TYPE)
                       .get(BackendDTO.class));
            }
        }).toBlocking().first();
        return backendDTO.getGreeting() + " at host: " +
            backendDTO.getIp();
    }

See the accompanying source code for the exact details.

Where to Look Next

In this chapter, we learned a little about the pains of deploying and managing microservices at scale and how Linux containers can help. We can leverage true immutable delivery to reduce configuration drift, and we can use Linux containers to enable service isolation, rapid delivery, and portability. We can leverage scalable container management systems like Kubernetes and take advantage of a lot of distributed-system features like service discovery, failover, health-checking (and more!) that are built in. You don’t need complicated port swizzling or complex service discovery systems when deploying on Kubernetes because these are problems that have been solved within the infrastructure itself. To learn more, please review the following links:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.41.212