In Chapter 5, we just had a quick introduction to cluster management, Linux containers, and cluster management. Let’s jump into using these things to solve issues with running microservices at scale. For reference, we’ll be using the microservice projects we developed in Chapters 2, 3, and 4 (Spring Boot, Dropwizard, and WildFly Swarm, respectively). The following steps can be accomplished with any of the three Java frameworks.
To package our microservice as a Docker image and eventually deploy it to Kubernetes, let’s navigate to our project (Spring Boot example in this case) and return to JBoss Forge. JBoss Forge has some plug-ins for making it easy to quickly add the Maven plug-ins we need to use:
$
cd
hola-springboot$
forge
Now let’s install a JBoss Forge addon:
hola-springboot]
$
addon-install--coordinate io.fabric8.forge:devops,2.2.148 ***SUCCESS*** Addon io.fabric8.forge:devops,2.2.148 was installed successfully.
Now let’s add the Maven plug-ins:
[
hola-springboot]
$
fabric8-setup ***SUCCESS*** Added Fabric8 Maven support with base Docker image: fabric8/java-jboss-openjdk8-jdk:1.0.10. Added the following Maven profiles[
f8-build, f8-deploy, f8-local-deploy]
to make building the project easier, e.g., mvn -Pf8-local-deploy
Let’s take a look at what the tooling did. If we open the pom.xml file, we see it added some properties:
<docker.assemblyDescriptorRef>
artifact</docker.assemblyDescriptorRef>
<docker.from>
docker.io/fabric8/java-jboss-openjdk8-jdk:1.0.10</docker.from>
<docker.image>
fabric8/${project.artifactId}:${project.version}</docker.image>
<docker.port.container.http>
8080</docker.port.container.http>
<docker.port.container.jolokia>
8778</docker.port.container.jolokia>
<fabric8.iconRef>
icons/spring-boot</fabric8.iconRef>
<fabric8.service.containerPort>
8080</fabric8.service.containerPort>
<fabric8.service.name>
hola-springboot</fabric8.service.name>
<fabric8.service.port>
80</fabric8.service.port>
<fabric8.service.type>
LoadBalancer</fabric8.service.type>
It also added two Maven plug-ins: docker-maven-plugin
and
fabric8-maven-plugin
:
<plugin>
<groupId>
io.fabric8</groupId>
<artifactId>
docker-maven-plugin</artifactId>
<version>
0.14.2</version>
<configuration>
<images>
<image>
<name>
${docker.image}</name>
<build>
<from>
${docker.from}</from>
<assembly>
<basedir>
/app</basedir>
<descriptorRef>
${docker.assemblyDescriptorRef}</descriptorRef >
</assembly>
<env>
<JAR>
${project.artifactId}-${project.version}.war</JAR>
<JAVA_OPTIONS>
-Djava.security.egd=/dev/./urandom<
/JAVA_OPTIONS>
</env>
</build>
</image>
</images>
</configuration>
</plugin>
<plugin>
<groupId>
io.fabric8</groupId>
<artifactId>
fabric8-maven-plugin</artifactId>
<version>
2.2.100</version>
<executions>
<execution>
<id>
json</id>
<phase>
generate-resources</phase>
<goals>
<goal>
json</goal>
</goals>
</execution>
<execution>
<id>
attach</id>
<phase>
package</phase>
<goals>
<goal>
attach</goal>
</goals>
</execution>
</executions>
</plugin>
Lastly, the tooling added some convenience Maven profiles:
f8-build
Build the docker image and Kubernetes manifest YML.
f8-deploy
Build the docker image and deploy to a remote docker registry; then deploy the application to Kubernetes.
f8-local-deploy
Build the docker image, generate the Kubernetes manifest.yml, and deploy to a locally running Kubernetes.
The JBoss Forge addon is part of the Fabric8 open source project. Fabric8 builds developer tooling for interacting with Docker, Kubernetes, and OpenShift, including Maven plug-ins, variable injection libraries for Spring/CDI, and clients for accessing the Kubernetes/OpenShift API. Fabric8 also builds API management, CI/CD, chaos monkey and Kubernetes-based NetflixOSS functionality on top of Kubernetes.
With the Maven plug-ins added from the previous step, all we have to do to build the docker image is run the following Maven command. This step, and all others related to building Docker images or deploying to Kubernetes, assume the CDK (earlier in this chapter) is up and running:
$
mvn -Pf8-build[
INFO]
DOCKER> ... d3f157b39583 Pullcomplete
=============
10%============
20%============
30%============
40%============
50%=============
60%============
70%============
80%============
90%============
100%=
[
INFO]
DOCKER> ... f5a6e0d26670 Pullcomplete
=
100%==
[
INFO]
DOCKER> ... 6d1f91fc8ac8 Pullcomplete
=
100%==
[
INFO]
DOCKER> ... 77c58da5314d Pullcomplete
=
100%==
[
INFO]
DOCKER> ... 1416b43aef4d Pullcomplete
=
100%==
[
INFO]
DOCKER> ... fcc736051e6e Pullcomplete
[
INFO]
DOCKER> ... Digest: sha256:e77380a4924bb599162e3382e6443e 8aa50c0[
INFO]
DOCKER> ... Downloaded imagefor
java-jboss-openjdk8-jdk: 1.0.10[
INFO]
DOCKER>[
fabric8/hola-springboot:1.0]
: Built image 13e72 5c3c771[
INFO]
[
INFO]
fabric8-maven-plugin:2.2.100:json(
default-cli)
@ hola- springboot[
INFO]
Configured with file: /Users/ceposta/dev/sandbox/micro services-by-example/source/spring-boot/hola-springboot/target /classes/kubernetes.json[
INFO]
Generated env mappings:{}
[
INFO]
Generated port mappings:{
http
=
ContainerPort(
containerPort
=
8080,hostIP
=
null,hostPort
=
null,name
=
http,protocol
=
null,additionalProperties
={})
,jolokia
=
ContainerPort(
containerPort
=
8778,hostIP
=
null,hostPort
=
null,name
=
jolokia,protocol
=
null,additionalProperties
={})}
[
INFO]
Removed'version'
label from service selectorfor
service``
[
INFO]
Generated ports:[
ServicePort(
name
=
null,nodePort
=
null,port
=
80,protocol
=
TCP,targetPort
=
IntOrString(
IntVal
=
8080,Kind
=
null,StrVal
=
null,additionalProperties
={})
,additionalProperties
={})]
[
INFO]
Icon URL: img/icons/spring-boot.svg[
INFO]
Looking at repo with directory /microservices-by-example /.git[
INFO]
Added environment annotations:[
INFO]
Service hola-springboot selector:{
project
=
hola- springboot,[
INFO]
provider
=
fabric8,group
=
com.redhat.examples}
ports: 80[
INFO]
ReplicationController hola-springboot replicas: 1,[
INFO]
image: fabric8/hola-springboot:1.0[
INFO]
Template is now:[
INFO]
Service hola-springboot selector:{
project
=
hola- springboot,[
INFO]
provider
=
fabric8,group
=
com.redhat.examples}
ports: 80[
INFO]
ReplicationController hola-springboot replicas: 1,[
INFO]
image: fabric8/hola-springboot:1.0[
INFO]
------------------------------------------------------[
INFO]
BUILD SUCCESS[
INFO]
------------------------------------------------------[
INFO]
Totaltime
: 04:22 min[
INFO]
Finished at: 2016-03-31T15:59:58-07:00[
INFO]
Final Memory: 47M/560M[
INFO]
------------------------------------------------------
If we have the Docker tooling installed, we should see that our microservice has been packaged in a Docker container:
$
docker images langswif01(
cdk-v2(
master))
$
docker images REPOSITORY TAG IMAGE ID CREATED SIZE fabric8/hola-springboot 1.0 13e725c3c771 3d ago 439.7 MB
We could start up the Docker container using docker run
, but we want to deploy this into a cluster and leave the management of the microservice to Kubernetes. Let’s deploy it with the following Maven command:
$
mvn -Pf8-local-deploy
If your environment is configured correctly (i.e., you’ve started the CDK, installed the oc
tooling, logged in with the oc login
, and created a new project with oc new-project microservices-book
), you should see a successful build similar to this:
[
INFO]
--- fabric8-maven-plugin:apply(
default-cli)
@ hola- springboot ---[
INFO]
Using https://10.1.2.2:8443/ in namespace microservice- book[
INFO]
Kubernetes JSON: /Users/ceposta/dev/sandbox[
INFO]
/microservices-by-example/source/spring-boot/hola- springboot[
INFO]
/target/classes/kubernetes.json[
INFO]
OpenShift platform detected[
INFO]
Using namespace: microservice-book[
INFO]
Creating a Template from kubernetes.json namespace[
INFO]
microservice-book name hola-springboot[
INFO]
Created Template: target/fabric8/applyJson/microservice- book/[
INFO]
template-hola-springboot.json[
INFO]
Looking at repo with directory /Users/ceposta/dev/ sandbox/[
INFO]
microservices-by-example/.git[
INFO]
Creating a Service from kubernetes.json namespace[
INFO]
microservice-book name hola-springboot[
INFO]
Created Service: target/fabric8/applyJson/microservice- book[
INFO]
/service-hola-springboot.json[
INFO]
Creating a ReplicationController from kubernetes.json namespace[
INFO]
microservice-book name hola-springboot[
INFO]
Created ReplicationController: target/fabric8/applyJson[
INFO]
/microservice-book/replicationcontroller-hola- springboot.json[
INFO]
Creating Route microservice-book:hola-springboot host:[
INFO]
-------------------------------------------------------[
INFO]
BUILD SUCCESS[
INFO]
-------------------------------------------------------[
INFO]
Totaltime
: 19.101 s[
INFO]
Finished at: 2016-04-04T09:05:02-07:00[
INFO]
Final Memory: 52M/726M[
INFO]
-------------------------------------------------------
Let’s take a quick look at what the fabric8-maven-plugin
plug-in did
for us.
First, Kubernetes exposes a REST API that allows us to manipulate the
cluster (what’s deployed, how many, etc.). Kubernetes follows a
“reconciliation of end state” model where you describe what you want
your deployment to look like and Kubernetes makes it happen. This is similar to how some configuration management systems work where you
declaratively express what should be deployed and not how it should be
accomplished. When we post data to the Kubernetes REST API, Kubernetes
will reconcile what needs to happen inside the cluster. For example, if “we want a pod running hola-springboot
" we can make an HTTP POST to
the REST API with a JSON/YAML manifest file, and Kubernetes will create
the pod, create the Docker containers running inside that pod, and
schedule the pod to one of the hosts in the cluster. A Kubernetes pod is an atomic unit that can be scheduled within a Kubernetes cluster. It
typically consists of a single Docker container, but it can contain many Docker containers. In our code samples, we will treat our hello-world deployments as a single Docker container per pod.
The fabric8-maven-plugin
we used in the preceding code example will automatically generate the REST objects inside a JSON/YAML manifest file for us and POST this data to the Kubernetes API. After running the mvn -Pf8-local-deploy
command successfully, we should be able to navigate to the webconsole (https://10.1.2.2:8443) or using the CLI tooling to see our new pod
running our hello-springboot application:
$
oc get pod NAME READY STATUS RESTARTS AGE hola-springboot-8xtdm 1/1 Running0
3d
At this point we have a single pod running in our cluster. What advantage does Kubernetes bring as a cluster manager? Let’s start by exploring the first of many. Let’s kill the pod and see what happens:
$
oc delete pod/hola-springboot-8xtdm pod"hola-springboot-8xtdm"
deleted
Now let’s list our pods again:
$
oc get pod NAME READY STATUS RESTARTS AGE hola-springboot-42p89 0/1 Running0
3d
Wow! It’s still there! Or, more correctly, another pod has been created after we deleted the previous one. Kubernetes can start/stop/auto-restart your microservices for you. Can you imagine what a headache it would be to determine whether a service is started/stopped at any kind of scale? Let’s continue exploring some of the other valuable cluster management features Kubernetes brings to the table for managing microservices.
One of the advantages of deploying in a microservices architecture is independent scalability. We should be able to replicate the number of services in our cluster easily without having to worry about port conflicts, JVM or dependency mismatches, or what else is running on the same machine. With Kubernetes, these types of scaling concerns can be accomplished with the ReplicationController. Let’s see what replication controllers exist in our deployment:
$
oc get replicationcontroller CONTROLLER CONTAINER(
S)
IMAGE(
S)
hola-springboot hola-springboot fabric8/hola-springboot:1.0 SELECTORgroup
=
com.redhat.examples,project=
hola-springboot,provider
=
fabric8,version=
1.0 REPLICAS AGE1
3d
We can also abbreviate the command:
$
oc get rc
One of the objects that the fabric8-maven-plugin
created for us is the
ReplicationController
with a replica
value of 1. This means we want
to always have one pod/instance of our microservice at all times. If a
pod dies (or gets deleted), then Kubernetes is charged with
reconciling the desired state for us, which is replicas=1
. If the
cluster is not in the desired state, Kubernetes will take action to make
sure the desired configuration is satisfied. What happens if we want to
change the desired number of replicas and scale up our service?
$
oc scale rc hola-springboot --replicas=
3 replicationcontroller"hola-springboot"
scaled
Now if we list the pods, we should see three pods of our hello-springboot application:
$$
oc get pod NAME READY STATUS RESTARTS AGE hola-springboot-42p89 1/1 Running0
3d hola-springboot-9s6a6 1/1 Running0
3d hola-springboot-np2l1 1/1 Running0
3d
Now if any of our pods dies or gets deleted, Kubernetes will do what it
needs to do to make sure the replica count is 3. Notice, also, that we
didn’t have to change ports on these services or do any unnatural port
remapping. Each one of the services is listening on port 8080
and does
not collide with the others.
Let’s go ahead and scale down to 0 to get ready for the next section; we can just run the same command:
$
oc scale rc hola-springboot --replicas=
0
Kubernetes also has the ability to do autoscaling by watching metrics like CPU, memory usage, or user-defined triggers, to scale the number of replicas up or down. Autoscaling is outside the scope of this book but is a very valuable piece of the cluster-management puzzle.
One last concept in Kubernetes that we should understand is
Service
. In Kubernetes, a Service
is a simple abstraction that
provides a level of indirection between a group of pods and an
application using the service represented by that group of pods. We’ve
seen how pods are managed by Kubernetes and can come and go. We’ve also
seen how Kubernetes can easily scale up the number of instances of a
particular service. In our example, we’re going to start our backend
service from the previous chapters to play the role of service provider.
How will our hola-springboot
communicate with the backend
?
Let’s run the backend
service by navigating to our source code to the
folder backend
and deploy it locally to our locally running Kubernetes
cluster running in the CDK:
$
mvn -Pf8-local-deploy
Let’s take a look at what Kubernetes services exist:
$
oc get service NAME CLUSTER_IP EXTERNAL_IP PORT(
S)
backend 172.30.231.63 80/TCP hola-springboot 172.30.202.59 80/TCP SELECTOR AGEcomponent
=
backend,provider=
fabric8 3dgroup
=
com.redhat.examples,project=
hola-springboot,provider
=
fabric8 3d
Note the Service
objects get automatically created by the
fabric8-maven-plugin
just like the ReplicationController
objects in
the previous section. There are two interesting attributes of a service
that appear in the preceding code example. One is the CLUSTER_IP.
This is a virtual IP that is assigned when a Service
object is created and never goes away. It’s a single, fixed IP that is available to any applications running within the Kubernetes cluster and can be used to talk to backend
pods. The pods are “selected” with the SELECTOR
field. Pods in Kubernetes can be “labeled” with whatever metadata you want to apply (like “version” or “component” or “team”) and can subsequently be used in the selector for a Service
. In this example, we’re selecting all the pods with label component
=backend
and provider
=fabric8
. This means any pods that are “selected” by the selector can be reached just by using the cluster IP. No need for complicated distributed registries (e.g., Zookeeper, Consul, or Eureka) or anything like that. It’s all built right into Kubernetes. Cluster-level DNS is also built into Kubernetes. Using DNS in general for microservice service discovery can be very challenging and downright painful. In Kubernetes, the cluster DNS points to the cluster IP; and since the cluster IP is a fixed IP and doesn’t go away, there are no issues with DNS caching and other gremlins that can pop up with traditional DNS.
Let’s add a couple environment variables to our hola-springboot
project to use our backend
service when running inside a Kubernetes
cluster:
<fabric8.env.GREETING_BACKENDSERVICEHOST>
backend<
/fabric8.env.GREETING_BACKENDSERVICEHOST>
<fabric8.env.GREETING_BACKENDSERVICEPORT>
80</fabric8.env.GREETING_BACKENDSERVICEPORT>
Let’s build the Kubernetes manifest and verify we’re passing in these
environment variables to our pod. Note that Spring Boot will resolve
configuration from application.properties
but can be overridden with
system properties and environment variables at runtime:
$
mvn fabric8:json
Inspect file target/classes/kubernetes.json:
"containers"
:
[
{
"args"
:
[
],
"command"
:
[
],
"env"
:
[
{
"name"
:
"GREETING_BACKENDSERVICEHOST"
,
"value"
:
"backend"
},
{
"name"
:
"GREETING_BACKENDSERVICEPORT"
,
"value"
:
"80"
},
{
"name"
:
"KUBERNETES_NAMESPACE"
,
"valueFrom"
:
{
"fieldRef"
:
{
"fieldPath"
:
"metadata.namespace"
}
}
}
],
Let’s delete all of the pieces of the hola-springboot
project and
redeploy:
$
oc delete all -lproject
=
hola-springboot$
mvn -Pf8-local-deploy
We should now be able to list the pods and see our hola-springboot
pod
running as well as our backend service pod:
$
oc get pod NAME READY STATUS RESTARTS AGE backend-nk224 1/1 Running5
3d hola-springboot-r5ykr 1/1 Running0
2m
Now, just to illustrate a handy debugging technique, we’re going to set
up port-forwarding between our local machine and our
hola-springboot-r5ykr
pod and verify that our service is working correctly and we can call the backend. Let’s set up portforward to port 9000
on our local machine:
$
oc port-forward -p hola-springboot-r5ykr 9000:8080
We should not be able to communicate with the pod over our localhost
port 9000
. Note this technique works great even across a remote
Kubernetes cluster, not just on our local CDK. So instead of having to
try and find which host our pod is running and how to ssh
into it, we can just use oc port-forward
to expose it locally.
So now we should be able to navigate locally using our browser or a CLI command line:
$
curl http://localhost:9000/api/hola
Hola Spring Boot de 172.17.0.9
We can see the /api/hola endpoint at http://localhost:9000
using our port-forwarding! We also see that the /api/hola
endpoint is returning the IP address of the pod in which it’s running. Let’s call the /api/greeting API, which is supposed to call our backend:
$
curl http://localhost:9000/api/greeting
Hola Spring Boot from cluster Backend at host: 172.17.0.5
We can see that the backend
pod returns its IP address in this call!
So our service was discovered correctly, and all it took was a little bit of DNS and the power of Kubernetes service discovery. One big thing to notice about this approach is that we did not specify any extra client libraries or set up any registries or anything. We happen to be using Java in this case, but using Kubernetes cluster DNS provides a
technology-agnostic way of doing basic service discovery!
Complex distributed systems like a microservice architecture must be built with an important premise in mind: things will fail. We can spend a lot of energy preventing things from failing, but even then we won’t be able to predict every case where and how dependencies in a microservice environment can fail. A corollary to our premise of “things will fail” is that “we design our services for failure.” Another way of saying that is “figure out how to survive in an environment where there are failures.”
If a service begins to misbehave, how will we know about it? Ideally our cluster management solution can detect and alert us about failures and let human intervention take over. This is the approach we typically take in traditional environments. When running microservices at scale, where we have lots of services that are supposed to be identical, do we really want to stop and troubleshoot every possible thing that can go wrong with a single service? Long-running services may experience unhealthy states. An easier approach is to design our microservices such that they can be terminated at any moment, especially when they appear to be behaving incorrectly.
Kubernetes has a couple of health probes we can use out of the box to allow the cluster to administer and self-heal itself. The first is a readiness probe, which allows Kubernetes to determine whether or not a pod should be considered in any service discovery or load-balancing algorithms. For example, some Java apps may take a few seconds to bootstrap the containerized process, even though the pod is technically up and running.
If we start sending traffic to a pod in this state, users may experience failures or inconsistent states. With readiness probes, we can let Kubernetes query an HTTP endpoint (for example) and only consider the pod ready if it gets an HTTP 200 or some other response. If Kubernetes determines a pod does not become ready within a specified period of time, the pod will be killed and restarted.
Another health probe we can use is a liveness probe. This is similar to the readiness probe; however, it’s applicable after a pod has been determined to be “ready” and is eligible to receive traffic. Over the course of the life of a pod or service, if the liveness probe (which could also be a simple HTTP endpoint) starts to indicate an unhealthy state (e.g., HTTP 500 errors), Kubernetes can automatically kill the pod and restart it.
When we used the JBoss Forge tooling and the fabric8-setup
command
from the Fabric8 addon, a readiness probe was automatically added to our Kubernetes manifest by adding the following Maven properties to the
respective pom.xml. If it wasn’t, you can use the command
fabric8-readiness-probe
or fabric8-liveness-probe
to add it to an
existing project:
<fabric8.readinessProbe.httpGet.path>
/health</fabric8.readinessProbe.httpGet.path>
<fabric8.readinessProbe.httpGet.port>
8080</fabric8.readinessProbe.httpGet.port>
<fabric8.readinessProbe.initialDelaySeconds>
5</fabric8.readinessProbe.initialDelaySeconds>
<fabric8.readinessProbe.timeoutSeconds>
30</fabric8.readinessProbe.timeoutSeconds>
The Kubernetes JSON that gets generated for including these Maven properties includes:
"readinessProbe"
:
{
"httpGet"
:
{
"path"
:
"/health"
,
"port"
:
8080
},
This means the “readiness” quality of the hola-springboot
pod will be
determined by periodically polling the /health endpoint of our pod.
When we added the actuator to our Spring Boot microservice earlier, a
/health endpoint was added which returns:
{
"diskSpace"
:
{
"free"
:
106880393216
,
"status"
:
"UP"
,
"threshold"
:
10485760
,
"total"
:
107313364992
},
"status"
:
"UP"
}
The same thing can be done with Dropwizard and WildFly Swarm!
As a service provider, your responsibility is to your consumers to
provide the functionality you’ve promised. Following promise theory, a
service provider may depend on other services or downstream systems but
cannot and should not impose requirements upon them. A service provider
is wholly responsible for its promise to consumers. Because distributed systems can and do fail, there will be times when service promises can’t be met or can be only partly met. In our previous examples, we showed our Hola apps reaching out to a backend
service to form a greeting at the /api/greeting endpoint. What happens if the backend
service is not available? How do we hold up our end of the promise?
We need to be able to deal with these kinds of distributed systems faults. A service may not be available; a network may be experiencing intermittent connectivity; the backend service may be experiencing enough load to slow it down and introduce latency; a bug in the backend service may be causing application-level exceptions. If we don’t deal with these situations explicitly, we run the risk of degrading our own service, holding up threads, database locks, and resources, and contributing to rolling, cascading failures that can take an entire distributed network down. To help us account for these failures, we’re going to leverage a library from the NetflixOSS stack named Hystrix.
Hystrix is a fault-tolerant Java library that allows microservices to hold up their end of a promise by:
Providing protection against dependencies that are unavailable
Monitoring and providing timeouts to guard against unexpected dependency latency
Load shedding and self-healing
Degrading gracefully
Monitoring failure states in real time
Injecting business-logic and other stateful handling of faults
With Hystrix, you wrap any call to your external dependencies with a
HystrixCommand
and implement the possibly faulty calls inside the
run()
method. To help you get started, let’s look at implementing a
HystrixCommand
for the hola-wildflyswarm
project. Note for this
example, we’re going to follow the Netflix best practices of making
everything explicit, even if that introduces some boilerplate code.
Debugging distributed systems is difficult and having exact stack traces for your code without too much magic is more important than hiding everything behind complicated magic that becomes impossible to debug at runtime. Even though the Hystrix library has annotations for
convenience, we’ll stick with implementing the Java objects directly for this book and leave it to the reader to explore the more mystical ways to use Hystrix.
First let’s add the hystrix-core
dependency to our Maven pom.xml:
<dependency>
<groupId>
com.netflix.hystrix</groupId>
<artifactId>
hystrix-core</artifactId>
<version>
${hystrix.version}</version>
</dependency>
Let’s create a new Java class called BackendCommand
that extends from
HystrixCommand
in our hola-wildflyswarm
project shown in Example 6-1.
public
class
BackendCommand
extends
HystrixCommand
<
BackendDTO
>
{
private
String
host
;
private
int
port
;
private
String
saying
;
public
BackendCommand
(
String
host
,
int
port
)
{
super
(
HystrixCommandGroupKey
.
Factory
.
asKey
(
"wfswarm.backend"
));
this
.
host
=
host
;
this
.
port
=
port
;
}
public
BackendCommand
withSaying
(
String
saying
)
{
this
.
saying
=
saying
;
return
this
;
}
@Override
protected
BackendDTO
run
()
throws
Exception
{
String
backendServiceUrl
=
String
.
format
(
"http://%s:%d"
,
host
,
port
);
System
.
out
.
println
(
"Sending to: "
+
backendServiceUrl
);
Client
client
=
ClientBuilder
.
newClient
();
return
client
.
target
(
backendServiceUrl
)
.
path
(
"api"
)
.
path
(
"backend"
)
.
queryParam
(
"greeting"
,
saying
)
.
request
(
MediaType
.
APPLICATION_JSON_TYPE
)
.
get
(
BackendDTO
.
class
);
}
}
You can see here we’ve extended HystrixCommand
and provided our
BackendDTO
class as the type of response our command object will
return. We’ve also added some constructor and builder methods for
configuring the command. Lastly, and most importantly, we’ve added a
run()
method here that actually implements the logic for making an
external call to the backend
service. Hystrix will add thread timeouts and fault behavior around this run()
method.
What happens, though, if the backend
service is not available or
becomes latent? You can configure thread timeouts and rate of failures
which would trigger circuit-breaker behavior. A circuit breaker in this
case will simply open a circuit to the backend service by not allowing
any calls to go through (failing fast) for a period of time. The idea
with this circuit-breaker behavior is to allow any backend remote
resources time to recover or heal without continuing to take load
and possibly further cause it to persist or degrade into unhealthy
states.
You can configure Hystrix by providing configuration keys, JVM system
properties, or by using a type-safe DSL for your command object. For example, if we want to enable the circuit breaker (default true) and open the circuit if we get five or more failed requests (timeout, network error, etc.) within five seconds, we could pass the following into the constructor of our BackendCommand
object:
public
BackendCommand
(
String
host
,
int
port
)
{
super
(
Setter
.
withGroupKey
(
HystrixCommandGroupKey
.
Factory
.
asKey
(
"wildflyswarm.backend"
))
.
andCommandPropertiesDefaults
(
HystrixCommandProperties
.
Setter
()
.
withCircuitBreakerEnabled
(
true
)
.
withCircuitBreakerRequestVolumeThreshold
(
5
)
.
withMetricsRollingStatistical
WindowInMilliseconds
(
5000
)
))
;
this
.
host
=
host
;
this
.
port
=
port
;
}
Please see the Hystrix documentation for more advanced configurations as well as for how to externalize the configurations or even configure them dynamically at runtime.
If a backend dependency becomes latent or unavailable and Hystrix
intervenes with a circuit breaker, how does our service keep its
promise? The answer to this may be very domain specific. For example, if we consider a team that is part of a personalization service, we want to display custom book recommendations for a user. We may end up calling the book-recommendation service, but what if it isn’t available or is too slow to respond? We could degrade to a book list that may not be personalized; maybe we’d send back a book list that’s generic for users in a particular region. Or maybe we’d not send back any personalized list and just a very generic “list of the day.” To do this, we can use Hystrix’s built-in fallback method. In our example, if the backend
service is not available, let’s add a fallback
method to return a generic BackendDTO
response:
public
class
BackendCommand
extends
HystrixCommand
<
BackendDTO
>
{
<
rest
of
class
here
>
@Override
protected
BackendDTO
getFallback
()
{
BackendDTO
rc
=
new
BackendDTO
();
rc
.
setGreeting
(
"Greeting fallback!"
);
rc
.
setIp
(
"127.0.0,1"
);
rc
.
setTime
(
System
.
currentTimeMillis
());
return
rc
;
}
}
Our /api/greeting-hystrix
service should not be able to service a
client and hold up part of its promise, even if the backend service is
not available.
Note this is a contrived example, but the idea is ubiquitous. However, the application of whether to fallback or gracefully degrade versus breaking a promise is very domain specific. For example, if you’re trying to transfer money in a banking application and a backend service is down, you may wish to reject the transfer. Or you may wish to make only a certain part of the transfer available while the backend gets reconciled. Either way, there is no one-size-fits-all fallback method. In general, coming up with the fallback is related to what kind of customer experience gets exposed and how best to gracefully degrade considering the domain.
Hystrix offers some powerful features out of the box, as we’ve seen. One more failure mode to consider is when services become latent but not latent enough to trigger a timeout or the circuit breaker. This is one of the worst situations to deal with in distributed systems as latency like this can quickly stall (or appear to stall) all worker threads and cascade the latency all the way back to users. We would like to be able to limit the effect of this latency to just the dependency that’s causing the slowness without consuming every available resource. To accomplish this, we’ll employ a technique called the bulkhead. A bulkhead is basically a separation of resources such that exhausting one set of resources does not impact others. You often see bulkheads in airplanes or trains dividing passenger classes or in boats used to stem the failure of a section of the boat (e.g., if there’s a crack in the hull, allow it to fill up a specific partition but not the entire boat).
Hystrix implements this bulkhead pattern with thread pools. Each downstream dependency can be allocated a thread pool to which it’s assigned to handle external communication. Netflix has benchmarked the overhead of these thread pools and has found for these types of use cases, the overhead of the context switching is minimal, but it’s always worth benchmarking in your own environment if you have concerns. If a dependency downstream becomes latent, then the thread pool assigned to that dependency can become fully utilized, but other requests to the dependency will be rejected. This has the effect of containing the resource consumption to just the degraded dependency instead of cascading across all of our resources.
If the thread pools are a concern, Hystrix also can implement the bulkhead on the calling thread with counting semaphores. Refer to the Hystrix documentation for more information.
The bulkhead is enabled by default with a thread pool of 10 worker threads with no BlockingQueue as a backup. This is usually a sufficient configuration, but if you must tweak it, refer to the configuration documentation of the Hystrix component. Configuration would look something like this (external configuration is possible as well):
public
BackendCommand
(
String
host
,
int
port
)
{
super
(
Setter
.
withGroupKey
(
HystrixCommandGroupKey
.
Factory
.
asKey
(
"wildflyswarm.backend"
))
.
andThreadPoolPropertiesDefaults
(
HystrixThreadPoolProperties
.
Setter
()
.
withCoreSize
(
10
)
.
withMaxQueueSize
(-
1
))
.
andCommandPropertiesDefaults
(
HystrixCommandProperties
.
Setter
()
.
withCircuitBreakerEnabled
(
true
)
.
withCircuitBreakerRequestVolumeThreshold
(
5
)
.
withMetricsRollingStatisticalWindow
InMilliseconds
(
5000
)
))
;
this
.
host
=
host
;
this
.
port
=
port
;
}
To test out this configuration, let’s build and deploy the
hola-wildflyswarm
project and play around with the environment.
Build Docker image and deploy to Kubernetes:
$
mvn -Pf8-local-deploy
Let’s verify the new /api/greeting-hystrix endpoint is up and
functioning correctly (this assumes you’ve been following along
and still have the backend
service deployed; refer to previous
sections to get that up and running):
$
oc get pod NAME READY STATUS RESTARTS AGE backend-pwawu 1/1 Running0
18h hola-dropwizard-bf5nn 1/1 Running0
19h hola-springboot-n87w3 1/1 Running0
19h hola-wildflyswarm-z73g3 1/1 Running0
18h
Let’s port-forward the hola-wildflyswarm
pod again so we can reach it
locally. Recall this is a great benefit of using Kubernetes that you can
run this command regardless of where the pod is actually running in the
cluster:
$
oc port-forward -p hola-wildflyswarm-z73g3 9000:8080
Now let’s navigtate to http://localhost:9000/api/greeting-hystrix:
Now let’s take down the backend
service by scaling its
ReplicationController
replica count down to zero:
$
oc scale rc/backend --replicas=
0
By doing this, there should be no backend
pods running:
$
oc get pod NAME READY STATUS RESTARTS AGE backend-pwawu 1/1 Terminating0
18h hola-dropwizard-bf5nn 1/1 Running0
19h hola-springboot-n87w3 1/1 Running0
19h hola-wildflyswarm-z73g3 1/1 Running0
18h
Now if we refresh our browser pointed at http://localhost:9000/api/greeting-hystrix, we should see the service degrade to using the Hystrix fallback method:
In a highly scaled distributed system, we need a way to discover and load balance against services in the cluster. As we’ve seen in previous examples, our microservices must be able to handle failures; therefore, we have to be able to load balance against services that exist, services that may be joining or leaving the cluster, or services that exist in an autoscaling group. Rudimentary approaches to load balancing, like round-robin DNS, are not adequate. We may also need sticky sessions, autoscaling, or more complex load-balancing algorithms. Let’s take a look at a few different ways of doing load balancing in a microservices environment.
The great thing about Kubernetes is that it provides a lot of
distributed-systems features out of the box; no need to add any extra
components (server side) or libraries (client side). Kubernetes
Services
provided a means to discover microservices and they also
provide server-side load balancing. If you recall, a Kubernetes
Service
is an abstraction over a group of pods that can be specified
with label selectors. For all the pods that can be selected with the
specified selector, Kubernetes will load balance any requests across
them. The default Kubernetes load-balancing algorithm is round robin, but it can be configured for other algorithms such as session affinity. Note that clients don’t have to do anything to add a pod to the Service
; just adding a label to your pod will enable it for selection and be available. Clients reach the Kubernetes Service
by using the cluster IP or cluster DNS provided out of the box by Kubernetes. Also recall the cluster DNS is not like traditional DNS and does not fall prey to the DNS caching TTL problems typically encountered with using DNS for discovery/load balancing. Also note, there are no hardware load balancers to configure or maintain; it’s all just built in.
To demonstrate load balancing, let’s scale up the backend
services in
our cluster:
$
oc scale rc/backend --replicas=
3
Now if we check our pods, we should see three backend
pods:
$
oc get pod NAME READY STATUS RESTARTS AGE backend-8ywcl 1/1 Running0
18h backend-d9wm6 1/1 Running0
18h backend-vt61x 1/1 Running0
18h hola-dropwizard-bf5nn 1/1 Running0
20h hola-springboot-n87w3 1/1 Running0
20h hola-wildflyswarm-z73g3 1/1 Running0
19h
If we list the Kubernetes services available, we should see the
backend
service as well as the selector used to select the pods that
will be eligible for taking requests. The Service
will load balance to
these pods:
$
oc get svc NAME CLUSTER_IP PORT(
S)
backend 172.30.231.63 80/TCP hola-dropwizard 172.30.124.61 80/TCP hola-springboot 172.30.55.130 80/TCP hola-wildflyswarm 172.30.198.148 80/TCP
We can see here that the backend
service will select all pods with
labels component=backend
and provider=fabric8
. Let’s take a quick
moment to see what labels are on one of the backend
pods:
$
oc describe pod/backend-8ywcl|
grep Labels Labels:component
=
backend,provider=
fabric8
We can see that the backend
pods have the labels that match what the
service is looking for; so any time we communicate with the service, we
will be load balanced over these matching pods.
Let’s make a call to our hola-wildflyswarm
service. We should see the
response contain different IP addresses for the backend
service:
$
oc port-forward -p hola-wildflyswarm-z73g3 9000:8080$
curl http://localhost:9000/api/greeting Hola from cluster Backend at host: 172.17.0.45$
curl http://localhost:9000/api/greeting Hola from cluster Backend at host: 172.17.0.44$
curl http://localhost:9000/api/greeting Hola from cluster Backend at host: 172.17.0.46
Here we enabled port forwarding so that we can reach our
hola-wildflyswarm
service and tried to access the
http://localhost:9000/api/greeting endpoint. I used curl
here, but
you can use your favorite HTTP/REST tool, including your web browser.
Just refresh your web browser a few times to see that the backend, which
gets called is different each time. The Kubernetes Service
is load
balancing over the respective pods as expected.
Client-side load balancers can be used inside Kubernetes if you need
more fine-grained control or domain-specific algorithms for determining
which service or pod you need to send to. You can even do things like
weighted load balancing, skipping pods that seem to be faulty, or some
custom-based Java logic to determine which service/pod to call. The
downside to client-side load balancing is that it adds complexity to
your application and is often language specific. In a majority of cases, you should prefer to use the technology-agnostic, built-in Kubernetes service load balancing. If you find you’re in a minority case where more sophisticated load balancing is required, consider a client-side load balancer like SmartStack, bakerstreet.io
, or NetflixOSS Ribbon.
In this example, we’ll use NetflixOSS Ribbon to provide client-side load
balancing. There are different ways to use Ribbon and a few options for
registering and discovering clients. Service registries like Eureka and
Consul may be good options in some cases, but when running within
Kubernetes, we can just leverage the built-in Kubernetes API to discover
services/pods. To enable this behavior, we’ll use ribbon-discovery
project from Kubeflix. Let’s enable the dependencies in our pom.xml that we’ll need:
<dependency>
<groupId>
org.wildfly.swarm</groupId>
<artifactId>
ribbon</artifactId>
</dependency>
<dependency>
<groupId>
io.fabric8.kubeflix</groupId>
<artifactId>
ribbon-discovery</artifactId>
<version>
${kubeflix.version}</version>
</dependency>
For Spring Boot we could opt to use Spring Cloud, which provides convenient Ribbon integration, or we could just use the NetflixOSS dependencies directly:
<dependency>
<groupId>
com.netflix.ribbon</groupId>
<artifactId>
ribbon-core</artifactId>
<version>
${ribbon.version}</version>
</dependency>
<dependency>
<groupId>
com.netflix.ribbon</groupId>
<artifactId>
ribbon-loadbalancer</artifactId>
<version>
${ribbon.version}</version>
</dependency>
Once we’ve got the right dependencies, we can configure Ribbon to use Kubernetes discovery:
loadBalancer
=
LoadBalancerBuilder
.
newBuilder
()
.
withDynamicServerList
(
new
KubernetesServerList
(
config
))
.
buildDynamicServerListLoadBalancer
();
Then we can use the load balancer with the Ribbon LoadBalancerCommand
:
@Path
(
"/greeting-ribbon"
)
@GET
public
String
greetingRibbon
()
{
BackendDTO
backendDTO
=
LoadBalancerCommand
.
<
BackendDTO
>
builder
()
.
withLoadBalancer
(
loadBalancer
)
.
build
()
.
submit
(
new
ServerOperation
<
BackendDTO
>()
{
@Override
public
Observable
<
BackendDTO
>
call
(
Server
server
)
{
String
backendServiceUrl
=
String
.
format
(
"http://%s:%d"
,
server
.
getHost
(),
server
.
getPort
());
System
.
out
.
println
(
"Sending to: "
+
backendServiceUrl
);
Client
client
=
ClientBuilder
.
newClient
();
return
Observable
.
just
(
client
.
target
(
backendServiceUrl
)
.
path
(
"api"
)
.
path
(
"backend"
)
.
queryParam
(
"greeting"
,
saying
)
.
request
(
MediaType
.
APPLICATION_JSON_TYPE
)
.
get
(
BackendDTO
.
class
));
}
}).
toBlocking
().
first
();
return
backendDTO
.
getGreeting
()
+
" at host: "
+
backendDTO
.
getIp
();
}
See the accompanying source code for the exact details.
In this chapter, we learned a little about the pains of deploying and managing microservices at scale and how Linux containers can help. We can leverage true immutable delivery to reduce configuration drift, and we can use Linux containers to enable service isolation, rapid delivery, and portability. We can leverage scalable container management systems like Kubernetes and take advantage of a lot of distributed-system features like service discovery, failover, health-checking (and more!) that are built in. You don’t need complicated port swizzling or complex service discovery systems when deploying on Kubernetes because these are problems that have been solved within the infrastructure itself. To learn more, please review the following links:
3.22.41.212