Chapter 3. The Impact of Docker on Java Application Architecture

“If you can’t build a [well-structured] monolith, what makes you think microservices are the answer?”

Simon Brown, Coding the Architecture

Continuously delivering Java applications within containers require several long-standing architectural patterns and paradigms be challenged. The potential for automation and immutability that containers provide (along with new properties imposed like resource-restriction and transience) means that new architecture and configuration approaches are required. This chapter looks at examples of these changes, such as the Twelve-Factor Application, the decomposition of monolithic applications into smaller API-driven composable components (microservices), and the need to develop mechanical sympathy—an appreciation and respect for the underlying deployment fabric.

Cloud-Native Twelve-Factor Applications

In early 2012, Platform-as-a-Service (PaaS) pioneer Heroku developed the Twelve-Factor App, a series of rules and guidance for helping developers build cloud-ready PaaS applications that:

  • Use declarative formats for setup automation, to minimize time and cost for new developers joining the project
  • Have a clean contract with the underlying operating system, offering maximum portability between execution environments
  • Are suitable for deployment on modern cloud platforms, minimizing the need for servers and systems administration
  • Minimize divergence between development and production, enabling continuous deployment for maximum agility
  • Can scale up without significant changes to tooling, architecture, or development practices

These guidelines were written before container technology became mainstream, but are all equally (if not more) relevant to Docker-based applications. Let’s looks briefly at each of the factors now, and see how they map to continuously deploying Java applications within containers:

1. Codebase: One codebase tracked in revision control, many deploys

Each Java application (or service) should be tracked in a single, shared code repository, and Docker configuration files, such as Dockerfiles, should be stored alongside the application code.

2. Dependencies: Explicitly declare and isolate dependencies

Dependencies are commonly managed within Java applications using Maven or Gradle, and OS-level dependencies should be clearly specified in the associated Dockerfile.

3. Config: Store config in the environment

The Twelve-Factor App guidelines suggest that configuration data should be injected into an application via environment variables, although in practice many Java developers prefer to use configuration files, and there can be security issues with exposing secrets via environment variables. Storing nonsensitive configuration data in a remote service like Spring Cloud Config (backed by Git or Consul) and secrets in a service like HashiCorp’s Vault can be a good compromise. It is definitely not recommended to include secrets in a Dockerfile or Docker image.

4. Backing services: Treat backing services as attached resources (typically consumed over the network)

Java developers are accustomed to treating data stores and middleware in this fashion, and in-memory substitutes (e.g., HSQLDB, Apache Qpid, and Stubbed Cassandra) or service virtualization (e.g., Hoverfly) can be used for in-process component testing within the build pipeline.

5. Build, release, run: Strictly separate build and run stages

For a compiled language such as Java, this guideline comes as no surprise (and with little choice of implementation!). It is worth mentioning that the flexibility provided by Docker means that separate containers can be used to build, test, and run the application, each configured as appropriate. For example, a container can be created for build and test with a full OS, JDK, and diagnostic tools; and a container can be built for running an application in production with only a minimal OS and JRE. However, some may see this as an antipattern, as there should only be one “source of truth” artifact that is created, tested, and deployed within the pipeline, and using multiple Docker images can lead to an impedance mismatch and configuration drift between images.

6. Processes: Execute the app as one or more stateless processes

Building and running a Java application as a series of microservices can be made easier by using Docker (the concept of microservice is explained later in this chapter).

7. Port binding: Export services via port binding

Java developers are used to exposing application services via ports (e.g., running an application on Jetty or Apache Tomcat). Docker compliments this by allowing the declarative specification of exposed ports within an application’s Dockerfile, and Docker Compose also enables the same functionality when orchestrating multiple containers.

8. Concurrency: Scale out via the process model

Traditional Java applications typically take the opposite approach to scaling, as the JVM runs as a giant “uberprocess” that is often vertically scaled by adding more heap memory, or horizontally scaled by cloning and load-balancing across multiple running instances. However, the combination of decomposing Java applications into microservices and running these components within Docker containers can enable this approach to scalability. Regardless of the approach taken to implement scalability, this should be tested within the build pipeline.

9. Disposability: Maximize robustness with fast startup and graceful shutdown

This can require a mindset shift with developers who are used to creating a traditional long-running Java application, where much of the expense of application configuration and initialization was front-loaded in the JVM/application startup process. Modern, container-ready applications should utilize more just-in-time (JIT) configuration, and ensure that best efforts are taken to clean up resource and state during shutdown. The JDK exposes hooks for JRE/application startup and shutdown, and Docker (and the OS) can be instructed to call these, such as by issuing SIGTERM (versus SIGKILL) to running processes within a container.

10. Dev/prod parity: Keep development, staging, and production as similar as possible

The use of Docker in combination with orchestration technologies like Docker Swarm, Kubernetes, and Mesos can make this easier in comparison with traditional bare metal deployments where the underlying hardware and OS configuration is often significantly different than developer or test machines. As an application artifact moves through the build pipeline, it should be exposed to more and more realistic environments (e.g., unit testing can run in-memory [or within a Docker container] on a build box). However, end-to-end testing should be conducted in a production-like environment (e.g., if you are running Docker Swarm in production, you should be testing on Docker Swarm).

11. Logs: Treat logs as event streams

Java has had a long and sometimes arduous relationship with logging frameworks, but modern frameworks like Logback and Log4j 2 can be configured to stream to standard output (and hence viewed when running in a container by using the docker logs command) or streamed to disk (which can be mounted from the host running the container).

12. Admin processes: Run admin/management tasks as one-off processes

The ability to create simple Java applications that can be run within a Docker container allows administrative tasks to be run as one-off processes. However, these processes must be tested within (or as part of) the build pipeline.

This section of the chapter has attempted to summarize the benefits of using the principles from the Twelve-Factor App. For developers looking to develop a deeper understanding, Kevin Hoffman’s Beyond the Twelve-Factor App (O’Reilly) is a recommended read.

The Move to Microservices

One of the early use cases for Docker containers was the process isolation guarantees—the fact that multiple application instances could be deployed onto the same hardware without the need for a virtual machine hypervisor. Arguments emerged that suggested running only a single process per container was best practice, and that container runtimes should be minimal and the artifacts deployed into them should not be monolithic applications. This proved a difficult (if not impossible) set of requirements for existing Java applications that were already running in production. These arguments, in combination with other trends in the software development industry, led to the increase in popularity of a new architectural style we now refer to as microservices.

This book won’t cover the concept of microservices in depth; instead, an introduction to the topic can be found in Christian Posta’s Microservices for Java Developers (O’Reilly), and a more thorough treatment can be found in Sam Newman’s Building Microservices (O’Reilly) and Amundsen et al’s Microservice Architecture (O’Reilly). However, it is worth mentioning that many developers are embracing this architectural style in combination with adopting containerization technologies like Docker.

A core concept of microservices revolves around creating services that follow the single-responsibility principle and have one reason to change. Building Java-based microservices impacts the implementation of CD in several ways:

  • Multiple build pipelines (or branches within a single pipeline) must be created and managed
  • Deployment of multiple services to an environment now have to be orchestrated, managed, and tracked
  • Component testing may now have to mock, stub, or virtualize dependent services
  • End-to-end testing must now orchestrate multiple services (and associated state) before and after executing tests
  • Process must be implemented to manage service version control (e.g., the enforcement of only allowing the deployment of compatible interdependent services)
  • Monitoring, metrics, and application performance management (APM) tooling must be adapted to handle multiple services

Decomposing an existing monolithic application, or creating a new application that provides functionality through a composite of microservices is a non-trivial task. Techniques such as context mapping, from domain-driven design, can help developers (working alongside stakeholders and the QA team) understand how application/business functionality should be composed as a series of bounded contexts or focused services. It is also critical to understand how to design service application programming interfaces (APIs).

API-Driven Applications

Once service boundaries have been determined, the development team (and stakeholders) can define service functionality and the associated interfaces—the application programming interfaces (APIs). Mike Amundsen’s O’Reilly video series “Designing APIs for the Web” is a good place to start if you want to learn more about API design. Many teams attempt to define a service API up front, but in reality the design process will be iterative. A useful technique to enable this iterative approach is the behavior-driven development (BDD) technique named “The Three Amigos,” where any requirement should be defined with at least one developer, one QA specialist, and one project stakeholder present.

The typical outputs from this stage of the service design process include: a series of BDD-style acceptance tests that asserts component (single microservice) level requirements, such as Cucumber Gherkin syntax acceptance test scripts; and an API specification, such as a Swagger or RAML file, which the test scripts will operate against. It is also recommended that each service has basic (happy path) performance test scripts created (for example, using Gatling or JMeter) and also security tests (for example, using bdd-security). These service-level component tests can then be run continuously within the build pipeline, and will validate local microservice functional and nonfunctional requirements. Additional internal resource API endpoints can be added to each service, which can be used to manipulate internal state for test purposes or expose metrics.

The benefits to the CD process of exposing application or service functionality via an API include:

  • Easier automation of test fixture setup and teardown via internal resource endpoints (and this limits or removes the need to manipulate state via file system or data store access).
  • Easier automation of specification tests (e.g., REST-assured). Triggering functionality through a fragile UI is no longer required for every test.
  • API contracts can be validated automatically, potentially using techniques like consumer contracts and consumer-driven contracts (e.g., Pact-JVM).
  • Dependent services that expose functionality through an API can be efficiently mocked (e.g., WireMock), stubbed (e.g., stubby4j), or virtualized (e.g., Hoverfly).
  • Easier access to metrics and monitoring data via internal resource endpoints (e.g., Codahale Metrics or Spring Boot Actuator).

Containers and Mechanical Sympathy

Martin Thompson and Dave Farley have talked about the concept of mechanical sympathy in software development for several years. They were inspired by the Formula One racing driver Jackie Stewart’s famous quote “You don’t have to be an engineer to be a racing driver, but you do have to have mechanical sympathy”, meaning that understanding how a car works will make you a better driver; and it has been argued that this is analogous to programmers understanding how computer hardware works. You don’t necessarily need a degree in computer science or to be a hardware engineer, but you do need to understand how hardware works and take that into consideration when you design software. The days of architects sitting in ivory towers and drawing UML diagrams is over. Architects and developers must continue to develop practical and operational experience from working with the new technologies.

Using container technologies like Docker can fundamentally change the way your software interacts with the hardware it is running on, and it is beneficial to be aware of these changes:

  • Container technology can limit access to system resources, due to developer/operator specification, or to resource contention.
    • In particular, watch out for the restriction of memory available to a JVM, and remember that Java application memory requirements are not simply equal to heap size. In reality, Java applications’ memory requirements include the sum of Xmx heap size, PermGen/Metaspace, native memory thread requirements, and JVM overhead.
    • Another source of potential issues is that containers typically share a single source of entropy (/dev/random) on the host machine, and this can be quickly exhausted. This manifests itself with Java applications unexpectedly stalling/blocking during cryptographic operations such as token generation on the initialization of security functionality. It is often beneficial to use the JVM option -Djava.security.egd=file:/dev/urandom, but be aware that this can have some security implications.
  • Container technology can (incidentally) expose incorrect resource availability to the JVM (e.g., the number of processor cores typically exposed to a JVM application is based on the underlying host hardware properties, not the restrictions applied to a running container)
  • When running containerized deployment fabric, it is often the case that additional layers of abstraction are applied over the operating system (e.g., orchestration framework, container technology itself, and an additional OS).
  • Container orchestration and scheduling frameworks often stop, start, and move containers (and applications) much more often compared to traditional deployment platforms.
  • The hardware fabric upon which containerized applications are run is typically more ephemeral in nature (e.g., cloud computing).
  • Containerized applications can expose new security attack vectors that must be understood and mitigated.

These changes to the properties of the deployment fabric should not be a surprise to developers, as the use of many new technologies introduce some form of change (e.g., upgrading the JVM version on which an application is running, deploying Java applications within an application container, and running Java applications in the cloud). The vast majority of these potential issues can be mitigated by augmenting the testing processes within the CD build pipeline.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.199.122