Chapter 11. Swarm on zCX

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Swarm on zCX

This chapter describes the use of Docker swarm mode on zCX, including the necessary configuration steps for setting up a clustering container environment.

Upon completion of this chapter, you should be able to deploy a container by using a web application on IBM Z and use orchestration tools.

This chapter includes the following topics:

•11.1, “Swarm introduction” on page 240

•11.2, “zCX in swarm mode” on page 241

11.1 Swarm introduction

The Docker website describes a swarm as follows:

A swarm consists of multiple Docker hosts, which run in swarm mode and act as managers (to manage membership and delegation) and workers (which run swarm services). A given Docker host can be a manager, a worker, or perform both roles. When you create a service, you define its optimal state (number of replicas, network, and storage resources available to it, ports the service exposes to the outside world, and more). Docker works to maintain that wanted state. For instance, if a worker node becomes unavailable, Docker schedules that node’s tasks on other nodes. A task is a running container that is part of a swarm service and managed by a swarm manager, as opposed to a stand-alone container.¹

The Docker command-line interface (CLI) is used to create a swarm, deploy application services to a swarm, and manage swarm behavior. (A web interface is not used for swarm.)

The following key concepts are related to swarm:

•Node

A node is an instance of the Docker Engine that is participating in the swarm. Multiple nodes can be run on one physical or virtual server. Typically, nodes are spread across multiple servers in a production environment. Nodes in a swarm can be manager or worker nodes:

– Manager node

This node conducts the orchestration and management functions for the swarm. Interaction with the swarm happens through a manager node, which in turn dispatches units of work that are called tasks to the worker nodes.

Docker recommends the use of an odd number of manager nodes in a swarm for redundancy to maintain a quorum, with seven nodes being the maximum recommended. It is possible to run with only one manager node. However, if this node fails, the swarm continues to operate but a new cluster must be built to recover.

Note: Increasing the number of manager nodes does not increase performance of the swarm, and doing so might negatively affect the performance of the swarm.

– Worker node

This node receives tasks from the management nodes and runs. Worker nodes inform the manager nodes of their status and the status of the assigned tasks. They exist to run containers. They have no input into load-balancing decisions of the swarm and they do not participate in raft consensus. The status of worker nodes is controlled by the manager.

Note: Manager nodes are worker nodes by default. Manager nodes can be demoted to worker node status by using the docker node demote command. Worker nodes can be promoted to manager node status by using the docker node promote command.

•Tasks and services

A task carries a Docker container and the commands to be run inside the container. Manager nodes assign tasks to worker nodes according to the number of replicas that are set in the service scale. After a task is assigned to a node, it cannot move to another node. It can run on the assigned node only or fail.

A service is the definition of the tasks to run on the manager or worker nodes. It is the central structure of the swarm system and the primary root of user interaction with the swarm. Service definition specifies which container image to use and which commands to run inside running containers.

Figure 11-1 shows the architecture of a swarm.

Figure 11-1 Swarm architecture diagram

•Load balancing

The swarm manager uses ingress load balancing to expose the services that are made available externally to the swarm. The swarm manager can automatically assign the service a PublishedPort, or a PublishedPort can be configured for the service 30000 - 32767 range.

External components, such as cloud load balancers, can access the service on the PublishedPort of any node in the cluster whether the node is running the task for the service. All nodes in the swarm route ingress connections to a running task instance.

Swarm mode has an internal DNS component that automatically assigns a DNS entry to each service in the swarm. The swarm manager uses internal load balancing to distribute requests among services within the cluster, which are based on the DNS name of the service.

11.2 zCX in swarm mode

In this section, we configure Docker Swarm mode on our zCX instances to demonstrate the application portability that can be achieved by zCX.

As discussed earlier in this chapter, Docker swarm is part of Docker Engine. Therefore, we do not have to install any more components to have functionality.

In this example, we complete the following steps:

1. Leave node from Swarm.

2. Start the swarm mode.

3. Add node manager to the cluster.

4. Add worker nodes to the Manager node.

5. Create a Docker service.

6. Scale up a container.

7. Change node availability to simulate a scheduled maintenance.

8. Promote a worker node.

9. Demote a manager node.

10. Scale down a container.

Our environment

The specific components of our swarm environment are listed in Table 11-1.

Table 11-1 Our zCX swarm environment

Server Name	Server IP	Swarm Node role	z/OS LPAR
sc74cn03.pbm.ihost.com	129.40.23.70	Manager	wtsc74.pbm.ihost.com
sc74cn04.pbm.ihost.com	129.40.23.71	Worker	wtsc75.pbm.ihost.com
sc74cn09.pbm.ihost.com	129.40.23.76	Manager	wtsc74.pbm.ihost.com
sc74cn10.pbm.ihost.com	129.40.23.77	Worker	wtsc75.pbm.ihost.com

The Example 11-2 depicts our zCX cluster environment.

Figure 11-2 Our zCX environment

Note: Swarm uses server names and IP addresses for cluster configuration. Ensure that you correctly set up DNS with the name and IP addresses of the zCX instances.

11.2.1 Removing node from swarm

Use the docker swarm leave command if you need to remove a specific node from swarm cluster. If your zCX instance has never joined a swarm cluster, you can skip this step. Otherwise, issue the command that is shown in Example 11-1.

Example 11-1 Swiping swarm configuration

admin@570e9473805e:~$ docker swarm leave

Node left the swarm.

Note: For Manager Node, you must append --force to force the node to leave the cluster.

11.2.2 Initializing swarm mode

The first node in swarm is called Manager leader. This node is responsible for advertising its IP address. In our case, we use the sc74cn03.pbm.ihost.com (129.40.23.70) instance to act as leader. (See Example 11-2.)

Example 11-2 Initializing a swarm cluster

admin@570e9473805e:~$ docker swarm init --advertise-addr 129.40.23.70

Swarm initialized: current node (1xk53pdq2ujayiea2vbfu3bs1) is now a manager.

To add a worker to this swarm, run the following command:

docker swarm join --token SWMTKN-1-17aalrjaxjmlx43ytth47xb23enxaek0ris89826ane1d6o99z-1zzz4qvmc8bxhb2le3gulg3nx 129.40.23.70:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

Note: After issuing the preceding docker swarm init command, the master node is defined and displays the command that allows you to add new worker nodes to the cluster. Make a note of it because we use this command in the next steps.

Example 11-2 demonstrates a command that adds another master node into the cluster. We ran the command as instructed and Example 11-3 shows the output.

Example 11-3 Getting the command to join manager nodes into the cluster

admin@570e9473805e:~$ docker swarm join-token manager

To add a manager to this swarm, run the following command:

docker swarm join --token SWMTKN-1-17aalrjaxjmlx43ytth47xb23enxaek0ris89826ane1d6o99z-cafuexvwlvojqhuosk7lqe07q 129.40.23.70:2377

11.2.3 Adding node manager to the cluster

In this step, we add the second manager node to our swarm cluster.

To join sc74cn09.pbm.ihost.com as master node, follow these steps:

1. Connect to sc74cn09.pbm.ihost.com zCX instance by using ssh (in other words, ssh [email protected]).

2. Issue docker swarm join, including the token for the manager node that you got in Example 11-2. The following example shows the output:

Example 11-4 Joining a manager node

admin@01f543067055:~$ docker swarm join --token SWMTKN-1-17aalrjaxjmlx43ytth47xb23enxaek0ris89826ane1d6o99z-cafuexvwlvojqhuosk7lqe07q 129.40.23.70:2377

This node joined a swarm as a manager.

11.2.4 Adding worker nodes to the manager node

The worker nodes load the Docker workloads. Therefore, you can add worker nodes to process more workloads (see Example 11-5).

To join our first worker node (sc74cn04.pbm.ihost.com), complete the following steps:

1. Connect to sc74cn04.pbm.ihost.com zCX instance by using ssh (in other words, ssh [email protected])

2. Issue docker swarm join, including the token for the worker node that you got in Example 11-2. The following example shows the output:

Example 11-5 Joining a worker node

admin@a29d70c73c55:~$ docker swarm join --token SWMTKN-1-17aalrjaxjmlx43ytth47xb23enxaek0ris89826ane1d6o99z-1zzz4qvmc8bxhb2le3gulg3nx 129.40.23.70:2377

This node joined a swarm as a worker.

Example 6 shows successful output for the command for the sc74cn10.pbm.ihost.com server.

Now, check the status of the cluster by running the command that is shown in Example 6 from one of the manager nodes.

Example 6 Checking status of our swarm cluster

admin@570e9473805e:~$ docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION

1xk53pdq2ujayiea2vbfu3bs1 * sc74cn03.pbm.ihost.com Ready Active Reachable 18.09.2

cidllaskmus095rrb221ii9b9 sc74cn04.pbm.ihost.com Ready Active 18.09.2

4y9a3mej93xvq4denfphhz53u sc74cn09.pbm.ihost.com Ready Active Leader 18.09.2

j2b9596uktzhupuxv6td22mm1 sc74cn10.pbm.ihost.com Ready Active 18.09.2

By default, all nodes in the cluster can receive Docker workloads. We are going to prevent the manager nodes from receiving application workloads. In our setup, only worker nodes will receive application workloads from Swarm. The following commands prevent the manager nodes from receiving workloads. Run these commands on any of the manager nodes. In our example, we are running it on the sc74cn03.pbm.ihost.com server and sc74cn09.pbm.ihost.com as shown in Example 11-7.

Example 11-7 Node update command

admin@570e9473805e:~$ docker node update --availability drain sc74cn03.pbm.ihost.com

sc74cn03.pbm.ihost.com

admin@570e9473805e:~$ docker node update --availability drain sc74cn09.pbm.ihost.com

sc74cn09.pbm.ihost.com

admin@570e9473805e:~$ docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERS

1xk53pdq2ujayiea2vbfu3bs1 * sc74cn03.pbm.ihost.com Ready Drain Reachable 18.09.2

cidllaskmus095rrb221ii9b9 sc74cn04.pbm.ihost.com Ready Active 18.09.2

4y9a3mej93xvq4denfphhz53u sc74cn09.pbm.ihost.com Ready Drain Leader 18.09.2

j2b9596uktzhupuxv6td22mm1 sc74cn10.pbm.ihost.com Ready Active 18.09.2

As you can see in Example 11-7, the manager nodes are now in Drain status. When you drain a node, the scheduler reassigns any tasks that are running on the node to other available worker nodes in the swarm. It also prevents the scheduler from assigning tasks to the node.

11.2.5 Creating a Docker service

In this step, we defined a cluster service called web server. This service provides the nginx application. Example 8 shows the service status before the command is run. Then, we deploy a new service and check the status to confirm that web server service was created.

These commands must run in any manager node that is part of the cluster. In our example, we ran it in the sc74cn09.pbm.ihost.com server.

Example 8 Creating a Docker service

admin@01f543067055:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

admin@01f543067055:~$ docker service create --name webserver -p 80:80 nginx:latest

pwu25hj3wpwwovo73d2is31gn

overall progress: 1 out of 1 tasks

1/1: running [==================================================>]

verify: Service converged

admin@01f543067055:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

pwu25hj3wpww webserver replicated 1/1 nginx:latest *:80->80/tcp

You can get more information about the web server service by running the command that is shown in Example 11-9 (the nginx container was started in sc74cn04.pbm.ihost.com server, which is the worker node. Remember that we removed sc74cn03.pbm.ihost.com and sc74cn09.pbm.ihost.com from receiving application workloads because they were drained in a previous step. By default, only one replica is started.

Example 11-9 Checking web server service

admin@01f543067055:~$ docker service ps webserver

ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORT

vkqgrr6aypi3 webserver.1 nginx:latest sc74cn04.pbm.ihost.com Running Running 2 minutes ago

Now, we stopped the container on sc74cn04.pbm.ihost.com server to see the behavior when someone takes down a container that is managed by swarm (see Example 11-10).

Example 11-10 Stopping nginx container on itsosleb server

admin@a29d70c73c55:~$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAME

0f0fac618197 nginx:latest "nginx -g 'daemon of…" 5 minutes ago Up 5 minutes 80/tcp webserver.1.vkqgrr6aypi3e8ue88wuqedxw

a29d70c73c55 ibm_zcx_zos_cli_image "sudo /usr/sbin/sshd…" 2 hours ago Up 2 hours 8022/tcp, 0.0.0.0:8022->22/tcp ibm_zcx_zos_cli

admin@a29d70c73c55:~$ docker stop 0f0fac618197

0f0fac618197

admin@a29d70c73c55:~$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAME

e5a065c3177b nginx:latest "nginx -g 'daemon of…" 6 seconds ago Up Less than a second 80/tcp webserver.1.2y909gfnju13ybtxwsknljuyr

a29d70c73c55 ibm_zcx_zos_cli_image "sudo /usr/sbin/sshd…" 2 hours ago Up 2 hours 8022/tcp, 0.0.0.0:8022->22/tcp ibm_zcx_zos_cli

Swarm identified that container nginx was down on the sc74cn04.pbm.ihost.com worker node and automatically restarted a new container on the same node. Typically that restart happens on the other worker node. But the decison depends on the calculations of the swarm algorithm.

To confirm that the container was restarted, we ran the command listed in Example 11-11 on sc74cn03.pbm.ihost.com (master node) to show the status of the web server.

Example 11-11 Checking status of web server service after container stops

admin@570e9473805e:~$ docker service ps webserver

ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR

PORTS

2y909gfnju13 webserver.1 nginx:latest sc74cn04.pbm.ihost.com Running Running 5 minutes ago

vkqgrr6aypi3 \_ webserver.1 nginx:latest sc74cn04.pbm.ihost.com Shutdown Complete 5 minutes ago

11.2.6 Scaling up a container

Scaling a container allows you to increase the number of containers. This activity is common when you need to increase capacity for your application and allow more requests to be processed by the cluster.

We scaled up the web server service from 1 to 4 instances, as shown in Example 11-12. We ran this command on sc74cn03.pbm.ihost.com (master node).

Example 11-12 Scaling up your web server service

admin@570e9473805e:~$ docker service scale webserver=4

webserver scaled to 4

overall progress: 4 out of 4 tasks

1/4: running [==================================================>]

2/4: running [==================================================>]

3/4: running [==================================================>]

4/4: running [==================================================>]

verify: Service converged

admin@570e9473805e:~$ docker service ps webserver

ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORT

2y909gfnju13 webserver.1 nginx:latest sc74cn04.pbm.ihost.com Running Running 7 minutes ago

vkqgrr6aypi3 \_ webserver.1 nginx:latest sc74cn04.pbm.ihost.com Shutdown Complete 7 minutes ago

omled936b5dt webserver.2 nginx:latest sc74cn10.pbm.ihost.com Running Running 16 seconds ago

4ztuxznjrc8t webserver.3 nginx:latest sc74cn10.pbm.ihost.com Running Running 16 seconds ago

hqb26nea8vmf webserver.4 nginx:latest sc74cn04.pbm.ihost.com Running Running 17 seconds ago

admin@570e9473805e:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

pwu25hj3wpww webserver replicated 4/4 nginx:latest *:80->80/tcp

For the next example, we scaled the web server service from 4 to 30, as shown in Example 11-13.

Example 11-13 Scaling web server service from 4 to 30

admin@570e9473805e:~$ docker service scale webserver=30

webserver scaled to 30

overall progress: 30 out of 30 tasks

verify: Service converged

admin@570e9473805e:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

pwu25hj3wpww webserver replicated 30/30 nginx:latest *:80->80/tcp

11.2.7 Changing node availability to simulate a scheduled maintenance

Setting node availability from active to drain is often used when you need to perform maintenance on your Docker host. When a node has availability equal to drain, all containers are stopped on this node and started in the other nodes that are members of the cluster.

The status of sc74cn04.pbm.ihost.com (worker node) server changed from active to drain (see Example 11-14).

Example 11-14 Putting server in DRAIN mode

admin@570e9473805e:~$ docker node update --availability drain sc74cn04.pbm.ihost.com

sc74cn04.pbm.ihost.com

admin@570e9473805e:~$ docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSI

1xk53pdq2ujayiea2vbfu3bs1 * sc74cn03.pbm.ihost.com Ready Drain Reachable 18.09.2

cidllaskmus095rrb221ii9b9 sc74cn04.pbm.ihost.com Ready Drain 18.09.2

4y9a3mej93xvq4denfphhz53u sc74cn09.pbm.ihost.com Ready Drain Leader 18.09.2

j2b9596uktzhupuxv6td22mm1 sc74cn10.pbm.ihost.com Ready Active 18.09.2

admin@570e9473805e:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

pwu25hj3wpww webserver replicated 30/30 nginx:latest *:80->80/tcp

The output that is shown in Example 11-15 shows that all containers for the web server service were stopped.

Example 11-15 Show containers that are available

admin@a29d70c73c55:~$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

a29d70c73c55 ibm_zcx_zos_cli_image "sudo /usr/sbin/sshd…" 2 hours ago Up 2 hours 8022/tcp, 0.0.0.0:8022->22/tcp ibm_zcx_zos_cli

To return the sc74cn04.pbm.ihost.com (worker node) server back to active, run the command that is shown in Example 11-16. We also are scaling the web server service to 105 to see instances that are running on sc74cn04.pbm.ihost.com server. This is because swarm does not automatically rebalance the cluster.

Example 11-16 Returning the server back to accept workloads

# Making sc74cn04.pbm.ihost.com server available to the cluster

admin@570e9473805e:~$ docker node update --availability active sc74cn04.pbm.ihost.com

sc74cn04.pbm.ihost.com

# Scaling cluster from 30 to 105

admin@570e9473805e:~$ docker service scale webserver=105

webserver scaled to 105

overall progress: 105 out of 105 tasks

verify: Service converged

admin@570e9473805e:~$

#Connected on sc74cn04.pbm.ihost.com worker node to confirm containers running after scale the cluster. 53 instances of nginx running on sc74cn04.

admin@a29d70c73c55:~$ docker ps |grep nginx |wc -l

11.2.8 Promoting a worker node

You might promote a worker node to be a manager node in scenarios like these:

•You need to take a manager node down for maintenance.

•You need to increase the number of manager nodes for your swarm cluster.

Example 11-17 shows the method for promoting a node.

Example 11-17 Promoting itsoredb node

admin@570e9473805e:~$ docker node promote sc74cn04.pbm.ihost.com

Node sc74cn04.pbm.ihost.com promoted to a manager in the swarm.

admin@570e9473805e:~$ docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSI

1xk53pdq2ujayiea2vbfu3bs1 * sc74cn03.pbm.ihost.com Ready Drain Reachable 18.09.2

cidllaskmus095rrb221ii9b9 sc74cn04.pbm.ihost.com Ready Active Reachable 18.09.2

4y9a3mej93xvq4denfphhz53u sc74cn09.pbm.ihost.com Ready Drain Leader 18.09.2

j2b9596uktzhupuxv6td22mm1 sc74cn10.pbm.ihost.com Ready Active 18.09.2

11.2.9 Demoting a manager node

We demoted sc74cn04.pbm.ihost.com server, as shown in Example 11-18. Therefore, this server does not process manager nodes operations.

Example 11-18 Demoting a manager node

admin@570e9473805e:~$ docker node demote sc74cn04.pbm.ihost.com

Manager sc74cn04.pbm.ihost.com demoted in the swarm.

admin@570e9473805e:~$ docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSIO

1xk53pdq2ujayiea2vbfu3bs1 * sc74cn03.pbm.ihost.com Ready Drain Reachable 18.09.2

cidllaskmus095rrb221ii9b9 sc74cn04.pbm.ihost.com Ready Active 18.09.2

4y9a3mej93xvq4denfphhz53u sc74cn09.pbm.ihost.com Ready Drain Leader 18.09.2

j2b9596uktzhupuxv6td22mm1 sc74cn10.pbm.ihost.com Ready Active 18.09.2

11.2.10 Scaling down a service

Use the command that is shown in Example 11-19 to scale down your service. In this example, we set the web server service to have only four instances.

Example 11-19 Example of how to scale down service

admin@570e9473805e:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

pwu25hj3wpww webserver replicated 105/105 nginx:latest *:80->80/tcp

admin@570e9473805e:~$ docker service scale webserver=4

webserver scaled to 4

overall progress: 4 out of 4 tasks

1/4: running [==================================================>]

2/4: running [==================================================>]

3/4: running [==================================================>]

4/4: running [==================================================>]

verify: Service converged

admin@570e9473805e:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS

pwu25hj3wpww webserver replicated 4/4 nginx:latest *:80->80/tcp

11.2.11 Considerations regarding the number of manager nodes

It is vital to understand how swarm mode’s fault-tolerance feature works to prevent problems with your cluster service. Although it is possible to run a cluster with only one manager node, this approach is not ideal for organizations that have high-availability policies. If the single-node manager fails, the services continue to process the user requests. However, you must create a cluster to recover the manager node operations.

Docker uses a consensus algorithm that is named raft to achieve internal consistency across the entire swarm cluster and all containers that are running on it. Docker recommends that you implement an odd number, based to your high-availability requirements, according to the following rules:

•A three-manager swarm tolerates a maximum loss of one manager.

•A five-manager swarm tolerates a maximum simultaneous loss of two manager nodes.

•An N-manager cluster tolerates the loss of at most (N-1)/2 managers.

For more information about the cluster algorithm, see this web page.

¹ https://docs.docker.com/engine/swarm/key-concepts/#what-is-a-swarm

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 11. Swarm on zCX

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 11. Swarm on zCX