Creating a scalable Docker Swarm cluster on bare metal with Packet

IaaS clouds have been popularized through heavy usage of virtual machines. Recent initiatives are targeting bare metal servers with an API, so we get the best of both worlds—on-demand servers through an API and incredible performance through direct access to the hardware. https://www.packet.net/ is a bare metal IaaS provider (https://www.scaleway.com/ is another) very well supported by Terraform with an awesome global network. Within minutes we have new hardware ready and connected to the network.

We'll build a fully automated and scalable Docker Swarm cluster, so we can operate highly scalable and performant workloads on bare metal: this setup can scale thousands of containers in just a few minutes. This cluster is composed of Type 0 machines (4 cores and 8 GB RAM), for one manager and 2 nodes, totaling 12 cores and 24 GB of RAM, but we can use more performant machines if we want: the same cluster with Type 2 machines will have 72 cores and 768 GB of RAM (though the price will adapt accordingly).

Getting ready

To step through this recipe, you will need the following:

  • A working Terraform installation
  • A Packet.net account with an API key
  • An Internet connection

How to do it…

Let's start by creating the packet provider, using the API key (an authentication token). Create the variable in variables.tf:

variable "auth_token" {
  default     = "1234"
  description = "API Key Auth Token"
}

Also, be sure to override the value in terraform.tfvars with the real token:

auth_token = "JnN7e6tPMpWNtGcyPGT93AkLuguKw2eN"

Creating a Packet project using Terraform

Packet, like some other IaaS providers, uses the notion of project to group machines. Let's create a project named Docker Swarm Bare Metal Infrastructure, since that's what we want to do, in a projects.tf file:

resource "packet_project" "swarm" {
  name = "Docker Swarm Bare Metal Infrastructure"
}

This way, if you happen to manage multiple projects or customers, you can split them all into their own projects.

Handling Packet SSH keys using Terraform

To connect to the machines using SSH, we need at least one public key uploaded to our Packet account. Let's create a variable to store it in variables.tf:

variable "ssh_key" {
  default     = "keys/admin_key"
  description = "Path to SSH key"
}

Don't forget to override the value in terraform.tfvars if you use another name for the key.

Let's use the packet_ssh_key resource to create the SSH key on our Packet account:

resource "packet_ssh_key" "admin" {
  name       = "admin_key"
  public_key = "${file("${var.ssh_key}.pub")}"
}

Bootstraping a Docker Swarm manager on Packet using Terraform

We'll create two types of servers for this Docker Swarm cluster: managers and nodes. Managers are controlling what's executed on the nodes. We'll start by bootstrapping the Docker Swarm manager server, using the Packet service (more alternatives are available from Packet API):

  • We want the cheapest server (baremetal_0)
  • We want the servers in Amsterdam (ams1)
  • We want the servers to run Ubuntu 16.04 (ubuntu_16_04_image)
  • Default SSH user is root
  • Billing will be hourly, but that can be monthly as well

Let's put generic information in variables.tf so we can manipulate them:

variable "facility" {
  default     = "ewr1"
  description = "Packet facility (us-east=ewr1, us-west=sjc1, eu-west=ams1)"
}

variable "plan" {
  default     = "baremetal_0"
  description = "Packet machine type"
}

variable "operating_system" {
  default     = "coreos_stable"
  description = "Packet operating_system"
}

variable "ssh_username" {
  default     = "root"
  description = "Default host username"
}

Also, override them in terraform.tfvars to match our values:

facility = "ams1"
operating_system = "ubuntu_16_04_image"

To create a server with Packet, let's use the packet_device resource, specifying the chosen plan, facility, operating system, billing, and the project in which it will run:

resource "packet_device" "swarm_master" {
  hostname         = "swarm-master"
  plan             = "${var.plan}"
  facility         = "${var.facility}"
  operating_system = "${var.operating_system}"
  billing_cycle    = "hourly"
  project_id       = "${packet_project.swarm.id}"
}

Now, let's create two scripts that will execute when the server is ready. The first one will update Ubuntu (update_os.sh) while the second will install Docker (install_docker.sh).

#!/usr/bin/env bash
# file: ./scripts/update_os.sh
sudo apt update -yqq
sudo apt upgrade -yqq

This script will install and start Docker:

#!/usr/bin/env bash
# file: ./scripts/install_docker.sh
curl -sSL https://get.docker.com/ | sh
sudo systemctl enable docker
sudo systemctl start docker

We can now call those scripts as a remote-exec provisioner inside the packet_device resource:

  provisioner "remote-exec" {
    connection {
      user        = "${var.ssh_username}"
      private_key = "${file("${var.ssh_key}")}"
    }

    scripts = [
      "scripts/update_os.sh",
      "scripts/install_docker.sh",
    ]
  }

At this point, the system is fully provisioned and functional, with Docker running.

To initialize a Docker Swarm cluster, starting with Docker 1.12, we can just issue the following command:

$ docker swarm init --advertise-addr docker.manager.local.ip

A server at Packet has one interface sharing both public and private IP addresses. The private IP is the second one, and is available through the following exported attribute: ${packet_device.swarm_master.network.2.address}. Let's create another remote-exec provisioner, so the Swarm manager is initialized automatically, right after bootstrap:

  provisioner "remote-exec" {
    connection {
      user        = "${var.ssh_username}"
      private_key = "${file("${var.ssh_key}")}"
    }

    inline = [
      "docker swarm init --advertise-addr ${packet_device.swarm_master.network.2.address}",
    ]
  }

At this point, we have a Docker cluster running, with only one node—the manager itself.

The last step is to store the Swarm token, so the nodes can join. The token can be obtained with the following command:

$ docker swarm join-token worker -q

We'll store this token in a simple file in our infrastructure repository (worker.token), so we can access it and version it. Let's create a variable to store our token in a file in variables.tf:

variable "worker_token_file" {
  default     = "worker.token"
  description = "Worker token file"
}

We will execute the previous docker swarm command through SSH when everything else is done, using a local-exec provisioner. As we can't interact with the process, let's skip the host key checking and other initial SSH checks:

  provisioner "local-exec" {
    command = "ssh -t -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ${var.ssh_key} ${var.ssh_username}@${packet_device.swarm_master.network.0.address} "docker swarm join-token worker -q" > ${var.worker_token_file}"
  }

We're now done with the Docker Swarm manager!

Bootstraping Docker Swarm nodes on Packet using Terraform

We need nodes to join the swarm, so the workload can be spread. For convenience, the machine specs for the nodes will be the same as that of the master. Here's what will happen:

  • Two nodes are created
  • The token file is sent to each node
  • The operating system is updated, and Docker is installed
  • The node joins the swarm

Let's start by creating a variable for the number of nodes we want, in variables.tf:

variable "num_nodes" {
  default     = "1"
  description = "Number of Docker Swarm nodes"
}

Override that value as the cluster grows in terraform.tfvars:

num_nodes = "2"

Create the nodes using the same packet_device resource we used for the master:

resource "packet_device" "swarm_node" {
  count            = "${var.num_nodes}"
  hostname         = "swarm-node-${count.index+1}"
  plan             = "${var.plan}"
  facility         = "${var.facility}"
  operating_system = "${var.operating_system}"
  billing_cycle    = "hourly"
  project_id       = "${packet_project.swarm.id}"
}

Add a file provisioner to copy the token file:

  provisioner "file" {
    source      = "${var.worker_token_file}"
    destination = "${var.worker_token_file}"
  }

Using the same update and Docker installation scripts as the master, create the same remote-exec provisioner:

  provisioner "remote-exec" {
    connection {
      user        = "${var.ssh_username}"
      private_key = "${file("${var.ssh_key}")}"
    }

    scripts = [
      "scripts/update_os.sh",
      "scripts/install_docker.sh",
    ]
  }

The operating system is now fully updated and Docker is running.

Now we want to join the Docker Swarm cluster. To do this, we need two pieces of information: the token and the local IP of the master. We already have the token in a file locally, and Terraform knows the local IP of the swarm manager. So a trick is to create a simple script (I suggest you write a more robust one!), that reads the local token, and takes the local manager IP address as an argument. In a file named scripts/join_swarm.sh, enter the following lines:

#!/usr/bin/env bash
# file: scripts/join_swarm.sh
MASTER=$1
SWARM_TOKEN=$(cat worker.token)
docker swarm join --token ${SWARM_TOKEN} ${MASTER}:2377

Now we just have to send this file to the nodes using the file provisioner:

  provisioner "file" {
    source      = "scripts/join_swarm.sh"
    destination = "join_swarm.sh"
  }

Use it as a last step through a remote-exec provisioner, sending the local Docker master IP (${packet_device.swarm_master.network.2.address}") as an argument to the script:

  provisioner "remote-exec" {
    connection {
      user        = "${var.ssh_username}"
      private_key = "${file("${var.ssh_key}")}"
    }

    inline = [
      "chmod +x join_swarm.sh",
      "./join_swarm.sh ${packet_device.swarm_master.network.2.address}",
    ]
  }//.

Launch the whole infrastructure:

$ terraform apply
Outputs:

Swarm Master Private IP = 10.80.86.129
Swarm Master Public IP = 147.75.100.19
Swarm Nodes = Public: 147.75.100.23,147.75.100.3, Private: 10.80.86.135,10.80.86.133

Our cluster is running.

Using the Docker Swarm cluster

Using our Docker Swarm cluster is out of the scope of this book, but now we have it, let's take a quick look to scale a container to the thousands!

Verify we have our 3 nodes:

# docker node ls
ID                           HOSTNAME                STATUS  AVAILABILITY  MANAGER STATUS
9sxqi2f1pywmofgf63l84n7ps *  swarm-master.local.lan  Ready   Active        Leader
ag07nh1wzsbsvnef98sqf5agy    swarm-node-1.local.lan  Ready   Active
cppk5ja4spysu6opdov9f3x8h    swarm-node-2.local.lan  Ready   Active

We want a common network for our containers, and we want to scale to the thousands. So a typical /24 network won't be enough (that's the docker network default). Let's create a /16 overlay network, so we have room for scale!

# docker network create -d overlay --subnet 172.16.0.0/16 nginx-network

Create a Docker service that will simply launch an nginx container on this new overlay network, with 3 replicas (3 instances of the container running at the same time):

# docker service create --name nginx --network nginx-network --replicas 3 -p 80:80/tcp nginx

Verify if it's working:

# docker service ls
ID            NAME   REPLICAS  IMAGE  COMMAND
aeq9lspl0mpg  nginx  3/3       nginx

Now, accessing by HTTP any of the public IPs of the cluster, any container of any node can answer: we can make an HTTP request to node-1, and it can be a container on node-2 responding. Nice!

Let's scale our service now, from 3 replicas to 100:

# docker service scale nginx=100
nginx scaled to 100
# docker service ls
ID            NAME   REPLICAS  IMAGE  COMMAND
aeq9lspl0mpg  nginx  100/100   nginx

We just scaled to a hundred containers in a few seconds and split them on all 3 bare metal machines.

Now, you know you can scale, and with such a configuration you can push the nginx service to 500, 1000, or maybe more!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.239.41