Join us as we explore the many perilous paths through a pod and into Kubernetes. See the system from an adversary’s perspective: get to know the multitudinous defensive approaches and their weaknesses, and revisit historical attacks on cloud native systems through the piratical lens of your nemesis: Dread Pirate Captain Hashjack.
Kubernetes has grown rapidly and has historically not been considered to be “secure by default”. This is mainly due to security controls such as network policies not being enabled by default on a vanilla clusters.
As authors we are infinitely grateful that our arc saw the cloud native enlightenment, and we extend our heartfelt thanks to the volunteers, core contributors, and Cloud Native Computing Foundation (CNCF) members involved in Kubernetes’ vision and delivery. Documentation and bug fixes don’t write themselves, and the incredible selfless contributions that drive open source communities have never been more freely given or more gratefully received.
Security controls are generally more difficult to get right than the complex orchestration and distributed system functionality that Kubernetes is known for.
To the security teams especially, we thank you for your hard work! This book is a reflection on the pioneering voyage of the good ship Kubernetes, out on the choppy and dangerous free seas of the internet.
For the purposes of imaginative immersion: you have just become Chief Information Security Officer (CISO) of the start-up haulier Boats, Cranes & Trains Logistics (BCTL), who have just completed their Kubernetes migration.
They’ve been hacked before and are “taking security seriously”. You have the authority to do what needs to be done to keep the company afloat, figuratively and literally.
Historical examples of marine control system instability can be seen in the film Hackers (1995), where Ellingson Mineral Company’s oil tankers fall victim to an internal attack by the company’s CISO, Eugene “The Plague” Belford.
Welcome to the job! It’s your first day, and you have been alerted to a credible threat against your cloud systems. Container-hungry pirate and generally bad egg Captain Hashjack, and their clandestine hacker crew, are lining up for a raid on BCTL’s Kubernetes clusters.
If they gain access to your clusters they’ll mine bitcoin or crypto lock any valuable data they can find. You have not yet threat modelled your clusters and applications, or hardened them against this kind of adversary, and so we will guide you on your journey to defend them from the salty Captain’s voyage to encode, exfiltrate, or plunder whatever valuables they can find.
The BCTL cluster is a vanilla Kubernetes installation using kubeadm on a public cloud provider. Initially, all settings are as default.
To demonstrate hardening a cluster, well use an example insecure system.
It’s managed by the BCTL SRE team, which means the team are responsible for
securing the Kubernetes master nodes. This increases the potential attack surface of the
cluster: a managed service hosts the control plane (master nodes and
and their hardened configuration prevents some attacks (like a direct
but both approaches depend on the secure configuration of the
cluster to protect your workloads.
Let’s talk about your cluster. The nodes run in a private network segment, so public (internet) traffic cannot reach them directly. Public traffic to your cluster is proxied through an internet-facing load balancer: the ports on your nodes are not directly accessible to the world.
GitOps is declarative configuration deployment for applications: think of it like traditional configuration management for Kubernetes clusters. You can read more at gitops.tech and learn more on how to harden Git for GitOps in this whitepaper.
Running on the cluster there is a SQL datastore, as well as a front end, API, and batch processor.
The hosted application—a booking service for your company’s clients—is deployed in a single namespace using GitOps, but without a Network Policy or Pod Security Policy as discussed in the policy chapter.
Here’s a network diagram of the system in Figure 1-1:
The cluster’s role-based access control (RBAC) was configured by engineers who have moved on. The inherited security support services have intrusion detection and hardening, but the team has been disabling them from time to time as they were making too much noise. We will discuss this configuration in-depth as we press on with the voyage.
But first, let’s explore how to predict security threats to your clusters.
Understanding how a system is attacked is fundamental to defending it. A threat model gives you a more complete understanding of a complex system and provides a framework for rationalising security and risk. Threat actors categorise the potential adversaries the system is configured to defend against.
A threat model is like a fingerprint: every one is different. A threat model is based upon the impact of a system’s compromise. A Raspberry Pi hobby cluster and your bank’s clusters hold different data, have different potential attackers, and very different potential problems if broken into.
Threat modelling can reveal insights into your security program and configuration, but it doesn’t solve everything. You should make sure you are following basic security hygiene (like patching and testing) before considering the more advanced and technical attacks that a threat model may reveal.
If your systems can be compromised by published CVEs and a copy of Kali Linux, a threat model will not help you!
Your threat actors can be considered casual or motivated. Casual adversaries include:
Vandals (the graffiti kids of the internet generation).
Accidental trespassers looking for treasure (which is usually your data).
Drive-by “script kiddies”, who will run any code they find on the internet if it claims to help them hack.
Casual attackers shouldn’t be a concern to most systems that are patched and well configured.
Motivated individuals are the ones you should worry about. They include insiders like trusted employees, organised crime syndicates operating out of less-well-policed states, and state-sponsored actors, who may overlap with organised crime or sponsor it directly. “Internet crimes” are not well-covered by international laws and can be hard to police.
This table shows a Table 1-1 that can be used to guide threat modelling:
Vandal: Script Kiddie, Trespasser
Curiosity, Personal Fame
Fame from bringing down service or compromising confidential dataset of a High Profile Company
Uses publicly available tools and applications (Nmap, Metasploit, CVE PoCs). Some experimentation. Attacks are poorly concealed. Low level of targeting
Small scale DOS
Launches prepackaged exploits for access, crypto mining
Motivated individual: Political activist, Thief, Terrorist
Personal Gain, Political or Ideological
Personal gain to be had from exfiltrating and selling large amounts of personal data for fraud. Perhaps achieved through manipulating code in version control or artefact storage, or exploiting vulnerable applications from knowledge gained in ticketing and wiki systems, OSINT, or other parts of the system
Personal Kudos from DDOS of large public-facing web service
Defacement of the public-facing services through manipulation of code in version control or public servers can spread political messages amongst a large audience
May combine publicly available exploits in a targeted fashion. Modify open source supply chains. Concealing attacks of minimal concern
Exploit known vulnerabilities to obtain sensitive data from systems for profit and intelligence or to deface websites
Compromise Open Source projects to embed code to exfiltrate environment variables and secrets when code is run by users. Exported values are used to gain system access and perform crypto mining
Insider: employee, external contractor, temporary worker
Personal gain to be had from exfiltrating and selling large amounts of personal data for fraud, or making small alterations to the integrity of data in order to bypass authentication for fraud.
Encrypt data volumes for ransom
Detailed knowledge of the system, understands how to exploit it, conceals actions
Uses privileges to exfiltrate data (to sell on)
Misconfiguration/”codebombs” to take service down as retribution
Organised crime: syndicates, state-affiliated groups
Ransom, Mass extraction of PII/credentials/PCI data, Manipulation of transactions for financial gain
High level of motivation to access data sets or modify applications to facilitate large scale fraud
Crypto-ransomware e.g. encrypt data volumes and demand cash
Ability to devote considerable resources, hire “authors” to write tools and exploits required for their means. Some ability to bribe/coerce/intimidate individuals. Level of targeting varies. Conceals until goals are met
Ransomware (becoming more targeted)
RATs (in decline)
Coordinated attacks using multiple exploits, possibly using a single zero-day or assisted by a rogue individual to pivot through infrastructure (e.g Carbanak)
Cloud Service Insider: employee, external contractor, temporary worker
Personal Gain, Curiosity
Unknown level of motivation, access to data should be restricted by cloud provider’s segregation of duties and technical controls.
Depends on segregation of duties and technical controls within cloud provider
Access to or manipulation of datastores
Foreign Intelligence Services (FIS): nation states
Intelligence gathering, Disrupt Critical National Infrastructure, Unknown
May steal intellectual property, access sensitive systems, mine personal data en masse, or track down specific individuals through location data held by the system
Disrupt or modify hardware/software supply chains. Ability to infiltrate organisations/suppliers, call upon research programs, develop multiple zero-days. Highly targeted. High levels of concealment
Stuxnet (multiple zero days, infiltration of 3 organisations including 2 PKI infrastructures with offline root CAs)
SUNBURST (targeted supply chain attack, infiltration of hundreds of organisations)
Threat actors can be a hybrid of different categories. Eugene Belford, for example, was an insider who used advanced organised crime methods.
Captain Hashjack is a motivated criminal adversary with extortion or robbery in mind. We don’t approve of his tactics - he doesn’t play fair, and is a cad and a bounder - so we shall do our utmost to thwart his unwelcome interventions.
The pirate crew have been scouting for any advantageous information they can find online, and have already performed reconnaissance against BCTL. Using OSINT techniques (Open Source Intelligence) like searching job postings and LinkedIn skills of current staff, they have identified technologies in use at the organisation. They know you use Kubernetes, and they can guess which version you started on.
To threat model a Kubernetes cluster, you start with an architecture view of the system as shown in Figure 1-3. Gathering as much information as possible keeps everybody aligned, but there’s a balance: ensure you don’t overwhelm people with too much information.
You can learn to threat model Kubernetes with ControlPlane’s O’Reilly course: Threat Modelling Kubernetes.
This initial diagram might show the entire system, or you may choose to scope only one small or important area such as a particular pod, nodes, or the control plane.
A threat model’s “scope” is its target: the parts of the system we’re currently most interested in.
Next, you zoom in on your scoped area. Model the data flows and trust boundaries between components in a data flow diagram like Figure 1-3. When deciding on trust boundaries, think about how Captain Hashjack might try to attack components.
An exhaustive list of possibilities is better than a partial list of feasibilities
Adam Shostack, Threat Modelling
To generate possible threats you must internalise the attacker mindset: emulate their instincts and preempt their tactics. The humble data flow diagram (cf. Figure 1-4) is the defensive map of your silicon fortress, and it must be able to withstand Hashjack and their murky ilk.
Now that you have all the information, you brainstorm. Think of simplicity, deviousness, and cunning. Any conceivable attack is in scope, and you will judge the likelihood of the attack separately. Some people like to use scores and weighted numbers for this, others prefer to rationalise the attack paths instead.
Capture your thoughts in a spreadsheet, mindmap, a list, or however makes sense to you. There are no rules, only trying, learning, and iterating on your own version of the process. Try to categorise and make sure you can review your captured data easily. Once you’ve done the first pass, consider what you’ve missed and have a quick second pass.
Then you’ve generated your initial threats - good job! Now it’s time to plot them on a graph so they’re easier to understand. This is the job of an Attack Tree: the pirate’s treasure map.
An attack tree shows potential infiltration vectors such as Figure 1-5. Here we model how to take down the Kubernetes control plane.
Attack trees can be complex and span multiple pages, so you can start small like this branch of reduced scope.
This attack tree focuses on Denial of Service (DOS), which prevents (“denies”) access to the system (“service”). The attacker’s goal is at the top of the diagram, and the routes available to them start at the root (bottom) of the tree. The key on the left shows the shapes required for logical “OR” and “AND” nodes to be fulfilled, which build up to the top of the tree: the negative outcome. Confusingly Attack Trees can be bottom-up or top-down: in this book we exclusively use bottom-up. We walk through attack trees later in this chapter.
A YAML deserialisation Billion laughs attack in CVE-2019-11253 affected Kubernetes to v1.16.1 by attacking the API server. It’s not covered on this attack tree as it’s patched, but adding historical attacks to your attack trees is a useful way to acknowledge their threat if you think there’s a high chance they’ll reoccur in your system.
As we progress through the book, we’ll use these techniques to identify high-risk areas of Kubernetes and consider the impact of successful attacks.
Now you know who you are defending against, you can enumerate some high-level threats against the system and start to check if your security configuration is suitable to defend against them.
We define a scope for each threat model. Here, you are threat modelling a pod.
Threat modelling should be performed with as many stakeholders as possible (development, operations, QA, product, business stakeholders, security) to ensure diversity of thought.
You should try to build the first version of a threat model without outside influence to allow fluid discussion and organic idea generation. Then you can pull in external sources to cross-check the group’s thinking.
Let’s consider a simple group of Kubernetes threats to begin with:
Attacker on the Network: sensitive endpoints (such as the API server) can be attacked easily if public.
Compromised application leads to foothold in container: a compromised application (remote code execution, supply chain compromise) is the start of an attack.
Establish Persistence: stealing credentials or gaining persistence resilient to pod, node, and/or container restarts.
Malicious Code Execution: running exploits to pivot or escalate and enumerating endpoints.
Access Sensitive Data: reading secret data from the API server, attached storage, and network-accessible datastores.
Denial Of Service: rarely a good use of an attacker’s time. Denial of Wallet and Crypto Locking are common variants.
It’s also useful to draw Attack Trees to conceptualise how the system may be attacked and make the controls easier to reason about. Fortunately, our initial threat model contains some useful examples.
These diagrams use a simple legend, described in Figure 1-6.
The “Goal” is an attacker’s objective, and what we are building the Attack Tree to understand how to prevent.
The logical “AND” and “OR” gates define which of the child nodes need completing to progress through them.
In Figure 1-7 you see an attack tree starting with a threat actor’s remote code execution in a container.
You now know what you want to protect against and have some simple attack trees, so you can quantify the controls you want to use.
At this point, your team has generated a list of threats. We can now cross-reference them against some commonly-used threat modelling techniques and attack data:
STRIDE (framework to enumerate possible threats)
This is also a good time to draw on pre-existing, generalised threat models that may exist:
Trail of Bits and Atredis Partners Kubernetes Threat Model for the Kubernetes Security Audit Working Group (now SIG-security) and associated security findings, examining the Kubernetes codebase and how to attack the orchestrator
ControlPlane’s Kubernetes Threat Model and Attack Trees for the CNCF Financial Services User Group, considering a user’s usage and hardened configuration of Kubernetes
NCC’s Threat Model and Controls looking at system configuration
No threat model is ever complete. It is a point-in-time best effort from your stakeholders and should be regularly revised and updated, as the architecture, software, and external threats will continually change.
Software is never finished. You can’t just stop working on it. It is part of an ecosystem that is moving.
Now you are equipped with the basics: you know Captain Hashjack, your adversary. You understand what a threat model is, why it’s essential, and how to get to the point you have a 360-view on your system. In this chapter we further discussed threat actors and attack trees and walked through a concrete example. We have a model in mind now so we’ll explore each of the main Kubernetes areas of interest. Let’s jump into the deep end: we start with the pod.