Chapter 6

Security in embedded systems*

J. Rosenberg    Draper Laboratory, Cambridge, MA, United States

Abstract

This chapter is about security of embedded devices. Throughout this book, the effects of harsh environments in which we require many of our embedded devices to operate have been discussed in some detail. But even an embedded processor in a clean, warm, dry, stationary, and even physically safe situation is actually in an extremely harsh environment. That is because every processor everywhere can be subject to the harshness of cyber attacks. We know that even “air-gapped” systems can be attacked by the determined attacker as we saw happen with the famous Stuxnet attack. This chapter will discuss the nature of this type of harsh environment, what enables the cyber attacks we hear about every day, what are the principles we need to understand to work toward a much higher level of security, and we will present new developments that may change the game in our favor.

Keywords

Embedded systems; Security; Cyber attacks; Harsh environment; Internet of things

1 Not Covered in This Chapter

This chapter is about logical and network security (or lack thereof) of embedded devices. The physical security of the environs of the embedded device is not touched on here. In many situations, such as when the embedded devices are part of an enterprise—used in manufacturing or part of some sort of infrastructure or part of the financial operations (e.g., point of sale (POS) systems)—or when used in a contained military vehicle like a submarine or bomber, the physical security of the environment takes care of the embedded device. In situations where the embedded device is just one of many Internet of things (IoT), physical security of one device may not be vital but security of the whole array of devices is tied to the network that connects them, we will be discussing network security for embedded devices in this chapter. While the concept of denial of service (DoS) will be covered here, it is difficult if not impossible for the embedded device itself to locally defend itself against DoS attacks. Finally, where physical security of the embedded device is of critical importance, but the device cannot be part of any large organizational security infrastructure, an antitamper approach must be taken. Antitamper technology equips the device with special hardware and software that makes sure the device is, first, nonoperational when not in the hands of those authorized to possess it, and second, refuses to give up any secrets it contains if the unauthorized parties in possession try to tamper with the device in an attempt to steal its secrets. Perhaps even more important, if a device is considered inherently secure, antitamper is necessary to maintain that designation. We will only briefly cover antitamper as it is a large topic unto itself but we will discuss how an embedded device can become inherently secure in this chapter.

2 Motivation

2.1 What Is Security?

In the English language, security is defined as the state of being free from danger or threat.

Consequently we lock our car. We like to walk along well-lit streets. Many of us have alarm systems on our houses. Thanks to important banking laws enacted after the depression, we trust the government to back up losses at our bank. We use credit cards knowing that if they are stolen we are not responsible for expenditures on them. We are (mostly) free from danger or threat in our physical being and in our financial dealings.

But what happens when your house or car is broken into and your things are stolen, or you hear footsteps behind you on a dark street, or your bank is hacked and your identity is stolen? Understandably you have a visceral reaction. It feels like you have been personally violated.

The early days of computer hacking was simply vandalism usually designed to disable a computer. The first significant computer worm was The Morris worm of Nov. 2, 1988. It was the first to be distributed via the Internet. And it was the first to gain significant mainstream media attention. It also resulted in the first conviction in the United States under the 1986 Computer Fraud and Abuse Act. It was written by a graduate student at Cornell University, Robert Morris, and launched from MIT (where he is now a full professor).

But as the stakes got higher and higher and people who wanted things that were not theirs got more sophisticated in their programming skills, much bigger targets were taken on including big corporations, public infrastructure, and the military.

We are talking about security of embedded systems in this chapter so let’s begin with the fundamental principles on which computer security is based.

2.2 Fundamental Principles

Computer security is defined in terms of what is called the CIA triad: Confidentiality, Integrity, and Availability [1]. Let’s take each concept in turn and list out the traits of that principle.

2.2.1 Confidentiality

 Assurance that information is not disclosed to unauthorized individuals, programs, or processes.

 Ensures that the necessary level of secrecy is enforced at each junction of data processing and prevents unauthorized disclosure.

 Some information is more sensitive than other information and requires a higher level of confidentiality.

 The military and intelligence organizations classify information according to multilevel security designations such as Secret, Top Secret, and so forth that require protection from disclosure.

 Individuals have personally identifiable information such as social security numbers that also require protection from disclosure to unauthorized individuals.

 Attacks against confidentiality perpetrated by individuals tend to be about personal gain, so they go after credit card numbers that they use themselves or sell on the black market.

 On the other hand, Nation States may be after corporate confidential information like plans for a new fighter jet or national or corporate secrets.

 Either individuals, corporations, or government information may be the target of insider attacks, who have a variety of motives for accessing information they are not authorized to.

2.2.2 Integrity

 Assures that the accuracy and reliability of information and systems are maintained and any unauthorized modification is prevented.

 Must make sure data, resources, are not altered in an unauthorized fashion.

 Modification of data on disk, or in transit would violate integrity.

 Modification of any programs, computer systems, and network connections would also violate integrity.

 Vandals may try to change or destroy data just to create confusion or fear.

 Nation states may try to change data to affect outcomes beneficial to them. The military would fear an attack that changes battlefield recognizance information or command directives. These would be integrity attacks.

2.2.3 Availability

 Ensures reliability and timely access to data and resources to authorized individuals.

 Information systems to be useful at all must be available for use in a timely manner so productivity is not affected.

 A system that loses connectivity to its database would become useless to most users.

 A system that runs so slowly that users cannot get their work done becomes useless as well.

 Denial of service attacks have a wide variety of forms but all tend to keep some aspect of a system so busy responding to the DoS traffic that it no longer functions for its legitimate users.

Computer security is necessary because there are threats. The best way to think about what types of security is needed is to establish a model for those threats. This will get us into some basic vocabulary that is used to describe threats.

2.3 Threat Model

Key vocabulary used in security discussions includes “vulnerability,” “threat,” “risk,” and “exposure,” which would create a lot of confusion if they are really the same thing; which they are not. We will define them here and, more importantly, we will define a threat model in terms of how these concepts interact with each other.

2.3.1 Vulnerability

A vulnerability is a software, hardware, procedural, or human weakness that provides a hacker an open door to a network or computer system to gain unauthorized access to resources in the environment. Weak passwords, lax physical security, unmatched applications, an open port on a firewall, or bugs in software enable an attacker to leverage that flaw to gain access to the computer that software is running on. The Heartbleed bug was a vulnerability deployed on 66% of servers on the Internet [2] (not counting email, chat or VPN servers, or any embedded devices) created by a programmer’s failure to verify the bounds of a buffer that allowed unlimited external access to a server’s internal memory. For years, the vast majority of home wireless access points were shipped with the administration password of “password” and most users never changed that creating a huge vulnerability.

2.3.2 Threat

A threat is a potential danger to information or systems. The danger is that someone or something will identify a specific vulnerability and use it against the company or individual. A hacker or attacker coming into a network from across the internet is a threat agent. The entity that takes advantage of a vulnerability is a threat agent. A threat agent might be an intruder accessing the system through an open port on firewall. Or it could be a process accessing data in a way that violates security policy. An employee making a mistake that exposes confidential information is an (inadvertent) threat agent. The insider who, for their own reasons, wants to cause harm or steal information taking advantage of their insider access privileges is a threat agent.

2.3.3 Risk

Risk is the likelihood of a threat agent being able to take advantage of a vulnerability and the operational impact if they do. Risk ties together the vulnerability, threat and likelihood of exploitation to the resulting operational loss. If firewall ports are open, if users are not educated on proper procedures, if an intrusion detection system is not installed or configured correctly, risk goes up. If a known vulnerability is not addressed, risk goes up because bad guys pay attention to discovered vulnerabilities, they did not discover themselves.

2.3.4 Asset

Assets can be physical, information, monetary, or reputation. A physical asset includes computers, network equipment, or attached peripherals. Information assets are things such as customer data, proprietary information, and secret information. Monetary assets include the fines or direct expenses from a breach or loss of stock value in the stock market. And reputation asset is what is lost due to the poor perceptions of a company that has suffered a high visibility breach.

2.3.5 Exposure

Exposure is an instance of being subjected to losses or asset damage from a threat agent. A vulnerability exposes an organization to possible damages. Just as failure to have working fire sprinklers exposes an organization not only to fire danger but also losses of physical assets and potential liabilities, poor password management exposes an organization to password capture by threat agents who would then gain unauthorized access to systems and information.

2.3.6 Safeguard

A safeguard (or countermeasure) mitigates potential risk. It could be software, hardware, configurations, or procedures that eliminate a vulnerability or reduce likelihood a threat agent will be able to exploit a vulnerability.

The relationships between all of these concepts form a threat model: A threat agent creates a threat, which exploits a vulnerability that leads to a risk that can damage an asset and causes an exposure, which can be remedied through a safeguard or countermeasure.

If no one and nothing had access to our computers and networks we would be done with this discussion and this would be a very short chapter. Threats exist because there is access (usually through a network a system is connected to but also even when a system is air-gapped), so the next thing we need to include, when thinking about what security is, has to be access control. Readers who are aware of embedded systems that are not connected to any network or are on a network that is air-gapped from any network, where threat agents are active and think those devices are safe from threat just need to remember how Stuxnet was able to destroy 2000 uranium processing centrifuges that were air-gapped from the Internet.

2.4 Access Control

Fundamental to security is controlling how resources are accessed. A subject is an active entity that requests interaction with an object. An object is a passive entity that contains information. Access is the flow of information between a subject and an object. Access controls are security features that control how subjects and objects communicate and interact with each other.

2.4.1 Identification

All access control, and all of security for that matter, hinges on making sure with high confidence we know who is requesting access to a resource. If everyone can claim to be Barack Obama and gains access as such, there is little hope we will have much security in our systems. Identification is the method for ensuring that a subject is the entity it claims to be. Once an authoritative body has accepted the entity is who they say they are, that information is mapped to something that can be used to claim that identity such as a login user name or account number.

2.4.2 Authentication

A weakness in most systems is that identification is quite easy to spoof. Stronger defense against what are called social engineering attacks requires a second piece of the credential beyond identification such as password, passphrase, cryptographic key, personal identification number, anatomical attribute (biometric), token, or answers to a set of shared secrets.

2.4.3 Authorization

Authorization asks the question: Does the subject have the necessary rights and privileges to carry out the requested actions? If so, the subject is authorized to proceed.

2.4.4 Accountability

Keeping track by identity of what each subject did in the system is called accountability. It is used for forensics, to detect attacks, and to support auditing.

Clearly, different organizations have different needs when it comes to security. The national intelligence and defense organizations need to protect confidential, secret, and top secret information in separate buckets and people without sufficient clearance must never be allowed to see higher level information. A bank must not allow customer social security numbers to get to unauthorized individuals. These are examples of security policies.

2.5 Security Policy

A security policy is an overall statement of intent that dictates what role security plays within the organization. Security policies can be organizational policies, issue-specific policies, or system-specific policies, or a combination of all of these.

Security policies might:

 identify assets the organization considers valuable;

 state company goals and objectives pertaining to security;

 outline personal responsibility of individuals within the organization;

 define the scope and function of a security team;

 outline the organization’s planned response to an incident including public relations, customer relations, or government relations; and

 outline the company’s response to legal, regulatory, and standards of due care.

2.6 Why Cyber?

Cyber is short for cybernetics. It is used as an adjective defined as: of, relating to, or characteristic of the culture of computers, information technology, and virtual reality. That doesn’t really explain why it started to be used so heavily because it was a simple substitution for Information Technology or Computers or even Internet. The best answer might just be that it sounds cool and the media tend to drive the popular terminology. Whatever the reason, our subject is now described by the terms cyber-security, cyber-attacks, and cyber-threats.

2.7 Why is Security Important?

Security has become vitally important because of the critical nature of what our computers and networks do now compared to two or even one generation ago. For instance, they run the electric grid. On Aug. 14, 2003, shortly after 2 p.m. Eastern Daylight Time, a high-voltage power line in northern Ohio brushed against some overgrown trees and shut down—a fault, as it’s known in the power industry. The line had softened under the heat of the high current coursing through it. Normally, the problem would have tripped an alarm in the control room of FirstEnergy Corporation, an Ohio-based utility company, but the alarm system failed due to a software bug—a vulnerability. (Here the threat agent was Mother Nature not a hacker.) All told, 50 million people lost power for up to 2 days in the biggest blackout in North American history. The event contributed to at least 11 deaths and cost an estimated $6 billion. The entire power grid is vulnerable to determined hackers and a nationwide blackout—according to the Wall Street Journal quoting US intelligence sources, nation-states have penetrated the US power grid [3].

Similarly, computers control our water supply. EPA advisories have outlined the risk of cyber attack and its consequences. Our drinking water and wastewater utilities depend heavily on computer networks and automated control systems to operate and monitor processes such as treatment, testing, and movement of water. These industrial control systems (ICSs) have improved drinking water and wastewater services and increased their reliability. However, this reliance on ICSs, such as supervisory control and data acquisition (SCADA) embedded devices, has left the Water Sector like other interdependent critical infrastructures, including energy, transportation, and food and agriculture, vulnerable to targeted cyber attacks or accidental cyber events. A cyber attack causing an interruption to drinking water and wastewater services could erode public confidence, or worse, produce significant panic or public health and economic consequences. In factories producing chemicals, processing food, and manufacturing products, every valve, pump, and motor has the same sort of ICS processor control and monitoring embedded system each one of which is highly susceptible to cyber attack.

Transportation systems, air traffic control (ATC), airplanes themselves, trains, ships, even our cars have dozens of processors in them and many of those are in embedded systems with little or no physical protection from cyber attack. Havoc of disruption to our air, train, ship, or road-based transportation would have a significant impact on our economy and society. Transportation represents 12% of the GDP, it is highly visible and most citizens are dependent on it. Just if GPS went down it would shut down most of the above systems. You just have to see what happens when one stoplight is out and multiply that by tens of thousands just for autotransportation.

The stock market and commodity trading systems use thousands of embedded systems to function. A major disruption to the stock market stops the flow of investment capital and thus could shut down the economy.

Besides these major systems, other segments of the economy that are heavily dependent on highly vulnerable embedded systems include financial systems including POS, e-commerce, and inventory. Twenty percent of the economy is health-related including hospitals, medical devices, and even insurance companies. While not life-critical, the entertainment segment is big business and an important aspect of life including movies, TV, cable systems, satellite TV, video games, music, audio, and e-books.

Increasingly, our homes and all the devices in them such as thermostats, washers, dryers, stoves, microwaves, and even newer toasters are all controlled by embedded systems; they are all part of the trend toward an IoT.

Perhaps the heaviest user of embedded systems to date is our military. Military systems from satellites, to weapons systems, to command & control (C&C) are heavily dependent on embedded processors. Recent hacks against Central Command, the Office of Personnel Management (not an embedded device attack), and hundreds not reported, point to the danger.

And of course, the Internet and World Wide Web themselves are controlled by millions of embedded devices. Designed to withstand a direct nuclear hit, the Internet is indeed very resilient but never was the cyber attacking envisioned or planned for and no mechanisms to help defend against them were ever designed in. In fact, the entire Web is designed to help the attackers with ways to hide their tracks, dynamically change IP addresses, and to anonymize their location and attack paths.

Let’s talk some more about the IoT because this is the future of embedded devices and represents a huge potential source of vulnerabilities. What is the IoT and what is the vision? Currently there are 9 billion interconnected devices, which is expected to reach 24 billion devices by 2020. This next major expansion of our computing base will be outside the realm of the traditional desktop and will sit squarely in the embedded device space. In the IoT paradigm, many of the objects that surround us will be on the network in one form or another. Radio frequency identification (RFID) and sensor network technologies will represent a major portion of the sensors that will drive the IoT expansion. Many new types of sensors will be part of the IoT including smart healthcare sensors, smart antennas, enhanced RFID sensors, RFID applied to retail, smart grid, and household metering sensors, smart traffic sensors, and many others. IoT will have major implications for privacy concerns and impressive new levels of big data storage, management, analytics and, yes, security. IoT will be such a large market opportunity that it is the latest “gold rush” and companies are rushing new products to market as fast as possible with very little attention being paid to innate security. The hackers must be rubbing their hands in anticipation of their coming opportunities.

2.8 Why Are Cyber Attacks so Prevalent and Growing?

In 2014, companies reported 42.8 million detected attacks (the number unreported is probably at least as much) worldwide, a 48% year-over-year increase [4]. Some nation-states who are sometimes considered “adversaries” of the US have acknowledged they have large teams of cyber warriors [5]. But there are a lot of factors that have contributed to the exponential growth of cyber attacks. We identify a few of those here. The rapid growth and severity of incidents is stunning. Over 220,000 new malware variants appear every day according to AV-TEST. The number of new malware programs has rapidly grown over the last 10 years to exceed 140 million per year.

u06-01-9780128024591

Source: AV-TEST, https://www.av-test.org.

2.8.1 Mistakes in software

Mistakes are a big part of how cyber attackers ply their trade. Programmers make a mistake, that becomes a bug in an application and the cyber attackers exploit that bug as a cyber vulnerability. Many mistakes in the software in the SCADA controllers, and the Windows-based systems commonly used to program them, were leveraged to destroy 2000 centrifuges in the 2008 Stuxnet attack. Similarly, it was mistakes in various aspects of the systems at Sony, Target, Home Depot, and hundreds of others that allowed those famous cyber attacks to proceed. Stuxnet, however, is in a new class of dangerous attacks designed to destroy physical assets from across the Internet but this category also includes the Shamoon attack on Saudi Aramco that destroyed 30,000 desktop computers and in late 2014 hackers struck an unnamed steel mill in Europe. They did so by manipulating and disrupting control systems to such a degree that a blast furnace could not be properly shut down, resulting in “massive”—though unspecified—damage.

According to Lloyd’s of London, businesses lose $400B per year due to cyber attacks but due to potential lawsuits and liability issues, much cybercrime goes unreported and so is presumably much higher than $400B. According to the US Department of Homeland Security, about 90% of security breaches originate from defects in software. The embedded processors in mobile devices and the applications built for them are becoming the most common vector of attack on companies. These mobile apps typically have poor security because developers rush to bring them to market before properly implementing security protocols. More on software bugs and vulnerabilities to cyber attack is discussed later in this chapter.

2.8.2 Opportunity scale created by the Internet

The sheer size of the Internet, the existence of malcontents, determined nation-state adversaries, and the financial gains now possible from cyber attacks, are all reasons for growth of cyber crime.

Size of the Internet as measured by domains is 271 M, [6] servers, and as measured by users is 3.17B users.1 There has been an exponential explosion of smart phones that are Internet connected. In 2015, there were 2B smart phone users worldwide and that is expected to grow to over 6B by 2020.1 GE estimates the “Industrial Internet” has the potential to add $10 to $15 trillion2 to global GDP over the next 20 years. Cisco states that its forecast for the economic value of the IoT is $19 trillion3 in the year 2020.

2.8.3 Changing nature of the adversaries

The first cyber-criminals were purely vandals. The previous generation’s graffiti is hacking into a site and scribbling on their home page. There is an aspect of Haves vs. Have-nots on the Internet where the have-nots can be across borders and oceans and still attack the rich people in the rich countries. Thus phase two of cyber-crime became identity theft and wholesale thefts of credit cards which are re-sold as in the Target POS credit card theft. This phase also included a rash of cyber-ransom attacks against individuals and small businesses. Meanwhile increasing steadily are the actions of nation-states, where the aim is to steal secrets and to obtain an advantageous posture for a future action that will include cyber-war as part of an overall strategy. These nation states steal identities of government employees, and contractors in order to blackmail individuals into turning over secrets, they are behind Flame—an attack against the embedded devices in personal computers (e.g., Keyboard, microphone, camera)—whose mission is espionage and is considered by many to be the most complex malware ever found. The US’s main adversaries have penetrated critical infrastructure and have, it is believed, left behind mechanisms allowing them to take action against that infrastructure when it suits their overall aims.

2.8.4 Financial gain opportunities

The Target attack, enabled by the HVAC-contractor vulnerability, ultimately compromised the POS systems where consumer's credit cards are swiped. The foreign organized crime hackers who perpetrated the attack were highly enriched by their ability to resell those credit cards on the black market [7]. Those stolen cards were being offered at “card shops” starting at $20 each, up to more than $100. Between 1 and 3 million of those credit cards were ultimately sold on the black market, raising an estimated $53.7 million for the hackers.4 The attack costs Target $148 million, and costs credit card issuers institutions $200 million. The CEO lost his job and company profits fell 46% the quarter after the breach.

2.8.5 Ransomware

Ransomware is malware that locks your keyboard or computer to prevent you from accessing your data until you pay a ransom, usually demanded in Bitcoin. The digital extortion racket is not new—it’s been around since about 2005, but attackers have greatly improved on the scheme with the development of ransom cryptoware, which encrypts your files using a private key that only the attacker possesses, instead of simply locking your keyboard or computer.

It is not the case that ransomware just affects desktop machines or laptops; it also targets mobile phones and if it became more lucrative, the IoT would be next (think: control of your Nest thermostat during very cold weather).

Symantec gained access to a C&Cl server used by the CryptoDefense malware and got a glimpse of the hackers’ haul based on transactions for two Bitcoin addresses the attackers used to receive ransoms. Out of 5700 computers infected with the malware in a single day, about 3% of victims appeared to shell out for the ransom. At an average of $200 per victim, Symantec estimated that the attackers hauled in at least $34,000 that day. Extrapolating from this, they would have earned more than $394,000 in a month. This was based on data from just one command server and two Bitcoin addresses; the attackers were likely using multiple servers and Bitcoin addresses for their operation.

Conservatively, at least $5 million is extorted from ransomware victims each year. But forking over funds to pay the ransom doesn’t guarantee attackers will be true to their word and victims will be able to access their data again. In many cases, Symantec reports, this doesn’t occur.

2.8.6 Industrial espionage

Worldwide, around 50,000 companies a day are thought to come under cyberattack with the rate estimated as doubling each year. One of the means perpetrators use to conduct industrial espionage is by exploiting vulnerabilities in computer software. Malware and spyware as a tool for industrial espionage, are designed to transmit digital copies of trade secrets, customer plans, future plans and contacts. Newer forms of malware include devices which surreptitiously switch on mobile phones camera and recording devices and in some cases, to monitor every keystroke at the keyboard.

Operation Aurora was a series of cyber attacks conducted by advanced persistent threats such as the Elderwood Group based in Asia [8]. The attack has been aimed at dozens of organizations, including Adobe Systems, Juniper Networks, Rackspace, Yahoo, Symantec, Northrop Grumman, Morgan Stanley, and Dow Chemical. The primary goal of the attack5 was to gain access to and potentially modify source code repositories at these high tech, security and defense contractor companies. The Source Code Management systems were found to be wide open even though these were the crown jewels of these companies, in many ways more valuable than any financial or personally identifiable data that they may have and spend so much time and effort protecting.

2.8.7 Transformation into cyber warfare

Since 2010, when the cyberweapon Stuxnet was finally understood—and the damage a cyber attack could affect was fully absorbed—it has become clear that cyberwarfare was possible. At the very least, if not used as a weapon directly, cyber attacks were now part of any adversarial nation’s foreign policy. The target most frequently cited is our critical infrastructure which in general is heavily dependent on embedded systems such as programmable logic controllers (PLCs) and SCADA controllers. This is exactly what Stuxnet targeted. And this is why this chapter is so important.

2.9 Why Isn’t Our Security Approach Working?

Why hasn’t what we’ve been doing been working to stop or even slow the relentless cyber attacking? We have really smart people; the inventiveness of our cyber security people is legend. We have created firewalls, intrusion detection systems, anomaly sleuthing, virus scanners, schemes to make an application morph itself into a different application to fool attackers, and every possible piece of software protection these brilliant people can invent, and still the attack frequency and seriousness goes up year after year.

We are living in a time of inherent in-security and there is nothing that seems to be working to slow things down much less fix the problem. One reason even experts seem to be throwing up their hands is that the assumption is that our computer systems are so complex that it is impossible to design them without vulnerabilities to cyber attacks.

Therefore, the best we can do, it seems, is to: build virtual walls around our networks using firewalls and intrusion detection systems, then constantly run virus scanning tools on every computer on those networks to look for known attack signatures, patch our operating systems, applications, firmware, and even those security systems (they have bugs too) once a week or so, and then what, pray?

Perimeters are known to be very porous. In fact, hackers like to joke that enterprise networks look like some kind of candy: crunchy on the outside but soft and chewy on the inside. That’s because if they do penetrate the perimeter they are in and once in, they can pretty much move through the network and operate at will. In fact, one of the problems has become those perimeter systems themselves. They tend to run at a very high privilege level and because they are very large they have a corresponding number of bugs and those bugs are the entry point for the hackers. Patching is a particularly ineffective strategy for addressing cyber physical and other embedded systems because they can be much harder to reach, they tend to be long-lived and specialized, some are not on a network, and they substantially increase the attack surface of the environment with more patches to deliver and more opportunities for failure to protect.

It is important to remember that signature scanning and patching are only done after an attack has been identified and isolated. This is like waiting to lock your doors until after your neighbors have had a theft. What is most damning about patching’s effectiveness is that we have found that common vulnerabilities and exposures (CVEs) [9] that were found between 1999 and 2011 represented 75% of the exploits reported in 2014. That’s as much as 15 years since a vulnerability was reported without being patched [10].

2.9.1 Asymmetrical

A DARPA review of 9000 distinct pieces of malware in 2010 found that the average size of each independent attack component was only 125 lines of code. Meanwhile, the defensive systems enterprises use to protect their systems including intrusion detection continue to grow and some of these systems have reached 10 million lines of code. That is so complex—and these systems by nature have to have privileged access to the systems they are protecting—that they have become a desirable target for attack. Like all software, the larger it is the more bugs it has. Steve Maguire in his book Writing Secure Code found that across all deployed software regardless of application domain or programming language, one finds between 15 and 50 bugs per thousand lines of code (abbreviated KLOC). Rough estimates have been suggested that 10% of all bugs are potential cyber vulnerabilities. Given this, a 10-million lines of code defensive system has 150,000 bugs at best and potentially 15,000 security vulnerabilities that could enable an attacker entry into the network and all the systems the defensive system was designed to protect.

u06-02-9780128024591

Size of malware compared to the increasing complexity of defensive software.

Malware over almost a 25-year span has remained at about 125 lines of code measured over 9000 samples. But in that same period, defensive systems have gotten increasing more complex to the point where they are over 10 million lines of code. Since studies have shown that there are consistently 15 bugs per KLOC, this means the threat and we are diverging. Figure credit DARPA.

2.9.2 Architectural flaws

The crux of the matter is that we are losing the cyber war mostly because the architecture of our computer processors has no support for cybersecurity, so it is practically child’s play to attack them and have them do the bad guys bidding. No matter what virus protection, firewalls, or number of patches we apply, software is written by people and people cannot write perfect software, so vulnerabilities will always exist in any reasonably complex software. Bad guys are smart, they are patient, and they can win with very simple attack software.

Our legacy processor architectures are built around what many in the cyber security community call “Raw Seething Bits.” That is their description of what it means to have a single undifferentiated memory, where there is no indication what each word in that memory is. There is no way to tell—and most importantly, no way for the processor to tell—if a word is an instruction or an integer or a pointer to a block of memory storing data. There is an important reason processors started out having a single undifferentiated memory: it was simple to build. Simple was important when logic was performed by vacuum tubes or, a bit later, by just a few really expensive transistors: it was easier and cheaper to build.

That architecture—still in use in all our computing devices whether servers, laptops, or embedded devices—is Von Neumann’s 1945 stored-program computer (see Fig. 1). Having worked out the simplest architecture that worked, the industry started to perfect it, made transistors smaller and smaller, and ultimately started to see a trend that became known as Moore’s Law take hold. Moore’s Law stated that every 18th month the number of transistors occupying the same area would double. In fact the reason for the explosion of the Internet, mobile devices, the coming IoT, and information processing in general, is Moore’s Law leveraging this simple architecture and driven by a mantra of smaller, cheaper, faster. Moore’s Law has given us an iPhone more powerful than all computers NASA owned when it landed men on the moon. Transistors are now incredibly cheap, and while Moore’s Law may have reached its limit, the 2000 transistors for the first microprocessor in 1971 has become 5.5 billion transistors in the current model. Ironically, it’s the connectedness of everything enabled by Moore’s Law that has made cyber threats so prevalent and so serious.

f06-01-9780128024591
Fig. 1 The Von Neumann processor architecture.

In 1945, Von Neumann and others described a simple but powerful processor architecture with a single internal memory. This architecture continues to dominate the architecture of processors in billions of devices today. This single memory where instructions, data, pointers, and all the data structures needed by an application are stored with no way to tell what is what. It is this memory sometimes called “raw, seething bits” that prevents the processor from cooperating with the program to enforce security. Figure under the Creative Commons Attribution-Share Alike 3.0 by Kapooht.

As our processors became much more powerful and sophisticated, people began to trust them to protect increasingly valuable things. As processor power made more things possible, people’s expectations also grew and so software size has grown exponentially to keep up. All the while programming languages like C and C++ continued to provide direct access to the “raw, seething bits” of memory without manifest identity, types, boundaries, or permissions to help explain what each word in memory was for. Even Java—12 years newer than C++ and 23 years newer than C—which was designed to be safer and more helpful to the programmer, retains the C/C++ risks because it builds on libraries written in C and C++ and that unsound foundation leaves it vulnerable to bugs and attacks. This has left everything up to the programmer. When programming in C/C++/Java, the default is unsafe. As we have seen, a buffer when allocated does not protect itself from being overwritten; the programmer always has to do extra work (write more code) to make it safe. This is not so much the fault of the programming language as it is of things below the level of the language the programmer is writing in. It is this undefined behavior of the language, where most of the problems (which become cyber vulnerabilities) arise.

As cybersecurity started to become a concern, there were important constraints that had to be enforced in every program by every programmer to avoid cyber attack. But those vital constraints would only be enforced when every programmer got everything right on every single line of code. Any single mistake could become a vulnerability and sink the ship.

2.9.3 Software complexity—many vulnerabilities

NIST maintains a list of the unique software vulnerabilities (see https://nvd.nist.gov). Across all the world’s software, whenever a vulnerability is found that has not been identified anywhere before, it is added to this list. As of this writing, that list was approaching 76,000 unique vulnerabilities. This means attackers have 76,000 things they can leverage to compromise the systems they attack. Vulnerabilities are how attackers get in. Bugs (or weaknesses) are how the software is written so that vulnerabilities are created.

The common weakness enumeration list contains a rank ordering of software errors (bugs) that can lead to a cyber vulnerability. Top 25 most dangerous software errors is a list of the most widespread and critical errors that can lead to serious vulnerabilities in software. They are often easy to find, and easy to exploit. They are dangerous because they will frequently allow attackers to completely take over the software, steal data, or prevent the software from working at all.

RankScoreIDName
 [1]93.8CWE-89Improper neutralization of special elements used in an SQL command (“SQL injection”)
 [2]83.3CWE-78Improper neutralization of special elements used in an OS command (“OS command injection”)
 [3]79.0CWE-120Buffer copy without checking size of input (“Classic buffer overflow”)
 [4]77.7CWE-79Improper neutralization of input during web page generation (“Cross-site scripting”)
 [5]76.9CWE-306Missing authentication for critical function
 [6]76.8CWE-862Missing authorization
 [7]75.0CWE-798Use of hard-coded credentials
 [8]75.0CWE-311Missing encryption of sensitive data
 [9]74.0CWE-434Unrestricted upload of file with dangerous type
[10]73.8CWE-807Reliance on untrusted inputs in a security decision
[11]73.1CWE-250Execution with unnecessary privileges
[12]70.1CWE-352Cross-site request forgery
[13]69.3CWE-22Improper limitation of a pathname to a restricted directory (“path traversal”)
[14]68.5CWE-494Download of code without integrity check
[15]67.8CWE-863Incorrect authorization
[16]66.0CWE-829Inclusion of functionality from untrusted control sphere
[17]65.5CWE-732Incorrect permission assignment for critical resource
[18]64.6CWE-676Use of potentially dangerous function
[19]64.1CWE-327Use of broken or risky cryptographic algorithm
[20]62.4CWE-131Incorrect calculation of buffer size
[21]61.5CWE-307Improper restriction of excessive authentication attempts
[22]61.1CWE-601URL redirection to untrusted site (“open redirect”)
[23]61.0CWE-134Uncontrolled format string
[24]60.3CWE-190Integer overflow or wraparound
[25]59.9CWE-759Use of a one-way hash without a salt

t0010

Rank ordered by a score based on factors including technical impact, attack surface, and environmental factors as determined by NIST and MITRE who maintain the list. Each item in the list has its ID listed with it for cross-reference to the list maintained at http://cwe.mitre.org.

Layer upon layer of software increases the “attack surfaces” that attackers can probe for weaknesses. Every layer is subject to the same 15–50 bugs per KLOC and the corresponding exploitable vulnerabilities. This has led many to conclude that while perimeters are absolutely necessary to keep the vandals and script-kiddies out, they are far from sufficient to keep out the determined and sophisticated hackers that nation-states now deploy.

2.9.4 Complacence, fear, no regulatory pressure to act

Another factor making today’s security strategies not work is the simple fact that it is too easy to do nothing. There is the erroneous thinking that systems that are air-gapped are safe (the Stuxnet SCADA controllers were air-gapped). There is the fear and hoping that someone else gets hacked instead of you. In almost all situations for embedded systems there are few if any regulations that force one to act combined with a competitive environment that makes one not want to be first as it cuts into profit margins. The regulations we do have are mostly advisory in nature and have no teeth to require industry to act. It is a situation that is heading for a major catastrophe before serious action is taken.

2.9.5 Lack of expertise

Finally, security is not working for a huge number of organizations—especially smaller ones—simply because these organizations are not experts at IT, and expecting them to be on the cutting edge of IT security is totally unreasonable. Even if they outsource their IT function to a supposed expert, the IT outsourcing companies that service small organizations are themselves small businesses that cannot keep pace with the rapid change in the cybersecurity domain.

2.10 What Does This Mean for the IoT Security?

The IoT is an extremely fast growing area with many established companies building products. But many of the companies are small startups trying to leverage this brand new market into a business. Projections by Cisco and others predict 50 billion connected IoT devices by 2020. There are lots of old products that are getting this “new” name IoT such as routers and home devices. New or old, companies participating in IoT have very little time to get a product to market so they do whatever they have to, to get their first products to market in the least amount of time. Products in the IoT space have to be inexpensive, so margins are thin and all participants—large and small companies—cut corners and build products that are powered by specialized computer chips made by companies such as Broadcom, Qualcomm, and Marvell. These chips are cheap, and the profit margins slim. Aside from price, the way processor manufacturers differentiate themselves from each other is by features and bandwidth. They typically put a version of the Linux operating system onto the chips, as well as a bunch of other open-source and proprietary components and drivers. They do as little engineering as possible before shipping, and there’s little incentive to update their systems until absolutely necessary.

The system manufacturers—usually original device manufacturers (ODMs) who often don’t get their brand name on the finished product—choose a chip based on price and features, and then build their IoT device such as a router or server, or something else. They tend not to do a lot of engineering either. The brand-name company on the box adds a user interface perhaps a few new features, they run batteries of tests to make sure everything works, and then they are done, too.

The problem with this process is that no one entity has any time, margin room (or maybe any cash at all), expertise, or even ability to patch the software once it is shipped. The chip manufacturer is busy shipping the next version of the chip. The ODM is busy upgrading its product to work with this next chip. Maintaining the older chips and products is just not a priority. But the situation with the software is worse.

Much of the software is old, even when the device is new. It is common in home routers that the software components are 4–5 years older than the device. The minimum age of the Linux operating system is around 4 years. The minimum age of the Samba file system software is 6 years. They may have had all the security patches applied, but most likely not. No one has that job or has made it a priority. Some IoT components are so old that they’re no longer being patched. This patching is especially important because security vulnerabilities are found more easily as systems age.

To make matters worse, it’s often impossible to patch the software or upgrade the components to the latest version. Often, the complete source code isn’t available. They’ll have the source code to Linux and any other open-source components but many of the device drivers and other components are just “binary blobs” with no source code at all. That’s the most pernicious part of the problem: one can’t patch code that is just binary.

Even when a patch is possible, it’s rarely applied. Users usually have to manually download and install relevant patches. But since users never get alerted about security updates, and don’t have the expertise to manually administer these devices, it doesn’t happen. Sometimes the ISPs have the ability to remotely patch routers and modems, but this is also rare.

The result is hundreds of millions of devices that have been sitting on the Internet, unpatched and insecure, for the last 5–10 years. Devices that have not been on the Internet are being put on it and hackers are taking notice. When TrackingPoint enabled their self-aiming rifles for Wi-Fi, hackers wasted no time in hacking in and re-aiming the rifle and firing it remotely [11]. Similarly, when the skateboard company Boosted brought a remote-controlled powered skateboard to market and failed to encrypt the Bluetooth communications, hackers simply took control and could make a board going 20 miles an hour suddenly stop ejecting its rider [12]. Malware DNS Changer attacks home routers as well as computers. In a South American country, 4.5 million DSL routers were compromised for purposes of financial fraud. Last month, Symantec reported on a Linux worm that targets routers, cameras, and other embedded devices.

This is only the beginning. What we will see soon are some easy-to-use hacker tools and once we do the script kiddies and vandals will get into the game.

All the new IoT devices will only make this problem worse, as the Internet—as well as our homes and bodies—becomes flooded with new embedded devices that will be equally poorly maintained and unpatchable. Still, routers and modems pose the biggest problem because they are: between users and the Internet so turning them off is usually not an option; more powerful and more general in function than other embedded devices; the one 24/7 computing device in the house, and therefore a natural place for lots of new features.

We were here before with personal computers, and we fixed the problem by disclosing vulnerabilities which forced vendors to fix the problem. But that approach won’t work the same way with embedded systems. The scale is different today: more devices, more vulnerability, viruses spreading faster on the Internet, and less technical expertise on both the vendor and the user sides. Plus, as we have shown, we now have vulnerabilities that are impossible to patch. We have a formula for disaster: huge numbers of devices all connected to the Internet, more functionality in devices with a lack of updates, a pernicious market dynamic that has inhibited updates and prevented anyone else from updating, and a sophisticated hacker community chomping at the bit for this green field opportunity.

Fixing this has to become a priority before a disaster. We need better designs that start with security and don’t build them in on the third or fourth revision, or never. Automatic and very secure update mechanisms—now that all these devices are going to be connected to the Internet—are essential as well.

2.11 Attacks Against Embedded Systems

Embedded systems face distinct challenges separate from networked IT systems. Conventional protective strategies are insufficient to mitigate current cyber vulnerabilities. Most organizations do not currently have sufficient embedded system expertise to provide long-term vulnerability mitigation against the adaptive threat we are seeing. While there is no silver-bullet solution, there are a broad-based set of immediate actions that can significantly mitigate embedded system cyber risk above and beyond basic hygiene:

1. Employ digital signatures and code signing to ensure software integrity of new applications and all updates. Require future systems to cryptographically verify all software and firmware as it is loaded onto embedded devices.

2. Mandate inclusion of software assurance tools/processes and independent verification and validation using appropriate standards as part of future system development. Use best commercial code tools and languages available.

3. Employ hardware/software isolation and randomization to reduce embedded cyber risk and improve software agility even for highly integrated systems.

4. Improve and build organizational cyber skills and capabilities for embedded systems.

5. Protect design/development information (e.g., Source code repositories and revision control systems). Implement security procedures sufficiently early that protection against exfiltration and exploitation is consistent with the eventual criticality of the fielded system.

6. Develop situational awareness hardware and analysis tools to establish a baseline for your embedded operational patterns such that this will inform the best mitigation strategies.

7. Develop and deploy continuously verifiable software techniques.

8. Develop and deploy formal-method software assurance tools and processes.

In the following sections, we will examine 11 specific types of attacks against embedded systems which could be mitigated using the above list of recommendations:

1. Stuxnet, the first true cyber-weapon, created by nation-states to attack and destroy physical equipment half a world away. This was a sophisticated cyber attack against SCADA controllers driving nuclear weapon plutonium refinement. Examining how this attack worked is extremely helpful in thinking how other embedded devices might be attacked and how to prevent it. Stuxnet destroyed the misconception of the myriad people who had been saying “we are air-gapped and so not susceptible like everyone else is.”

2. Flame, Gauss, and Dudu are all derivatives of Stuxnet possibly made by different nation-states, and while not as focused on embedded devices per se, they are worth understanding in terms of their scale and sophistication. Again, knowing how these malwares work is helpful in preventing new embedded devices from being susceptible to the next attack.

3. Routers are the backbone of any network including the Internet itself. A router is a fairly simple embedded device, yet the entire network is lost if a router is compromised. It was thought that routers from major market-leading vendors were safe but once again, this overconfidence has proven to be unwarranted.

4. Aviation has embedded devices both on-board the plane and as part of the ground-based control infrastructure, any part of which if compromised could represent catastrophe. Some very simple hacks have been discovered in spite of high degrees of concern and diligence.

5. Automotive not only has a multitude of embedded devices, but it also represents one of the most complex system of systems in some vehicles reaching 100 million lines of source code. Given that Steve Mcquire reports [13] that there are at least 15 bugs per KLOC in deployed systems and approximately 10% of those can be turned into cyber vulnerabilities, these hugely complex vehicles could have 150,000 potential vulnerabilities.

6. Medical devices are embedded systems frequently with a life-critical mission. Some are embedded themselves into the human body. Security of these devices is critical for life safety of course but also to prevent the panic that would inevitably ensue should cyber hacks occur. Well, they have occurred and quite publicly.

7. ATM Jackpotting is a hack demonstrated at a Black Hat conference. ATMs, like a lot of today’s devices and machines, are running fairly standard computers and operating systems internally. ATM machines, like these other devices, allow updates over a network. Software unfortunately has flaws and as security vulnerabilities are found in the ATM software, they must be updated or they risk being hacked in increasingly creative ways like treating an ATM like a jackpot machine that spews out money like a slot machine that hit the jackpot.

8. Military is full of embedded systems from weapons systems to major platforms such as plans, tanks, ships, and submarines. Attack on these systems has not only loss of life implications of national security ones as well.

9. Infrastructure includes electric grid, water supply, communications and transportation to name just a few. All of these types of infrastructures depend heavily on embedded systems to operate. Yet many of these infrastructures are quite old and made up of legacy (and highly insecure) embedded systems. Attack on any of these infrastructures could cripple the country and thus are national security issues themselves.

10. Point-of-sale (POS) systems are included because when it comes to purely making money off of hacking, this is a favorite target of cyber attackers as shown in the Target, Home Depot, TJX, BJs and many others. These embedded devices are one of the most prevalent consumer-facing embedded devices trusted to handle highly sensitive personally identifiable data beginning with credit card numbers and they are highly vulnerable to attack.

11. Social engineering & password guessing. Password guessing must be included because it remains a common way attackers gain control and it is so easy to prevent and yet remains a prevalent weakness of embedded devices just as it is of large computer systems as well.

2.11.1 Stuxnet

Stuxnet was a 500-kilobyte computer worm that infected the software of at least 14 industrial sites in the country it was targeted at, including a uranium-enrichment plant. While a computer virus relies on an unwitting victim to install it, a worm spreads on its own, often over a computer network [14].

This worm was an unprecedentedly masterful and malicious piece of code that attacked in three phases. First, it targeted Microsoft Windows machines and networks, repeatedly replicating itself as it infected system after system. From there it sought out Siemens Step7 software, which is also Windows-based system used to program ICSs that operate equipment, such as centrifuges. Finally, it compromised the PLCs that are directly connected to and in control of the centrifuge motors. The worm’s authors could thus spy on the industrial systems and even cause the fast-spinning centrifuges to tear themselves apart, unbeknownst to the human operators at the plant. The Stuxnet victim never confirmed that the attack destroyed some of its centrifuges but satellite images show 2000 centrifuges being rolled out of the facility into the trash heap.

Stuxnet could spread stealthily between computers running Windows—even those not connected to the Internet. If a worker stuck a USB thumb drive into an infected machine, Stuxnet could jump onto it, then spread onto the next machine that read that USB drive. Because someone could unsuspectingly infect a machine this way, letting the worm proliferate over local area networks, experts feared that the malware had perhaps gone wild across the world. Basically it did but Stuxnet was highly selective in terms of location, host, and SCADA controller it had its sights set on so its spread was harmless except to one facility it was designed to target.

For several years the authors of Stuxnet were not officially acknowledged, but the size and sophistication of the worm led experts to believe that it could have been created only with the sponsorship of a nation-state, leading candidates being the United States and one of its allies [15]. Then in Jun. 2012, Obama announced that the United States had been involved. Since the discovery of Stuxnet many computer-security engineers have been fighting off other weaponized viruses based on Stuxnet design principles, such as Duqu, Flame, and Gauss, an onslaught that shows no signs of abating. The callout outlines the 12 steps that the Stuxnet malware went through. The most detailed analysis of how Stuxnet worked was done by Ralph Langer [16].

Put the following in a callout:

1. Stuxnet was initially launched into the wild as early as Jun. 2009, and its creator updated and refined it over time, releasing three different versions. An indication of how determined they were to remain anonymous was that one of the virus’s driver files used a valid signed certificate stolen from RealTek Semiconductor, a hardware maker in Taiwan, in order to fool systems into thinking the malware was a trusted program from RealTek.

2. Several layers of masking obscured the zero-day exploit inside, requiring work to reach it, and the malware was huge—500 k bytes, as opposed to the usual 10–15 k. Generally malware this large contained a space-hogging image file, such as a fake online banking page that popped up on infected computers to trick users into revealing their banking login credentials. But there was no image in Stuxnet, and no extraneous fat either. The code appeared to be a dense and efficient orchestra of data and commands.

3. Normally, Windows functions are loaded as needed from a DLL file stored on the hard drive. Doing the same with malicious files, however, would be a giveaway to antivirus software. Instead, Stuxnet stored its decrypted malicious DLL file only in memory as a kind of virtual file with a specially crafted name. It then reprogrammed the Windows API—the interface between the operating system and the programs that run on top of it—so that every time a program tried to load a function from a library with that specially crafted name, it would pull it from memory instead of the hard drive. Stuxnet was essentially creating an entirely new breed of ghost file that would not be stored on the hard drive at all, and hence would be almost impossible to find.

4. Each time Stuxnet infected a system, it “phoned home” to one of two domains—http://www.mypremierfutbol.com and http://www.todaysfutbol.com hosted on servers in Malaysia and Denmark—to report information about the infected machines. This included the machine’s internal and external IP addresses, the computer name, its operating system and version, and whether Siemens Simatic WinCC Step7 software, also known simply as Step7, was installed on the machine. This approach let the attackers update Stuxnet on infected machines with new functionality or even install more malicious files on systems.

5. Out of the initial 38,000 infections, about 22,000 were in the targeted country. The next most infected country was a distant second, with about 6700 infections, and the third-most had about 3700 infections. The United States had fewer than 400.

6. All told, Stuxnet was found to have at least four zero-day attacks in it. That had never been seen before. One was a Windows print spooler vulnerability that allowed the worm to spread across networks using a shared printer. Another attacked vulnerabilities in a Windows keyboard driver and a Task Scheduler process to perform a privilege escalation to root. Finally, it was exploiting a static password Siemens had hard-coded into its Step7 software, which it used to gain access to and infect a server hosting a database used with the Step7 programming system.

7. Stuxnet was not spreading via the Internet like every other worm before it had. It was targeting systems that were not normally connected to the Internet at all (air-gapped) and instead was spreading via USB thumb drives physically inserted into new machines.

8. The payload Stuxnet carried was three main parts and 15 components all wrapped together in layers of encryption which would only be decrypted and extracted when the conditions on the newly infected machine were just right. When it found its target—a Windows machine running Siemens Step7—it would install a new DLL that impersonated a legitimate one Step7 used.

9. Step7 was a Windows-based programming environment for the Siemens PLC used in many industrial control automation applications. The malicious DLL would intercept commands going from Step7 to the PLC and replace them with its own commands. Another portion of Stuxnet disabled alarms as a result of the malicious commands. Finally, it intercepted status messages sent from the PLC to the Step7 machine stripping out any signs of the malicious commands. This meant workers monitoring the PLC from the Step7 machine would see only legitimate commands and have no clue the PLC was being told to do very different things from what they thought Step7 had told it to do. At this moment, the first ever worm designed to do physical sabotage was born.

10. Stuxnet, it turns out, was a precision weapon sabotaging a specific facility. It carried a very specific dossier with details of the target configuration at the facility it sought. Any system not matching would be unharmed. This made it clear the attackers were a government with detailed inside knowledge of its target.

11. In the final stage of Stuxnet’s attack, it searched for the unique part number for a Profibus and then for one of two frequency converters that control motors. The malware would sit quietly on the system doing reconnaissance for about 2 weeks, then launch its attack swiftly and quietly, increasing the frequency of the converters to 1410 Hz for 15 min, before restoring them to a normal frequency of 1064 Hz. The frequency would remain at this level for 27 days, before Stuxnet would kick in again and drop the frequency down to 2 Hz for 50 min. The drives would remain untouched for another 27 days, before Stuxnet would attack again with the same sequence. The extreme range of frequencies suggested Stuxnet was trying to destroy whatever was on the other end of the converters. Satellite images show that approximately 2000 centrifuges—one-fifth of their total—were removed from the targeted country’s nuclear facility suggesting they were indeed destroyed.

12. Recently it was discovered there were even older versions of Stuxnet that went undiscovered for many years. The first targeted gas valves in nuclear reactors and the second targeted the reactors’ cores.

The 2010 Stuxnet worm used at least three separate zero-day exploits—an unprecedented feat—to damage industrial controllers and disrupt the uranium enrichment facility.

The zero-day vulnerabilities included the following from the list of CVEs [9]:

 CVE-2010-2568—Executes arbitrary code when user opens a folder with maliciously crafted.LNK or .PIF file.

 CVE-2010-2729—Executes arbitrary code when attacker sends a specially crafted remote procedure call message.

 CVE-2010-2772—Allows local users to access a back-end database and gain privileges in Siemens Simatic WinCC and PCS 7 SCADA system.

The attack also exploited already patched vulnerabilities, suggesting that the attackers knew the systems likely would not have been updated.

2.11.2 Flame, Gauss, and Duqu

Flame, Gauss, and Duqu are cousins of Stuxnet but did not focus on embedded systems. Still it is useful to understand what serious attackers are interested in doing and because the distinction between embedded systems and nonembedded systems will lesson over time.

Flame

Flame also known as Flamer and Skywiper, is computer malware discovered in 2012 that attacks computers running the Windows operating system. The program is being used for targeted cyber espionage in Middle Eastern countries. The internal code has few similarities with other malware, but exploits two of the same security vulnerabilities used previously by Stuxnet to infect systems.

Its discovery was announced by Kaspersky Lab and CrySyS Lab of the Budapest University of Technology and Economics, which stated that Flame “is certainly the most sophisticated malware we encountered during our practice; arguably, it is the most complex malware ever found” [17].

Flame can spread to other systems over a local network or via USB stick. It can record audio, screenshots, keyboard activity, and network traffic. The program also records Skype conversations and can turn infected computers into Bluetooth beacons, which attempt to download contact information from nearby Bluetooth-enabled devices. This data, along with locally stored documents, is sent on to one of several C&C servers that are scattered around the world. The program then awaits further instructions from these servers.

According to estimates by Kaspersky in May 2012, Flame had initially infected approximately 1000 machines, with victims including governmental organizations, educational institutions, and private individuals [18]. At that time, 65% of the infections happened in the Middle East and North Africa. Flame has also been reported in Europe and North America. Flame supports a “kill” command, which wipes all traces of the malware from the computer. The initial infections of Flame stopped operating after its public exposure, and the “kill” command was sent [19].

Gauss

Kaspersky Lab researchers also discovered a complex cyber-espionage toolkit called Gauss, which is a nation-state sponsored malware attack closely related to Flame and Stuxnet, but blends nation-state cyber-surveillance with an online banking Trojan [20]. It can steal access credentials for various online banking systems and payment methods and various kinds of data from infected Windows machines such as specifics of network interfaces, computer’s drives, and even information about BIOS. It can steal browser history, social network, and instant messaging info and passwords, and searches for and intercepts cookies from PayPal, Citibank, MasterCard, American Express, Visa, eBay, Gmail, Hotmail, Yahoo, Facebook, Amazon, and some other Middle Eastern banks. Additionally Gauss includes an unknown, encrypted payload, which is activated on certain specific system configurations.

Gauss is a nation state sponsored banking Trojan, which carries a warhead of unknown designation according to Kaspersky. The payload is run by infected USB sticks and is designed to surgically target a certain system (or systems), which has a specific program installed. One can only speculate on the purpose of this mysterious payload. The malware copies itself onto any clean USB inserted into an infected personal computer, then collects data if inserted into another machine, before uploading the stolen data when reinserted into a Gauss-infected machine.

The main Gauss module is only about 200 k, which is one-third the size of the main Flame module, but it has the ability to load other plugins, which altogether count for about 2 MB of code. Like Flame and Duqu, Gauss is programmed with a built in time-to-live (TTL). When Gauss infects an USB memory stick, it sets a certain flag to “30.” This TTL flag is decremented every time the payload is executed from the stick. Once it reaches 0, the data stealing payload cleans itself from the USB stick. This probably means it was built with an air-gapped network in mind.

There were seven domains being used to gather data, but the five C&C servers went offline before Kaspersky could investigate them. Kaspersky says they do not know if the people behind Duqu switched to Gauss at that time but they are quite sure they are related: Gauss is related to Flame, Flame is related to Stuxnet, and Stuxnet is related to Duqu. Hence, Gauss is related to Duqu. On the other hand, it is hard to believe that a nation state would rely on these banking Trojan techniques to finance a cyber-war or cyber-espionage operation.

So far, Gauss has infected more than 2500 systems in 25 countries with the majority—1660 infected machines—being located in a Middle Eastern country. Gauss started operating around Aug.–Sep. 2011 and it is almost certainly still in operation. Deep analysis of Stuxnet, Duqu, and Flame, leads these experts to believe with a high degree of certainty that Gauss comes from the same “factory” or “factories.” All these attack toolkits represent the high end of nation-state sponsored cyber-espionage and cyberwar operations, pretty much defining the meaning of “sophisticated malware” [21].

Duqu

Duqu looks for information that could be useful in attacking ICSs [22]. Its purpose seems not to be destructive but rather to gather information. However, based on the modular structure of Duqu, a special payload could be used to attack any type of computer system by any means and thus cyber-physical attacks based on Duqu might be possible. However, use on personal computer systems has been found to delete all recent information entered on that system, and, in some cases, total deletion of the computer’s hard drive has occurred. One of Duqu’s actions is to steal digital certificates (and corresponding private keys, as used in public-key cryptography) from attacked computers apparently to help future viruses appear as secure software. Duqu uses a 54 × 54 pixel jpeg file and encrypted dummy files as containers to smuggle data to its C&C center. Security experts are still analyzing the code to determine what information the communications contain. Initial research indicates that the original malware sample automatically removes itself after 36 days, which limits its detection.

Duqu has been called “nearly identical to Stuxnet, but with a completely different purpose.” Thus, Symantec believes that Duqu was created by the same authors as Stuxnet, or that the authors had access to the source code of Stuxnet [23]. The worm, like Stuxnet, has a valid, but abused digital signature, and collects information to prepare for future attacks. Duqu’s kernel driver, JMINET7.SYS, is so similar to Stuxnet’s MRXCLS.SYS that some systems thought it was Stuxnet.

2.11.3 Routers

Routers are attractive to hackers because they operate outside the perimeter of firewalls, antivirus, behavioral detection software, and other security tools that organizations use to safeguard data traffic. Until now, they were considered vulnerable only to sustained denial-of-service attacks using barrages of millions of packets of data. It has been “common knowledge” that they were not very vulnerable to outright takeover. However, a recent very dangerous attack against routers called SYNful has proven this commonly held belief is wrong. SYNful is a reference to SYN, the signal a router sends when it starts to communicate with another router, a process which the implant exploited.

If one seizes control of the router, that attacker owns the data of all the companies and government organizations that sit behind that router. Takeover of the router is considered the ultimate spying tool, the ultimate corporate espionage tool, the ultimate cybercrime tool. Router attacks have hit multiple industries and government agencies.

Cisco has confirmed the SYNful attacks but said they were not due to any vulnerability in its own software. Instead, the attackers stole valid network administration credentials from targeted organizations or managed to gain physical access to the routers. Mandiant has found at least 14 instances of the router implants [24]. Because the attacks actually replace the basic software controlling the routers, infections persist when devices are shut off and restarted. If found to be infected, basic software used to control those routers has to be re-imaged, a time-consuming task for IT engineers.

SYNful Knock is a stealthy modification of the router’s firmware image that can be used to maintain persistence within a victim’s network. It is customizable and modular in nature and thus can be updated once implanted. Even the presence of the backdoor can be difficult to detect as it uses nonstandard packets as a form of pseudo-authentication.

The initial infection vector does not appear to leverage a zero-day vulnerability. It is believed that the credentials are either default or discovered by the attacker in order to install the backdoor. However, the router’s position in the network makes it an ideal target for re-entry or further infection. It is a significant challenge to find backdoors within one’s network but finding a router implant is even more so. The impact of finding this implant on one’s network is severe and most likely indicates the presence of other footholds or compromised systems. This backdoor provides ample capability for the attacker to propagate and compromise other hosts and critical data using this as a very stealthy beachhead.

The implant consists of a modified Cisco IOS image that allows the attacker to load different functional modules from the anonymity of the internet [25]. The implant also provides unrestricted access using a secret backdoor password. Each of the modules are enabled via the HTTP (not HTTPS), using specially crafted Transfer Control Protocol (TCP) packets sent to the routers interface. The packets have a nonstandard sequence and corresponding acknowledgment numbers. The modules can manifest themselves as independent executable code or hooks within the routers IOS that provide functionality similar to the backdoor password. The backdoor password provides access to the router through the console and Telnet.

This very sophisticated, powerful, and dangerous attack on such an important embedded device is emblematic of what any embedded device developer has to be prepared for.

Aviation

There have been a number of incidents in recent years that demonstrate that worrisome cyber security vulnerabilities exist in the civil aviation system, some of which are in embedded devices within the airplane or attacks on ground systems affected the embedded devices on board. Some incidents of note are:

 an attack on the Internet in 2006 that forced the US Federal Aviation Administration (FAA) to shut down some of its ATC systems in Alaska;

 a cyber-attack that led to the shutdown of the passport control systems at the departure terminals at Istanbul Atatürk and Sabiha Gökçen airports in Jul. 2013, causing many flights to be delayed; and

 an apparent cyber-attack that possibly involved malicious hacking and phishing targeted at 75 airports in the United States in 2013.

More directly related to the vulnerability of on-board embedded systems is the loss of Malaysia Airlines Flight MH370, which is fueling a discussion of whether it is possible to hack into an airplane and gain complete control of on-board systems. In the case of MH370, speculation persists that it was through the entertainment system. This is not far-fetched. The former scientific adviser to the UK’s Home Office, Sally Leivesley, revealed Boeing 777 controls could be coopted with a radio signal sent from a small device [26]. A major vulnerability on many aircraft is the lack of any separation between the in-flight entertainment systems and critical control systems. This “open door” for hackers has USB ports and come with Ethernet. Recent modifications Boeing requested the FAA approve added a “network extension device” to separate the various systems from each other.

It was disclosed at a Hack In The Box security conference that it is possible to hack the navigation system within an airplane with an Android Smartphone. This is particularly alarming because just by using such a common and simple device, a hacker is able to take control of the entire on-board control system including plane navigation and cockpit systems. The researcher demonstrated that just using an exploit framework, called Simon, and an Android app, it is possible to gain remote control of the airplane’s on-board control system [27].

On the ground, hackers have been able to target ground systems and have successfully grounded 1400 passengers on LOT Polish Airways. Hackers breached the airline’s ground computers used to issue flight plans blocking the airline’s ability to create flight plans for outbound flights from its Warsaw hub. The airline canceled 20 flights and several others were delayed. The airline’s CEO warned all airlines are vulnerable.

The FAA’s highly vaunted NextGen system has already been shown to have serious vulnerabilities. The cornerstone of this new system is automatic-dependent surveillance-broadcast or ADS-B, where planes will be equipped with GPS and will constantly send out radio broadcasts announcing to the world who they are and where they are. It turns out that ADS-B signals look a lot like little bits of computer code that are unencrypted and unauthenticated. For computer security experts, these are red flags. Concerned white hat hackers have learned they can spoof these signals and create fake “ghost planes” in the sky. A fake plane could cause a real pilot to swerve—or a series of ghost planes could shut down an airport.

Automotive

Vulnerabilities in car automation systems have been exposed by several groups of security researchers to demonstrate how a networked car’s acceleration, braking, and other vital systems could be sabotaged. Researchers have also studied the risk of remote attacks against networked vehicles. A very highly publicized event was a hack on a Jeep, after which Chrysler announced that it was issuing a formal recall for 1.4 million vehicles that may be affected by a hackable software vulnerability in Chrysler’s Uconnect dashboard computers [28]. Hacks have demonstrated getting into the controls of some popular vehicles, causing them to suddenly accelerate, turn, kill the brakes, activate the horn, control the headlights, and modify the speedometer and gas gauge readings.

A team from University of California demonstrated that computer systems embedded in automobiles can be hacked to compromise safety features. They have even found ways to remotely hack into these systems using Bluetooth and MP3 files. These and other researchers have been exploring security holes in electronic vehicle controls as automakers develop increasingly complicated in-car computers and Internet-connected entertainment systems with dozens of processors and millions of lines of code. These results have been presented to the National Academy of Sciences Committee on Electronic Vehicle Controls and Unintended Acceleration [29].

Most new cars have some kind of a computer system that controls basic functions, such as brakes and engine performance, as well as advanced features such as Bluetooth wireless and built-in connectors for cell phones, MP3 players, and other devices. All new American-made cars are federally mandated to have a controller area network bus for diagnostics, and several automakers have rolled out cellular technology such as General Motors’ OnStar and Ford’s Sync services.

The cellular and Bluetooth connectivity is proving to be a particularly fruitful area for hackers. They have demonstrated ability to control the car’s brakes, locks, and computerized dashboard display by accessing the on-board computer using the Bluetooth wireless and OnStar and Sync’s cellular networks. The hacker team (friendly in this case) was also able to access GPS data and vehicle identification numbers. They broke through the cellular network’s authentication system to upload an audio file containing malware. Then they played an MP3 file containing some malicious code over the car’s stereo to alter the firmware. They found a vulnerability in the way Bluetooth was implemented that allowed them to execute malicious code by using an app installed on a smart phone that had been “paired” with the car’s Bluetooth system. Their work showed that attackers could search for desired models of cars, identify their locations using GPS tracking, and unlock them without laying a hand on the car. They could also sabotage such a car by disabling its brakes.

Another group decided to go after the tire pressure monitors built into modern cars and showed them to be insecure. The wireless sensors, compulsory in new automobiles in the United States since 2008, contain unique IDs, so merely eavesdropping enabled the researchers to identify and track vehicles remotely. Beyond this, they could alter and forge the readings to cause warning lights on the dashboard to turn on, or even crash the engine control unit completely. The tire pressure monitors are notable because they’re wireless, allowing attacks to be made from adjacent vehicles while driving down the road. The researchers used equipment costing $1500, including radio sensors and special software, to eavesdrop on, and interfere with, two different tire pressure monitoring systems. While these attacks are more of a nuisance than any real danger—the tire sensors only send a message every 60–90 s, giving attackers little opportunity to compromise systems or cause any real damage—the point is, several teams have now demonstrated that in-car computers have been designed with ineffective security measures.

Medical

More than 2.5 million people rely upon implantable medical devices to treat conditions ranging from cardiac arrhythmias to diabetes to Parkinson’s disease. Increasingly, these electronic devices, each with a processor on-board, are connected to some network and are becoming part of the IoT Inevitably, organized crime will turn its attention to the computers inside of us, whether for financial gain, for attention, or simply to cause fear.

Pace makers

Pacemakers from several manufacturers can be commanded to deliver a deadly, 830-V shock from someone on a laptop up to 50 ft away, the result of poor software programming by medical device companies. This was first reported by well-known white hat hacker Barnaby Jack6 of security vendor IOActive, known for his analysis of other medical equipment such as insulin-delivering devices. The flaw lies with the programming of the wireless transmitters used to give instructions to pacemakers and implantable cardioverter-defibrillators (ICDs), which detect irregular heart contractions and deliver an electric shock to avert a heart attack. A successful attack using the flaw could definitely result in fatalities. Jack made a video demonstration showing how he could remotely cause a pacemaker to deliver an 830-V shock, which could be heard with a crisp audible pop.

As many as 4.6 million pacemakers and ICDs were sold between 2006 and 2011 in the United States alone. In the past, pacemakers and ICDs were reprogrammed by medical staff using a wand that had to pass within a couple of meters of a patient who had one of the devices installed. The wand flips a software switch that would allow it to accept new instructions. But the trend is now to go wireless. Several medical manufacturers are now selling bedside transmitters that replace the wand and have a wireless range of up to 30–50 ft. In 2006, the US Food and Drug Administration approved full radio-frequency-based implantable devices operating in the 400 MHz range.

With that wide transmitting range, remote attacks against the software become more feasible. Jack found the devices would give up their model and serial number after he wirelessly contacted one with a special command. With the serial and model numbers, Jack could then reprogram the firmware of a transmitter, which in turn allowed reprogramming of a pacemaker or ICD in a person’s body.

Other problems found with the devices included the fact they often contain personal data about patients, such as their name and their doctor and access to remote servers used to develop the software. It is possible to upload specially crafted firmware to a company’s servers that would infect multiple pacemakers and ICDs, spreading through their systems like a real virus. Jack painted a doomsday scenario saying, “We are potentially looking at a worm with the ability to commit mass murder.” Ironically, both the implants and the wireless transmitters are capable of using AES (advance encryption standard) encryption, but it was not enabled. The devices also were built with “backdoors,” or ways that programmers can get access to them without the standard authentication using a serial and model number.

There is a legitimate medical need for a backdoor since without one, you might have to perform otherwise unnecessary surgery. It is vital when designing a backdoor that extreme care be taken. In this case, it has to be embedded deep inside the ICD core. Ultimately the flaws in the pacemaker could mean an attacker could perform a fairly anonymous assassination from 50 ft away turning a simple laptop into a murder weapon.

And it doesn’t take a professional highly advanced hacker like Barnaby Jack to construct these attacks. A University of Alabama group showed that a pacemaker or insulin pump attack can be done by a student with basic information technology and computer science background. The student attackers had no penetration testing skills, but successfully launched brute force and DoS attacks as well as attacks on security controls of a pacemaker. Their attacks included a DoS attack using HPING3 and using Reaver for a brute force attack against Wi-Fi Protected Setup register PIN numbers. In the students second attempt, they were able to crack the pacemaker’s passphrase in 9528 s or 2 h 38 min and 48 s.

Diabetes glucose monitors and insulin pumps

Insulin pumps are used to treat patients with diabetes by infusing their bodies with insulin, which, in healthy individuals, is secreted by the pancreas. When insulin levels are too low, people suffer from excessive blood sugar levels, a condition known as hyperglycemia. When insulin levels are too high, they suffer from hypoglycemia, a condition that can result in death if left unchecked. A glucose monitor, until recently just an external device a patient or his/her doctor uses to test blood glucose levels, can now be surgically inserted under the skin to continuously monitor levels without a skin prick. Continuous monitoring that provides fine-grained control over the insulin pump is much healthier because spikes and valleys in blood glucose levels are not good for the patient.

In addition to his pacemaker attack, Barnaby Jack succeeded in taking control of both an insulin pump’s radio control and vibrating alert safety mode. Jack’s hacking kit included a special piece of software and a custom-built antenna that has a scan range of 300 ft and for which the operator does not need to know the serial number. This meant the latest models of insulin pumps, equipped with small radio transmitters allowing medics to adjust function, can become easy prey to this hacking invention that scans around for insulin pumps. Once the hacker’s code takes control of the targeted machine, he can then disable the warning function and/or make it disperse 45 days worth of insulin all at once—a dose that will most likely kill the patient.

After these attacks were brought to light, manufacturers are working to improve the security of the medical devices by evaluating encryption and other protections that can be added to their design. Medical device manufacturers have promised to set up an industry working group to establish a set of standard security practices.

2.12 ATM “Jackpotting”

Barnaby Jack has also demonstrated attacks against two unpatched models from two of the world’s biggest ATM makers. One exploited software that uses the internet or phone lines to remotely administer a machine made by Tranax Technologies. Once Jack was in, he was able to install a rootkit that allowed him to view administrative passwords and account PINs and to force the machine to spit out a steady stream of dollar bills, something Jack called “jackpotting.”

In a second attack against a machine from Triton Systems, Jack used a key available for sale over the internet to access the model’s internal components. He was then able to install his rootkit by inserting a USB drive that was preloaded with the malicious program.

Both Triton and Tranax have patched the vulnerabilities that were exploited in the demos. But in a press conference after his revelations, Jack said he was confident he could find similarly devastating flaws—including in machines made by other manufacturers as well. Jack said he wasn't aware of real-world attacks that used his exploits.

To streamline his work, Jack developed an exploit kit he calls Dillinger, named after the 1930s bank robber. It can be used to access ATMs that are connected to the internet or the telephone system, which Jack said is true of most machines. The researcher has developed a rootkit dubbed Scrooge, which is installed once Dillinger has successfully penetrated a machine.

Jack said vulnerable ATMs can be located by war-dialing large numbers of phone numbers or sending specific queries to IP addresses. Those connected to ATMs will send responses that hackers can easily recognize. Jack called on manufacturers to do a better job securing their machines. Upgrades for physical locks, executable signing at the operating system kernel level and more rigorous code reviews should all be implemented, he said.

2.13 Military

Despite a 2013 Pentagon report warning of major vulnerabilities in the cyber security of weapons systems, [30] the US House of Representatives believes the military is lagging behind in closing those software holes. Meanwhile the US Senate proposes a 3-year cyber vulnerability assessment of all major weapons systems focusing on whether they can be hacked. The chief Pentagon weapons tester James Michael Gilmore found after a year of running dozens of tests and simulations on over 40 military weapons systems that almost all of them have some kind of major cybersecurity weakness [31].

Hackers have accessed designs for more than two dozen major US weapons systems according to a classified Defense Science Board report. A partial list of compromised designs includes the F-35 fifth-generation fighter jet, the V-22 Osprey, THAAD missile defense, Patriot missile defense, AIM-120 Advanced Medium-Range Air-to-Air Missile, and the Global Hawk high-altitude surveillance drone.

The benefits to an attacker using cyber exploits are potentially spectacular. Should the United States find itself in a full-scale conflict with a peer adversary, attacks would be expected to include DoS, data corruption, supply chain corruption, traitorous insiders, and kinetic and related nonkinetic attacks at all altitudes from underwater to space. US guns, missiles, and bombs may not fire, or may be directed against our own troops. Hackers could theoretically infiltrate the cryptographic intranet communications systems of an F-35 and render it essentially inoperable. Resupply, including food, water, ammunition, and fuel may not arrive when or where needed. Military Commanders may rapidly lose trust in the information and ability to control US systems and forces. Once lost, that trust is very difficult to regain.

Analysts at IOActive have shown that communications devices from Harris, Hughes, Cobham, Thuraya, JRC, and Iridium are all highly vulnerable to attack. The security flaws are numerous but the most important one—the one that’s the most consistent across the systems—is back doors, special points that engineers design into the systems to allow fast access. Cobham defends the practice of providing a back door claiming, as many manufacturers do, that such a back door was a “feature” that the helps ensure ease of maintenance. They believe that you have to be physically present at the terminal to use the maintenance port. But experienced hackers dispute that and while you do need physical access to pull off certain attacks, other vulnerabilities within the swift broadband unit (SBU) can be attacked through the Wi-Fi. This author insists that back doors should never be allowed into critical systems. Another common security flaw in these communication devices used in many weapons systems is hardcoded credentials, which allows multiple users access to a system via a single login identity. Another serious no-no that no customer should allow in devices they purchase.

The most serious vulnerability found on Cobham’s equipment allowed a hacker access to the system’s SBU and the satellite data unit (SDU). This means any of the systems connected to these elements, such as the multifunction control display unit (MCDU), could be impacted by a successful attack. Meanwhile, the MCDU unit provides information on such vital areas as the amount of fuel left in the plane. A hacker could give the pilot a lot of bad information that could imperil the aircraft, as happened in 2001 aboard Transat Flight 236, when a mechanical error did not inform the pilots that fuel was being diverted to a leaky tank. The pilots didn’t know the severity of the mechanical problem until there was a massive power failure in mid-air. This example shows the similarity between “naturally occurring” events and cyber attacks and this similarity is not lost on sophisticated attackers who try to disguise some or all of their attack as a naturally occurring event or in association with such an event such as a storm in the case of a power grid attack.

The military sea-borne systems aren’t any safer. Friendly hackers have shown that they can access the SAILOR 6000 satellite communications device, also manufactured by Cobham, which is used in naval settings by countries, including the United States, participating in the Global Maritime Distress and Safety System, an international framework to allow for better communication among maritime actors. This system, which the world’s maritime nations—including the United States—have implemented, is based upon a combination of satellite and terrestrial radio services and has changed international distress communications from being primarily ship-to-ship-based to primarily ship-to-shore-based according to the Department of Homeland Security.

Also, closely examined were SATCOMS used by ground forces [32]. SATCOMS are in wide use among the US and NATO militaries used by troops to communicate with units beyond the line of sight. Traffic over these systems is encrypted so there’s no threat of enemies listening in on restricted calls. But IOActive showed that it is possible not only to disrupt the communication but also discover troop locations. This means that a would-be enemy could exploit design flaws in some very common pieces of military communications equipment used by soldiers on the front lines to block calls for help, or potentially reveal troop positions.

2.14 Infrastructure

The Director of National Intelligence James Clapper has identified cyberattacks as the greatest threat to US national security [33]. Critical infrastructure—the physical and virtual assets, systems, and networks vital to national and economic security, health, and safety such as utilities, refineries, military defense systems, water treatment plants and other facilities on which we depend every day—has been shown to be vulnerable to cyberattacks by foreign governments, criminal entities, and lone actors. Due to the increasingly sophisticated, frequent, and disruptive nature of cyberattacks, such an attack on critical infrastructure could be significantly disruptive or potentially devastating. Policymakers and cybersecurity experts contend that energy is the most vulnerable industry; a large-scale attack could temporarily halt the supply of water, electricity, and gas, hinder transportation and communication, and cripple financial institutions. America’s critical infrastructure has become its soft underbelly, the place where we are now most vulnerable to attack.

Over the past 25 years, hundreds of thousands of analog controls in these facilities have been replaced with digital systems. Digital controls provide facility operators and managers with remote visibility and control over every aspect of their operations, including the flows and pressures in refineries, the generation and transmission of power in the electrical grid, and the temperatures in nuclear cooling towers. In doing so, they have made industrial facilities more efficient and more productive. But the same connectivity that managers use to collect data and control devices allows cyber attackers to get into control system networks to steal sensitive information, disrupt processes, and cause damage to equipment. Hackers have taken notice. While early control system breaches were random accidental infections, ICSs today have become the object of targeted attacks by skilled and persistent adversaries.

ICS is a general term that encompasses several types of control systems used in industrial production, including SCADA systems, distributed control systems, and other smaller control system configurations such as PLCs often found in the industrial sectors and critical infrastructures. ICSs are typically used in industries such as electrical, water, oil, gas, and data. Based on data received from remote stations, automated or operator-driven supervisory commands can be pushed to remote station control devices, which are often referred to as field devices. Field devices control local operations such as opening and closing valves and breakers, collecting data from sensor systems, and monitoring the local environment for alarm conditions.

The 2010 discovery of the Stuxnet worm demonstrated the vulnerability of these systems to cyber attack.

Attackers target ICSs because they might want to disrupt critical operations, create public panic, harm a company or the nation economically, or may be stealing information from these devices.

The recently discovered ICS modules of the HAVEX trojan are one example. This malware infiltrated an indeterminate number of critical facilities by attaching itself to software updates distributed by control system manufacturers. When facilities downloaded the updates to their network, HAVEX used open communication standards to collect information from control devices and sent that information to the attackers for analysis. This type of attack represents a significant threat to confidential production data and corporate intellectual property and may also be an early indicator of an advanced targeted attack on an organization’s production control systems.

Other hacks represent a direct threat to the safety of US citizens. Recently the FBI released information on Ugly Gorilla, an attacker who invaded the control systems of utilities in the United States. While the FBI suspects this was a scouting mission, Ugly Gorilla gained the cyber keys necessary for access to systems that regulate the flow of natural gas [34].

Considering that cyber attackers are numerous and persistent—for every one you see there are a hundred you don’t—those developments should sound alarms among executives at companies using industrial controls and with the people responsible for protecting American citizens from attacks. To a limited extent both businesses and the US government have begun to take action; however, no one is adequately addressing the core of the issue. That issue is that every computer system we have deployed for the last 70 years is highly vulnerable to attack no matter how many patches we employ or how many layers of perimeter we lay down.

2.14.1 Electric grid

The people who carried out the Dec. 2015 first known hacker-caused power outage, used highly destructive malware to gain a foothold into multiple regional distribution power companies in an Eastern European country and delay restoration efforts once electricity had been shut off. The malware, known as BlackEnergy, allowed the attackers to gain a foothold on the power company systems, said the report, which was published by a member of the SANS ICSs team. The still-unknown attackers then used that access to open circuit breakers that cut power. After that, they likely used a wiper utility called KillDisk to thwart recovery efforts and then waged denial-of-service attacks to prevent power company personnel from receiving customer reports of outages.

The attackers demonstrated planning, coordination, and the ability to use malware and possible direct remote access to blind system dispatchers, cause undesirable state changes to the distribution electricity infrastructure, and attempt to delay the restoration by wiping SCADA servers after they caused the outage. This attack consisted of at least three components: the malware, a DoS to the phone systems, and the missing piece of evidence of the final cause of the impact. Analysis indicates that the missing component was direct interaction from the adversary and not the work of malware. Or in other words, the attack was enabled via malware but consisted of at least three distinct efforts.

2.15 Point-Of-Sale Systems

POS systems are embedded devices at checkout points in retail stores. Since this is where consumer’s credit cards are processed, this has become a favorite target for hackers. The most famous POS attack was against Target during the Christmas holiday shopping season in 2013.

Target disclosed that the initial intrusion into its systems, on Nov. 15, 2013, was traced back to network credentials that were stolen from a third-party HVAC contractor. This was the same HVAC as Trader Joe’s, Whole Foods and BJ’s Wholesale Club locations in Pennsylvania, Maryland, Ohio, Virginia, and West Virginia. These large retailers, with very large buildings, commonly have an HVAC team that routinely monitors energy consumption and temperatures in stores to save on costs (particularly at night) and to alert store managers if temperatures in the stores fluctuate outside of an acceptable range that could prevent customers from shopping at the store. To support this temperature monitoring, the HVAC vendors need to be able to remote into the system to do maintenance (updates, patches, etc.) or to troubleshoot glitches and connectivity issues with the software.

Between Nov. 15 and Nov. 28 (Thanksgiving, the day before Black Friday—the biggest shopping day of the year), the attackers succeeded in uploading their card-stealing malicious software to a small number of cash registers within Target stores. Two days later, the intruders had pushed their malware to a majority of Target’s POS devices, and were actively collecting card records from live customer transactions. Target acknowledged that the breach exposed approximately 40 million debit and credit card accounts between Nov. 27 and Dec. 15, 2013.

The issues here are that the HVAC contractor was seemingly an easy target for the hackers. This is not too surprising given that most small businesses are not able to afford the highest standards of network security nor do they have sufficient IT expertise to even know what they do not know. But more distressing is the fact that Target (and many retailers) do not, and are not required even by published industry standards to, maintain separate networks for payment and nonpayment operations. Those industry standards do require merchants to incorporate two-factor authentication for remote network access originating from outside the network by personnel and all third parties—including vendor access for support or maintenance. Whether the HVAC contractor was required to use two-factor authentication for access to Target’s system has not been disclosed.

Callout box with the title: The Target POS attack by the numbers [35].

40 million—The number of credit and debit card thieves stole from Target between Nov. 27 and Dec. 15, 2013.

70 million—The number of records stolen that included the name, address, email address, and phone number of Target shoppers.

46—The percentage drop in profits at Target in the fourth quarter of 2013, compared with the year before.

200 million—Estimated dollar cost to credit unions and community banks for reissuing 21.8 million cards—about half of the total stolen in the Target breach.

100 million—The number of dollars Target says it will spend upgrading their payment terminals to support Chip-and-PIN-enabled cards.

18.00–35.70—The median price range (in dollars) per card stolen from Target and resold on the black market.

1 million–3 million—The estimated number of cards stolen from Target that were successfully sold on the black market and used for fraud before issuing banks got around to canceling the rest.

53.7 million—The income that hackers likely generated from the sale of 2 million cards stolen from Target and sold at the mid-range price of $26.85.

55 million—The number of dollars outgoing CEO Gregg Steinhafel got in executive compensation and other benefits on his departure as Target’s chief executive.

2.16 Social Engineering

Social engineering is when one person tricks another into sharing confidential information, for example, by posing as someone authorized to have access to that information. Social engineering can take many other forms. Indeed, any one-to-one communication medium can be used to perform social engineering attacks. Here we will briefly explore three forms of social engineering: spoofing email, phishing, and password guessing all of which can be a step toward an attack against embedded systems.

2.16.1 Spoofing email

An example of spoofing email might be that an attacker changes the name in the From field to be the name of the network administrator and sends the message to the CEO’s assistant. The message says the IT department is having trouble with the CEO’s account and to please help them out by supplying the CEO’s confidential password. Seeing the correct name of the network administrator and not knowing how to look deeper at the inner workings of the email message, the admin is likely to comply and provide the password. With the CEO’s credentials messages can be sent to many people in the company that they would believe that might allow the attacker to get at critical embedded systems within the company’s operations.

2.16.2 Phishing

Phishing is closely related to spoofing email. Phishing is used to install and execute CryptoLocker, which is a very malicious form of ransomware that encrypts files on the user’s machine and across their entire connected network and demands a ransom to provide the key to decrypt those files. One way it enters is via a spoofed email message that pretends to originate from a source the user trusts and has a relationship with. The message has an attached Word document with a name and a message in the email body that makes the user want to open the document. Once the document is opened, it starts running some embedded Word macros. As those macros begin to execute CryptoLocker installs itself into the user’s Documents and Settings folder, using a randomly generated name, and adds itself to the list of programs in the registry that Windows loads automatically every time one logs on. It produces a lengthy list of random-looking server names in the domains .biz, .co.uk, .com, .info, .net, .org, and .ru. It tries to make a web connection to each of these server names in turn, trying one each second until it finds one that responds. Once it has found a server that it can reach, it uploads a small file that is essentially the “CryptoLocker ID.” The server then generates a public-private key pair unique to that ID, and sends the public key part back to the source computer. The private key part is tucked away impossible for the victim to access. The malware on the computer uses this public key to encrypt all the files it can find that match an extensive list of extensions, covering file types such as images, documents, and spreadsheets. At this point, the bad guys won and the victim has lost. CryptoLocker then pops up a “pay page,” giving the victim a limited time, typically 72 hours, to buy back the private key for their data, typically for 2 bitcoins or $300. Since the ransomers are using modern cryptography, once they have encrypted the victim's disk, there are no back doors or shortcuts to help the victim out. What the ransomers have encrypted can only be decrypted using their private key. Therefore, if they don’t provide that private key, the data is as good as gone. There are many situations where this could directly or indirectly affect embedded devices that are within the same network as the original victimized machine. Also other types of phishing can more directly target embedded devices such as those used widely in critical infrastructure.

2.16.3 Password guessing

It may not sound like an attack category but by far the easiest attack against embedded systems is still password guessing. Many devices like routers, PLCs, and even automotive diagnostic ports have factory-defined passwords that are supposed to be changed by the owner at the time of installation but frequently are not. Thus, passwords like “12345,” “password,” and “admin” are prevalent on millions of devices. As these devices increasingly are put on the Internet for legitimate purposes, those passwords leave a wide open pathway from the outside world into whatever the embedded device is attached to.

Many people use very simple easy to remember passwords such as “123456” or their pet’s name. These are extremely easy to guess. This type of behavior will affect consumer IoT devices going forward and with the enormous number of these devices projected into the future, weak passwords could create great problems if a coordinated attack were launched against a lot of these devices simultaneously.

Weak passwords keep getting weaker. That’s because ordinary desktop computers can now test over a hundred million passwords per second using password cracking tools that run on a general purpose CPU and over 350 billion passwords per second using GPU-based password cracking tools. A user-selected eight-character password with numbers, mixed case, and symbols, reaches an estimated 30-bit strength, according to NIST; 2○30 is only 1 billion permutations and would take an average of 16 min to crack. If a hacker can access the encrypted passwords (Unix used to place them in /etc/password) then they can bypass any schemes that limit the number of failed attempts before a lockout. All they have to do is run a high-speed password cracker and encrypt that and compare the result to the items in the encrypted password file.

Until two-factor authentication becomes ubiquitous and most device’s sole protection is passwords, the best plan of action is to use a password generator (such as SecureSafe) that stores the password in a reliable digital vault that does require two-factor authentication. That vault should have a very strong password that is not stored anywhere. The recommendation for generating solid passwords that have to be remembered is to use a passphrase approach such as a poem you know well where the first letter of each of 12–16 words is remembered and where a few letters are turned into numbers and special characters. For example, You could turn an “l” into a 1, an “o” into a 0, an “a” into a & and separate two phrases with an “=” or “;”.

Now that we have looked at what types of embedded devices have had high profile attacks against them and how those attacks have worked, in the next section we will put into context how bad things could get with certain types of attacks against key embedded devices.

2.17 How Bad Could This Be?

The network connectivity that the United States has used to tremendous advantage economically and militarily over the past 20 years has made the country more vulnerable than ever to cyber attacks. At the same time, our adversaries are far more capable of conducting such attacks.

Symantec reports blocking over 5.5 billion attacks with its software in 2011 alone finding that the average breach exposed 1.1 million identities and nearly 5000 new vulnerabilities were identified in the calendar year. Over 400 million unique variants of malware attempted to take advantage of those vulnerabilities, up to 40% from 2010. Attack toolkits are easy to find and available in web forums or on the underground black-market and cost only $40–$4000 to procure. Use of these widely available tools allows almost anyone to exploit any known and uncorrected vulnerability.

At the same time, the cyber world has moved from exploitation and disruption to destruction.

The impact of a destructive cyber attack on the civilian population would be frightening as it could mean no electricity, money, communications, TV, radio, or fuel (electrically pumped). In a short time, food and medicine distribution systems would be ineffective; transportation would fail or become so chaotic as to be useless. Law enforcement, medical staff, and emergency personnel capabilities could be expected to be barely functional in the short term and dysfunctional over sustained periods. If the attack’s effects were reversible, damage could be limited to an impact equivalent to a power outage lasting a few days. If an attack’s effects cause physical damage to control systems, pumps, engines, generators, controllers, etc., the unavailability of parts and manufacturing capacity could mean months to years are required to rebuild and reestablish basic infrastructure operation.

Let’s look at a few real instances of attacks and extrapolate that to the growth of the connected embedded devices projected over the next decade.

2.17.1 Heartbleed

Heartbleed is a security bug disclosed in Apr. 2014 in the OpenSSL cryptography library, which is a widely used implementation of the transport layer security (TLS) protocol. SSL (secure sockets layer) and TLS are frequently used interchangeably but TLS is the newer version of the protocol. The “s” in https is from SSL and is what displays the padlock symbol in one’s browser and protects transmission of confidential or private data such as social security or credit card numbers. Heartbleed has been shown to have affected 17% of the servers on the Internet and the clients that connect to them. Heartbleed may be exploited regardless of whether the party using a vulnerable OpenSSL instance for TLS is a server or a client. It results from improper input validation (due to a missing bounds check) in the implementation of the TLS heartbeat extension [36] designed to keep sessions in browsers alive and thus the bug’s name derives from “heartbeat.” The vulnerability is classified as a buffer overread, a situation where more data can be read than should be allowed, meaning data can be exfiltrated.

The Heartbeat Extension keeps the session alive by having a computer at one end of a connection to send a “Heartbeat Request” message, consisting of a payload, typically a text string, along with the payload’s length as a 16-bit integer. The receiving computer then must send exactly the same payload back to the sender. The affected versions of OpenSSL allocate a memory buffer for the message to be returned based on the length field in the requesting message, without regard to the actual size of that message’s payload. Because of this failure to do proper bounds checking, the message returned consists of the payload, possibly followed by whatever else happened to be in memory after the allocated buffer.

Heartbleed is exploited by sending a malformed heartbeat request with a small payload and large length field to the vulnerable party (usually a server) in order to elicit the victim’s response, permitting attackers to read up to 64 kilobytes of the victim’s memory that was likely to have been used previously by OpenSSL. Where a Heartbeat Request might ask a party to “send back the four-letter word ‘bird’,” resulting in a response of “bird,” a “Heartbleed Request” (a malicious heartbeat request) of “send back the 500-letter word ‘bird’” would cause the victim to return “bird” followed by whatever 496 characters the victim happened to have in active memory. Attackers in this way could receive sensitive data, compromising the confidentiality of the victim’s communications. Although an attacker has some control over the disclosed memory block’s size, it has no control over its location, and therefore cannot choose what content is revealed.

The problem can be fixed very simply by ignoring Heartbeat Request messages that ask for more data than their payload needs.

Heartbleed is registered in the CVEs system as CVE-2014-0160. A fixed version of OpenSSL was released on Apr. 7, 2014, on the same day Heartbleed was publicly disclosed.

At the time of disclosure, some 66% of the Internet’s secure web servers certified by trusted authorities were believed to be vulnerable to the attack, allowing theft of the servers’ private keys and users’ session cookies and passwords. The Electronic Frontier Foundation, Ars Technica, and Bruce Schneier [37] all deemed the Heartbleed bug “catastrophic.” Forbes cybersecurity columnist Joseph Steinberg wrote, “Some might argue that [Heartbleed] is the worst vulnerability found (at least in terms of its potential impact) since commercial traffic began to flow on the Internet” [38].

Although the bug received more attention due to the threat it represents for servers, TLS clients using affected OpenSSL instances are also vulnerable. Relevant to the embedded device world, a “reverse Heartbleed” has malicious servers exploiting Heartbleed to read data from a vulnerable client’s memory. This shows that Heartbleed is not just a server-side vulnerability but also a client-side vulnerability because the server, or whomever you connect to, is as able to ask you for a heartbeat back as you are to ask them. Those clients are in many cases embedded systems so contrary to many people’s initial assumptions, Heartbleed is not just an issue for Web servers and other purely computer systems. OpenSSL is very popular in client software and somewhat popular in networked appliances which have most inertia in getting updates.

To wit, Cisco Systems has identified 75 of its products as vulnerable, including IP phone systems and telepresence (video conferencing) systems [39]. Games including Steam, Minecraft, Wargaming.net, League of Legends, GOG.com, Origin, Sony Online Entertainment, Humble Bundle, and Path of Exile were affected and subsequently fixed as well [40].

2.17.2 Shellshock

Shellshock, also known as Bashdoor, is a family of security bugs in the widely used Unix Bash shell, the first of which was disclosed on Sep. 24, 2014. Many Internet-facing services, such as some web server deployments, use Bash to process certain requests, allowing an attacker to cause vulnerable versions of Bash to execute arbitrary commands. This can allow an attacker to gain unauthorized access to a computer system [41].

In Unix-based operating systems, and in other operating systems that Bash supports, each running program has its own list of name/value pairs called environment variables. When one program starts another program, it provides an initial list of environment variables for the new program. Separately from these, Bash also maintains an internal list of functions, which are named scripts that can be executed from within the program. Since Bash operates both as a command interpreter and as a command, it is possible to execute Bash from within itself. When this happens, the original instance can export environment variables and function definitions into the new instance. Function definitions are exported by encoding them within the environment variable list as variables whose values begin with parentheses (“()”) followed by a function definition. The new instance of Bash, upon starting, scans its environment variable list for values in this format and converts them back into internal functions. It performs this conversion by creating a fragment of code from the value and executing it, thereby creating the function “on-the-fly,” but affected versions do not verify that the fragment is a valid function definition. Therefore, given the opportunity to execute Bash with a chosen value in its environment variable list, an attacker can execute arbitrary commands or exploit other bugs that may exist in Bash’s command interpreter. Increasingly, embedded systems run Unix variants and some run the bash shell making this dangerous bug relevant to embedded systems.

The discoverer of Shellshock contacted Bash’s maintainer reporting their discovery of the original bug called “Bashdoor.” Together the discoverer and maintainer developed a patch. The bug was assigned the CVE identifier CVE-2014-6271. Once Bash updates with the fix were ready for distribution, it was announced to the public. Within days of the publication of this, intense scrutiny of the underlying design flaws discovered a variety of related vulnerabilities which the developers addressed with a series of further patches.

Attackers exploited Shellshock within hours of the initial disclosure by creating botnets of compromised computers to perform distributed denial-of-service attacks and vulnerability scanning. Security companies recorded millions of attacks and probes related to the bug in the days following the disclosure.

Shellshock could potentially compromise millions of unpatched servers and other systems. Accordingly, it has been compared to the Heartbleed bug in its severity [42].

2.17.3 Blackouts

One of the most critical applications of embedded systems has to be across the broad expanse of the US power grid. However this system is quite old and vulnerable to attack. In Oct. 2012, US defense secretary Leon Panetta warned that the United States was vulnerable to a “cyber Pearl Harbor” that could derail trains, poison water supplies, and cripple power grids. The next month, Chevron confirmed the speculation by becoming the first US Corporation to admit that Stuxnet had spread across its machines.

A doomsday blackout scenario is not far-fetched. Meanwhile, the asset owners across the power grid know that their SCADA controllers are vulnerable but will not move without regulations forcing them too. Their profit margins are thin and no one wants to incur the costs of cyber-motivated upgrades that would cost them dearly while their competitors wait and don’t incur those costs.

Let’s examine the potential impact of a massive blackout by examining the (noncyber-attack instigated) Aug. 2003 Northeast blackout. On that Aug. 14, an electrical disruption caused a loss of electrical service to consumers in New York, Michigan, Ohio, Pennsylvania, New Jersey, Connecticut, Vermont, Massachusetts, and substantial parts of the Canadian province of Ontario. In total, about 50 million US residents lost electrical power. Many also lost water service, and still more found it difficult or impossible to get gasoline for automobiles and trucks. The total impact on US workers, consumers, and taxpayers was a loss of approximately $6.4 billion.

If we extrapolate to a full US blackout that lasts for a week the numbers are staggering. Instead of affecting 50M people it would affect all 319M people in the United States. Assume worst case that it hits on a Monday morning which, if it was intentional, is what would happen, and further assume it lasts a full 7 days before we can recover. That makes the economic impact jump to $185.8B in 2003 dollars. Adjusting to 2015 (assumes 27.6% cumulative rate of inflation) increases that number to $237B in 2015 dollars. Attempting to take a guess at how much the increased dependence on computers and the Internet has on every aspect of the economy, we will assume 3 × what it was in 2003 and the final damage estimate is at least $711B. That represents 4% of 2014’s GDP. To put that in perspective, the worst quarterly drop of the 2009 Great Recession was about 4%, so a blackout of that magnitude would send the country into a tailspin of unprecedented proportions leading perhaps to something much worse than the Great Recession.

What may be most frightening is that the United States could suffer a coast-to-coast blackout if saboteurs knocked out just nine of the country’s 55,000 electric-transmission substations on a scorching summer day, according to a study by the Federal Energy Regulatory Commission, which concluded that coordinated attacks in each of the nation’s three separate electric systems could cause the entire power network to collapse [43]. Frighteningly, the FERC analysis indicates that knocking out nine of those key substations could plunge the country into darkness for weeks, if not months. We know that nation-state adversaries have probed the electric grid and have demonstrated the ability to attack various parts of the system making many believe they could shut down parts or all of it if they were prepared for the consequences of doing so.

About once every 4 days, part of the power grid is struck by a cyber or physical attack. Although the repeated security breaches have never resulted in the type of cascading outage that swept across the Northeast in 2003, they have sharpened concerns about vulnerabilities in the electric system. A widespread outage lasting even a few days could disable devices ranging from ATMs to cellphones to traffic lights, and could threaten lives if heating, air conditioning, and health care systems exhaust their backup power supplies.

3 Security & Computer Architecture

In this section, we will discuss a new class of inherently secure processor ideally suited for embedded devices that attempts to thwart a broad class of attacks that are typically used in the worst types of cyber incursions. The processor described evolved out of a DARPA program initiated in 2010 as a reaction to the Stuxnet attack against the SCADA controllers and the centrifuges they controlled, which resulted in an estimated 2000 of these plutonium enrichment centrifuges being irreparably destroyed.

3.1 Processor Architectures and Security Flaws

The Von Neumann architecture, also known as the Princeton architecture, is a computer architecture based on that described in 1945 by the mathematician and physicist John Von Neumann. He described an architecture for an electronic digital computer with parts consisting of a processing unit containing an arithmetic logic unit (ALU) and processor registers, a control unit containing an instruction register and program counter (PC), a memory to store both data and instructions, external mass storage, and input and output mechanisms. The meaning has evolved to be any stored-program computer in which an instruction fetch and a data operation cannot occur at the same time because they share a common bus.

The design of a Von Neumann architecture is simpler than the more modern Harvard architecture which is also a stored-program system but has one dedicated set of address and data buses for reading data from and writing data to memory, and another set of address and data buses for fetching instructions. A stored-program digital computer is one that keeps its program instructions, as well as its data, in read-write, random-access memory.

A stored-program design also allows for self-modifying code. One early motivation for such a facility was the need for a program to increment or otherwise modify the address portion of instructions, which had to be done manually in early designs. This became less important when index registers and indirect addressing became usual features of machine architecture. Another use was to embed frequently used data in the instruction stream using immediate addressing. Self-modifying code has largely fallen out of favor, since it is usually hard to understand and debug, as well as being inefficient under modern processor pipelining and caching schemes.

On a large scale, the ability to treat instructions as data is what makes assemblers, compilers, linkers, loaders, and other automated programming tools possible. One can “write programs which write programs.” On a smaller scale, repetitive I/O-intensive operations such as the BITBLT image manipulation primitive or pixel & vertex shaders in modern 3D graphics were considered inefficient to run without custom hardware. These operations could be accelerated on general purpose processors with “on the fly compilation” (“just-in-time compilation”) technology, e.g., code-generating programs—one form of self-modifying code that has remained popular.

There are drawbacks to the Von Neumann design especially when it comes to security, which was not even conceived as a problem until the 1980s. Program modifications can be quite harmful, either by accident or design. Since the processor just executes the word the PC points to, there is effectively no distinction between instructions and data. This is precisely the design flaw that attackers use to perform code injection attacks and it leads to the theme of the inherently secure processor: the processor cooperates in security.

3.2 Solving the Processor Architecture Problem

Since vulnerabilities are inevitable using today’s—and the foreseeable future’s—programming methods, and they make our systems unsafe by default, how can we win? What if the underlying processor didn’t allow the vulnerabilities to become an attack vector? What if the processor was inherently secure even if the software running on it has vulnerabilities? Inherent security means safe by default even when programmers make mistakes.

This is achievable. It is possible to change the architecture of our processors, use transistors for a security purpose, and build a new computing foundation that is inherently secure without giving up backwards compatibility to existing processor instruction sets, operating systems, and programming languages.

We can add metadata tags to every word in memory, a way to process those tags in parallel with instruction processing, and we will define a language of micro-policies that enforce security policies based on information in the metadata tags. A good way to think about what these security/safety micro-policies do for our current inherently insecure processors is to add an interlock. When you build a nuclear power plant, you include (several levels) of safety interlocks to prevent the worst things from happening. Interlocks here include temperature sensors to shut things down if they get too hot, and shunt valves to dump reactants. Machines have physical interlock buttons to shut down dangerous cutters if they are opened (including lowly paper shredders). Radiation machines have physical interlocks to prevent removal of the shield when radiation beam is set too high. Even highways have physical guard rails and barriers to prevent drivers from driving off ramps and bridges into oncoming traffic even if the driver falls asleep. So, our goal is to provide a set of interlocks in the processor that detect bad things that shouldn’t happen and prevent the most egregious things from happening. Let’s look in some detail now at the three key ideas that create these security interlocks: programmable metadata tags, a processor interlocks for policy enforcement (PIPE), and a set of micro-policies.

3.3 Fully Exploiting Metadata Tags

The first architectural modification to apply to the inherently secure processor is programmable metadata tags. Here is where we spend the most silicon (transistors) to support security. Here the metadata tag is a word of memory attached to and permanently associated with a “normal” word of memory. Call that normal word the payload. A payload might be an instruction for the processor, values to be processed, pointers to variables stored in memory. A tag attached to a payload word of memory is not a new concept. Lisp machines7 used it for identifying types [44]. A machine made by a long-lost computer company called Burroughs had a tag bit in each word to identify the word as a control word or numeric word. This was partially a security mechanism to stop programs from being able to corrupt control words on the stack. Later it was realized that the 1-bit control word/numeric distinction was a powerful idea and this was extended to three bits outside of the 48 bit native word into a more general tag used for things like “this is the top of the stack,” “this is a return address,” “this is code,” “this is data,” “this is the index value for a loop,” and other similar identifications [45]. Tagged architectures have been the subject of several research papers [46,47].

Recently some major processor manufacturers are recognizing the importance of using tagged memory to protect memory. Intel is introducing new MPX (memory protection extensions) instructions as a set of extensions to the x86 instruction set architecture

With compiler, runtime library, and operating system support, these instructions are designed to check pointer references whose normal compile-time intentions are maliciously exploited at runtime due to buffer overflows. With MPX there are new bounds registers, and new instruction set extensions that operate on these registers. The result is × 86 support for memory protection that, once compilers are modified to use them, will protect against buffer overflows. Also recently, Oracle/SUN announced the Sparc M7 which uses the highest four bits of a pointer to keep track of 24 (or 16) different regions of memory (buffers) to protect them from reads or writes going out of bounds on them. This is a step in the right direction but only the simplest program creates only 16 distinct buffers so this does not sufficiently address the buffer overflow vulnerability. While both these developments are encouraging because it means the major vendors are cognizant of the huge problem, they have a long way to go in addressing all of the other already identified serious Common Vulnerabilities and Exploits.

For a real solution, we need to employ a comprehensive approach to metadata tagging as shown in Fig. 2. These tags can point to software-defined and interpreted metadata records. This metadata can express provenance, access control (who can read or write the payload word), executability (does the payload contain an instruction), and legal branch target (does the payload contain address that it is legal for the program to branch to and continue execution). Very important is the fact that these metadata tags are permanently bonded to the payload word, are uninterpreted by the hardware, can be a pointer to an arbitrary data structure, and are not accessible from the application. The hardware component that will process these tags is called a PIPE.

f06-02-9780128024591
Fig. 2 Metadata tags are added to every word in memory including program counter (PC), instructions, registers, and main memory. Tags describe critical attributes about the Data payload used by the micro-policies. Metadata tags are highly protected.

3.4 Processor Interlocks for Policy Enforcement

To process these tags in parallel with the standard ALU—the heart of the processing pipeline of any standard processor—we create a PIPE, the second architectural modification to the processor, and add it into the instruction processing pipeline such that it operates in parallel with the ALU (see Fig. 3). Note that in the diagram, the PC, the instruction-store, the register file, and all the general-purpose memory have the extra word of metadata associated with them shown in green. As the ALU executes an instruction, the PIPE operates in parallel checking and updating appropriate tags. The hardware makes sure that payloads are sent to the ALU while tags are sent to the PIPE so that parallel processing of tags happens at the same speed as instruction processing.

f06-03-9780128024591
Fig. 3 The standard processor pipeline continues to operate with no instructions added or subtracted. But every component of a computation has a metadata tag which are the inputs to the PIPE which acts as an interlock on the processor preventing it from violating any of the active policies.

3.5 Micro-Policies

The “secret sauce” to make the metadata maximally flexible and powerful is a set of micro-policies (abbreviated as μ-policies hereon in). These policies define what operations are allowed and specify how metadata is updated as a result of an instruction being processed. Examples of policies include:

Access controlFine-grained control over who has what kind of access to a piece of data
Type safetyMaking sure types declared in the program are manipulated as those types not just as raw data
Instruction permissionProtects special sections of code from being executed by user-level applications; part of the ISP self-protection mechanism
Memory safetyProtecting buffers in memory from overread or overwrite
Control-flow integrityGuaranteeing that only jumps to program-defined locations are made at run-time
Taint tracking/information flow controlTracking the influences of values through a computation to prevent untrusted values from influencing critical decisions and to limit the flow of sensitive data (e.g., guarantee encryption if data leaves the system)

t0015

As the PIPE processes tags associated with an instruction, it takes the relevant metadata and sends it to the installed set of software-defined policies where the metadata is checked against those policies and, if the result is allowed, determines the result tags. If the result is not allowed by policy, that instruction results in a security violation and the instruction is voided.

3.5.1 μ-Policies enforce security

The programming model for μ-policies involves abstracting the hardware. At the hardware level, as we have shown, there are metadata bits attached to each word. There is a hardware PIPE whose job it is to resolve this metadata.

The programmer should not have to worry about limits such as the number of bits of metadata or the complexity of the rule logic inside the PIPE. The metadata tag will be viewed as a pointer that can point to a data structure of arbitrary size.

A policy therefore is a collection of rules that take the form

opcode,PCtag,INSTtag,OP1tag,OP2tag,MRtagallow?,PCtag,Resulttag

si1_e

which we call a transfer function. The abstract function of the PIPE is diagramed here

u06-03-9780128024591

3.5.2 Memory safety μ-policy

The memory safety policy is designed to enforce spatial and temporal safety. Since in the rankings of the top vulnerability categories of 2014, according to Verizon, memory misuse accounted for 70% of all exploits, this is the first and probably most important one to consider.

This policy will work for both heap-allocated data as well as compiler stack-based allocations. We want this policy to protect against both spatial safety violations (e.g., accessing an array out of its bounds) and temporal safety violations (e.g., referencing through a pointer after the region has been freed). Such violations are a common source of serious security vulnerabilities such as heap-based buffer overflows, confidential data leaks, and exploitable use-after-free, and double-free bugs.

Many μ-policies will require assistance to perform their functions. That assistance is called a monitor service. For memory safety, the monitor services for heap memory are the allocation and freeing routines that are part of the set of operating system services. The allocation and freeing monitor services are parameterized by two functions, malloc() and free(), that are assumed to satisfy certain high-level properties: the malloc monitor service first searches the list of block descriptors for a free block of at least the required size, cuts off the excess if needed, generates a fresh color, initializes the new memory block with each word’s metadata having that color, and returns the address of the start of the block which is a pointer to the newly allocated memory (or buffer). The free monitor service reads the pointer color, deallocates the corresponding block, tags its cells with a special freed tag F, and updates the block descriptors. The F tags prevent any remaining pointers to the deallocated block from being used to access it after deallocation. If a later allocation reuses the same memory, it will be tagged with a different (larger) color, so these dangling pointers will still be unusable.

The method we will use for this μ-policy, shown in Fig. 4, is to give each pointer a unique “color” and then to color each memory slot in the allocated buffer with the same color. When the memory is later freed all slots are re-colored to the uncolored value (i.e., 0). A “color” is just a numeric value equal to 2number-of-bits where number-of-bits is the size of the metadata tag available for this purpose (typically 32 bits which provides over 4 billion unique values).

f06-04-9780128024591
Fig. 4 C language calls to malloc() and subsequent references to the buffers so created are shown on the left. On the right are shown the metatags and the payload data each metadata tag describes. Each tag is assigned a unique color as is the pointer returned from malloc(). This information enables the memory safety policy to determine if a memory reference is legal and should be allowed to complete.

The transfer function for this μ-policy is

LOAD,PCtag,,R1tag,,MRtagMRtag==R1tag&&PCtag==INSTtag

si2_e

which says that the LOAD instruction being executed will be allowed only if the tag on the memory reference (MR, the address in memory that is about to be written) is the same (represented as ) as the tag on the pointer (R1). We additionally require that the PC tag matches the color of the block to which the PC points. This ensures that the PC cannot be used to leak information about inaccessible frames by loading instructions from them.

All other slots in the formula are do-not-cares for this instruction. We do allow adding and subtracting integers from pointers. The result of such pointer arithmetic is a pointer with the same color. The new pointer is not necessarily in bounds, but the rules for LOAD and STORE opcodes will prevent invalid accesses. (Computing an out-of-bounds pointer is not a violation per se, reading or writing through it is and will be handled by the rules for LOAD and STORE.)

This simple μ-policy, in conjunction with support from the operating system malloc() and free(), and from the compiler as it creates stack frames as functions are called, provides very powerful memory safety, which prevents buffer overflows when both reading and writing into memory.

3.5.3 Control flow integrity μ-policy

The memory safety policy just discussed leads people to apply the policy to make code nonwriteable. However, that didn’t stop attackers. They figured out that they could use your own code against you using return-oriented programming (ROP) attacks. These are a relatively new type of attack that has grown in popularity with attackers and is very dangerous. ROP is a computer security exploit technique that allows an attacker to execute code even when memory protection is in place. In this technique, an attacker gains control of the call stack to hijack program control flow and then executes carefully chosen machine instruction sequences, called “gadgets.” Each gadget typically ends in a return instruction and is located in a subroutine within the existing program and/or shared library code. Chaining these gadgets together enables an attacker to perform any computation just with these return sequences. To understand the policy to prevent ROP attacks, it is important to understand that these attacks work by returning into the middle of sequence of code—returns (i.e., control-flow-branches) that did not exist in the original program. Current processors will blindly run the code for the attacker.

The control flow integrity (CFI) policy dynamically enforces that all indirect control flows (computed jumps) adhere to a fixed control flow graph (CFG) most commonly obtained from a compiler. This policy is another interlock that prevents control-flow-hijacking attacks by locking down control transfers to only those intended by the program.

The sample code we will use to illustrate this is shown in Fig. 5. While in function foo(), function bar() is called. The address at which that call occurs is t1. The address where bar() begins is t2. And the address in bar() where it is about to return to its caller is t3. Somewhere else in the program assume bar() is also called from address t42 as well as a few other places. Thus the full list of legal address locations bar() can be called from includes t1, t42, and a few others as shown at the bottom of the figure. As you will see, CFI uses tags to distinguish the memory locations containing instructions and the sources and targets of indirect jumps, while using the PC tag to track execution history (the sources of indirect jumps).

f06-05-9780128024591
Fig. 5 The compiler knows the legal call graph and when the Control Flow Integrity policy is active, the control flow is tracked through the metadata tag on the PC. Here as bar() is called, the list of legal addresses where bar() can be called from is tracked such that if a return is corrupted by an attacker, the PIPE knows the processor is trying to return to an illegal address and halts that instruction.

On a call to a the function bar() from the address t1 the μ-policy that governs what happens is described by the following transfer function

CALL,none,t1,R1tag,,true,t1,

si3_e

which declares that the tag from the call instruction (which is t1) must be copied to the tag for the PC as a result of the CALL instruction (that is, the PC which is now at the first instruction for bar() which is t2). The second half of CFI enforcement comes from the following transfer function

/CALL,t1,t2,,,t1int2,none,

si4_e

which says whenever the processor is not executing a CALL instruction and the PC is tagged (in this case t1 because we are in a previous CALL that began at t1), check that the tag on the PC is in the list of “legal caller tags” (t2 is in that list) on the current instruction (which must be the target of a call). The PIPE will also untag the PC as shown by the none in the PC tag spot on the right-hand side of the transfer function. These two transfer functions direct the PIPE to strictly enforce the CFG and only allow control transfers to those locations specified in the program thus creating another interlock that thwarts ROP attacks.

3.5.4 Taint tracking μ-policy

Here we will show how a taint tracking policy can be used quite generally to “taint” data such that certain rules can be enforced throughout the system. Taint tracking can enforce rules like “if the data is not public it may not leave the system unencrypted” and “at this critical decision juncture, if the data about to be used is untrusted do not proceed.” Taint tracking can also be used for a safety policy that might be used in a medical device like “if you are about to pump insulin and more than N milligrams have been pumped in 24 h do not pump any, alert the user, and shut down the processor.” The policy, as applied to the ADD instruction, would be represented by the following transfer function

ADD,PCtag,INSTtag,OP1tag,OP2tag,true,PC,unionPC,INST,OP1,OP2

si5_e

where the tag on the Result is formed from the union of the tag on all the operands including the PC, INST, OP1, and OP2.

3.5.5 Composite policies

It is important that for any given metadata tag there not be just one policy that can be enforced. An arbitrary number of policies may be required and this is easily supported by our model by using the tag as a pointer to a tuple of μ-policies. There is no hardware limit on the number of μ-policies supported in this fashion.

u06-04-9780128024591

To propagate tags efficiently, the processor is augmented with a rule cache that operates in parallel with instruction execution. On a rule cache miss, control is transferred to a trusted miss handler which, given the tags of the instruction’s arguments, decides whether the current operation should be allowed and, if so, computes appropriate tags for its results. It then adds this set of argument and result tags to the rule cache so that when the same situation is encountered in the future, the rule can be applied without slowing down the processor. In performance tests on accurate machine simulators, the overall performance overhead for security enforcement is less than 10%.

A partial list of the types of safety and security policies that can be implemented using the mechanisms just described is listed in the following table.

 Type safety

 Mandatory access control

 Memory safety

 Classification levels

 Control-flow integrity

 Lightweight compartmentalization

 Stack safety

 Software fault isolation

 Unforgettable resource identifiers

 Sandboxing

 Abstract types

 Access control

 Immutability

 Capabilities

 Linearity

 Provenance

 Software architecture enforcement

 Full/Empty bits

 Units

 Concurrency: race detection

 Signing

 Debugging

 Sealing

 Data tracing

 Endorsement

 Introspection

 Taint

 Audit

 Confidentiality

 Reference monitors

 Integrity

 Garbage collection

 Bignums

t0020

Returning to the analogy of metadata + PIPE + μ-policies being interlocks, one of their advantages is that policies are small (few 100–1000 lines of code), but they are protecting millions of lines of code. So, while a programmer has no chance of having millions of lines of code be bug free, she does have a chance of making the interlocks be bug free. Just to be certain, since these bits of software are small and highly important, they are amenable to and worth verifying formally.

3.6 Self-Protection

For an inherently secure processor to maintain that designation, it is vital that it strongly protects itself against malfeasance of any sort. In this case, that means protecting the metadata, the PIPE, the mechanisms of μ-policies, having a secure boot, and providing protection from physical tampering.

3.6.1 Metadata protection

In our architecture, data and metadata do not mix. Metadata is not addressable; not by the user-level application, not by the operating system. In the hardware, data paths for data and metadata do not cross. No user-accessible instructions read or write metadata. Metadata is only transformed through the PIPE.

Separation of data and metadata is important for maintaining strong protection of the PIPE and policy mechanisms. Metadata rule misses could be processed on a separate processor. We could store the metadata structures that the metadata tags point to in separate memory. Then this separate metadata processor would only need to access metadata memory and would never access standard data (and vice versa for the standard processing structures). But to avoid a second processor we want to process a miss on the same processor albeit in an isolated subsystem. In this metadata processing subsystem metadata tags become “data,” that is, they are just pointers into metadata memory space.

The mechanism to accomplish this on the RISC-V uses special PIPE control and status registers (CSRs) for rule inputs and outputs. On a PIPE miss trap the PIPE tag inputs are stored in the PIPE CSRs. Now these tags have just become data to the metadata processing subsystem. Here the PIPE CSRs are read and processed. If a result is allowed the tag result is written to the PIPE CSRs which triggers a rule insertion for this PIPE transfer function (same inputs mapped to same outputs) making this a simple lookup next time this rule hits the PIPE.

3.6.2 PIPE protection

We can use the PIPE itself both to implement the symbolic μ-policy and, at the same time, to enforce the restrictions above (which we call monitor self-protection). To achieve this, we use the special Monitor tag to mark all of the monitor’s code and data (remember, malloc() and free() are monitors), allowing the miss handler to detect when untrusted code is trying to tamper with it.

Every instruction causes a rule cache lookup, which results in a fault if no corresponding rule is present (i.e., a miss). Since the machine has no special “privileged mode,” this applies even to monitor code. To ensure that monitor code can do its job, we set up cache ground rules (one for each opcode) saying that the machine can step whenever the PC and INST tags in the input vector of a rule are tagged Monitor; in this case, the next PC and any result of the instruction are also tagged Monitor. Monitor code never changes or overrides these rules.

3.6.3 The Dover processor

Draper Laboratory in Cambridge Massachusetts is an independent not-for-profit research and development laboratory that is developing an inherently secure processor for embedded systems called Dover (short for do-over). In early 2016, the processor achieved its first stage of demonstrability by showing it was immune to the Heartbleed attack and could automatically protect memory from a buffer overflow. The processor will be developed as either an FPGA or an ASIC depending on the requirements of the embedded system it is intended for. As embedded devices adopt an inherently secure processor more and more broadly, infrastructure, the IoT, transportation systems, weapons systems and all other embedded systems will become much more difficult for attackers to overwhelm.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.82.79