10
Using Machine Learning for Protecting the Security and Privacy of Internet of Things (IoT) Systems

Melody Moh and Robinson Raju

10.1 Introduction

Today, IoT devices are ubiquitous and have pervaded almost every sphere of our lives, ushering an era of smart things:

  • Smart homes have appliances, lights, and thermostat connected to the Internet [1].
  • Smart medical appliances not only monitor remotely but also administer medicines timely [2].
  • Smart bridges have sensors to monitor loads [3].
  • Smart power grids detect disruptions and manage distribution of power [4].
  • Smart machinery in industries have embedded sensors in heavy machinery to increase worker safety and improve automation [5].

To get a better understanding of the scale of IoT, here are some numbers for review:

  • In 2008, the number of devices connected to the Internet surpassed the world population of approximately 6.7 billion people.
  • In 2015, approximately 1.4 billion smartphones were shipped by manufacturers.
  • By 2020, the prediction is that there will be 6.1 billion smartphone users and an anticipated 50 billion things connected to the Internet [6].
  • By 2027, the expectation is that there will be 27 billion machine‐to‐machine connections in the industrial sector.

Now, if the focus shifts to the amount of data that gets generated, one gets a glimpse of the dawn of the zettabyte era [7]. To put a zettabyte into perspective, 36,000 years of high‐definition television video would be the equivalent of one zettabyte.

  • In 2013, devices connected to the Internet generated 3.1 zettabytes of data.
  • In 2014, that number jumped to 8.6 zettabytes.
  • In 2018, that number is expected to soar to 400 zettabytes [8].

10.1.1 Examples of Security and Privacy Issues in IoT

While previous chapters talked about the ubiquitousness of IoT, the amount data generated, and the technologies used, this chapter focuses on the type of data that is transmitted and the security and privacy implications of this. Ubiquitousness is a double‐edged sword. The reach is higher and more widespread than human comprehension, but so is the vulnerability. Hence security and privacy implications of a system that has myriads of devices manufactured independently and communicating using different protocols and generates zettabytes of data are broad and deep. Cisco's whitepaper on Global Cloud Index [6] talks about the types of data in the cloud. A total of 7.6% of documents in file‐sharing services contain confidential data. Personally identifiable information (e.g., Social Security numbers, tax ID numbers, phone numbers, addresses, and so on) follows this at 4.3% of all documents. Next, 2.3% of documents contain payment data (e.g., credit card numbers, debit card numbers, bank account numbers, and so on). Finally, 1.6% of documents contain protected health information (e.g., patient diagnoses, medical treatments, medical record IDs, and so on).

As IoT usage grows, the amount of data uploaded to the cloud by IoT systems far exceeds that done by users. Because IoT data is on the cloud and IoT devices have connectivity to the Internet, they become vulnerable to attacks of different types. In fact, more often than not, we read about breaches on a daily basis:

  • Water treatment plant is hacked and chemical mix changed for tap supplies [9].
  • Nuclear power plant in Ukraine is breached [10].
  • Security researchers from Rapid7 security firm discover many security vulnerabilities affecting several video baby monitors [11].
  • Data from wearable devices are used to plan robberies [12].
  • There are reports on how hackers could target pacemakers [13].

As per Cybercrime report in 2016 [14], cybercrime damages will cost the world $6 trillion annually by 2021, up from $3 trillion in 2015.

10.1.2 Security Concerns at Different Layers in IoT

A review of the 2015 IBM Point of View on IoT security [15] shows threats at multiple points in the IoT ecosystem and protections that are applicable at every layer (see Figure 10.1).

Schematic diagram of IoT system with threats and protections annotated with Interface; Service; Network; and Service Layers.

Figure 10.1 IoT system with threats and protections annotated.

10.1.2.1 Sensing Layer

In most of the scenarios just described, hackers were able to do the most damage when they gained access to sensors like baby monitors or pacemakers. So, it is critical to have sensors protected and monitored so that one can either prevent the intrusion or alert the user when there is one, in the fastest possible time. The possible threats at the sensing layer are the following:

  • Unauthorized access to data
  • Denial of service attack
  • Malware on the device to send wrong information
  • Malware on the device to send data to the wrong party
  • Information gathering or data leakage leading to planned attacks

10.1.2.2 Network Layer

The availability, manageability, and scalability of the network are crucial for the operation of IoT. If the monitoring applications are not able to get data in time, IoT devices are rendered useless. Hence, hackers target networks more often to cripple the effectiveness of smart systems. Attacking the network by sending a lot of data at once to congest the network and pave the way to denial of service attacks is very common.

10.1.2.3 Service Layer

The service layer acts as a bridge between the hardware layer at the bottom and the interface layer at the top. An attack on the service layer impacts critical functions such as device management and information management, leading to the end users not being serviced. Privacy protection, access control, user authentication, communication security, data integrity, and data confidentiality are vital aspects of service layer security.

10.1.2.4 Interface Layer

In many ways, the interface layer is the most vulnerable part of the IoT ecosystem because this layer is at the top and is a gateway to all the other layers below. If there is a compromise in the authentication and authorization mechanisms of the interface, the ripple effects could permeate to the edge. The end user is a possible attack mechanism since attackers could gain sensitive information via phishing or other similar attacks. The web and the app interfaces can be subject to frequent attacks like SQL injection, cross‐site scripting, known default credentials, insecure password recovery mechanism and so forth.

OWASP (Open Web Application Security Project) has a very neat summarization of the attack surface areas for IoT [16] and is a handy reference (see Table 10.1).

Table 10.1 OWASP IoT attack surface areas.

Attack surface Vulnerability
Ecosystem Access Control
  • Implicit trust between components
  • Enrollment security
  • Lost access procedures
Device Memory
  • Cleartext usernames and passwords
  • Third‐party credentials
  • Encryption keys
Device Web Interface
  • SQL injection
  • Cross‐site scripting
  • Cross‐site request forgery
  • Username enumeration
  • Weak passwords
  • Account lockout
  • Known default credentials
Device Firmware
  • Hardcoded credentials
  • Sensitive information disclosure
  • Sensitive URL disclosure
  • Encryption keys
  • Firmware version display and/or last update date
Device Network Services
  • Information disclosure
  • User CLI
  • Administrative CLI
  • Injection
  • Denial of service
  • Unencrypted services
  • Poorly implemented encryption
  • Vulnerable UDP services
  • DoS
Administrative Interface
  • SQL injection
  • Cross‐site scripting
  • Cross‐site request forgery
  • Username enumeration
  • Weak passwords
  • Account lockout
  • Known default credentials
  • Logging options
  • Two‐factor authentication
  • Inability to wipe device

10.1.3 Privacy Concerns in IoT Devices

A 2015 report of Internet of Things research study [17] done by Hewlett Packard reported that 80% of devices raised privacy concerns. Many devices collect some of the other form of personal data such as name, address, date of birth, payment information, health data, light and sound information from home, activities within a home, and so forth (Figure 10.2). Most of these devices are transmitting data within the home network in an unencrypted fashion, and since data go out from home into the cloud, most people are just one misconfiguration away from exposing the data to the outside world. The report found that, on average, 25 vulnerabilities were found per device, totaling 250 vulnerabilities.

Diagram listing Privacy vulnerabilities in IoT in five states: 70 percent (2); 90 percent; 80 percent; 6 out of 10.

Figure 10.2 Privacy vulnerabilities in IoT

An article in FastCompany by Lauren Zanolli [18] talks about IoT being a “Privacy Hell.” Another article in Wall Street Journal [19] talks about IoT opening up new privacy litigation risks. Italian retailer Benetton was boycotted for having RFID tracking in clothes [20]. There was a sense of real urgency in FTC report on IoT in Jan 2015 [21] that asked companies to adopt best practices to address consumer privacy and security risks. There has been much research into security aspects of IoT, and most of them have been a continuation of security challenges with networking and routing. In comparison, the research into privacy issues has been decidedly less.

10.1.3.1 Information Privacy

Privacy is a comprehensive term, and historically it has meant media, place, communication, body privacy. Today, the term is increasingly used to mean information privacy. Privacy was defined by Westin in 1968 as “the claim of individuals, to determine for themselves when, how, and to what extent information about them is communicated” [22].

Ziegeldort et al. in their paper on privacy in IoT [23], concretized the definition as follows. Privacy in the Internet of Things is the threefold guarantee that addresses these subjects:

  1. Awareness of privacy risks imposed by smart things and services surrounding the data subject.
  2. Individual control over the collection and processing of personal information by the surrounding smart things.
  3. Awareness and control of subsequent use and dissemination of personal information by those entities to any entity outside the subject's control sphere.

Ziegeldort et al. [23] also defined a reference model to quickly understand and analyze the privacy concerns regarding anything that is interconnected anywhere via a network. The reference model contained four main types of entities: (i) smart things; (ii) subject; (iii) infrastructure; and (iv) services. It includes five types of information flows: (i) interaction; (ii) collection; (iii) processing; (iv) dissemination; and (v) presentation.

10.1.3.2 Categorization of IoT Privacy Issues

Ziegeldort et al. [23] also categorized the privacy threats (see Figure 10.3) into the following: (i) identification; (ii) localization and tracking; (iii) profiling; (iv) privacy‐violating interaction and presentation; (v) lifecycle transitions; (vi) inventory attack; and (vii) linkage.

Flow diagram depicting Privacy threats with entities and information flows in IoT: Privacy violating Interactions & presentations; lifecycle transitions; identification tracking.

Figure 10.3 Privacy threats with entities and information flows in IoT.

Identification.

Identification is the threat of associating an identifier, e.g., a name and address, with an individual. It also enables and aggravates other threats, e.g., profiling and tracking of people.

Localization and Tracking.

Localization and tracking is the threat of determining and recording a person's location through time and space. Since localization is an essential functionality in many IoT systems, the data are fetched by most applications. However, this leads to disclosure of private information such as illness, vacation plans, work schedules, and so forth.

Profiling.

Profiling is the threat of categorizing individuals into groups by using data from IoT devices. Personalization in e‐commerce, e.g. recommender systems, newsletters, and advertisements use profiling methods to optimize and to give targeted content. Examples, where profiling leads to a violation of privacy, are price discrimination, unsolicited advertisements, social engineering, or erroneous automatic decisions, e.g., by Facebook's automatic detection of sexual offenders. Also, several data marketplaces collect and sell profile information.

Privacy‐Violating Interaction and Presentation.

Privacy violating interaction is the threat of communicating private information in such a manner that it gets disclosed to an unwanted audience. For example, someone wearing a smartwatch and traveling in a public transit could inadvertently let strangers read their SMSes since the messages pop up on the watch screen as they come in.

Lifecycle Transitions.

When smart things undergo upgrades, configurations and data are backed up and restored. In the process, sometimes, wrong data can end up in the wrong device, leading to a privacy violation, e.g. photos and videos on one device available on another. 

Inventory Attack.

Since smart things are queryable on the Internet, hackers can query devices to compile an inventory of things at a specific location, such as whether a home contains a smart meter, smart thermostat, smart lighting, and so forth.

Linkage.

Linkage is a threat where one gathers insights about a subject by combining data from different sources, collected in different contexts. The revelation might be erroneous, and users may not have given permission to do this.

In summary, privacy is a critical issue in IoT devices and needs to be handled promptly from the manufacturing to deployment at every layer in the IoT ecosystem.

10.1.4 IoT Security Breach Deep‐Dive: Distributed Denial of Service (DDoS) Attacks on IoT Devices

10.1.4.1 Introduction to DDoS

A denial of service (DoS) attack is a cyberattack where an attacker makes a network resource unavailable by interrupting services of a machine connected to the Internet. It is typically accomplished by flooding the target machine with fake requests in order to overload the system. A distributed denial of service (DDoS) attack is one that uses multiple network resources as the source of the attack. A DDoS is mainly intended not only as a method to multiply the capabilities of a single attacker but also to conceal the identity of the attacker and thwart mitigation efforts. Most botnets use compromised computer resources without the owner's knowledge. In the CIA (confidentiality, integrity, availability) triad of information security, DDoS attack falls in the availability category. Figure 10.4 depicts how an attacker could initiate one attack and transform it into a multitude of attacks on a victim [24].

Schematic diagram depicting an DDoS attack from attacker; control server; botnet attack nodes; victim.

Figure 10.4 DDoS attack.

Though the motivations for DDoS can be multiple – extortion, hacktivism, cyberterrorism, personal vendetta, business rivalry, etc. – the impact is very severe in many instances. It can cause damage to reputation, huge revenue loss, and tens of thousands of hours of lost productivity. The scale of DDoS attacks has continued to rise over recent years, by 2016 exceeding a terabit per second.

10.1.4.2 Timeline of Notable DoS Events [25]

  • 1988: Robert Tappan Morris launches a self‐replicating worm that spreads uncontrollably throughout the Internet and causes a massive unintentional DoS.
  • 1997: The “AS 7007 incident,” is the first notable BGP hijacking and results in a massive DoS to significant portions of the Internet.
  • 1999: Creation of trin00, TFN, and Stacheldraht botnets. The first instance of a botnet DDoS attack was a trin00 attack on the University of Minnesota.
  • 2000: Michael Calce (aged 15) launched successful DoS attacks against Yahoo!, Fifa.com, Amazon.com, Dell, E*TRADE, eBay, and CNN.
  • 2004: Hackers on 4chan develop the Low Orbit Ion Cannon (LOIC), a DDoS tool that would be used extensively by Anonymous and other groups to launch DDoS attacks.
  • 2007: A series of DDoS attacks target various Estonian organizations. These attacks are notable as the first government‐sponsored DDoS attacks, since Russian government was suspected to be behind them.
  • 2008: Hacktivist collective Anonymous launches its first significant DDoS attack, successfully targeting the Church of Scientology.
  • 2009: Launch of a coordinated DDoS attack targeting Facebook, Google Blogger, LiveJournal, and Twitter targeting a Georgian blogger critical of Russia.
  • 2010: Hacktivist collective Anonymous launches “Operation Avenge Assange” targeting banks that froze donations to Wikileaks.
  • 2013: A massive DDoS attack targeting anti‐spam organization Spamhaus.org breaks records with traffic peaking at 300 Gbps.
  • 2014: The hacking group Lizard Squad initiates successful DDoS attacks against the Sony Playstation Network and Microsoft Xbox Live.
  • 2015: A network security hardware manufacturer reports a DDoS attack more than 500 Gbps against an unnamed customer.
  • 2017: On October 21, a large‐scale DDoS attack on Dyn [26], which is a primary provider of DNS services to many companies, took down many high‐profile websites like Twitter, Pinterest, Reddit, GitHub, Amazon, Verizon, Comcast, and so forth.

10.1.4.3 Reason for the Recent Success of the DDoS Attacks

The most recent DDoS attack on Dyn [26] was made possible by the large number of unsecured IoT devices, such as home routers and surveillance cameras. The attackers employed thousands of such devices that had been infected with malicious code to form a botnet. The devices themselves were not powerful, but collectively they generated a massive amount of traffic to overwhelm targeted servers. The moment someone places a device on the Internet without changing the default password, it gets added to the army of vulnerable machines used for DDoS attacks. A report from welivesecurity.com [27] mentions that ESET tested more than 12,000 home routers to find 15% of them being unsecured. In the article “10 things to know about October 21 IoT DDoS attack” [28], Stephen Cobb lists default password as the leading cause. A mashable.com report in 2014 [29] mentions that 73,000 webcams were discovered in the Internet because people did not change default passwords.

To summarize, one could attribute the success of recent DDoS attacks despite decades of research and tools to mitigate, to the following:

  • The proliferation of IoT devices.
  • Increase in the number devices on the Internet with default passwords, and this could be due to the increase of nonsavvy technology users of smart devices.

10.1.4.4 Directions for Prevention of Specific Attacks on IoT Devices

As mentioned, in many instances above, IoT devices are growing at an alarming pace, and it is imminent that the devices be made secure. The attacks increasingly have a crippling effect on the economy and have become the new currency of global warfare. With this in mind, the US Senate introduced legislation in August 2017 [30] to improve the cybersecurity of IoT devices.

Specifically, if enacted, the Internet of Things (IoT) Cybersecurity Improvement Act of 2017 [31] would:

  • Require vendors of Internet‐connected devices purchased by the federal government to ensure their devices are patchable, rely on industry standard protocols, do not use hard‐coded passwords, and do not contain any known security vulnerabilities.
  • Direct the Office of Management and Budget (OMB) to develop alternative network‐level security requirements for devices with limited data processing and software functionality.
  • Direct the Department of Homeland Security's National Protection and Programs Directorate to issue guidelines regarding cybersecurity coordinated vulnerability disclosure policies to be required by contractors providing connected devices to the US government.
  • Exempt cybersecurity researchers engaging in good‐faith research from liability under the Computer Fraud and Abuse Act and the Digital Millennium Copyright Act when engaged in research pursuant to adopted coordinated vulnerability disclosure guidelines.
  • Require each executive agency to inventory all Internet‐connected devices in use by the agency.

10.1.4.5 Steps to Prevent Attacks on IoT Devices

The overarching strategy to secure IoT devices should be twofold: reduce the number of devices that can be abused and convince the would‐be attackers like hacktivists on the gravity of the situation. Also, there needs to be a global strategy to punish the guilty. There have been multiple efforts to reduce the number of devices that can be abused. The Cybersecurity Improvement Act mentioned above, alerts sent out by the Department of Homeland Security, WaterISAC's 10 Basic Cybersecurity measures [32], are few initiatives from the government toward this. Here are the top four actions recommended by US‐CERT [33] in the wake of the latest attacks:

  1. Ensure all default passwords are changed to strong passwords. (Default usernames and passwords for most devices can easily be found on the Internet, making devices with default passwords extremely vulnerable.)
  2. Update IoT devices with security patches as soon as patches become available.
  3. Disable Universal Plug and Play (UPnP) on routers unless absolutely necessary.
  4. Purchase IoT devices from companies with a reputation for providing secure devices.

10.2 Background

10.2.1 Brief Overview of Machine Learning

Machine learning, a term coined by Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence [34], is the science of getting computers to learn and act without being explicitly programmed. The idea behind machine learning is to have an algorithm that can analyze data, identify patterns, and create a model that the machine could use to analyze data that it has not seen before. As systems provide more data to it, the algorithm learns continuously and will be able to produce reliable decisions repeatedly. In the past decade or so, with the increase of computing power and development of systems like Hadoop to do massive data processing at a short period, machine learning has pervaded many things that people use. From speech recognition, image recognition, fingerprint scanning, to self‐driving cars, machine learning is used almost everywhere and is arguably the most impactful invention in recent times.

There are many machine‐learning algorithms used in a variety of scenarios. Broadly, they could be categorized either by the nature of learning available to the system or by the desired output.

Depending on the nature of the learning, machine‐learning algorithms can be categorized as follows [35]:

  • Supervised learning. In this, one gives the computer a training set that contains data with corresponding labels. The algorithm then creates a model that maps future unknown inputs to known outputs.
  • Unsupervised learning. In this type of learning, the training set does not contain output labels. The algorithm discovers hidden patterns in the data and then uses this to map future unknown inputs to the pattern.
  • Reinforcement learning. The program operates in a dynamic environment where it gets inputs continuously, and the program's outputs are provided feedback whether they are right or wrong.

Depending on the desired output, machine‐learning algorithms can be categorized as follows:

  • Classification. The output is a finite number of discrete categories/classes. The algorithm should produce a model from the training data that can assign one of these classes to the new inputs. Spam filtering and credit card companies determining if a person is creditworthy or not are examples of classification problems.
  • Regression. The output is not discrete but is one or more continuous variables. Examples include predicting output sales given the budget for TV and radio ads, predicting house prices given a set of variables, and so forth.
  • Clustering. The objective is to group input data into clusters that contain similar data points. Examples include segmenting users based on purchase patterns, detecting activity types using motion sensors, and so forth.
  • Dimensionality reduction. The objective is to reduce the number of dimensions with the intent of focusing on dimensions (features) that are important to the problem. It also helps in reducing complexity, space, and time to compute.

10.2.2 Frequently Used Machine‐Learning Algorithms

In this section, we briefly touch on the most commonly used machine‐learning (ML) algorithms [36], and this would help get a better context for the review of machine‐learning algorithms utilized for IoT.

10.2.2.1 Classification

  • Logistic regression. Predictions are mapped to be between 0 and 1 through the logistic function.
  • Classification tree. The data are repeatedly split into separate branches to arrive at the output label.
  • Support vector machine (SVM). In SVM, the program views the data elements as points in an n‐dimensional space. The algorithm finds a hyperplane (decision boundary) that maximizes the distance between closest points of separate classes.
  • Naïve Bayes. In Naïve Bayes, the model is a probability table that gets created using the probability of occurrences of training data. The algorithm predicts the new output by looking up the probabilities of the input variables and using conditional probability.
  • K‐nearest neighbors (KNN). In KNN, the algorithm predicts the class by searching through the training set for K most similar neighbors of the new input.

10.2.2.2 Regression

  • Linear regression. In linear regression, the algorithm creates a model by fitting a straight line (or a hyperplane for n‐dimensions) through the data.
  • Regression tree / decision tree. In regression tree, the data are repeatedly split into separate branches to arrive at the output.
  • K‐nearest neighbors. In KNN, the algorithm predicts the value by searching through the training set for K most similar neighbors of the new input and summarizing the outputs.

10.2.2.3 Clustering

  • K‐means. In K‐means, the algorithm creates clusters based on geometric distances between points. At the outset, the algorithm randomly assigns the data points to k clusters, computes centroids for each cluster, computes points closest to each centroid and then re‐computes the centroids. The algorithm repeats the process till there are no more improvements possible. The clusters tend to be globular for K‐means.
  • DBSCAN. In DBSCAN (Density‐based spatial clustering of applications with noise), clusters are created based on density. The algorithm makes an n‐dimensional sphere of radius epsilon for each data point and counts the number of points inside the sphere. If the number is less than min_points, the algorithm disregards the point. If not, it computes the centroid for the sphere and continues the same process.
  • Hierarchical clustering. The algorithm starts with n clusters for n data points. It combines two nearest clusters to create a new cluster. The algorithm repeats the process until only one cluster remains. One can view the result as a dendrogram with the height representing the distance between the clusters. If we can imagine a horizontal line that traverses the dendrogram vertically, the maximum distance it covers without intersecting another cluster gives the minimum distance between clusters. The number of vertical lines cut gives the number of clusters.

10.2.2.4 Dimensionality Reduction

  • PCA. A principal component is a normalized linear combination of the variables in a dataset. In PCA (principal component analysis), the objective is to orthogonally project data points onto an L dimensional linear subspace that has the maximal projected variance. For PCA, the variable values need to be numerical. Hence, categorical variables are converted to numerical.
  • CCA. Canonical correlation analysis (CCA) deals with two or more variables, and its objective is to find a corresponding pair of highly cross‐correlated linear subspaces so that within one of the subspaces there is a correlation between each component and a single component from the other subspace.

10.2.2.5 Combining Models (Ensemble ML)

In many instances, a single type of algorithm may not be able to give optimal results due to the variety of the types of data or other reasons. In these cases, different algorithms are combined to give more accurate predictions than individual models.

  • CART. In classification and regression trees (CART), The data repeatedly split into separate branches to arrive at the output label or value. Though the trees used for regression and those used for classification have some similarities, they differ in some respects, e.g., the algorithm to determine where to split.
  • Random forests. In random forests, instead of training a single tree, a multitude of trees are trained. The algorithm outputs a class that is the mode of the training classes or the mean of the training values.
  • Bagging. Bootstrap aggregation, also called bagging, is a general procedure that can be used to reduce the variance for an algorithm that has high variance. CART/decision tree is an algorithm that has a high variance and is sensitive to training data.

10.2.2.6 Artificial Neural Networks

Artificial neural networks (ANNs) are computing systems that model neural networks and brain in humans. ANN contains units called neurons. Neurons are connected to each other via synapses and communicate signals to each other. Each neuron receives inputs from other neurons connected to it and computes an output to be transmitted upstream. Each input signal has a corresponding weight, and the neuron applies a function to the weighted sum of the inputs it gets. Feed forward neural networks (FFNN), also known as multilayer perceptrons (MLP), is the most common type of neural networks in practical applications. There are other types of ANNs such as CNN (convolutional neural network), RNN (recurrent neural network), DBN (deep belief network), TDNN (time delay neural network), DSN (deep stacking network) and so forth.

10.2.3 Examples of Machine‐Learning Algorithms in IoT

10.2.3.1 Overview

The main ingredient in an ML system is data. With the spread of IoT, there is a massive amount of data that gets generated on a daily basis, and this is a goldmine for machine learning. The adoption of supervised and unsupervised machine‐learning techniques in IoT smart data analysis is broad. All of the smart things discussed in Section 10.1.1 – smart homes where appliances, lights, and thermostat connect to the Internet [1], smart medical appliances that not only monitor remotely but also administer medicines [2], smart bridges that have sensors to monitor load [3], smart power grids to detect disruptions and manage distribution of power [4] and smart machinery in industries that have embedded sensors in machinery to increase worker [5] – would be using or have the potential to use machine learning in some form or the other.

10.2.3.2 Examples

There are many concrete examples where machine learning saved millions of dollars for corporations:

  • Google Deepmind AI. Google applied machine learning to 120+ variables from sensor in its data center to optimize cooling, and that cut its overall energy consumption by 15% [37].
  • Roomba 980. This Roomba is connected to the Internet and comes with a camera that captures the images of a room and software that compares these images to gradually build up a map of the robot's surroundings to determine its location [38]. It is able to “remember” a home layout, adapt to different surfaces or new items, clean a room with the most efficient movement pattern, and dock itself to recharge its batteries.
  • NEST Thermostat. NEST “learns” the regular temperature preferences of its users, and also adapts to the work schedule of its users by turning down energy use [39]. The input is the temperature preference of the user, time and day, presence of the user at home, etc. and the output is a discrete set of temperatures making this a classification problem.
  • Tesla cars. Tesla enabled auto pilot service in its cars that helps in hands‐free driving, including complex tasks like lane changes. Tesla cars built since 2014 have 12 sensors on the bottom of the vehicle, a front‐facing camera next to the rear‐view mirror, and a radar system under the nose [40]. These sensing systems are not only constantly collecting data to help the autopilot work on the road, but also to amass data that can make Teslas operate better in the future. Because all Tesla cars have an always‐on wireless connection, data from driving and using autopilot is collected, sent to the cloud, and analyzed with software.

10.2.4 Machine‐Learning Algorithms by IoT Domains

In this section, we summarize the machine‐learning algorithms that could be used for various use cases for different domains. The data are a summarization of information from examples above and also from papers Machine Learning for Internet of Things Data Analysis: A survey from Mahdavinejad et. al. [41] and Unlocking the Value of the Internet of Things (IoT) – A Platform Approach by Misra et. al. [42].

10.2.4.1 Healthcare

Metrics to Optimize.

Healthcare systems in hospitals and at home have sensors to monitor patients or surrounding. Some metrics that could use machine learning could be remote monitoring and medication, disease management, and health prediction.

Machine‐Learning Algorithms
  • Classification algorithms could be used to classify patients into different groups based on their health condition.
  • Anomaly detection could be used to identify if someone has a problem that needs to be looked into.
  • Clustering algorithms like K‐means could be used to group people with similar health conditions to create profiles.
  • Feed forward neural network could be used to make fast decisions based on a patient's continuously changing condition during illness.

10.2.4.2 Utilities – Energy/Water/Gas

Metrics to Optimize.

Readings from smart meters for electricity, water, or gas could be used for usage prediction, demand supply prediction, load balancing, and other scenarios.

Machine‐Learning Algorithms
  • Linear regression could be used to predict usage for a particular day or time.
  • Classification algorithms could be used to classify consumers as high‐, medium‐, or low‐usage consumers.
  • Clustering algorithms could be used to group consumers of similar profile together and analyze their usage patterns.
  • Artificial neural networks could be used to dynamically balance loads if there is a surge in usage in certain areas.

10.2.4.3 Manufacturing

Metrics to Optimize.

Many industries have sensors on equipment for continuous monitoring, mechanisms to track production volumes and security systems to continuously monitor. So, the metrics to optimize would be to diagnose problems when they occur, very quickly, to predict failure so that evasive action could be taken, detect security breaches into the facility or theft of goods.

Machine‐Learning Algorithms
  • CART/decision tree could be used to diagnose problems with machines.
  • Linear regression could be used to predict failure
  • Anomaly detection could be used to detect security breaches or anything that occurs out of the ordinary.

10.2.4.4 Insurance

Metrics to Optimize.

Insurance companies would be interested in knowing what kind of cars or profiles of people are more likely to be connected with accidents. The usage pattern could be obtained by sensors in the cars. They could use that information to charge appropriate insurance premiums. Machine learning could be applied to obtain home or car usage pattern, prediction of property damage, remote assessment of damage, and so forth.

Machine‐Learning Algorithms
  • Clustering algorithms like K‐means or DBSCAN could be used to create profiles of users who drive similarly.
  • Classification algorithms like Naïve Bayes could be used to classify a customer as risky or not and also to predict whether he/she should be given insurance.
  • Decision trees could be used to classify users or to arrive at the premium to be charged or discounts to be given.
  • Anomaly detection could be used to determine theft or destruction of property.

10.2.4.5 Traffic

Metrics to Optimize.

Traffic is a very important metric to be monitored, especially in big cities. Traffic data could be obtained via sensors in cars, data from mobiles phones, tracking devices on people, and so forth. Machine‐learning algorithms could be used to predict traffic, identify traffic bottlenecks, detect accidents or even predict accidents.

Machine‐Learning Algorithms
  • DBSCAN could be used to identify roads and intersections that have high traffic.
  • Naïve Bayes could be used to identify if a road needs maintenance or whether it is susceptible to accidents.
  • Decision trees could be used to divert users onto a less‐trafficked road.
  • Anomaly detection could be used to determine if there is an accident on the road.

10.2.4.6 Smart City – Citizens and Public Places

Metrics to Optimize.

In a smart city, it is essential to optimize facilities for citizens. Based on data from smartphones, ATMs, vending machines, traffic cameras, bus/train terminals or other tracking devices, and machine‐learning algorithms can predict the travel patterns of people, density of population at certain places, predict abnormal behaviors, forecast energy consumption, forecast needs for public infrastructure like housing, transportation, shopping, and more.

Machine‐Learning Algorithms
  • DBSCAN could be used to identify places in the city that have high concentrations of people during different times of the day.
  • Linear regression or Naïve Bayes could be used to forecast energy consumption or the need for improvement of public infrastructure.
  • CART could be used for real‐time passenger travel prediction as well as to identify travel patterns.
  • Anomaly detection could be used to determine unusual behavior like terrorism or financial fraud.
  • PCA could be used to reduce the number of dimensions to simplify analysis since the sheer volume of data generated by multiple devices in a city is huge.

10.2.4.7 Smart Homes

Metrics to Optimize.

Smart homes are one area where IoT devices have increased multifold in the past decade. They are equipped with smart meters to monitor energy, devices like Nest and Ecobee to control temperature automatically and remotely, smart bulbs like Philips Hue that could be automated and controlled remotely, smart switches, fitness bands, smart locks, security cameras, and so forth. Multiple sensors and the amount and quality of generated data can be harnessed by machine‐learning algorithms to provide valuable insights like occupancy awareness, intrusion detection, gas leakage, energy consumption prediction, television viewing preferences and prediction, and so forth.

Machine‐Learning Algorithms
  • K‐means could be used to analyze load and consumption frequency of energy.
  • Linear regression or Naïve Bayes could be used to forecast energy consumption or occupancy prediction.
  • Anomaly detection could be used to determine intrusion detection, tampering with devices, burglary, device malfunction, and so forth.

10.2.4.8 Agriculture

Metrics to Optimize.

As the demand for food increases with rise in population, large‐scale farms are beginning to use sensors in the fields, drones to take pictures, and other IoT devices to be able to optimize resource usage, detect crop diseases faster, and predict production. AgTech (agriculture technology) is a growing field of active research.

Machine‐Learning Algorithms
  • Naïve Bayes could be used to determine if a crop is healthy or not.
  • Anomaly detection could be used to determine if there is a water leakage, uneven supply of water.
  • Neural networks could be used to analyze pictures taken by drones to identify weed growth or if patches in the field are growing slower than others.

In many ways, machine learning and IoT have a symbiotic relationship. IoT provides machine learning with large amount of data and machine learning is revolutionizing IoT by making the simple devices much smarter than they are. In an article about machine learning revolutionizing IoT, Ahmed [43] mentions three ways in which ML is changing IoT:

  1. Making IoT data useful
  2. Making IoT more secure
  3. Expanding the scope of IoT

In the next section, we review how machine learning is making IoT more secure.

10.3 Survey of ML Techniques for Defending IoT Devices

10.3.1 Systematic Categorization of ML Solutions for IoT Security

In the previous section, we did a review of a lot of use cases where machine‐learning algorithms were used for IoT. Some of the key tasks like discovering a pattern in existing data, detecting outliers, predicting values, and feature extraction are critical to IoT security. Some of the machine‐learning algorithms used for these tasks are tabulated in Table 10.2.

Table 10.2 Categorization of ML solutions for IOT security.

Use case ML algorithm
Pattern discovery
Discovery of unusual data points
Prediction of values and categories
Feature extraction

In most papers studied in this research, the main objective has been to detect a security breach. Hence, the second point in Table 10.2 becomes very critical from a security perspective. From the point of detecting outliers, the use cases can be further divided into the following:

  • Malware detection
  • Intrusion detection
  • Data anomaly detection

Since anomaly detection is basically a classification problem, it follows that the most used machine learning techniques are the ones that are commonly used in classification. These include decision trees, Bayesian networks, Naïve Bayes, random forests, and support vector machines (SVM). In many new instances, artificial neural networks (ANNs) have been used. ANNs are generally not used for malware detection since it takes longer time for training. Machine‐learning algorithms for these use cases are tabulated in Table 10.3.

Table 10.3 Categorization of ML solutions for outlier detection.

Use case ML algorithm
Malware detection
Intrusion detection
Anomaly detection

The next section reviews examples of machine‐learning algorithms used for the use cases in Table 10.3 by summarizing results from research paper on each of the machine‐learning algorithms.

10.3.2 Examples of ML Algorithms for IoT Security

10.3.2.1 Malware Detection Using SVM

In their paper for Android Malware detection using Linear SVM, Ham et al. [46] review various approaches for detecting malware, such as signature based, behavior based, and taint analysis based detection, and show that Linear SVM showed high performance among ML algorithms used to effectively detect malware. In a behavior‐based detection system, in order to detect abnormal patterns, event information on the device like memory usage, data content, and energy consumption are monitored. ML techniques are used to analyze the data, and hence, the choice of features is very important.

10.3.2.2 Malware Detection Using a Random Forest

In their paper for Android malware detection using a random forest, Alam et al. [47] apply ML ensemble learning algorithm random forest on an Android feature dataset of 48919 points of 42 features each. Their goal was to measure the accuracy of random forests in classifying Android application behavior to classify applications as malicious or benign. They also analyzed the detection accuracy as the parameters of RF algorithm, such as the number of trees, depth of each tree, and number of random features were changed. The results based on fivefold cross validation showed that RF performed very well with an accuracy of over 99% in general, an optimal out‐of‐bag (OOB) error rate of 0.0002 for forests with 40 trees or more, and a root mean squared error of 0.0171 for 160 trees.

10.3.2.3 Intrusion Detection Using PCA, Naïve Bayes, and KNN

In their paper for anomaly‐based intrusion detection, Pajouh et al. [48] present a novel model for intrusion detection based on two‐layer dimension reduction and two‐tier classification module, designed to detect malicious activities such as user to root (U2R) and remote to local (R2L) attacks. Their proposed model used PCA and linear discriminate analysis (LDA) to reduce the high dimensional dataset to a lower one with lesser features. They then applied a two‐tier classification module utilizing Naïve Bayes and certainty factor version of K‐nearest neighbor to identify suspicious behaviors.

10.3.2.4 Anomaly Detection Using Classification

In their paper for designing an IoT device for the safety of women, Jatti et al. [49] describe the design of a device that determines whether the wearer is in danger. The device transmits data related to physiology and body position of the person. The physiological signals that are transmitted are galvanic skin response (GSR) and body temperature. Body position is determined by acquiring raw accelerometer data from a triple axis accelerometer. The premise is that when a person is faced with a dangerous situation, secretion of adrenalin affects different systems in the body, resulting in increased blood pressure and heart rate and also sweating. This increases skin conductance, measured by GSR. The data are analyzed by an ML classifier that determines if the individual is in a dangerous situation, such as threat of rape.

10.3.3 Use of Artificial Neural Networks (ANN) to Forecast and Secure IoT Systems

Before the data get to the Internet and into the cloud, it could come from two kinds of IoT devices – edge devices or gateway devices. In general terms, when we refer to the billions of IoT devices that are gathering information, we talk about edge devices, which in themselves are dumb devices that are programmed to do a specific simple task, say measuring temperature. In comparison to edge devices, gateway devices have more resources and computing power. Hence, instead of focusing on security configurations at every edge device, one could focus energy on gateway devices to have a larger impact. In fact, in Neural Network Approach to Forecast the State of the Internet of Things Elements [50], Kotenko et al. talk about the use of artificial neural networks to predict the state of an IoT element and that this could reduce the labor costs of IoT administration. Here there is an implicit acknowledgment that security configurations at the edge are labor cost intensive. The approach in the paper combined a multi‐layered perceptron network along with a probabilistic neural network. The experiments revealed that by using the multilayer perceptron network to explore similar values in the past, one could use a probabilistic neural network to determine the state of the device.

Canedo et al. [51] propose using machine learning within an IoT gateway to help secure the system. The proposal was to use an ML technique, specifically ANN, in gateway and application layers; in gateway to monitor subsystem components and in the application layer to monitor the state of the entire system. After setting up the system with training data and warming it up, the researchers manipulated the sensors to add invalid data for a 10‐minute period. When the invalid data was run against the system, the neural network was able to detect the differences between the valid and invalid data. They then added a delay between transmissions as the third input to simulate man‐in‐the‐middle attacks and they were able to predict whether the data was valid or invalid for the approximately 360 samples in the testing set and summarized that the use of ANN is very beneficial for making an IoT system more secure.

10.3.4 New Flavors of Attacks on IoT Devices

Although in the past hacking into a device to steal data, snooping to determine the information at the remote end, and so forth, were common types of attacks, the attacks in recent times have changed the landscape for IoT and put IoT devices as the leading potential cause for bringing the Internet down. In the article “Someone Is Learning How to Take Down the Internet” [52], Bruce Schneier says that based on the analysis of recent attacks, the attacker may not be the traditionally assumed types like activists, researchers, or criminals. The attack could be state‐sponsored, and the world might be embarking on an era of cyber warfare. Here are some recent examples of IoT malware attacks from Perry [53].

10.3.4.1 Mirai

This DDoS attack is covered in Section 12.3.4. It took down half the Internet in the United States and Europe for hours. Mirai scans the Internet for hosts with an open telnet port and gains access if the password is weak. After it gets inside, it installs the malware and monitors the CNC (command and control) center. During the attack, the CNC instructs all the bots to create a flood of traffic and overwhelm the target. Perry [53] suggests that to protect the devices, one should take the following measures:

  • Always change default password.
  • Remove devices with telnet backdoors.
  • Limit exposing a device directly to the Internet.
  • Run port scans of all the devices.

10.3.4.2 Brickerbot

This bot makes the device under attack unusable, i.e. turns it into a brick. Once the malware obtains access to the device, it runs a series of commands to wipe data from the device's storage. This renders the device useless.

10.3.4.3 FLocker

FLocker (short for Frantic Locker) is a bot that locks the target device and prevents valid users from accessing it. Users could be asked to pay ransom or might lose access to the device and may have to hard‐delete all data. Norton Security [54] has noted its use for targeting Android Smart TVs.

10.3.4.4 Summary

In summary, IoT attacks are increasing, and new variants of the attacks are created often. A report from F5 labs [55] shows that IoT attacks exploded by 280% in the first half of 2017 with a large chunk of this growth stemming from Mirai. Moreover, the report claims that 83% of attacks came from a single hosting provider in Spain called SoloGigabit that had a “bulletproof” reputation.

10.3.5 Proposal for Effective ML Techniques to Achieve IoT Security

10.3.5.1 Insights from the Research

Based on the research done on ML techniques used for IoT security, it is evident that different techniques need to be used for different scenarios. There is no one‐size‐fits‐all solution because of the complexity of the problem statement. Also, anomalies in data can occur at different layers in the IoT ecosystem. Multiple devices could be hacked, resulting in wrong access patterns or data dispatch, or a gateway could be hacked, resulting in data routing. This would mean that the training system could get incomplete data or different types of data. In these cases, classic ML algorithms might fail to operate – SVM needs standardized numerical data, as the input to a decision tree cannot traverse through a branch in the tree when values are missing. In these cases, the best option is ensemble machine learning.

The other insight that came out of the research is that there are increasing use cases where IoT data must be analyzed as data are streamed, and decisions must be taken quickly. This means that the data cannot wait to be sent to the cloud and processed. Hence, new paradigms like fog computing and edge computing are more relevant for IoT security than others. Table 10.4 shows characteristic of data in smart city use case mentioned in Mahdavinejad et.al. [41] and it is clear that there are many use cases that need data to be processed near the device for quicker turnaround.

Table 10.4 Where data should be processed.

Use case Type of data Where it is best to be processed
Smart Traffic Stream/massive data Edge
Smart Health Stream/massive data Edge/cloud
Smart Environment Stream/massive data Cloud
Smart Weather Prediction Stream data Edge
Smart Citizen Stream data Cloud
Smart Agriculture Stream data Edge/cloud
Smart Home Massive/historical data Cloud
Smart Air Controlling Massive/historical data Cloud
Smart Public Place Monitoring Historical data Cloud
Smart Human Activity Control Stream/historical data Edge/cloud

To summarize the insights:

  1. IoT devices and data are diverse and need different machine‐learning algorithms to analyze different aspects of the system.
  2. IoT data need to be analyzed closer to the device than in the cloud.

10.3.5.2 Proposals

Proposal #1.

Use ensemble machine learning method for IoT data analysis in the cloud. Ensemble machine learning method uses multiple machine‐learning algorithms to obtain better predictive performance than what could be obtained from a single algorithm alone. It would also perform much better for different types of data and missing data. Figure 10.5 depicts the general idea behind ensemble machine learning.

Flow diagram depicting Ensemble machine learning from Training examples to combined classifiers and prediction.

Figure 10.5 Ensemble machine learning.

Proposal #2.

Use fog computing for data analysis closer to the edge. This would mean that decisions could be taken faster. Also, it would be more relevant to the device or groups of devices serviced by the fog computing node.

It is with this intent that the next two sections are entirely focused on fog computing and machine‐learning algorithms used in fog computing use cases.

10.4 Machine Learning in Fog Computing

10.4.1 Introduction

As noted earlier, the amount of data generated by IoT devices is expected to soar to 400 zettabytes by 2018 and grow exponentially every year. There are multiple issues with a cloud‐only architecture where data from IoT devices make it to the cloud to be processed and analyzed:

  • Network traffic congestion. By 2020, there will be over 50 billion things connected to the Internet, and if processing of the data happens in the cloud, there would be a network congestion and data may not get to the server and back fast enough.
  • Data bottleneck. If data storage and analysis is done only in the cloud, there could be a bottleneck if the server is slower in analyzing due to the volume of the data or for other reasons.
  • Security issues. Since data must travel through multiple layers from sensors to gateways to services to the cloud, there are numerous points of a breach. Also, a security solution in the cloud may address issues that are common to most devices and may not be able to take care of specific sensors or nodes on the edge.
  • Data staleness. In many instances, data loses its value when it cannot be analyzed fast enough. Security cameras, phones, cars, ATMs, and so forth, could generate data that need immediate analysis if there is a security or a privacy issue.

Fog computing solves this by selectively moving compute, storage, and decision‐making closer to the network edge where data are being generated. OpenFog Reference Architecture for fog computing defines fog computing as “A horizontal, system‐level architecture that distributes computing, storage, control and networking functions closer to the users along a cloud‐to‐thing continuum[56]. Essential characteristics of fog computing platforms include low latency, location awareness, and wired or wireless access. There are numerous benefits to this:

  • Real‐time analytics. As IoT usage grows, the number of scenarios where real‐time analytics is needed occurs too often (e.g. security camera capturing a potential intruder lurking in front of a home or a fraudster gaining access to someone's account). By the time data get uploaded to the cloud and get analyzed; it may be too late. These scenarios need near‐instant intelligence that fog computing provides.
  • Improved security. Since fog is nearer to the edge, it has the capability to configure security that is tailored to the devices and their functions. Also, security decisions regarding whether to block access during a breach can be taken almost instantaneously.
  • Data thinning at the edge. Fog consumes the raw data and makes decisions or provides insights. It sends only relevant, consolidated information upward in the hierarchy. This dramatically reduces the amount of data that gets transmitted to a central data center.
  • Cost savings. Fog may have higher setup costs due to distributed nature of deployments, but operational costs and long‐term benefits of the overall system would outweigh this.

10.4.2 Machine Learning for Fog Computing and Security

One of the main advantages of fog computing is the ability to do near real‐time analytics, and in many cases, this means utilizing machine learning at the fog nodes.

We could find many examples from the case studies reviewed in Section 10.4.3 where machine learning could be used. One example could be in industries where machine learning could help in fault isolation and fault detection of machines and thus improve MTTR (mean time to repair) of a failed system to achieve higher availability. Another example could be a train station in a smart city, where machine learning could be used to optimize operations by monitoring occupancy, movement, and overall system usage and over time. More examples are reviewed in the next section.

At the fog nodes, analytics can be both reactive as well as predictive. The fog nodes closer to the edge will most likely have reactive analytics, and the nodes farther from the edge will have more predictive analytics since it needs more computation power. The basic premise is that computing power is highest in the cloud and it goes down in the hierarchy referred to section 10.4.4 on n‐tier architecture. Machine‐learning algorithms can be run at fog nodes that have the processing power to compute corresponding to the task at that layer (see Table 10.5). Machine‐learning models are created at the nodes near the cloud or in the cloud itself. The models could be downloaded to middle‐tier nodes to help in execution.

Table 10.5 ML Use cases for fog computing.

Use case ML algorithm
Fog computing in industry – Remote monitoring for oil & gas operations [57]
  • Anomaly detection models
  • Predictive models
  • Optimization methods
Fog computing in retail –
Retail customer behavior analysis [57]
  • Statistical methods
  • Time series clustering
Fog computing in self‐driving cars [57]
  • Image processing
  • Anomaly detection
  • Reinforced learning

10.4.3 Examples of Machine Learning in Fog Computing

10.4.3.1 ML in Fog Computing in Industry

Traditional cloud‐based or noncloud centralized analytics infrastructures rely on training a machine learning algorithm by using data from past failures. The algorithm would create a model that could be used to predict failure. But in many instances, failure prediction is too late to prevent the breakdown and is used to minimize the effect of damage. In comparison, if near‐instant analytics is done locally using fog computing, the system would be able to take steps to prevent the occurrence of the issue. That is because the analytics system is nearer to the edge and has more context.

10.4.3.2 ML in Fog Computing in Retail

Retail stores, in general, do product placement based on analytics derived from customer purchases and also seasonal preferences. So, we see product placements change during Halloween, Thanksgiving, Christmas, and so forth. If fog computing is used with analytics being done for a store or a group of stores in an area, the system would be able to analyze buying patterns of the users in the locality and help the store to target merchandise better and improve customer experience.

10.4.3.3 Fog Computing for Self‐Driving Cars

With Google, Tesla, Uber, GM, and other mainstream companies testing self‐driving cars, the reality of having these vehicles for mainstream use cases is very near. Self‐driving automobiles are excellent examples of fog computing, since a lot of computing and decision‐making happens on the edge. Nevertheless, each car transmits a lot of data for processing in the cloud. An N‐tier model would make the system considerably more efficient. Machine‐learning algorithms used are ANN for image processing, Naïve Bayes or similar algorithms for anomaly detection, reinforced learning, and so forth.

10.4.4 Machine Learning in Fog Computing Security

Tang et al. [58] present a hierarchical structure for fog computing architecture to support the integration of massive number of infrastructure components and services in future smart cities. The architecture laid out in the paper is a four‐layer model, with the first layer being the cloud and the last being the sensors. The layers in between are the fog layers. Figure 10.6 shows the different layers and the primary security handling at each layer.

Schematic diagram depicting a fog computing security at four layers: very large scale, city; very small scale, sensor; very high latency, days/years; very low latency, milliseconds.

Figure 10.6 Fog computing security at multiple layers.

Table 10.6 Machine‐learning algorithms at different fog layers.

Layer Disaster response ML algorithm
Layer 4 ‐ Sensors None None
Layer 3 – Fog nodes for the neighborhood Response for anomaly KNN, Naïve Bayes, random forest, DBSCAN
Layer 2 – Fog nodes for the community Response for hazardous event HMM, MAP [58]
Regression, ANN, decision trees
Layer 1 – Cloud Response for city‐wide disaster, long‐term forecasting ANN, Deep learning, decision trees, reinforcement learning, Bayesian networks

Layer 3 contains fog nodes that get raw data from the sensors. The nodes at this layer perform two functions. One identifies potential threat patterns on the incoming data streams from sensors using machine‐learning algorithms, and the other performs feature extraction for reducing the amount of data to be sent upstream. The paper [58] does not specify how anomaly detection is done. Algorithms like KNN, Naïve Bayes, random forests, or DBSCAN could be used to do anomaly detection.

Layer 2 contains fog nodes that get data from nodes below them, and the data represent information from hundreds of sensors across locations. In the paper, HMM (hidden Markov model) and MAP (maximum aposteriori) algorithms are used for classification and alert if there is a hazardous event. Table 10.6 summarizes the machine‐learning algorithms at each fog layer.

10.4.5 Other Machine‐Learning Algorithms for Fog Computing

Section 10.3.1 categorized an ML solution for IoT security into pattern discovery, anomaly detection, value/label prediction, and feature extraction. We reviewed essential ML algorithms like K‐means, DBSCAN, Naïve‐Bayes, random forest, CART, PCA, and so forth. We also did a deep‐dive on anomaly detection use cases specifically focusing on malware and intrusion detection. All these use cases and examples apply to fog computing, such as malware detection using SVM [46], Malware detection using random forest [47], and intrusion detection [48] can be done in the fog nodes instead of on the cloud. In fact, anomaly detection using ANN by Kotenko [50] particularly talks about doing machine learning at the gateway layer, which is synonymous with doing it at a mid‐tier fog node.

In conclusion, fog computing can make IoT ecosystem more secure by being more contextual, being able to detect issues faster and reacting quicker to events.

10.5 Future Research Directions

As discussed, application of machine learning is very critical to IoT security due to the volume and variety of data. AI and ML are fast‐growing fields and IoT data analysis needs to be on par with the latest trends in these areas. Review of numerous machine‐learning techniques and several examples in IoT point to the fact that analyzing data in near real‐time at the proximity of the node is important. Hence, research on machine‐learning algorithms that need lesser memory and can process large amounts of time series data quickly is needed.

We could categorize future research directions as follows:

  • Usage of latest trends in AI and ML toward IoT security
  • ML algorithms for fog computing security focused on techniques that use lesser memory and can process large amount of data quickly
  • ML algorithms in new areas of IoT sensor development in multiple industries
  • ML algorithms to analyze healthcare data – specific focus could be done on WIBSNs (wireless and implantable body sensor networks)

10.6 Conclusions

In this chapter, we covered a range of topics starting from introduction to IoT, IoT architecture, IoT security, and privacy concerns, fog computing, machine learning for IoT security and machine learning in IoT security through fog computing. In each section, we defined the concept and then proceeded to expand the topic with references and examples.

First, we introduced the concept of the Internet of Things (IoT), common IoT devices, IoT architecture with a focus on four‐layer architecture, IoT applications, especially in the healthcare domain. With various examples, we showed how IoT devices have become ubiquitous and have pervaded almost every sphere of our lives ushering an era of smart things. Then we reviewed critical security and privacy issues with IoT devices and the ecosystem. With examples such as hacks of water treatment plants, nuclear power plant, baby monitor videos, wearable devices, and so forth, we showed the seriousness of the security issue. We used DDoS (distributed denial of service) as an example to show how IoT devices have been used to cripple the internet and bring down essential services to people in different parts of the world. Then we did a quick study of machine learning and commonly used machine‐learning algorithms and then delved into examples of machine learning used in IoT.

We took a look at examples like smart home, smart medical appliances, smart power grids, Roomba vacuum, Tesla, and so forth. Then we further reviewed use cases per domains like manufacturing, healthcare, utilities, and so forth and gave examples of ML algorithms in each. Then we focused on machine‐learning techniques for IoT security. By reviewing several papers and websites, we categorized the fundamental ML tasks used in defending IoT systems and then summarized a few papers focused on machine learning for IoT security with focus on malware detection, intrusion detection, and anomaly detection. In the end, we concluded that bringing computing closer to the edge and using ensemble learning techniques could provide reliable defense against attacks on IoT devices. We also concluded that fog computing is a critical emerging field within IoT domain and machine‐learning algorithms used in fog nodes are critical to the success and scalability of IoT.

References

  1. 1 IBM Electronics. The IBM vision of a smart home enabled by cloud technology, December 2010. https://www.slideshare.net/IBMElectronics/15‐6212631. Accessed September 2017.
  2. 2 M.Cousin, T.Castillo‐Hi, G.H.Snyder. Devices and diseases: How the IoT is transforming MedTech. Deloitte Insights (2015, September). https://dupress.deloitte.com/dup‐us‐en/focus/internet‐of‐things/iot‐in‐medical‐devices‐industry.html. Accessed September 2017.
  3. 3 S.Wende and C.Smyth. The new Minnesota smart bridge. http://www.mnme.com/pdf/smartbridge.pdf. Accessed September 2017.
  4. 4 D.Cardwell. Grid sensors could ease disruption of power. The New York Times (2015, February). https://www.nytimes.com/2015/02/04/business/energy‐environment/smart‐sensors‐for‐power‐grid‐could‐ease‐disruptions.html. Accessed September 2017.
  5. 5 K.J.Wakefield. How the Internet of Things is transforming manufacturing. Forbes (2014, July). https://www.forbes.com/sites/ptc/2014/07/01/how‐the‐internet‐of‐things‐is‐transforming‐manufacturing. Accessed September 2017.
  6. 6 Cisco. Cisco global cloud index: forecast and methodology, 2015–2020, 2016. https://www.cisco.com/c/dam/m/en_us/service‐provider/ciscoknowledgenetwork/files/622_11_15‐16‐Cisco_GCI_CKN_2015‐2020_AMER_EMEAR_NOV2016.pdf. Accessed September 2017.
  7. 7 T.Barnett Jr. The dawn of the zettabyte era [infographic], 2011. http://blogs.cisco.com/news/the‐dawn‐of‐the‐zettabyte‐era‐infographic. Accessed September 2017.
  8. 8 D.Worth. Internet of things to generate 400 zettabytes of data by 2018, November 2014. http://www.v3.co.uk/v3‐uk/news/2379626/internet‐of‐things‐to‐generate‐400‐zettabytes‐ofdata‐by‐2018. Accessed September 2017.
  9. 9 J.Leyden. Water treatment plant hacked, chemical mix changed for tap supplies. The Register (2016, March). http://www.theregister.co.uk/2016/03/24/water_utility_hacked. Accessed September 2017.
  10. 10 K.Zetter. Everything we know about Ukraine's power plant hack. Wired (2016, January). https://www.wired.com/2016/01/everything‐we‐know‐about‐ukraines‐power‐plant‐hack. Accessed September 2017.
  11. 11 P.Paganini. Hacking baby monitors is dramatically easy, September 2015. http://securityaffairs.co/wordpress/39811/hacking/hacking‐baby‐monitors.html. Accessed September 2017.
  12. 12 A.Tillin. The surprising way your fitness data is really being used. Outside (2016, August). https://www.outsideonline.com/2101566/surprising‐ways‐your‐fitness‐data‐really‐being‐used. Accessed September 2017.
  13. 13 L.Cox. Security experts: hackers could target pacemakers. ABC News (2010, April). http://abcnews.go.com/Health/HeartFailureNews/security‐experts‐hackers‐pacemakers/story?id=10255194. Accessed September 2017.
  14. 14 S.Morgan. Hackerpocalypse: a cybercrime revelation, 2016. https://cybersecurityventures.com/hackerpocalypse‐cybercrime‐report‐2016/. Accessed September 2017.
  15. 15 IBM Analytics. The IBM Point of View: Internet of Things security. (2015, April). https://www‐01.ibm.com/common/ssi/cgi‐bin/ssialias?htmlfid=RAW14382USEN. Accessed October 2017.
  16. 16 OWASP. IoT attack surface areas. (2015, November). https://www.owasp.org/index.php/IoT_Attack_Surface_Areas. Accessed November 2017.
  17. 17 Hewlett Packard. Internet of things research study, 2015. http://www8.hp.com/h20195/V2/GetPDF.aspx/4AA5‐4759ENW.pdf. Accessed March 10, 2016.
  18. 18 L.Zanolli, Welcome to privacy hell, also known as the Internet of Things. Fast Company (2015, March 23). http://www.fastcompany.com/3044046/tech‐forecast/welcome‐to‐privacy‐hell‐otherwise‐known‐as‐the‐internet‐of‐things. Accessed March 24, 2016.
  19. 19 J.Schectman. Internet of Things opens new privacy litigation risks. The Wall Street Journal (2015, January 28). http://blogs.wsj.com/riskandcompliance/2015/01/28/internet‐of‐things‐opens‐new‐privacy‐litigation‐risks. Accessed March 24, 2016.
  20. 20 B.Violino. Benetton to Tag 15 Million Items. RFiD Journal (2003, March). http://www.rfidjournal.com/articles/view?344. Accessed March 23, 2016.
  21. 21 FTC. FTC Report on Internet of Things urges companies to adopt best practices to address consumer privacy and security risks (2015, January 27). https://www.ftc.gov/news‐events/press‐releases/2015/01/ftc‐report‐internet‐things‐urges‐companies‐adopt‐best‐practices. Accessed March 24, 2016.
  22. 22 A. F.Westin. Privacy and freedom. Washington and Lee Law Review , 25(1): 166, 1968.
  23. 23 J. H.Ziegeldorf, O. G.Morchon, K.Wehrle. Privacy in the internet of things: Threats and challenges. Security Community Network , 7(12): 2728–2742, 2014.
  24. 24 Keycdn. DDoS Attack. (2016, July). https://www.keycdn.com/support/ddos‐attack/. Accessed October 2017.
  25. 25 Ddosbootcamp. Timeline of notable DDOS events. https://www.ddosbootcamp.com/course/ddos‐trends. Accessed October 2017.
  26. 26 J.Hamilton. Dyn DDOS Timeline. (2016, October). https://cloudtweaks.com/2016/10/timeline‐massive‐ddos‐dyn‐attacks. Accessed October 2017.
  27. 27 P.Stancik. At least 15% of home routers are unsecured. (2016, October). https://www.welivesecurity.com/2016/10/19/least‐15‐home‐routers‐unsecure/. Accessed October 2017.
  28. 28 S.Cobb. 10 things to know about the October 21 IoT DDoS attacks. (2016, October). https://www.welivesecurity.com/2016/10/24/10‐things‐know‐october‐21‐iot‐ddos‐attacks/. Accessed October 2017.
  29. 29 L.Ulanoff. 73,000 webcams left vulnerable because people don't change default passwords. (2014, November). http://mashable.com/2014/11/10/naked‐security‐webcams. Accessed October 2017.
  30. 30 M.Warner. Senators Introduce Bipartisan Legislation to Improve Cybersecurity of “Internet‐of‐Things” (IoT) Devices. (2017, August). https://www.warner.senate.gov/public/index.cfm/2017/8/enators‐introduce‐bipartisan‐legislation‐to‐improve‐cybersecurity‐of‐internet‐of‐things‐iot‐devices. Accessed November 2017.
  31. 31 M.Warner. Internet of Things Cybersecurity Improvement Act of 2017 (2017, August). https://www.scribd.com/document/355269230/Internet‐of‐Things‐Cybersecurity‐Improvement‐Act‐of‐2017. Accessed November 2017.
  32. 32 WaterISAC. 10 Basic Cybersecurity Measures. (2015, June). https://ics‐cert.us‐cert.gov/sites/default/files/documents/10_Basic_Cybersecurity_Measures‐WaterISAC_June2015_S508C.pdf. Accessed November 2017.
  33. 33 US‐CERT. Heightened DDoS threat posed by Mirai and other botnets. (2016, October). https://www.us‐cert.gov/ncas/alerts/TA16‐288A. Accessed November 2017.
  34. 34 A.L.Samuel. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development , 44 (1–2): 206–226, 2000.
  35. 35 SAS. Machine Learning: What it is and why it matters. https://www.sas.com/en_us/insights/analytics/machine‐learning.html. Accessed November 2017.
  36. 36 P.N.Tan, M.Steinbach, and V.Kumar (2013). Introduction to Data Mining.
  37. 37 J.Vincent. Google uses DeepMind AI to cut data center energy bills. (2016, July). Retrieved November, 2017, from https://www.theverge.com/2016/7/21/12246258/google‐deepmind‐ai‐data‐center‐cooling. Accessed November 2017.
  38. 38 W.Knight. The Roomba now sees and maps a home. MIT Technology Review (2015, September 16). https://www.technologyreview.com/s/541326/the‐roomba‐now‐sees‐and‐maps‐a‐home/. Accessed October 2017.
  39. 39 Nest Labs. Nest Labs introduces world's first learning thermostat. (2011, October). https://nest.com/press/nest‐labs‐introduces‐worlds‐first‐learning‐thermostat/. Accessed October 2017.
  40. 40 K.Fehrenbacher. How Tesla is ushering in the age of the learning car (2015, October). http://fortune.com/2015/10/16/how‐tesla‐autopilot‐learns/. Accessed October 2017.
  41. 41 M. S.Mahdavinejad, M.Rezvan, M.Barekatain, P.Adibi, P.Barnaghi, and A.P.Sheth. Machine learning for Internet of Things data analysis: A survey. Digital Communications and Networks, 4(3) (August): 161–175, 2018.
  42. 42 P.Misra, A.Pal, P.Balamuralidhar, S.Saxena, and R.Sripriya. Unlocking the value of the Internet of Things (IoT) – A platform approach. White Paper, 2014.
  43. 43 M.Ahmed. Three ways machine learning is revolutionizing IoT. (2017, October). https://www.networkworld.com/article/3230969/internet‐of‐things/3‐ways‐machine‐learning‐is‐revolutionizing‐iot.html. Accessed November 2017.
  44. 44 A.M.Souza and J.R.Amazonas. An outlier detect algorithm using big data processing and Internet of Things architecture. Procedia Computer Science 52 (2015): 1010–1015.
  45. 45 M.A.Khan, A.Khan, M.N.Khan, and S.Anwar. A novel learning method to classify data streams in the Internet of Things. In Software Engineering Conference (NSEC), November 2014, National: 61–66.
  46. 46 H.S.Ham, H.H.Kim, M.S.Kim, and M.J.Choi. Linear SVM‐based android malware detection for reliable IoT services. Journal of Applied Mathematics (2014).
  47. 47 M.S.Alam, and S.T.Vuong. Random forest classification for detecting android malware. In Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things (iThings/CPSCom), IEEE International Conference on and IEEE Cyber, Physical and Social Computing. (2013, August): 663–669.
  48. 48 H. H.Pajouh, R.Javidan, R.Khayami, D.Ali, and K.K.R.Choo. A two‐layer dimension reduction and two‐tier classification model for anomaly‐based intrusion detection in IoT backbone networks. IEEE Transactions on Emerging Topics in Computing, 2016.
  49. 49 A.Jatti, M.Kannan, R.M.Alisha, P.Vijayalakshmi, and S.Sinha. Design and development of an IOT‐based wearable device for the safety and security of women and girl children. In Recent Trends in Electronics, Information & Communication Technology (RTEICT), IEEE International Conference on (pp. 1108–1112), 2016, May. IEEE.
  50. 50 I.Kotenko, I.Saenko, F.Skorik, S.Bushuev. Neural network approach to forecast the state of the Internet of Things elements. 2015 XVIII International Conference on Soft Computing and Measurements (SCM), 2015. doi:10.1109/scm.2015.7190434.
  51. 51 J.Canedo, and A.Skjellum. Using machine learning to secure IoT systems. 2016 14th Annual Conference on Privacy, Security and Trust (PST), 2016. doi:10.1109/pst.2016.7906930.
  52. 52 B.Schneier. Someone is learning how to take down the Internet. (2016, September). https://www.lawfareblog.com/someone‐learning‐how‐take‐down‐internet. Accessed November 2017.
  53. 53 J.S.Perry. Anatomy of an IoT malware attack. (2017, October). https://www.ibm.com/developerworks/library/iot‐anatomy‐iot‐malware‐attack/. Accessed November 2017.
  54. 54 N.Kovacs. FLocker ransomware now targeting the big screen on Android smart TVs. (2016, June). https://community.norton.com/en/blogs/security‐covered‐norton/flocker‐ransomware‐now‐targeting‐big‐screen‐android‐smart‐tvs. Accessed November 2017.
  55. 55 S.,Boddy, K.Shattuck, The hunt for IoT: The Rise of Thingbots. (2017, August). https://f5.com/labs/articles/threat‐intelligence/ddos/the‐hunt‐for‐iot‐the‐rise‐of‐thingbots. Accessed November 2017.
  56. 56 OpenFog Consortium Architecture Working Group. OpenFog Reference Architecture for Fog Computing. OPFRA001, 20817 (2017, February). 162.
  57. 57 H.Vadada. Fog computing: Outcomes at the edge with machine learning. (2017, May). https://towardsdatascience.com/fog‐computing‐outcomes‐at‐the‐edge‐using‐machine‐learning‐7c1380ee5a5e. Accessed November 2017.
  58. 58 B.Tang, Z.Chen, G.Hefferman, T.Wei, H.He, and Q.Yang. A hierarchical distributed fog computing architecture for big data analysis in smart cities. In Proceedings of the ASE BigData & SocialInformatics 2015 (p. 28). ACM.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.171.212