IoT data analytics and machine learning comparison and assessment

The uses of a machine learning algorithm have their place in IoT. The typical case is when there is a plethora of streaming data that needs to produce some meaningful conclusion. A small collection of sensors may only need a simple rule engine on the edge in a latency-sensitive application. Others may stream data to a cloud service and apply rules there for systems with less-aggressive latency demands. When large amounts of data, unstructured data, and real-time analytics come into play, we need to consider the use of machine learning to solve some of the hardest problems.

In this section, we detail some tips and reminders in deploying machine learning analytics, and what use cases may warrant such tools.

Training phase:

For random forest, use bagging techniques to create ensembles.
When using a random forest, ensure you maximize the number of decision trees.
Watch overfitting. Overfitting will lead to inaccurate field models. Techniques such as regularization and even injecting noise into a system will reinforce the mode.
Don't train on the edge.
Gradient descent will lead to error. RNNs naturally are susceptible.

Model in field:

Update model with new data sets as they become available. Keep the training set current.
Running models on the edge can be reinforced with larger and more comprehensive models in the clouds.
Neural network execution can be optimized in the cloud and at the edge with a minimum loss by considering techniques such as pruning node and reducing precision.

Model	Best application	Worse fit and side effects	Resource demands	Training
Random forests (statistical models)	Anomaly detection Systems with 1000's of choice points and hundreds of inputs Regression and classification Handles mixed data types Ignores missing values Scales linearly with input	Feature extraction Time and sequence analysis	Low	Training based on bagging techniques. for maximum effectiveness Training fairly resource light Mainly supervised
RNN (temporal and sequence-based neural networks)	Prediction of an event based on a sequence Streaming data patterns Time-correlated series data Maintains knowledge of past states to predict new states (electrical signals, audio, speech recognition) Unstructured data nput variables may or may not be dependent	Image and video analysis Systems of requiring thousands of features	Very high for training High for inference execution	Training more cumbersome than CNN backpropagation Very hard to train Supervised
CNN (deep learning)	Prediction of an object based on surrounding values Pattern and feature identification 2D image recognition Unstructured Data Input variables may or may not be dependent	Time-based and sequential predictions Systems of requiring thousands of features	Very high for Training (floating point precision, large training sets, large memory demands) High for inference execution	Supervised and unsupervised
Bayesian networks (probabilistic models)	Noisy and incomplete data sets Streaming data patterns Time correlated series Structured data Signal analysis Models developed quickly	Assumes all input variables are independent Perform poorly with high orders of data dimensions	Low	Little training data need with respect to other artificial neural networks

Table of Contents for IoT data analytics and machine learning comparison and assessment

Create new playlist

Sign In

Sign Up

Table of Contents for
IoT data analytics and machine learning comparison and assessment