Classification (2/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

In our little universe, since we have the data for everyone hence we have easily created a

rule to predict the next diabetic patient. However, in the real world applications, we do not

store the complete dataset of all the patients, and therefore we can borrow the actual power of

ML to ﬁnd a viable solution for our problems. ML provides prediction even if our dataset does

not contain all the possible samples. For instance, if in our above example, we delete the last

two records. Now, an ML algorithm would process all the attributes of the incoming record

of a person and try to predict whether or not they can contract diabetes or not. This set of all

predictions is referred to as a model.

ML problems are often categorized under regression and classiﬁcation.

Regression

Regression is used for predicting “continuous” outcomes. In regression, the answer to a

question is determined from the given values of a model, instead of a ﬁnite set of labels. When

you search “regression”, you would ﬁnd many Statistics-based links. This is because it is one of

the fundamental branches of Statistics used for calculating the relationship between variables.

InML, it helps to calculate predictions for events by determining the relationship of the given

input ( variables) in the dataset. Typically, the regression model adheres to the following model.

Prediction Outcome = Coecient 1 + Coecient 2 * Input.

Logistic Regression

When we use the term “logistic” regression, our focus is on the primary function of the algorithm

known as the logistic function. This function is also referred to as the sigmoid function. It is part

of Statistics which is used to ﬂesh out the characteristics for the growth of population in ecology,

understanding its rise, and height for capacity. The function makes an S-shaped curve; as an

input, it considers real numbers and assigns it a value in the range of 0 and 1.

Logistic regression utilizes an equation for representation. The input values are represented

by x where they make use of coecient values or “weights” for estimation of the outcome. This

output is represented by y. Consider the following equation of logistic regression.

y = e^(b0 + b1*x)/(1 + e^(b0 + b1*x))

Sad

Happy

Logistic regression Model

Input:

X1, X2, X3 II Weights:

3, II Outputs: Happ

y or Sad

Chapter 10 Data Analytics and Machine Learning for IoT 255

Internet_of_Things_CH10_pp249-270.indd 255 9/3/2019 10:15:57 AM

Here y is the prediction result, b0 is the intercept or the bias, and b1 represents x’s (input)

coecient.

This regression applies to the probability of the ﬁrst class or the default class. For instance,

if we are trying to predict the gender of people through the given data of height values, then we

may have a default class as a male. In such an instance, we can write the probability formally

with the following method where, s = sex, m = male, and h = height.

P(s = m|h)

Observe closely that what we are saying is that the ﬁrst class y contains our input x.

P(X) = P(Y = 1|X)

It is important to note that while the method of logistic regression is linear, the estimations

are processed with the help of the logistic function. As a result, it diers from linear regression

as input cannot be comprehended with a linear combination.

The estimations for the values of the coecient must be performed via the training data.

For this purpose, the maximum likelihood estimation is used. Predictions are easy with logistic

regression—you just have to put the right value from the data. For instance, suppose we have

a model which is used to predict whether a student is good at study or not. This assessment is

done by going by their marks. Consider in the given data, a student has 40 marks. Provided,

we have the coecient value for b0 = −40 and b1 = 0.4, we can generate an estimation for the

assessment of a student of being deﬁcient in academics, P(bad/marks=40).

y = e^(b0 + b1*X) / (1 + e^(b0 + b1*X))

exp(−18 + 0.4*40) / (1 + EXP(−18 + 0.4*40))

y = 0.1192029220221176

Now, we did get a result, but how will we determine the assessment based on the output, well

for that we must have a benchmark. For instance, going by our benchmark, an intelligent student

has a probability of more than 0.20 and a less intelligent one has a probability of less than 0.20.

Linear Regression

Similar to logistic regression, linear regression also belongs to statistics. In statistics, it is used for

determining the relationship between numerical input and output variables.

Linear regression follows a linear model that is, there is a linear relationship among its

input and output variables. The input is represented by x and the output is represented by y.

To delve further, the value of y is determined by x values with a combination that is linear in

nature.

Whenever the input variable is only a single one, then the method is called simple linear

regression. Otherwise, for multiple values, it is called multiple linear regressions.

The equation of linear regression maps a scale factor for all the input values. This factor is

referred to as a coecient. It is represented by B (Beta). There is one more coecient known as

the bias coecient.

For instance, a regression problem with a single input variable x (simple linear regression),

takes the following equation.

y = B0 + B1*x

256 Internet of Things

Internet_of_Things_CH10_pp249-270.indd 256 9/3/2019 10:15:57 AM

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Classification (2/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
Classification (2/2)