Linear regression explained

Let's discuss the basic premise behind what Machine Learning is and what it attempts to accomplish. Take a look at the following chart that shows some fictional sales data for your next app:

Chart of fictional sales data

Now, just looking at the chart, you can see that as the x values increase (perhaps days on sale), it appears that our sales also increase: y value (sales). By just eyeing the chart, we ourselves can make predictions by following the trend of the points. Try it; how many sales are for an x value (bottom axis) of 25? Give it a guess, and write it down. With your guess secured, we will use a technique called linear regression to find a good answer.

Linear regression has been around for years and is considered as the base for many statistical data analysis methods. It is the basis for many other Machine Learning algorithms used in data science and predictive analysis today. This technique works by finding a solution (a line, curve, or whatever) that best fits the points. From that solution, we can determine the future or previous events or occurrences. Since this method is so well established, you can just open up Excel and let it draw the linear regression solution right on the graph. The following is an example of the linear regression with a trend line and equation added to the chart:

Chart with linear regression trend line

Keep in mind that this example uses 2D points, but the same concepts equally apply to 3D as well. You just need to account for the extra dimension, which is not always a trivial thing but doable nonetheless.

Without getting into the nitty-gritty details of the math, just understand that the line is drawn in order to minimize the error between the line and the points, which is often referred to as the line of best fit or one that minimizes the error, which in this case, is expressed as an R squared value (R²). R² ranges in value from 1.0, a best possible fit, to 0.0, or shooting blanks in the dark. You can see that our R² is not perfect, but it is 0.9125 out of 1 or 91.25% correct; it's not perfect but perhaps good enough.

Probability and statistics play heavily into Machine Learning of all forms. If you don't have a good statistics background, you can still get the statistics by choosing a third-party provider. The only exception is if you have issues with that technology; then, it helps to have some background on your side, which is probably not something you wanted to hear if you're already trying to catch up on your 3D math skills.

Take the example we just looked at and now think about the problem in 3D, and it's not a line but a 3D object we want to recognize or predict. Obviously, things can get complicated quite fast and computationally expensive using statistical models. Fortunately, there is a better way to do this using a technique that uses supervised learning that models the human brain, called neural networks (NN).

In the next section, we will go under the covers into supervised learning and explore some techniques that we can use to analyze data using deep learning (DL) with neural networks.

Table of Contents for Linear regression explained

Create new playlist

Sign In

Sign Up

Table of Contents for
Linear regression explained