A control chart represents how a system behaves over time. It is a graph that plots one or more variables of a system or process over time. This information can be used for quality control in manufacturing and business process. When only one variable is plotted against time, it is called a univariate control chart, and when more than one variable is plotted against time, it is called a multivariate control chart.
In this chapter, we will be working with a synthetic control chart time series data provided by the UCI Machine Learning Repository. Each of the control chart belongs to one of the following categories:
Each of the control charts consists of 60 columns, each a decimal value. There are 100 records for each category. Further details about the dataset can be found at http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data.html.
We will be using 80 out of 100 records from each category to develop a clustering model, and then we will use the remaining 20 records to predict the category for them. We will be using the K-means clustering algorithm for this, which is provided by Trident-ML.
But before going ahead with the producer, we need to download the dataset from the UCI Machine Learning Repository located at http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data. Save this file so that it can be used later for training and testing.
3.137.184.90