Chapter 7

Linear Discriminant Analysis

7.1 Introduction

In 1936, statistical pioneer Ronald Fisher discussed linear discriminant [1] that became a common method to be used in statistics, pattern recognition, and machine learning. The idea was to find a linear combination of features that are able to separate two or more classes. The resulting linear combination can also be used for dimensionality reduction. Linear discriminant analysis (LDA) is a generalization of the Fisher linear discriminant.

This method was used to explain the bankruptcy or survival of the firm [2]. In face recognition problems, it is used to reduce dimensions.

LDA seeks to maximize class discrimination and produces exactly as many linear functions as there are classes.

The predicted class for an instance will be the one that has the highest value for its linear function.

7.2 Example

Let us say we want to predict the type of smartphone a customer would be interested in. Different smartphones will be the classes and the known data related to customers will be represented by x.

In order to concoct two class problems, we will define two classes as “Apple” and “Samsung.”

C = {“Apple”, “Samsung”}

We will represent Apple as 0 and Samsung as 1 for the type of smartphones in our practical implementation, respectively, or C = {0,1}.

The two numeric variables that will be considered to predict the classes are age and income of customers. The variable x1 will represent the age of the customer and x2 will represent the income of the customer.

µi is the vector that will describe the mean age and mean income of the customers of smartphones of type i. i will be the covariance matrix of age and income for type i.

We will randomly generate the data for 25 customers in order to understand how the classification works using discriminant analysis.

X1_Apple_Age = round(30 + randn(10,1)*5);
% Supposition that average age of Apple buyers is
30 and a standard deviation of 5
X1_Samsung_Age = round(45 + randn(15,1)* 10);
% Supposition that average age of Samsung buyers is
45 and a standard deviation of 10
X2_Apple_income = round(10000 + randn(10,1) *
2000); % Supposition that average income of Apple
buyers is $10000 and a standard deviation of $2000
X2_Samsung_income = round(5000 + randn(15,1) * 500);
% Supposition that average income of Samsung buyers
is $5000 and a standard deviation of $500
X1 = [X1_Apple_Age; X1_Samsung_Age];
X2 = [X2_Apple_income; X2_Samsung_income];
X = [X1 X2];

To assign the class to the 25 records, we will simply use the following MATLAB® code:

Y = [zeros(10,1); ones(15,1)] % Assign first
10 rows the value of 0 (or Apple) and the last
15 rows the value of 1 (represent Samsung)

To visualize the above data, the following MATLAB code can be used:

scatter(X(1:10,1), X(1:10,2),’r+’) % red +
representing data related to first group or Apple
category
hold on;
scatter(X(11:25,1), X(11:25,2),’b^’) % blue
^representing data related to second group or
Samsung category

The above code will result in following Figure 7.1.

To perform discriminant analysis, we will first initialize few variables that will be used in the discrimination process.

[rows columns] = size(X); % Determine number of
rows and columns of input data
 
Labels = unique(Y); % Label will contain the two
unique values of Y
k = length(Labels); % k will contain number of
records for each label
 
% Initialize
nClass = zeros(k,1); % Class counts
ClassMean = zeros(k, columns); % Class sample
means
PooledCov = zeros(columns, columns); % Pooled
covariance
Weights = zeros(k, columns+1); % model
coefficients

Figure 7.1 Result of MATLAB code represents two classes.

In order to calculate weights that will be used for classification, we will have to calculate the mean vector as well as the covariance matrix. The covariance matrix of the two groups of data belonging to the different classes can be calculated simply by using the “cov()” command of MATLAB.

The following MATLAB code describes mean and covariance matrix calculation of the two groups of data.

Group1 = (Y == Labels(1)); % i.e class equal to 0
Group2 = (Y == Labels(2)); % i.e class equal to 1
 
% Group1 and Group2 are Boolean arrays with 1s and
0s.
% In order to find how many items in each group
are, we
% will convert them to number and then sum them.
 
numGroup1= sum(double(Group1));
numGroup2= sum(double(Group2));
 
MeanGroup(1,:) = mean(X(Group1,:)); %Find mean
vector for class 0
MeanGroup(2,:) = mean(X(Group2,:)); %Find mean
vector for class 1
 
Cov1 = cov(X(Group1,:)); % Covariance matrix
calculation for class 0
Cov2 = cov(X(Group2,:)); % Covariance matrix
calculation for class 1

In order to illustrate the calculation, it is better to show the original data along with the associate mean and the covariance matrix:

Mean age and income for Class 0 can be calculated easily and is as follows:

30    9721

Similarly, mean age and income for Class 1 is given as follows:

47.066    4984.53

The variable MeanGroup will hold these two mean vectors in the form of matrix.

30

9721

47.066

4984.53

The two variables Cov1 and Cov2 will hold the data related to covariance matrix of data belonging to the two classes:

20.8888888888889

4196.88888888889

4196.88888888889

2530221.11111111

35.9238095238095

722.104761904762

722.104761904762

214130.266666667

Rather than using two covariance matrices, we will pool the data and estimate a common covariance matrix (a technique discussed in the machine learning literature) for all classes. The following MATLAB code describes the calculation.

PooledCov = (numGroup1-1)/(rows-k).
*Cov1+(numGroup2-1)/(rows-k).*Cov2

Pooled covariance matrix with 9/23 part of Cov1 and 14/23 part of Cov2:

30.0405797101449

1202.71884057971

1202.71884057971

1120426.68405797

We also have to calculate the prior probabilities of the two groups.

PriorProb1 = numGroup1 /rows; % The prior
probability of Class 0
PriorProb2 = numGroup2/ rows; % The prior
probability of Class 1

Variable PriorProb1 will be simply calculated as 10/25 or 0.4, whereas the second variable PriorProb2 will have a value of 15/25 or 0.6.

Now with all above calculations, we want to calculate the weights that will be used for classification purpose. The following MATLAB code will calculate the weights for us.

Weights(1,1)= −0.5*(MeanGroup(1,:)/PooledCov)*
MeanGroup(1,:)’ + log(PriorProb1);
 
Weights(1,2:end) = MeanGroup(1,:)/PooledCov;
 
Weights(2,1)= −0.5*(MeanGroup(2,:)/PooledCov)*
MeanGroup(2,:)’ + log(PriorProb2);
 
Weights(2,2:end) = MeanGroup(2,:)/PooledCov;

The above code yields the following values for the weight matrix:

W0

W1

W2

−71.5217983501704

1.40645733832605

0.0101859165812994

−59.3830490712585

1.82324057744366

0.00640593376510738

These are the weights that LDA will use for classification.

References

1. Fisher, R. A. The use of multiple measures in taxonomic problems, Annals of Eugenics, vol. 7, 179–188, 1936.

2. Altman, E. I. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy, The Journal of Finance, vol. 23, issue 3, 589–609, 1968.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.199.122