Note: Page numbers followed by f and t refer to figures and tables, respectively.
Agents, 11
Alice’s Adventures in Wonderland (Novel), 9
Alignment problem, 145
Antecedent, rule, 53
Apple’s Siri, 32
Artificial neural network (ANN), 90
Attribute selection measure, 41–46
information gain of ID3, 41–44
problem with information gain, 44–46
Automatic recognition of handwritten postal codes, 15–17
Autonomous robots, 20
Axon, 89
Backpropagation algorithm, 99–102
stages, 99
weights updates in neural network, 101–102
Bayes’ theorem
equation, 73
posterior probability, 74
Belief, stability of (Plato), 12–13
Bias, perceptron, 92
Biometric verification, GMM, 138
Brant, Kenneth F., 28
Building stage, JRip algorithm, 57
Business intelligence system, 23
C4.5 (Successor of ID3), 38
attribute selection measure, 41
Cellan-Jones, Rory, 26
transcript of conversation with Chatbot Rose, 189–192, 190f
Chatbot
transcript of conversations with, 189–192, 190f
conversation with human and machine, 26–27
by Eugene Goostman, 26
neural conversational model, 28
Class conditional independence, 74
algorithm, 83
shortcoming of, 84
naïve Bayesian, 73
prior probability, 75
rule-based, 53
IF-THEN rules, 53
overview, 53
sequential covering algorithm, 54, 56f
visualization, 55
Clustering
k-means
method for polymorphic worms, 179
support vector, 116
Common weighing scheme, 84
Compute Array of Frequencies function, 183–184
Compute Principal Component function, 183, 185
Computer-aided diagnosis, 17–19
assisting doctors/radiologists in health problems, 19
classifier, examples, 18
Computers, 3
face recognition and security, 22
Conditional probabilities
after Laplace correction, 77
of likelihood, 76
Consequent, rule, 53
Continuous space K, 85
Covariance matrix, 154–155, 157
benefits into SVD, 158
ecoli dataset, 155t
evaluation, 177
“cov()” command of MATLAB, 110
Cross-validation technique, 14
Dataset/data
2D reduced, 163f
classification in DT, 37
cross-validation, 14
ecoli
covariance matrix of, 155t
variance of principle component, 160, 160t
function “infogain” to calculate entropy of, 47–48
in optimization stage, 71
information gain, 65
labeled
semi-supervised learning, 11
supervised learning, 8
machine learning, 7f
mining, 4
MPCA contributions in polymorphic worms detection
normalization of data, 176
projection of data, adjusting, 178
significant data determination, 175–176
naïve Bayesian classifier, 74–75
nonlinearly separable, 126f
for pruning grown rules, 67, 69
in optimization stage, 71
reconstruction error, 160–161, 161t
unlabeled
semi-supervised learning, 10
Decision tree (DT), 37
attribute selection measure, 41–46
information gain of ID3, 41–44
problem with information gain, 44–46
data classification, 37
concept of number of bits, 40–41
Shannon, 38
MATLAB®, implementation in, 46–52
Deep Fritz (chess program), 31
Dendrites, 89
Dimensionality reduction, SVD and, 157–158
Discrete and finite space K, 85
Double-honeynet system, 167–168, 171
Driverless cars, Toyota, 20–21, 22f
Durant, Will, 1
D-variate Gaussian function, 137
Ecoli data
covariance matrix of, 155t
variance of principle component, 160, 160t
Eigenvalue evaluation, 177
concept of number of bits, 40–41
Shannon, 38
Epoch, 94
Eugene Goostman (chatbot), 25
Expectation maximization (EM) algorithm, iterative, 138
Expected value of information, 38
Feature descriptor (FD), 178
Feed-forward stage, 99
Finite space K, 85
Fisher, Ronald, 107
Function “infogain,” 52
Gates, Bill, 14
Gaussian kernel, 126
Gaussian mixture model (GMM), 137–142
applications, 138
clustering using, 138
equation, 137
Gaussian distributions
clusters corresponding to, 142f
means and variances of, 139t, 140f
mixed, 141f
Google, 17
Google Now, 32
Gradient decent method, 101
Grow phase, JRip algorithm, 57
Handwriting detection, 24
Handwritten postal codes, automatic recognition of, 15–17
Hidden Markov model (HMM), 17
parameters, 148f
problems of, 145
Hold out testing/validation, 14
Huffman code, 40
ID3 (Iterative Dichotomiser 3), 38
Successor of ID3 (C4.5), 38
attribute selection measure, 41
IDS, see Intrusion detection system (IDS)
IF-THEN rules, rule-based classifiers, 53
Information gain
formula for, 66
measure
drawback, 46
formula of, 42
splitinfo, 41
rule growing process using, 60–65
Instance-based learning, 84
Internet worms, 167
Intrusion detection system (IDS), 169
Iris database, 127
Irwin, Terence, 12
Iterative expectation maximization (EM) algorithm, 138
building stage, 57
Kasparov, Garry, 30
Kernel
Gaussian, 126
trick, 115
k-fold cross-validation, 14
k-means clustering
KMP, see Knuth–Morris–Pratt (KMP) algorithm
k-nearest neighbors (k-NN) classifiers, 131–132
algorithm, 83
shortcoming of, 84
Knowledge Discovery from Data (KDD), 4
Knuth–Morris–Pratt (KMP) algorithm, 168, 170–171, 173
Kramnik, Vladimir, 31
Kuhn–Tucker theorem, 123
Labeled data
semi-supervised learning, 11
supervised learning, 8
Lazy learning, 84
Learning, see specific learning
Likelihood probability, 73, 75–76, 78
code, 80
conditional probabilities, 76
Linear discriminant analysis (LDA), 107–113
overview, 107
Linear kernel function, 125, 127
Linearly separable data, 116, 117f
Loebner Prize, 189
Luhn, H.P., 23
Machine(s)
conversation with human and, 26–27
prediction (2014 and 2015), 30
strategic technologies, 29, 29f
support vector machines (SVM)
Machine learning algorithms, 4–7, 37, 115
automatic recognition of handwritten postal codes, 15–17
computer-aided diagnosis, 17–19
goal of, 6
Apple’s Siri, 32
Google Now, 32
techniques and required data, 7f
MATLAB®
code
covariance matrix calculation, 110–111
naïve Bayesian classification, 79–82
perceptron training and testing algorithms, 94–96
McCulloch, Warren, 90
Means of Gaussian distributions, 139t, 140f
Minimum description length (MDL), 57, 70, 72
Modified Knuth–Morris–Pratt (MKMP) algorithm, 170, 172–174
SEA and PCA, 168
testing quality of generated signature, 174
Modified PCA (MPCA), polymorphic worms detection, 174–179
clustering method for worms, 179
covariance matrix evaluation, 177
eigenvalue evaluation, 177
frequency counts determination, 175
normalization of data, 176
principal component evaluation, 177–178
projection of data, adjusting, 178
significant data determination, 175–176
Naïve Bayesian classification, 73
prior probability, 75
Nearest centroid classifier, see Rocchio algorithm
Neural conversational model, 28
Neural network
error histogram, 104f
validation performance, 104f
Nonlinear kernel function, 126–127
Nonlinearly separable data, 126f
Optical character recognition (OCR) technology, 15–17, 16f
Optimization stage
dataset for growing rule, 71
dataset for pruning rules, 71
Overall variability, PCA, 154
Pattern (string), 169
Pattern recognition, 4
computer-aided diagnosis, 17–18
HMM, 145
k-NN algorithm, 83
OCR technology, 17
PCA, see Principal component analysis (PCA)
Perceptron
training and testing algorithm, MATLAB implementation, 94–96
PerceptronTesting function, 95–96
Pitts, Walter, 90
Plato on stability of belief, 12–13
Plato’s Ethics (Book), 12
The Pleasures of Philosophy (Book), 1
Polymorphic worms detection using PCA, 167–187
clustering method for worms, 179
quality testing of generated signature, 178–179
proposed SEA, 171–172, 172f, 172t
SEA, MKMP, and PCA, 168
signature generation algorithms pseudo-codes, 179–187
MKMP algorithm pseudo-code, 181–183
quality testing of generated signature, 186–187
SEA pseudo-code, 180
testing quality of generated signature, 174
Posterior probability, 73, 78–79
formula, 78
requirements, 78
using Bayes’ theorem, 74
reinforcement learning, 11
semi-supervised learning, 10–11
labeled data, 11
unlabeled data, 10
categories, 8
labeled data, 8
validation and evaluation, 11–14
projection of data, adjusting, 178
Principal component analysis (PCA)
2D reduced data, 163f
3D reduced space, 163f
dataset shape, 156f
SVD and dimensionality reduction, 157–158
data reconstruction error, 160–161, 161t
principle components selection, 159–160, 159t
polymorphic worms detection using, 167–187
proposed SEA, 171–172, 172f, 172t
SEA, MKMP, and PCA, 168
signature generation algorithms pseudo-codes, 179–187
testing quality of generated signature, 174
purpose of using, 157
Laplace estimator, 77
Prune phase, JRip algorithm, 57
Pruning metric, 68
Pseudo-codes, signature generation algorithms, 179–187
MKMP algorithm pseudo-code, 181–183
quality testing of generated signature, 186–187
SEA pseudo-code, 180
Quadratic programming (QP) problem, 121–122
Quality testing of generated signature, 168
MKMP algorithm, 174
Radial basis functions (RBFs) kernels, 126, 127f
Reconstruction error, 160–161, 161t
Reduced error pruning (REP), 55
Reinforcement learning, 11
Replacement rule, 70
Revised rule, 70
Ripper (repeated incremental pruning to produce error reduction), 55–72
building stage, 57
RoboCup (Robot Soccer World Cup), 19–20, 20f
Robots, autonomous, 20
Rocchio algorithm, 132
Romeo and Juliet (Play), 3
Rosenblatt, Frank, 90
Rose (Chatbot), transcript of conversation with, 189–192, 190f
Rule antecedent, 53
Rule-based classifiers, 53
IF-THEN rules, 53
overview, 53
sequential covering algorithm, 54, 56f
visualization, 55
Rule consequent, 53
attributes and possibilities, 60
SAS Institute Inc., 4
Scoring problem, 145
SEA, see Substring exaction algorithm (SEA)
Semi-supervised learning, 10–11
labeled data, 11
unlabeled data, 10
Sequential covering algorithm, 54, 56f
Shannon entropy, 38
Signaturefile function, 181–183
Signature generation algorithms pseudo-codes, 179–187
MKMP algorithm pseudo-code, 181–183
quality testing of generated signature, 186–187
SEA pseudo-code, 180
Sign function, 91
Singular value decomposition (SVD), 157–158
benefits of covariance matrix into, 158
and dimensionality reduction, 157–158
Siri (speech interpretation and recognition interface), Apple’s, 32
prediction (2014 and 2015), 30
strategic technologies, 29, 29f
Social media, 23
Soma, 89
Speaker identification, GMM, 138
Splitinfo (measure of information gain), 41, 49–52
Statistical Analysis System (SAS), 4
Step function, 91
Substring exaction algorithm (SEA), 168, 170
pseudo-code, 180
Successor of ID3 (C4.5), 38
attribute selection measure, 41
Supervised learning, 7–8; see also Unsupervised learning
algorithms
naïve Bayesian classification, 73–82
categories, 8
labeled data, 8
Support vector clustering, 116
Support vector machines (SVM)
case of nonlinear kernel, 126
Support vectors, 119, 119f, 121
SVD, see Singular value decomposition (SVD)
Svmtrain function, 127
TestingData, 127
Text (string), 169
Three-dimensional (3D) hyperplane, 116, 117f
Three-dimensional reduced space, 163f
Top-down induction of decision trees (TDIDTs), 37
Toyota, driverless cars, 20–21, 22f
Training problem, HMM, 145
Transcript of conversation with Chatbot Rose, 189–192, 190f
Two-dimensional (2D) random data, 155, 156f
Two-dimensional reduced data, 163f
Unlabeled data
semi-supervised learning, 10
Unsupervised learning, 9–10; see also Supervised learning
algorithms
Variances of Gaussian distributions, 139t, 140f
Visualization, rule-based classifiers, 55
Watson, Thomas J., 31
Weka
principal component methods in, 163–167
Writing style detection, 24
18.226.104.27