
Note: Page numbers followed by f and t refer to figures and tables, respectively.


Activation functions, 92f, 99

Agents, 11

Alice’s Adventures in Wonderland (Novel), 9

Alignment problem, 145

Antecedent, rule, 53

Apple’s Siri, 32

Artificial neural network (ANN), 90

Attribute selection measure, 4146

information gain of ID3, 4144

problem with information gain, 4446

Automatic recognition of handwritten postal codes, 1517

Autonomous cars, 2021, 22f

Autonomous robots, 20

Axon, 89


Backpropagation algorithm, 99102

stages, 99

weights updates in neural network, 101102

Bayes’ theorem

equation, 73

posterior probability, 74

Belief, stability of (Plato), 1213

Bias, perceptron, 92

Biometric verification, GMM, 138

Bits, concept of, 4041

Brant, Kenneth F., 28

Building stage, JRip algorithm, 57

Business intelligence system, 23


C4.5 (Successor of ID3), 38

attribute selection measure, 41

gain ratio of, 4951

Cars, autonomous, 2021, 22f

Cellan-Jones, Rory, 26

transcript of conversation with Chatbot Rose, 189192, 190f


creative, 193194

transcript of conversations with, 189192, 190f

Turing test, 2526

conversation with human and machine, 2627

by Eugene Goostman, 26

neural conversational model, 28

Class conditional independence, 74

Classifiers, 2425

k-NN, 131132

algorithm, 83

classification, 8384

example, 8486, 85f

in MATLAB®, 8688

regression, 8384

shortcoming of, 84

naïve Bayesian, 73

example, 7475

Laplace estimator, 7778

likelihood, 7576

MATLAB implementation, 7982

posterior probability, 7879

prior probability, 75

rule-based, 53

algorithm, 5455

IF-THEN rules, 53

overview, 53

Ripper, 5572

sequential covering algorithm, 54, 56f

visualization, 55


algorithm, 116, 154


algorithm, 133134

in MATLAB®, 134136

method description, 132133

overview, 131132

method for polymorphic worms, 179

support vector, 116

using GMM, 138142, 142f

Common weighing scheme, 84

Compute Array of Frequencies function, 183184

Compute Principal Component function, 183, 185

Computer-aided diagnosis, 1719

assisting doctors/radiologists in health problems, 19

classifier, examples, 18

pattern recognition, 1718

Computers, 3

Computer vision, 1922

driverless cars, 2021, 22f

face recognition and security, 22

RoboCup, 1920, 20f

Conditional probabilities

after Laplace correction, 77

of likelihood, 76

Consequent, rule, 53

Continuous space K, 85

Cortana, Microsoft’s, 3233

Covariance matrix, 154155, 157

benefits into SVD, 158

calculation, 110111

ecoli dataset, 155t

evaluation, 177

“cov()” command of MATLAB, 110

Creative chatbot, 193194

Cross-validation technique, 14



2D random, 155, 156f

2D reduced, 163f

classification in DT, 37

cross-validation, 14


covariance matrix of, 155t

k-means clustering, 134, 135f

k-NN classifier, 8687

neural network model, 102105

variance of principle component, 160, 160t

entropy of, 39, 39t, 43

function “infogain” to calculate entropy of, 4748

for growing rule, 67, 69

in optimization stage, 71

information gain, 65


semi-supervised learning, 11

supervised learning, 8

linearly separable, 116, 117f

machine learning, 7f

mining, 4

MPCA contributions in polymorphic worms detection

mean adjusted data, 176177

normalization of data, 176

projection of data, adjusting, 178

significant data determination, 175176

naïve Bayesian classifier, 7475

nonlinearly separable, 126f

for pruning grown rules, 67, 69

in optimization stage, 71

reconstruction error, 160161, 161t

rule growing, 5859, 64

text mining, 2425


semi-supervised learning, 10

supervised learning, 8, 9t

Decision tree (DT), 37

attribute selection measure, 4146

information gain of ID3, 4144

problem with information gain, 4446

data classification, 37

entropy, 3841

concept of number of bits, 4041

example, 3839, 39t

function “infogain,” 4748

Shannon, 38

MATLAB®, implementation in, 4652

Deep Blue (IBM), 3031

Deep Fritz (chess program), 31

Dendrites, 89

Dimensionality reduction, SVD and, 157158

Discrete and finite space K, 85

Double-honeynet system, 167168, 171

Driverless cars, Toyota, 2021, 22f

Durant, Will, 1

D-variate Gaussian function, 137


Ecoli data

covariance matrix of, 155t

k-means clustering, 134, 135f

k-NN classifier, 8687

neural network model, 102105

variance of principle component, 160, 160t

Eigenvalue evaluation, 177

Entropy, 3841

concept of number of bits, 4041

example, 3839, 39t

function “infogain,” 4748

Shannon, 38

Epoch, 94

Eugene Goostman (chatbot), 25

Expectation maximization (EM) algorithm, iterative, 138

Expected value of information, 38


Feature descriptor (FD), 178

Feed-forward stage, 99

Finite space K, 85

Fisher, Ronald, 107

Function “gaininfo,” 5152

Function “infogain,” 52

to calculate entropy of dataset, 4748


Gain ratio, 46, 49

of C4.5, 4951

Gartner, Inc., 2829

Gates, Bill, 14

Gaussian kernel, 126

Gaussian mixture model (GMM), 137142

applications, 138

clustering using, 138

concept by example, 138142

equation, 137

Gaussian distributions

clusters corresponding to, 142f

means and variances of, 139t, 140f

mixed, 141f

Google, 17

Google Now, 32

Gradient decent method, 101

Grow phase, JRip algorithm, 57


Handwriting detection, 24

Handwritten postal codes, automatic recognition of, 1517

Hidden Markov model (HMM), 17

example, 146147

MATLAB code, 148152

overview, 145146

parameters, 148f

problems of, 145

Hold out testing/validation, 14

Huffman code, 40

Human and machine, conversation with, 2627


IBM’s Deep Blue, 3031

IBM’s Watson, 3132

ID3 (Iterative Dichotomiser 3), 38

information gain of, 4144

Successor of ID3 (C4.5), 38

attribute selection measure, 41

gain ratio of, 4951

IDS, see Intrusion detection system (IDS)

IF-THEN rules, rule-based classifiers, 53

Information gain

formula for, 66


drawback, 46

formula of, 42

of ID3, 4144

problem with, 4446

splitinfo, 41

Ripper, 6566

rule growing process using, 6065

Instance-based learning, 84

Internet worms, 167

Intrusion detection system (IDS), 169

Iris database, 127

Irwin, Terence, 12

Iterative expectation maximization (EM) algorithm, 138


JRip algorithm, 5658

building stage, 57

optimization stage, 5758


Kasparov, Garry, 30


case of nonlinear, 126127

Gaussian, 126

linear function, 125, 127

nonlinear function, 126127

trick, 115

k-fold cross-validation, 14

k-means clustering

algorithm, 133134

in MATLAB®, 134136

method description, 132133

overview, 131132

KMP, see Knuth–Morris–Pratt (KMP) algorithm

Kmpfound function, 181182

k-nearest neighbors (k-NN) classifiers, 131132

algorithm, 83

classification, 8384

example, 8486, 85f

in MATLAB®, 8688

regression, 8384

shortcoming of, 84

Knowledge Discovery from Data (KDD), 4

Knuth–Morris–Pratt (KMP) algorithm, 168, 170171, 173

Kramnik, Vladimir, 31

Kuhn–Tucker theorem, 123


Labeled data

semi-supervised learning, 11

supervised learning, 8

Lagrangian function, 121123

Laplace estimator, 7778

Lazy learning, 84

Learning, see specific learning

Learning rate, 94, 101

Likelihood probability, 73, 7576, 78

code, 80

conditional probabilities, 76

Linear discriminant analysis (LDA), 107113

example, 108113

overview, 107

Linear kernel function, 125, 127

Linearly separable data, 116, 117f

Loebner Prize, 189

Luhn, H.P., 23



conversation with human and, 2627

smart, 3, 2830

criteria for, 2829

prediction (2014 and 2015), 30

strategic technologies, 29, 29f

support vector machines (SVM)

in MATLAB®, 127128

overview, 115116

problem definition, 116119

Machine learning algorithms, 47, 37, 115

applications of, 1425

automatic recognition of handwritten postal codes, 1517

computer-aided diagnosis, 1719

computer vision, 1922

speech recognition, 2223

text mining, 2325

discipline of, 56, 5f

goal of, 6

present and future, 2533

Apple’s Siri, 32

Deep Blue (IBM), 3031

Google Now, 32

IBM’s Watson, 3132

Microsoft’s Cortana, 3233

smart machines, 2830

thinking machines, 25, 2728

techniques and required data, 7f

Margin, SVM, 116, 118f



applies PCA, 161162

covariance matrix calculation, 110111

GMM, 138142

hidden Markov model, 148152

implementation, 4652

gain ratio of C4.5, 4951

naïve Bayesian classification, 7982

perceptron training and testing algorithms, 9496

prediction process in, 8182

k-means clustering, 134136

k-NN algorithm in, 8688

neural networks in, 102105

SVM in, 127128

McCulloch, Warren, 90

Mean adjusted data, 176177

Means of Gaussian distributions, 139t, 140f

Microsoft’s Cortana, 3233

Minimum description length (MDL), 57, 70, 72

Mitchell, Tom, 56

Modified Knuth–Morris–Pratt (MKMP) algorithm, 170, 172174

SEA and PCA, 168

testing quality of generated signature, 174

Modified PCA (MPCA), polymorphic worms detection, 174179

clustering method for worms, 179

contributions in, 174178

covariance matrix evaluation, 177

eigenvalue evaluation, 177

frequency counts determination, 175

mean adjusted data, 176177

normalization of data, 176

principal component evaluation, 177178

projection of data, adjusting, 178

significant data determination, 175176

quality testing of generated signature, 178179

Multilayer perceptron network, 9699


Naïve Bayesian classification, 73

example, 7475

Laplace estimator, 7778

likelihood, 7576

MATLAB implementation, 7982

posterior probability, 7879

prior probability, 75

Nearest centroid classifier, see Rocchio algorithm

Neural conversational model, 28

Neural network

error histogram, 104f

in MATLAB, 102105

multilayer, 9699

perceptron, 8994

validation performance, 104f

weights updates in, 101102

Neuron, 89, 90f

Nonlinear kernel function, 126127

Nonlinearly separable data, 126f


Optical character recognition (OCR) technology, 1517, 16f

Optimization, Ripper, 6872

Optimization stage

dataset for growing rule, 71

dataset for pruning rules, 71

JRip algorithm, 5758

Overall variability, PCA, 154

Overfitting phenomenon, 1314


Pattern (string), 169

Pattern recognition, 4

computer-aided diagnosis, 1718

HMM, 145

k-NN algorithm, 83

OCR technology, 17

PCA, see Principal component analysis (PCA)


neural network, 8994

training and testing algorithm, MATLAB implementation, 9496

PerceptronTesting function, 9596

Pitts, Walter, 90

Plato on stability of belief, 1213

Plato’s Ethics (Book), 12

The Pleasures of Philosophy (Book), 1

Polymorphic worms detection using PCA, 167187

KMP algorithm, 170171

MKMP algorithm, 173174

modified PCA, 174179

clustering method for worms, 179

contributions in, 174178

quality testing of generated signature, 178179

overview, 167168

proposed SEA, 171172, 172f, 172t

SEA, MKMP, and PCA, 168

signature generation algorithms pseudo-codes, 179187

MKMP algorithm pseudo-code, 181183

MPCA pseudo-code, 183186

quality testing of generated signature, 186187

SEA pseudo-code, 180

string matching, 169170

testing quality of generated signature, 174

Posterior probability, 73, 7879

formula, 78

requirements, 78

using Bayes’ theorem, 74

Preliminaries, 214

machine learning, 47

reinforcement learning, 11

semi-supervised learning, 1011

labeled data, 11

unlabeled data, 10

supervised learning, 78

categories, 8

labeled data, 8

unlabeled data, 8, 9t

unsupervised learning, 910

validation and evaluation, 1114

Principal component, 153154

evaluation, 177178

methods in Weka, 163167

projection of data, adjusting, 178

Principal component analysis (PCA)

2D reduced data, 163f

3D reduced space, 163f

defined, 153154

idea behind, 155158

dataset shape, 156f

SVD and dimensionality reduction, 157158

implementation, 158161

data reconstruction error, 160161, 161t

principle components selection, 159160, 159t

steps, 158159

MATLAB®, 161162

methods in Weka, 163167

polymorphic worms detection using, 167187

KMP algorithm, 170171

MKMP algorithm, 173174

modified PCA, 174179

overview, 167168

proposed SEA, 171172, 172f, 172t

SEA, MKMP, and PCA, 168

signature generation algorithms pseudo-codes, 179187

string matching, 169170

testing quality of generated signature, 174

problem description, 154155

purpose of using, 157

Prior probability, 73, 75

of class variable, 7980

Laplace estimator, 77

Prune phase, JRip algorithm, 57

Pruning metric, 68

Pruning operation, 56, 6668

Pseudo-codes, signature generation algorithms, 179187

MKMP algorithm pseudo-code, 181183

MPCA pseudo-code, 183186

quality testing of generated signature, 186187

SEA pseudo-code, 180


Quadratic programming (QP) problem, 121122

Quality testing of generated signature, 168

MKMP algorithm, 174

MPCA, 178179

pseudo-codes for, 186187


Radial basis functions (RBFs) kernels, 126, 127f

Reconstruction error, 160161, 161t

Reduced error pruning (REP), 55

Regression, k-NN, 8384

Reinforcement learning, 11

Replacement rule, 70

Revised rule, 70

Ripper (repeated incremental pruning to produce error reduction), 5572

algorithm of JRip, 5658

building stage, 57

optimization stage, 5758

information gain, 6566

optimization, 6872

pruning, 6668

rule growing process, 5865

RoboCup (Robot Soccer World Cup), 1920, 20f

Robots, autonomous, 20

Rocchio algorithm, 132

Romeo and Juliet (Play), 3

Rosenblatt, Frank, 90

Rose (Chatbot), transcript of conversation with, 189192, 190f

Rule antecedent, 53

Rule-based classifiers, 53

algorithm, 5455

IF-THEN rules, 53

overview, 53

Ripper, 5572

algorithm of JRip, 5658

information gain, 6566

optimization, 6872

pruning, 6668

rule growing process, 5865

sequential covering algorithm, 54, 56f

visualization, 55

Rule consequent, 53

Rule growing process, 5865

attributes and possibilities, 60

benchmark dataset, 5859

using information gain, 6065


SAS Institute Inc., 4

Scoring problem, 145

SEA, see Substring exaction algorithm (SEA)

Semi-supervised learning, 1011

labeled data, 11

unlabeled data, 10

Sequential covering algorithm, 54, 56f

Shannon entropy, 38

Sigmoid function, 92, 100

Signaturefile function, 181183

Signature generation algorithms pseudo-codes, 179187

MKMP algorithm pseudo-code, 181183

MPCA pseudo-code, 183186

quality testing of generated signature, 186187

SEA pseudo-code, 180

Sign function, 91

Singular value decomposition (SVD), 157158

benefits of covariance matrix into, 158

and dimensionality reduction, 157158

Siri (speech interpretation and recognition interface), Apple’s, 32

Smart machines, 3, 2830

criteria for, 2829

prediction (2014 and 2015), 30

strategic technologies, 29, 29f

Social media, 23

Soma, 89

Speaker identification, GMM, 138

Speech recognition, 2223

Splitinfo (measure of information gain), 41, 4952

Statistical Analysis System (SAS), 4

Step function, 91

String matching, 169170

Substring exaction algorithm (SEA), 168, 170

proposed, 171173

pseudo-code, 180

Successor of ID3 (C4.5), 38

attribute selection measure, 41

gain ratio of, 4951

Supervised learning, 78; see also Unsupervised learning


DT, 3752

k-NN classifiers, 8388

LDA, 107113

naïve Bayesian classification, 7382

neural networks, 88105

rule-based classifiers, 5372

SVM, 115128

categories, 8

labeled data, 8

unlabeled data, 8, 9t

Support vector clustering, 116

Support vector machines (SVM)

in MATLAB®, 127128

overview, 115116

problem definition, 116119

case of nonlinear kernel, 126

design, 120126

Support vectors, 119, 119f, 121

SVD, see Singular value decomposition (SVD)

Svmtrain function, 127


TestingData, 127

Text (string), 169

Text mining, 2325

applications, 2324

text and image data, 2425

Three-dimensional (3D) hyperplane, 116, 117f

Three-dimensional reduced space, 163f

Top-down induction of decision trees (TDIDTs), 37

Toyota, driverless cars, 2021, 22f

Training problem, HMM, 145

Transcript of conversation with Chatbot Rose, 189192, 190f

Turing, Alan, 3, 25

Two-dimensional (2D) random data, 155, 156f

Two-dimensional reduced data, 163f


Unlabeled data

semi-supervised learning, 10

supervised learning, 8, 9t

Unsupervised learning, 910; see also Supervised learning


GMM, 137142

HMM, 145152

k-means clustering, 131136

PCA, 153187

clustering, 10, 115116

US Postal Service, OCR technology, 1517, 16f


Variances of Gaussian distributions, 139t, 140f

Visualization, rule-based classifiers, 55

Voice-controlled programs, 2223


Watson, Thomas J., 31


principal component methods in, 163167

Ripper in, 5658

Writing style detection, 24


XOR logical operation, 9697, 97f, 97t

