Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Index

A

A/B tests
- URL / Importance of evaluation
Abstract-C / Abstract-C
activation function
- about / Perceptron
active learning
- about / Active learning
- representation / Representation and notation
- notation / Representation and notation
- scenarios / Active learning scenarios
- approaches / Active learning approaches
- uncertainty sampling / Uncertainty sampling
- version space sampling / Version space sampling
active learning, case study
- about / Case study in active learning
- tools / Tools and software
- software / Tools and software
- business problem / Business problem
- machine learning, mapping / Machine learning mapping
- data collection / Data Collection
- data sampling / Data sampling and transformation
- data transformation / Data sampling and transformation
- feature analysis / Feature analysis and dimensionality reduction
- dimensionality reduction / Feature analysis and dimensionality reduction
- models / Models, results, and evaluation
- results / Models, results, and evaluation
- evaluation / Models, results, and evaluation
- results, pool-based scenarios / Pool-based scenarios
- results, stream-based scenarios / Stream-based scenarios
- results, analysis / Analysis of active learning results
activity recognition
- about / Introducing activity recognition
- mobile phone sensors / Mobile phone sensors
- activity-recognition pipeline / Activity recognition pipeline
- plan / The plan
AdaBoost M1 method
- about / Choosing a classification algorithm
ADaptable sliding WINdow (ADWIN)
- about / Sliding windows
adaptation methods
- about / Adaptation methods
- explicit adaptation / Explicit adaptation
- implicit adaptation / Implicit adaptation
Advanced Message Queueing Protocol (AMQP) / Message queueing frameworks
advanced modelling
- with ensembles / Advanced modeling with ensembles
- ensembleLibrary package, using / Before we start
- data, pre-processing / Data pre-processing
- attribute selection / Attribute selection
- model selection / Model selection
- performance, evaluation / Performance evaluation
affinity analysis
- about / Affinity analysis
- cross-industry applications / Other applications in various areas
affinity propagation
- about / Affinity propagation
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
agglomerative clustering
- about / Clustering
algorithms, comparing
- McNemars Test / McNemar's Test
- Wilcoxon signed-rank test / Wilcoxon signed-rank test
Amazon Elastic MapReduce (EMR) / Amazon Elastic MapReduce
Amazon Kinesis / Publish-subscribe frameworks
Amazon Machine Learning / Machine learning as a service
Amazon Redshift / Amazon Redshift
analysis types
- about / Analysis types
- pattern analysis / Pattern analysis
- transaction analysis / Transaction analysis
Android Device Monitor
- about / Collecting training data
Android Studio
- installing / Installing Android Studio
- URL / Installing Android Studio
Angle-based Outlier Degree (ABOD) / How does it work?
anomalous behaviour detection
- about / Suspicious and anomalous behavior detection
- unknown-unknowns / Unknown-unknowns
anomalous pattern detection
- about / Anomalous pattern detection
- analysis types / Analysis types
- plan recognition / Plan recognition
anomaly detection
- about / Outlier or anomaly detection
anomaly detection, in time series data
- about / Anomaly detection in time series data
- histogram-based anomaly detection / Histogram-based anomaly detection
- data, loading / Loading the data
- histograms, creating / Creating histograms
- density based k-nearest neighbours / Density based k-nearest neighbors
anomaly detection, in website traffic
- about / Anomaly detection in website traffic
- dataset, using / Dataset
ANOVA test / ANOVA test
Apache Kafka / Publish-subscribe frameworks
Apache Mahout
- about / Apache Mahout
- configuring / Getting Apache Mahout
- configuring, in Eclipse with Maven plugin / Configuring Mahout in Eclipse with the Maven plugin
Apache Spark
- about / Apache Spark
- URL / Apache Spark
Apache Storm / SAMOA as a real-time Big Data Machine Learning framework
Application Portfolio Management (APM)
- about / IT Operations Analytics
Applied Machine Learning
- workflow / Applied machine learning workflow
Approx Storm / Approx Storm
Apriori
- about / Weka
Apriori algorithm
- about / Apriori algorithm
- used, for discovering shopping patterns / Apriori
ArangoDB / Graph databases
artificial neural networks
- about / Artificial neural networks
association analysis / Machine learning – types and subtypes
association rule learning
- about / Association rule learning
- Apriori algorithm / Apriori algorithm
- FP-growth algorithm / FP-growth algorithm
association rule learning, basic concepts
- database, of transactions / Database of transactions
- itemset / Itemset and rule
- rule / Itemset and rule
- support / Support
- confidence / Confidence
autoencoder
- about / Autoencoder
Autoencoders
- about / Autoencoders
- mathematical notations / Definition and mathematical notations
- loss function / Loss function
- limitations / Limitations of Autoencoders
- denoising / Denoising Autoencoder
axioms of probability / Axioms of probability

B

bag-of-word (BoW)
- about / Working with text data
Balanced Iterative Reducing and Clustering Hierarch (BIRCH)
- about / Hierarchical based and micro clustering
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
basic modelling
- about / Basic modeling
- models, evaluating / Evaluating models
- naive Bayes baseline, implementing / Implementing naive Bayes baseline
basic naive Bayes classifier baseline
- about / Basic naive Bayes classifier baseline
- data, obtaining / Getting the data
- data, loading / Loading the data
Batch Big Data Machine Learning
- about / Batch Big Data Machine Learning
- used, as H2O / H2O as Big Data Machine Learning platform
Bayesian information score (BIC) / Measures to evaluate structures
Bayesian networks
- about / Bayesian networks
- representation / Bayesian networks, Representation
- inference / Bayesian networks, Inference
- learning / Bayesian networks, Learning
Bayes theorem
- about / Bayes' theorem
- density, estimation / Density estimation
- mean / Mean
- variance / Variance
- standard deviation / Standard deviation
- Gaussian standard deviation / Gaussian standard deviation
- covariance / Covariance
- correlation coefficient / Correlation coefficient
- binomial distribution / Binomial distribution
- Poisson distribution / Poisson distribution
- Gaussian distribution / Gaussian distribution
- central limit theorem / Central limit theorem
- error propagation / Error propagation
BBC dataset
- URL / BBC dataset
Bernoulli distribution / Random variables, joint, and marginal distributions
big data
- dealing with / Dealing with big data
- volume / Dealing with big data
- velocity / Dealing with big data
- variety / Dealing with big data
Big Data
- characteristics / What are the characteristics of Big Data?
- volume / What are the characteristics of Big Data?
- velocity / What are the characteristics of Big Data?
- variety / What are the characteristics of Big Data?
- veracity / What are the characteristics of Big Data?
big data application
- architecture / Big data application architecture
Big Data cluster deployment frameworks
- about / Big Data cluster deployment frameworks
- Hortonworks Data Platform (HDP) / Hortonworks Data Platform
- Cloudera CDH / Cloudera CDH
- Amazon Elastic MapReduce (EMR) / Amazon Elastic MapReduce
- Microsoft Azure HDInsight / Microsoft Azure HDInsight
Big Data framework
- about / General Big Data framework
- cluster deployment frameworks / Big Data cluster deployment frameworks
- data acquisition / Data acquisition
- data storage / Data storage
- data preparation / Data processing and preparation
- data processing / Data processing and preparation
- machine learning / Machine Learning
- visualization / Visualization and analysis
- analysis / Visualization and analysis
Big Data Machine Learning
- about / Big Data Machine Learning
- framework / General Big Data framework
- Big Data framework / General Big Data framework
- Spark MLlib / Spark MLlib as Big Data Machine Learning platform
BigML / Machine learning as a service
binomial distribution / Binomial distribution
Book-Crossing dataset
- URL / Book ratings dataset
- BX-Users file / Book ratings dataset
- BX-Books file / Book ratings dataset
- BX-Book-Ratings file / Book ratings dataset
book-recommendation engine
- building / Building a recommendation engine
- book ratings dataset, using / Book ratings dataset
- data, loading / Loading the data
- data, loading from file / Loading data from file
- data, loading from database / Loading data from database
- in-memory database, creating / In-memory database
- collaborative filtering, implementing / Collaborative filtering
- custom rules, adding / Adding custom rules to recommendations
- evaluation / Evaluation
- online learning engine / Online learning engine
- content-based filtering, implementing / Content-based filtering
boosting
- about / Boosting
- algorithm input / Algorithm inputs and outputs
- algorithm output / Algorithm inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations
bootstrap aggregating (bagging)
- about / Bootstrap aggregating or bagging
- algorithm inputs / Algorithm inputs and outputs
- algorithm outputs / Algorithm inputs and outputs
- working / How does it work?
- Random Forest / Random Forest
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Broyden-Fletcher-Goldfarb-Shanno (BFGS) / How does it work?
Business Intelligence (BI) / What is not machine learning?
business problem / Business problem, Business problem

C

Canova library
- URL / Loading the data
case study
- about / Case study
- business problem / Business problem
- machine learning mapping / Machine learning mapping
- data sampling and transformation / Data sampling and transformation
- feature analysis / Feature analysis
- Models, results, and evaluation / Models, results, and evaluation
- results, analysis / Analysis of results
case study, with CoverType dataset
- about / Case study
- business problem / Business problem
- machine learning, mapping / Machine Learning mapping
- data collection / Data collection
- data sampling / Data sampling and transformation
- data transformation / Data sampling and transformation
- Big Data Machine Learning, used as Spark MLlib / Spark MLlib as Big Data Machine Learning platform
Cassandra
- about / Big data application architecture
- URL / Big data application architecture
/ Columnar databases
cc.mallet.pipe package
- Input2CharSequence pipeline / Pre-processing text data
- CharSequenceRemoveHTML pipeline / Pre-processing text data
- MakeAmpersandXMLFriendly pipeline / Pre-processing text data
- TokenSequenceLowercase pipeline / Pre-processing text data
- TokenSequence2FeatureSequence pipeline / Pre-processing text data
- TokenSequenceNGrams pipeline / Pre-processing text data
central limit theorem / Central limit theorem
Chebyshev distance
- about / Java machine learning
Chi-Squared feature / Statistical approach
Chunk / H2O architecture
classification
- about / Classification, Classification, Formal description and notation
- decision trees learning / Decision tree learning
- probabilistic classifiers / Probabilistic classifiers
- kernel methods / Kernel methods
- artificial neural networks / Artificial neural networks
- ensemble learning / Ensemble learning
- evaluating / Evaluating classification
- precision / Precision and recall
- recall / Precision and recall
- Roc curves / Roc curves
- data, using / Data
- data, loading / Loading data
- feature selection / Feature selection
- learning algorithms, selecting / Learning algorithms
- data, classifying / Classify new data
- evaluation / Evaluation and prediction error metrics
- prediction error metrics / Evaluation and prediction error metrics
- confusion matrix, examining / Confusion matrix
- algorithm, selecting / Choosing a classification algorithm
classification algorithms
- weka.classifiers.rules.ZeroR / Choosing a classification algorithm
- weka.classifiers.trees.RandomTree / Choosing a classification algorithm
- weka.classifiers.trees.RandomForest / Choosing a classification algorithm
- weka.classifiers.lazy.IBk / Choosing a classification algorithm
- weka.classifiers.functions.MultilayerPerceptron / Choosing a classification algorithm
- weka.classifiers.bayes.NaiveBayes / Choosing a classification algorithm
- weka.classifiers.meta.AdaBoostM1 / Choosing a classification algorithm
- weka.classifiers.meta.Bagging / Choosing a classification algorithm
Classification and Regression Trees (CART) / Decision Trees
classifier
- building / Building a classifier
- spurious transitions, reducing / Reducing spurious transitions
- plugging, into mobile app / Plugging the classifier into a mobile app
class implementation
- reference link / Loading the data
class unbalance / Class unbalance
Clique tree or junction tree algorithm, Bayesian networks
- about / Clique tree or junction tree algorithm
- input and output / Input and output
- working / How does it work?
- advantages and limitations / Advantages and limitations
Cloudera CDH / Cloudera CDH
Cluster-based Local Outlier Factor (CBLOF)
- about / How does it work?
clustering
- about / Clustering, Clustering, Clustering
- algorithms / Clustering algorithms
- evaluation / Evaluation
- spectral clustering / Spectral clustering
- affinity propagation / Affinity propagation
- using, for incremental unsupervised learning / Incremental unsupervised learning using clustering
- evaluation techniques / Validation and evaluation techniques
- validation techniques / Validation and evaluation techniques
- stream cluster evaluation, key issues / Key issues in stream cluster evaluation
- evaluation measures / Evaluation measures
clustering-based methods
- about / Clustering-based methods
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
clustering algorithms
- implementing / Clustering algorithms
- about / Clustering algorithms
- k-means / k-Means
- DBSCAN / DBSCAN
- mean shift / Mean shift
- Gaussian mixture modeling (GMM) / Expectation maximization (EM) or Gaussian mixture modeling (GMM)
- expectation maximization (EM) / Expectation maximization (EM) or Gaussian mixture modeling (GMM)
- hierarchical clustering / Hierarchical clustering
- self-organizing maps (SOM) / Self-organizing maps (SOM)
clustering evaluation
- about / Clustering validation and evaluation
- internal evaluation measures / Clustering validation and evaluation, Internal evaluation measures
- external evaluation measures / Clustering validation and evaluation, External evaluation measures
Clustering Features (CF) / Hierarchical based and micro clustering
Clustering Feature Tree (CF Tree) / Hierarchical based and micro clustering
clustering techniques
- about / Clustering techniques
- generative probabilistic models / Generative probabilistic models
- distance-based text clustering / Distance-based text clustering
- non-negative Matrix factorization (NMF) / Non-negative matrix factorization (NMF)
Clustering Trees (CT) / Hierarchical based and micro clustering
clustering validation
- about / Clustering validation and evaluation
Cluster Mapping Measures (CMM)
- about / Cluster Mapping Measures (CMM)
- mapping phase / Cluster Mapping Measures (CMM)
- penality phase / Cluster Mapping Measures (CMM)
cluster mode / Amazon Elastic MapReduce
cluster SSL
- about / Cluster and label SSL
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- limitations / Advantages and limitations
- advantages / Advantages and limitations
CluStream
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
co-training SSL
- about / Co-training SSL or multi-view SSL
- inputs / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
collaborative filtering
- about / Collaborative filtering
- implementing, with book-recommendation engine / Collaborative filtering
- user-based / User-based filtering
- item-based / Item-based filtering
columnar databases / Columnar databases
Comma Separated Value (CSV)
- about / Loading the data
competitions
- about / Competitions
concept drift
- about / Concept drift and drift detection
conditional probability distribution (CPD) / Factor types, Definition
Conditional random fields (CRFs) / Conditional random fields
confusion matrix / Confusion matrix and related metrics
conjugate gradient optimization algorithm
- building / Building a single-layer regression model
connectivity-based outliers (COF) / How does it work?
content-based filtering
- about / Content-based filtering
- implementing, with book-recommendation engine / Content-based filtering
contrastive divergence (CD) / Contrastive divergence
Contrastive Divergence algorithm
- about / Restricted Boltzmann machine
Convolutional Neural Network (CNN)
- about / Deep convolutional networks
Convolutional Neural Networks (CNN)
- about / Convolutional Neural Network
- local connectivity / Local connectivity
- parameter sharing / Parameter sharing
- discrete convolution / Discrete convolution
- Pooling or Subsampling / Pooling or subsampling
- ReLU / Normalization using ReLU
coreference resolution / Coreference resolution
Core Motion framework, iOS
- URL / Mobile phone sensors
correlation-based feature selection (CFS) / Correlation-based feature selection (CFS)
Correlation based Feature selection (CFS)
- about / Feature selection
correlation coefficient
- about / Correlation coefficient
/ Correlation coefficient
cosine distance
- about / Content-based filtering
Cosine distance / Cosine distance
cost function
- about / Supervised learning
Coursera
- URL / Online courses
covariance / Covariance
CoverType dataset
- reference link / Business problem
cross-industry applications, of affinity analysis
- about / Other applications in various areas
- medical diagnosis / Medical diagnosis
- protein sequences / Protein sequences
- census data / Census data
- customer relationship management (CRM) / Customer relationship management
- IT Operations Analytics / IT Operations Analytics
cross-validation
- about / Cross-validation
Cross Industry Standard Process (CRISP)
- about / Process
Cross Industry Standard Process for Data Mining (CRISP-DM)
- about / CRISP-DM
CrowdANALYTIX
- URL / Competitions
CSVLoader class
- URL / Loading the data
cumulative sum (CUSUM) / CUSUM and Page-Hinckley test
curse of dimensionality
- about / The curse of dimensionality
customer relationship database
- about / Customer relationship database
- challenge / Challenge
- dataset / Dataset
- evaluation / Evaluation
custom frameworks / Custom frameworks

D

(DBSCAN)
- about / DBSCAN
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
D-Separation, Bayesian networks / D-Separation
data
- about / Data and problem definition
data acquisition
- about / Data acquisition
- publish-subscribe frameworks / Publish-subscribe frameworks
- source-sink frameworks / Source-sink frameworks
- SQL frameworks / SQL frameworks
- message queueing frameworks / Message queueing frameworks
- custom frameworks / Custom frameworks
data analysis
- about / Data analysis
- label analysis / Label analysis
- features analysis / Features analysis
data and problem definition
- about / Data and problem definition
Data and problem definition
- measurement scales / Measurement scales
data cleaning
- about / Data cleaning
data collection
- about / Data collection
- data, observing / Find or observe data
- data, searching / Find or observe data
- data, generating / Generate data
- traps, sampling / Sampling traps
- from mobile phone / Collecting data from a mobile phone
- Android Studio, installing / Installing Android Studio
- data collector, loading / Loading the data collector
- training data, collecting / Collecting training data
- mapping / Data collection
data collector
- loading / Loading the data collector
- URL / Loading the data collector
- feature extraction / Feature extraction
data distribution sampling
- about / Data distribution sampling
- working / How does it work?
- model change / Expected model change
- error reduction / Expected error reduction
- advantages / Advantages and limitations
- disadvantages / Advantages and limitations
Data Frame / H2O architecture
data management
- about / Data management
Data Mining
- URL / Websites and blogs
Data Mining Research
- URL / Websites and blogs
data pre-processing
- about / Data pre-processing
- data cleaning / Data cleaning
- missing values, filling / Fill missing values
- outliers, removing / Remove outliers
- data transformation / Data transformation
- data reduction / Data reduction
data preparation
- key tasks / Data processing and preparation
- HQL / Hive and HQL
- Hive / Hive and HQL
- Spark SQL / Spark SQL
- Amazon Redshift / Amazon Redshift
- real-time stream processing / Real-time stream processing
data preprocessing
- about / Data transformation and preprocessing
data processing
- key tasks / Data processing and preparation
- HQL / Hive and HQL
- Hive / Hive and HQL
- Spark SQL / Spark SQL
- Amazon Redshift / Amazon Redshift
- real-time stream processing / Real-time stream processing
data quality analysis
- about / Data quality analysis
/ Data quality analysis
data reduction
- about / Data reduction
data sampling
- about / Data sampling, Data sampling and transformation
- need for / Is sampling needed?
- undersampling / Undersampling and oversampling
- oversampling / Undersampling and oversampling
- stratified sampling / Stratified sampling
- techniques / Training, validation, and test set
- experiments / Experiments, results, and analysis
- results / Experiments, results, and analysis
- analysis / Experiments, results, and analysis, Feature relevance and analysis
- feature relevance / Feature relevance and analysis
- test data, evaluation / Evaluation on test data
- results, analysis / Analysis of results
data science
- about / Machine learning and data science
Data Science Central
- URL / Websites and blogs
Data Science CS109 (Harvard) by John A. Paulson
- URL / Online courses
data scientist
- about / Machine learning and data science
dataset rebalancing
- about / Dataset rebalancing
datasets
- about / Datasets
- used, in machine learning / Datasets used in machine learning
- structured data / Datasets used in machine learning
- transaction data / Datasets used in machine learning
- market data / Datasets used in machine learning
- unstructured data / Datasets used in machine learning
- sequential data / Datasets used in machine learning
- graph data / Datasets used in machine learning
datasets, machine learning
- about / Datasets
- UC Irvine (UCI) database / Datasets
- Tunedit / Datasets
- Mldata.org / Datasets
- KDD Challenge Datasets / Datasets
- Kaggle / Datasets
data storage
- about / Data storage
- HDFS / HDFS
- NoSQL / NoSQL
data transformation
- about / Data transformation, Data transformation and preprocessing, Data sampling and transformation
- feature, construction / Feature construction
- missing values, handling / Handling missing values
- outliers, handling / Outliers
- discretization / Discretization
- data sampling / Data sampling
- training / Training, validation, and test set
- validation / Training, validation, and test set
- test set / Training, validation, and test set
Davies-Bouldin index / Davies-Bouldin index
- Silhouettes index / Silhouette's index
Decision and Predictive Analytics (ADAPA) / Predictive Model Markup Language
decision trees
- about / Underfitting and overfitting
Decision Trees
- about / Decision Trees
- algorithm input / Algorithm inputs and outputs
- algorithm output / Algorithm inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
decision trees learning
- about / Decision tree learning
Deep Autoencoders
- about / Deep Autoencoders
deep belief network
- building / Building a deep belief network
deep belief networks
- about / Artificial neural networks
Deep Belief Networks (DBN)
- inputs and outputs / Inputs and outputs
- working / How does it work?
Deep Belief Networks (DBNs)
- about / Restricted Boltzmann machine
deep convolutional networks
- about / Deep convolutional networks, MNIST dataset
Deep feed-forward NN
- about / Deep feed-forward NN
- input and outputs / Input and outputs
- working / How does it work?
deep learning
- about / Deep learning
- building blocks / Building blocks for deep learning
- Rectified linear activation function / Rectified linear activation function
- Restricted Boltzmann Machines / Restricted Boltzmann Machines
- Autoencoders / Autoencoders
- Unsupervised pre-training and supervised fine-tuning / Unsupervised pre-training and supervised fine-tuning
- Deep feed-forward NN / Deep feed-forward NN , How does it work?
- Deep Autoencoders / Deep Autoencoders
- Deep Belief Networks (DBN) / Deep Belief Networks, Inputs and outputs
- Dropouts / Deep learning with dropouts, Definition and mathematical notation
- sparse coding / Sparse coding
- Convolutional Neural Network (CNN) / Convolutional Neural Network
- Convolutional Neural Network (CNN) layers / CNN Layers
- Recurrent Neural Networks (RNN) / Recurrent Neural Networks
Deep Learning
- about / Deep learning and NLP
Deep Learning (DL) / Feature relevance and analysis
deep learning, case study
- about / Case study
- tools and software / Tools and software
- business problem / Business problem
- machine learning mapping / Machine learning mapping
- feature analysis / Feature analysis
- models, results and evaluation / Models, results, and evaluation
- basic data handling / Basic data handling
- multi-layer Perceptron / Multi-layer perceptron
- MLP, parameters / Multi-layer perceptron
- MLP, code for / Code for MLP
- Convolutional Network / Convolutional Network
- Convolutional Network, code for / Code for CNN
- Variational Autoencoder / Variational Autoencoder, Code for Variational deep learning, case studyVariational AutoencoderAutoencoder
- DBN / DBN
- parameter search, Arbiter used / Parameter search using Arbiter
- results and analysis / Results and analysis
Deeplearning4j
- about / Deeplearning4j
- URL / Deeplearning4j
- org.deeplearning4j.base / Deeplearning4j
- org.deeplearning4j.berkeley / Deeplearning4j
- org.deeplearning4j.clustering / Deeplearning4j
- org.deeplearning4j.datasets / Deeplearning4j
- org.deeplearning4j.distributions / Deeplearning4j
- org.deeplearning4j.eval / Deeplearning4j
- org.deeplearning4j.exceptions / Deeplearning4j
- org.deeplearning4j.models / Deeplearning4j
- org.deeplearning4j.nn / Deeplearning4j
- org.deeplearning4j.optimize / Deeplearning4j
- org.deeplearning4j.plot / Deeplearning4j
- org.deeplearning4j.rng / Deeplearning4j
- org.deeplearning4j.util / Deeplearning4j
DeepLearning4J
- URL / Machine learning – tools and datasets
- about / Machine learning – tools and datasets
deeplearning4java
- about / Deeplearning4j
- obtaining / Getting DL4J
delta rule
- about / Perceptron
denisity
- estimation / Density estimation
density-based methods / Outliers
- about / Density-based methods
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
density based algorithm
- about / Density based
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
desccriptive quality analysis
- about / Descriptive data analysis
- basic label analysis / Basic label analysis
- basic feature analysis / Basic feature analysis
detection methods
- model evolution, monitoring / Monitoring model evolution
- distribution changes, monitoring / Monitoring distribution changes
Deviance-Threshold Measure / Measures to evaluate structures
Dice coefficient / Dice coefficient
dimensionality reduction
- about / Feature relevance analysis and dimensionality reduction, Feature analysis and dimensionality reduction
- notation / Notation
- linear models / Linear methods
- nonlinear methods / Nonlinear methods
- PCA / PCA
- random projections / Random projections
- ISOMAP / ISOMAP
- observation / Observations on feature analysis and dimensionality reduction
/ Dimensionality reduction
Directed Acyclic Graph (DAG) / Definition
directory
- text data, importing / Importing from directory
Direct Update of Events (DUE) / Direct Update of Events (DUE)
Dirichlet distribution / Prior and posterior using the Dirichlet distribution
Discrete Fourier Transform (DFT)
- about / Activity recognition pipeline
discretization
- about / Discretization
- by binning / Discretization
- by frequency / Discretization
- by entropy / Discretization
distance-based clustering
- for outlier detection / Distance-based clustering for outlier detection
distance-based methods / Outliers
- about / Distance-based methods
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
distance measures
- Euclidean distances / Euclidean distances
- non-Euclidean distances / Non-Euclidean distances
distribution changes, monitoring
- about / Monitoring distribution changes
- Welchs test / Welch's t test
- Kolmogorov-Smirnovs test / Kolmogorov-Smirnov's test
- Page-Hinckley test / CUSUM and Page-Hinckley test
- cumulative sum (CUSUM) / CUSUM and Page-Hinckley test
divide-and-conquer strategy
- about / FP-growth algorithm
document collection
- about / Document collection and standardization
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
document databases / Document databases
document frequency (DF) / Frequency-based techniques
double evaluateLeftToRight method
- Instances heldOutDocuments component / Evaluating a model
- int numParticles component / Evaluating a model
- boolean useResampling component / Evaluating a model
- PrintStream docProbabilityStream component / Evaluating a model
drift detection
- about / Concept drift and drift detection
- data management / Data management
- partial memory / Partial memory
drift detection method (DDM) / Drift Detection Method or DDM
DrivenData / Competitions
DropConnect neural network
- about / MNIST dataset
Dropouts
- about / Deep learning with dropouts
- definition and mathematical notation / Definition and mathematical notation, How does it work?
- training with / Learning Training and testing with dropouts
- testing with / Learning Training and testing with dropouts
DSGuide
- URL / Websites and blogs
Dunns Indices / Dunn's Indices
dynamic time wrapping (DTW)
- about / Java machine learning

E

early drift detection method (EDDM) / Early Drift Detection Method or EDDM
Eclipse
- Apache Mahout, configuring with Maven plugin / Configuring Mahout in Eclipse with the Maven plugin
Eclipse IDE
- using / Before you start
Edit distance
- about / Non-Euclidean distances
eigendecomposition / Eigendecomposition
elbow method
- about / Clustering
ELEC dataset / Data collection
elimination-based inference, Bayesian networks
- about / Elimination-based inference
- variable elimination algorithm / Variable elimination algorithm
- input and output / Input and output
- VE algorithm, advantages / Advantages and limitations
Elki
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
EM (Diagonal Gaussian Model Factory) / Clustering models, results, and evaluation
email spam dataset
- URL / E-mail spam dataset
email spam detection
- about / E-mail spam detection
- email spam dataset, collecting / E-mail spam dataset
- default pipeline, creating / Feature generation
- training / Training and testing
- testing / Training and testing
- model performance, evaluating / Model performance
embedded approach / Embedded approach
energy efficiency dataset
- URL / Loading the data
ensambleSel.setOptions () method
- -L </path/to/modelLibrary> option / Model selection
- -W </path/to/working/directory> option / Model selection
- -B <numModelBags> option / Model selection
- -E <modelRatio> option / Model selection
- -V <validationRatio> option / Model selection
- -H <hillClimbIterations> option / Model selection
- -I <sortInitialization> option / Model selection
- -X <numFolds> option / Model selection
- -P <hillclimbMettric> option / Model selection
- -A <algorithm> option / Model selection
- -R option / Model selection
- -G option / Model selection
- -O option / Model selection
- -S <num> option / Model selection
- -D option / Model selection
ensemble algorithms
- about / Ensemble algorithms
- weighted majority algorithm (WMA) / Weighted majority algorithm
- online bagging algorithm / Online Bagging algorithm
- online boosting algorithm / Online Boosting algorithm
ensemble learning
- about / Ensemble learning, Ensemble learning and meta learners
- types / Ensemble learning and meta learners
- bootstrap aggregating (bagging) / Bootstrap aggregating or bagging
- boosting / Boosting
ensembleLibrary package
- using / Before we start
- URL / Before we start
ensembles
- used, for advanced modelling / Advanced modeling with ensembles
Ensemble Selection algorithm
- about / Advanced modeling with ensembles
environmental sensors
- about / Mobile phone sensors
error propagation / Error propagation
error reduction
- variance reduction / Variance reduction
- density weighted methods / Density weighted methods
Euclidean distance / Euclidean distance
Euclidean distances
- about / Euclidean distances
EUPLv1.1
- URL / Machine learning – tools and datasets
evaluate() method, parameters
- RecommenderBuilder / Evaluation
- DataModelBuilder / Evaluation
- DataModel / Evaluation
- trainingPercentage / Evaluation
- evaluationPercentage / Evaluation
evaluation
- about / Generalization and evaluation
evaluation criteria
- accuracy / Evaluation criteria
- balanced accuracy / Evaluation criteria
- Area under ROC curve (AUC) / Evaluation criteria
- Kappa statistic (K) / Evaluation criteria
- Kappa Plus statistic / Evaluation criteria
evaluation measures, clustering
- Cluster Mapping Measures (CMM) / Cluster Mapping Measures (CMM)
- V-Measure / V-Measure
- other measures / Other external measures
- purity / Other external measures
- entropy / Other external measures
- Recall / Other external measures
- F-Measure / Other external measures
- Precision / Other external measures
Exact Storm / Exact Storm
Expectation Maximization (EM) / Advantages and limitations
- about / Expectation maximization (EM) or Gaussian mixture modeling (GMM), How does it work?, How does it work?
Expectation Maximization (EM) clustering
- about / Clustering
exploitation
- about / Exploitation versus exploration
exploration
- about / Exploitation versus exploration
extended Jaccard Coefficient / Extended Jaccard coefficient
external evaluation measures
- about / External evaluation measures
- Rand index / Rand index
- F-Measure / F-Measure
- normalized mutual information index (NMI) / Normalized mutual information index

F

F-Measure / F-Measure
False Positive Rate (FPR) / Confusion matrix and related metrics
feature analysis
- about / Feature analysis and dimensionality reduction
- notation / Notation
- observation / Observations on feature analysis and dimensionality reduction
feature evaluation techniques
- about / Feature evaluation techniques
- filter approach / Filter approach
- wrapper approach / Wrapper approach
- embedded approach / Embedded approach
Feature extraction
- about / Building a machine learning application
feature extraction/generation
- about / Feature extraction/generation
- lexical features / Lexical features
- syntactic features / Syntactic features
- semantic features / Semantic features
feature map
- about / Deep convolutional networks
feature relevance analysis
- about / Feature relevance analysis and dimensionality reduction
- feature search techniques / Feature search techniques
- feature evaluation techniques / Feature evaluation techniques
features
- construction / Feature construction
feature search techniques / Feature search techniques
feature selection
- about / Data reduction, Feature selection
- Information theoretic techniques / Information theoretic techniques
- statistical-based techniques / Statistical-based techniques
- frequency-based techniques / Frequency-based techniques
feedforward neural networks
- about / Feedforward neural networks
file
- text data, importing / Importing from file
filter approach
- about / Filter approach
- univariate feature selection / Univariate feature selection
- multivariate feature selection / Multivariate feature selection
Fine Needle Aspirate (FNA) / Datasets and analysis
flow of influence, Bayesian networks / Flow of influence
Fourier transform
- reference link / Activity recognition pipeline
FP-Growth
- about / Weka
FP-growth algorithm
- about / FP-growth algorithm
- used, for discovering shopping patterns / FP-growth
FP-tree structure
- about / FP-growth algorithm
fraud detection, of insurance claims
- about / Fraud detection of insurance claims
- dataset, using / Dataset
- suspicious patterns, modelling / Modeling suspicious patterns
frequent pattern (FP)
- about / FP-growth algorithm
Friedmans test / Friedman's test

G

Gain charts / Gain charts and lift curves
Gain Ratio (GR) / Information theoretic techniques
Gaussian distribution / Gaussian distribution
Gaussian Mixture Model (GMM) / Gaussian Mixture Model
Gaussian mixture modeling (GMM)
- about / Expectation maximization (EM) or Gaussian mixture modeling (GMM), How does it work?
- input / Input and output
- output / Input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Gaussian Radial Basis Kernel / How does it work?
Gaussian standard deviation / Gaussian standard deviation
Geeking with Greg
- URL / Websites and blogs
generalization
- about / Generalization and evaluation
- underfitting / Underfitting and overfitting
- overfitting / Underfitting and overfitting
- test set / Train and test sets
- train set / Train and test sets
- cross-validation / Cross-validation
- leave-one-out validation / Leave-one-out validation
- stratification / Stratification
Generalized Linear Models (GLM) / Feature relevance and analysis
Generalized Sequential Patterns (GSP)
- about / Weka
generative probabilistic models
- about / Generative probabilistic models
- input / Input and output
- output / Input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations
Generative Stochastic Networks (GSNs)
- about / Restricted Boltzmann machine
Gibbs parameterization / Gibbs parameterization
Gibbs sampling
- about / Restricted Boltzmann machine
Gini index / How does it work?
GNU General Public License (GNU GPL)
- about / Weka
Google Prediction API / Machine learning as a service
Gradient Boosting Machine (GBM) / Feature relevance and analysis
graph
- concepts / Graph concepts
- structure and properties / Graph structure and properties
- subgraphs and cliques / Subgraphs and cliques
- path / Path, trail, and cycles
- trail / Path, trail, and cycles
- cycles / Path, trail, and cycles
graph data / Datasets used in machine learning
graph databases / Graph databases
Graphics Processing Unit (GPU)
- reference link / Build a Multilayer Convolutional Network
- about / Build a Multilayer Convolutional Network
graph mining / Machine learning – types and subtypes
GraphX
- about / Apache Spark, Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
grid based algorithm
- about / Grid based
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations

H

H2O
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
- as Big Data Machine Learning platform / H2O as Big Data Machine Learning platform
- architecture / H2O architecture
- machine learning / Machine learning in H2O
- tools / Tools and usage
- usage / Tools and usage
Hadoop
- about / Big data application architecture
- URL / Big data application architecture
Hadoop Distributed File System (HDFS)
- about / Apache Spark
Hamming distance
- about / Non-Euclidean distances
HBase
- about / Big data application architecture
- URL / Big data application architecture
/ Columnar databases
HDFS
- about / HDFS
HDFS, components / HDFS
- NameNode / HDFS
- Secondary NameNode / HDFS
- DataNode / HDFS
Hidden layer
- about / Feedforward neural networks
Hidden layer, issues
- vanishing gradients problem / Feedforward neural networks
- overfitting / Feedforward neural networks
Hidden Markov models
- about / Hidden Markov models for NER
- input / Input and output
- output / Input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations
hidden Markov models (HMM)
- about / Apache Mahout
/ How does it work?
Hidden Markov models (HMM) / Hidden Markov models
Hidden Markov Models (HMMs)
- about / Transaction analysis
hierarchical clustering
- about / Clustering, Hierarchical clustering
- input / Input and output
- output / Input and output
- working / How does it work?
- single linkage / How does it work?
- complete linkage / How does it work?
- average linkage / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
high-dimensional-based methods
- about / High-dimensional-based methods
- inputs / Inputs and outputs
- ouputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
histogram-based anomaly detection
- about / Histogram-based anomaly detection
Hive / Hive and HQL
Hoeffding Trees (HT) / Hoeffding trees or very fast decision trees (VFDT)
- input / Inputs and outputs
- output / Inputs and outputs
- limitations / Advantages and limitations
- advantages / Advantages and limitations
Horse Colic Classification, case study
- about / Case Study – Horse Colic Classification
- reference link / Case Study – Horse Colic Classification
- business problem / Business problem
- machine learning, mapping / Machine learning mapping
- data analysis / Data analysis
- supervised learning, experiments / Supervised learning experiments
- analysis / Results, observations, and analysis
Hortonworks Data Platform (HDP) / Hortonworks Data Platform
Hotspot
- about / Weka
HQL / Hive and HQL
hybrid approach
- about / Hybrid approach
Hyperbolic tangent ( / Hyperbolic tangent ("tanh") function
hyperplane / How does it work?

I

I-Map, Bayesian networks / I-Map
IBM Research team
- about / Advanced modeling with ensembles
IBM Watson Analytics / Machine learning as a service
image classification
- about / Image classification
- deeplearning4java / Deeplearning4j
- MNIST dataset / MNIST dataset
- data, loading / Loading the data
- models, building / Building models
ImageNet
- about / Introducing image recognition
- URL / Deep convolutional networks
image recognition
- about / Introducing image recognition
- neural networks / Neural networks
incremental learning / Machine learning – types and subtypes
incremental supervised learning
- about / Incremental supervised learning
- modeling techniques / Modeling techniques
- validation / Validation, evaluation, and comparisons in online setting
- evaluation / Validation, evaluation, and comparisons in online setting
- comparisons, in online setting / Validation, evaluation, and comparisons in online setting
- model validation techniques / Model validation techniques
incremental unsupervised learning
- clustering, using / Incremental unsupervised learning using clustering
- modeling techniques / Modeling techniques
independent, identical distributions (i.i.d.) / Monitoring model evolution
Independent Component Analysis (ICA) / Advantages and limitations
inference, Bayesian networks
- about / Inference
- elimination-based inference / Elimination-based inference
- propagation-based techniques / Propagation-based techniques
- sampling-based techniques / Sampling-based techniques
inferencing / Machine learning – types and subtypes, Semantic reasoning and inferencing
influence space (IS) / How does it work?
information extraction / Information extraction and named entity recognition
Information gain (IG) / Information theoretic techniques
Infrastructure as a Service (IaaS) / Machine learning in the cloud
Input layer
- about / Feedforward neural networks
insurance claims
- fraud detection / Fraud detection of insurance claims
internal evaluation measures
- about / Internal evaluation measures
- compactness / Internal evaluation measures
- separation / Internal evaluation measures
- notation / Notation
- R-Squared / R-Squared
- Dunns Indices / Dunn's Indices
- Davies-Bouldin index / Davies-Bouldin index
Internet of things (IoT) / Machine learning applications
Interquartile Ranges (IQR) / Outliers
interval data
- about / Measurement scales
Intrusion Detection (ID)
- about / Transaction analysis
inverse document frequency (IDF) / Inverse document frequency (IDF)
Isomap / Advantages and limitations
item-based analysis
- about / User-based and item-based analysis
item-based collaborative filtering
- about / Item-based filtering
iterative reweighted least squares (IRLS) / How does it work?

J

Jaccard distance
- about / Non-Euclidean distances
Java
- need for / The need for Java
Java-ML packages
- net.sf.javaml.classification / Java machine learning
- net.sf.javaml.clustering / Java machine learning
- net.sf.javaml.core / Java machine learning
- net.sf.javaml.distance / Java machine learning
- net.sf.javaml.featureselection / Java machine learning
- net.sf.javaml.filter / Java machine learning
- net.sf.javaml.matrix / Java machine learning
- net.sf.javaml.sampling / Java machine learning
- net.sf.javaml.tools / Java machine learning
- net.sf.javaml.utils / Java machine learning
java -Xmx16g
- about / Performance evaluation
Java API packages, Weka
- weka.associations / Weka
- weka.classifiers / Weka
- weka.clusterers / Weka
- weka.core / Weka
- weka.datagenerators / Weka
- weka.estimators / Weka
- weka.experiment / Weka
- weka.filters / Weka
- weka.gui / Weka
Java Class Library for Active Learning (JCLAL)
- URL / Machine learning – tools and datasets
- about / Machine learning – tools and datasets
- reference link / Tools and software
Java machine learning (Java-ML)
- about / Java machine learning
- URL / Java machine learning
JavaScript Object Notation (JSON) / Document collection and standardization
JKernelMachines (Transductive SVM) / Tools and software
joint distribution / Random variables, joint, and marginal distributions, Factor types

K

k-means
- about / k-Means
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
k-means clustering
- about / Clustering
k-nearest neighbors
- about / Underfitting and overfitting
k-Nearest Neighbors (k-NN) / Outliers
K-Nearest Neighbors (KNN)
- about / K-Nearest Neighbors (KNN)
- algorithm input / Algorithm inputs and outputs
- algorithm output / Algorithm inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Kaggle / Competitions
- about / Datasets
- URL / Datasets
KDD Challenge Datasets
- URL / Datasets
- about / Datasets
KDD Cup
- URL / Getting the data
KDnuggets / Machine learning as a service
- URL / Websites and blogs
KEEL
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
KEEL (Knowledge Extraction based on Evolutionary Learning)
- about / Tools and software
- reference link / Tools and software
kernel density estimation (KDE) / How does it work?
kernel methods
- about / Kernel methods
Kernel Principal Component Analysis (KPCA)
- about / Kernel Principal Component Analysis (KPCA)
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
kernel trick / How does it work?
key-value databases / Key-value databases
Key Performance Indicators (KPIs) / What is not machine learning?
Knime
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
KNIME
- about / KNIME
- references / KNIME
known-knowns
- about / Unknown-unknowns
known-unknowns
- about / Unknown-unknowns
Kohonen networks / How does it work?
Kolmogorov-Smirnovs test / Kolmogorov-Smirnov's test
Kubat / Widmer and Kubat
Kullback-Leibler (KL) / How does it work?

L

label SSL
- about / Cluster and label SSL
- output / Inputs and outputs
- input / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Latent Dirichlet
- about / MALLET
Latent Dirichlet Allocation
- about / Topic modeling
Latent Dirichlet Allocation (LDA)
- about / Modeling
/ Advantages and limitations
latent semantic analysis (LSA) / Dimensionality reduction
learning
- techniques / Training, validation, and test set
learning, Bayesian networks
- goals / Learning
- parameters / Learning parameters
- Maximum likelihood estimation (MLE) / Maximum likelihood estimation for Bayesian networks
- Bayesian parameter, estimation / Bayesian parameter estimation for Bayesian network
- Dirichlet distribution / Prior and posterior using the Dirichlet distribution
- structures / Learning structures
- structures, evaluating / Measures to evaluate structures
- structures, learning / Methods for learning structures, Advantages and limitations
- constraint-based techniques / Constraint-based techniques
- advantages and limitations / Advantages and limitations
- search and score-based techniques / Search and score-based techniques, How does it work?, Advantages and limitations
leave-one-out validation
- about / Leave-one-out validation
lemmatization
- about / Stemming or lemmatization
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
Leveraging Bagging (LB) / Supervised learning experiments
lexical features
- about / Lexical features
- character-based features / Character-based features
- word-based features / Word-based features
- part-of-speech tagging features / Part-of-speech tagging features
- taxonomy features / Taxonomy features
lift curves / Gain charts and lift curves
linear algorithm
- online linear models, with loss functions / Online linear models with loss functions
- Online Naive Bayes / Online Naïve Bayes
Linear Discriminant Analysis (LDA)
- about / Modeling
- reference link / Evaluating a model
Linear Embedding (LLE) / How does it work?
linear models / Linear models
- Linear Regression / Linear Regression
- Naive Bayes / Naïve Bayes
- Logistic Regression / Logistic Regression
- about / Linear methods
- principal component analysis (PCA) / Principal component analysis (PCA)
- random projections (RP) / Random projections (RP)
- Multidimensional Scaling (MDS) / Multidimensional Scaling (MDS)
Linear Regression
- about / Linear Regression
- algorithm input / Algorithm input and output
- algorithm output / Algorithm input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
linear regression
- about / Linear regression
Lloyds algorithm
- working / How does it work?
Local Outlier Factor (LOF)
- about / Histogram-based anomaly detection
/ How does it work?
LOF algorithm
- URL / Density based k-nearest neighbors
logical datasets
- about / Training, validation, and test set
Logistic Regression
- about / Logistic Regression
- algorithm input / Algorithm input and output
- algorithm output / Algorithm input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations

M

machine learning / Machine Learning
- about / Machine learning and data science
- advantages / What kind of problems can machine learning solve?
- supervised learning / What kind of problems can machine learning solve?, Machine learning – types and subtypes
- unsupervised learning / What kind of problems can machine learning solve?
- reinforcement learning / What kind of problems can machine learning solve?, Machine learning – types and subtypes
- in real life / Machine learning in real life
- nosiy data / Noisy data
- class unbalance / Class unbalance
- feature selection / Feature selection is hard
- model chaining / Model chaining
- evaluation / Importance of evaluation
- models, in production / Getting models into production
- models, maintaining / Model maintenance
- in cloud / Machine learning in the cloud
- as service / Machine learning as a service
- history / Machine learning – history and definition
- definition / Machine learning – history and definition
- relationship with / Machine learning – history and definition
- concepts / Machine learning – concepts and terminology
- terminology / Machine learning – concepts and terminology
- types / Machine learning – types and subtypes
- subtypes / Machine learning – types and subtypes
- semi-supervised learning / Machine learning – types and subtypes
- graph mining / Machine learning – types and subtypes
- probabilistic graph modeling / Machine learning – types and subtypes
- inferencing / Machine learning – types and subtypes
- time-series forecasting / Machine learning – types and subtypes
- association analysis / Machine learning – types and subtypes
- stream learning / Machine learning – types and subtypes
- incremental learning / Machine learning – types and subtypes
- datasets, used / Datasets used in machine learning
- practical issues / Practical issues in machine learning
- roles / Machine learning – roles and process
- process / Machine learning – roles and process
- tools / Machine learning – tools and datasets
- datasets / Machine learning – tools and datasets
- mapping / Machine learning mapping, Machine learning mapping
- in H2O / Machine learning in H2O
- future / The future of Machine Learning
machine learning application
- building / Building a machine learning application
- traditional machine learning / Traditional machine learning architecture
- big data, dealing with / Dealing with big data
machine learning applications / Machine learning applications
Machine Learning for Language Toolkit (MALLET)
- about / MALLET
- URL / MALLET
machine learning libraries
- about / Machine learning libraries
- Waikato Environment for Knowledge Analysis (Weka) / Weka
- Java machine learning (Java-ML) / Java machine learning
- Apache Mahout / Apache Mahout
- Apache Spark / Apache Spark
- Deeplearning4j / Deeplearning4j
- Machine Learning for Language Toolkit (MALLET) / MALLET
- comparing / Comparing libraries
Machine learning mastery
- URL / Websites and blogs
machine translation (MT) / Machine translation
Mahalanobis distance
- about / Non-Euclidean distances, Java machine learning
Mahout interfaces, abstractions
- DataModel / Collaborative filtering
- UserSimilarity / Collaborative filtering
- ItemSimilarity / Collaborative filtering
- UserNeighborhood / Collaborative filtering
- Recommender / Collaborative filtering
Mahout libraries
- org.apache.mahout.cf.taste / Apache Mahout
- org.apache.mahout.classifier / Apache Mahout
- org.apache.mahout.clustering / Apache Mahout
- org.apache.mahout.common / Apache Mahout
- org.apache.mahout.ep / Apache Mahout
- org.apache.mahout.math / Apache Mahout
- org.apache.mahout.vectorizer / Apache Mahout
Mallet
- installing / Installing Mallet
- URL / Installing Mallet, Machine learning – tools and datasets
- reference link / Pre-processing text data
- about / Machine learning – tools and datasets
/ Mallet
mallet
- topic modeling / Topic modeling with mallet
MALLET, packages
- cc.mallet.classify / MALLET
- cc.mallet.cluster / MALLET
- cc.mallet.extract / MALLET
- cc.mallet.fst / MALLET
- cc.mallet.grmm / MALLET
- cc.mallet.optimize / MALLET
- cc.mallet.pipe / MALLET
- cc.mallet.topics / MALLET
- cc.mallet.types / MALLET
- cc.mallet.util / MALLET
Manhattan distance
- about / Java machine learning
manifold learning
- about / Manifold learning
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
marginal distribution / Random variables, joint, and marginal distributions
market basket analysis (MBA)
- about / Market basket analysis
- item affinity / Market basket analysis
- identification, of driver items / Market basket analysis
- trip classification / Market basket analysis
- storetostore comparison / Market basket analysis
- revenue optimization / Market basket analysis
- marketing / Market basket analysis
- operations optimization / Market basket analysis
- affinity analysis / Affinity analysis
market data / Datasets used in machine learning
Markov blanket / Markov blanket
Markov chain
- about / Restricted Boltzmann machine
Markov chains
- about / Markov chains
- Hidden Markov models (HMM) / Hidden Markov models
- Hidden Markov models (HMM), portable path / Most probable path in HMM
- Hidden Markov models (HMM), posterior decoding / Posterior decoding in HMM
Markov networks (MN) / Markov networks and conditional random fields
- representation / Representation
- parameterization / Parameterization
- Gibbs parameterization / Gibbs parameterization
- factor graphs / Factor graphs
- log-linear models / Log-linear models
- independencies / Independencies
- global / Global
- Pairwise Markov / Pairwise Markov
- Markov blanket / Markov blanket
- inference / Inference
- learning / Learning
- Conditional random fields (CRFs) / Conditional random fields
Markov random field (MRF) / Markov networks and conditional random fields
massively parallel processing (MPP) / Amazon Redshift
Massive Online Analysis (MOA)
- about / Tools and software
- references / Tools and software
- reference link / Analysis of stream learning results
mathematical transformation
- of feature / Outliers
matrix
- about / Matrix
- transpose / Transpose of a matrix
- addition / Matrix addition
- scalar multiplication / Scalar multiplication
- multiplication / Matrix multiplication
matrix product, properties
- about / Properties of matrix product
- linear transformation / Linear transformation
- matrix inverse / Matrix inverse
- eigendecomposition / Eigendecomposition
- positive definite matrix / Positive definite matrix
Maven plugin
- Apache Mahout, configuring with / Configuring Mahout in Eclipse with the Maven plugin
maximum entropy Markov model (MEMM)
- about / Maximum entropy Markov models for NER
- input / Input and output
- output / Input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations
Maximum Likelihood Estimates (MLE) / How does it work?
Maximum likelihood estimation (MLE) / Maximum likelihood estimation for Bayesian networks
McNemars Test / McNemar's Test
McNemar test / Comparing algorithms and metrics
mean / Mean
mean absolute error
- about / Mean absolute error
mean shift
- about / Mean shift
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
mean squared error
- about / Mean squared error
measurement scales
- about / Measurement scales
- nominal data / Measurement scales
- ordinal data / Measurement scales
- interval data / Measurement scales
- ratio data / Measurement scales
meta learners
- about / Ensemble learning and meta learners
Micro Clustering based Algorithm (MCOD) / Micro Clustering based Algorithm (MCOD)
Microsoft Azure HDInsight / Microsoft Azure HDInsight
Microsoft Azure Machine Learning / Machine learning as a service
Min-Max Normalization / Outliers
minimal redundancy maximal relevance (mRMR) / Minimal redundancy maximal relevance (mRMR)
Minimum Covariant Determinant (MCD) / How does it work?
Minimum Description Length (MDL) / How does it work?
Minkowski distance
- about / Java machine learning
missing values
- filling / Fill missing values
- handling / Handling missing values, Results, observations, and analysis
Mixed National Institute of Standards and Technology (MNIST) / Data quality analysis
Mldata.org
- about / Datasets
- URL / Datasets
MLlib API library
- org.apache.spark.mllib.classification / Apache Spark
- org.apache.spark.mllib.clustering / Apache Spark
- org.apache.spark.mllib.linalg / Apache Spark
- org.apache.spark.mllib.optimization / Apache Spark
- org.apache.spark.mllib.recommendation / Apache Spark
- org.apache.spark.mllib.regression / Apache Spark
- org.apache.spark.mllib.stat / Apache Spark
- org.apache.spark.mllib.tree / Apache Spark
- org.apache.spark.mllib.util / Apache Spark
MNIST database
- reference link / Data collection
MNIST dataset
- about / MNIST dataset
MOA
- about / Machine learning – tools and datasets
mobile app
- classifier, plugging into / Plugging the classifier into a mobile app
mobile phone
- data, collecting / Collecting data from a mobile phone
mobile phone sensors
- about / Mobile phone sensors
- motion sensors / Mobile phone sensors
- environmental sensors / Mobile phone sensors
- position sensors / Mobile phone sensors
- URL, for Android / Mobile phone sensors
- URL, for Windows Phone / Mobile phone sensors
model
- chaining / Model chaining
- in production / Getting models into production
- maintenance / Model maintenance
- building / Model building
- linear models / Linear models
model assesment
- about / Model assessment, evaluation, and comparisons, Model assessment
model comparison
- about / Model assessment, evaluation, and comparisons, Model comparisons
- algorithms, comparing / Comparing two algorithms
- multiple algorithms, comparing / Comparing multiple algorithms
model evaluation
- about / Model assessment, evaluation, and comparisons
model evaluation metrics
- about / Model evaluation metrics, Model evaluation metrics
- confusion matrix / Confusion matrix and related metrics
- PRC curve / ROC and PRC curves
- ROC curve / ROC and PRC curves
- Gain charts / Gain charts and lift curves
- lift curves / Gain charts and lift curves
- Confusion Metrics, evaluation / Evaluation on Confusion Metrics
- ROC curves / ROC Curves, Lift Curves, and Gain Charts
- Lift Curves / ROC Curves, Lift Curves, and Gain Charts
- Gain Charts / ROC Curves, Lift Curves, and Gain Charts
model evolution, monitoring
- Kubat / Widmer and Kubat
- drift detection method (DDM) / Drift Detection Method or DDM
- early drift detection method (EDDM) / Early Drift Detection Method or EDDM
model evolution, monitoring
- about / Monitoring model evolution
- Widmer / Widmer and Kubat
- early drift detection method (EEDM) / Early Drift Detection Method or EDDM
modeling techniques
- linear algorithm / Linear algorithms
- non-linear algorithms / Non-linear algorithms
- ensemble algorithms / Ensemble algorithms
- partition based algorithm / Partition based
- hierarchical clustering / Hierarchical based and micro clustering
- micro clustering / Hierarchical based and micro clustering
- density based algorithm / Density based
- grid based algorithm / Grid based
models
- building / Building models
- single layer regression model, building / Building a single-layer regression model
- deep belief network, building / Building a deep belief network
- Multilayer Convolutional Network, building / Build a Multilayer Convolutional Network
- non-linear models / Non-linear models
- ensemble learning / Ensemble learning and meta learners
- meta learners / Ensemble learning and meta learners
- clustering analysis / Observations and clustering analysis
- observations / Observations and clustering analysis
model validation techniques
- about / Model validation techniques
- prequential evaluation / Prequential evaluation
- holdout evaluation / Holdout evaluation
- controlled permutations / Controlled permutations
- evaluation criteria / Evaluation criteria
- algorithms, versus metrics / Comparing algorithms and metrics
MongoDB
- about / Big data application architecture
- URL / Big data application architecture
most probable explanation (MPE) / MAP queries and marginal MAP queries
motion sensors
- about / Mobile phone sensors
Mozilla Thunderbird
- about / E-mail spam detection
Multi-layered neural network
- inputs / Inputs, neurons, activation function, and mathematical notation
- neuron / Inputs, neurons, activation function, and mathematical notation
- activation function / Inputs, neurons, activation function, and mathematical notation
- mathematical notation / Inputs, neurons, activation function, and mathematical notation
- about / Multi-layered neural network
- structure and mathematical notations / Structure and mathematical notations
- activation functions / Activation functions in NN
- training / Training neural network
multi-layered perceptron (MLP) / Feature relevance and analysis
Multi-layer feed-forward neural network
- about / Multi-layer feed-forward neural network
multi-view SSL
- about / Co-training SSL or multi-view SSL
Multidimensional Scaling (MDS)
- about / Multidimensional Scaling (MDS)
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Multilayer Convolutional Network
- about / Building models
- building / Build a Multilayer Convolutional Network
multinomial distribution / Random variables, joint, and marginal distributions
multiple algorithms, comparing
- ANOVA test / ANOVA test
- Friedmans test / Friedman's test
multivariate feature analysis
- about / Multivariate feature analysis
- scatter plots / Multivariate feature analysis
- ScatterPlot Matrix / Multivariate feature analysis
- parallel plots / Multivariate feature analysis
multivariate feature selection
- about / Multivariate feature selection
- minimal redundancy maximal relevance (mRMR) / Minimal redundancy maximal relevance (mRMR)
- correlation-based feature selection (CFS) / Correlation-based feature selection (CFS)
myrunscollector package
- Globals.java class / Loading the data collector
- CollectorActivity.java class / Loading the data collector
- SensorsService.java class / Loading the data collector

N

Naive Bayes
- about / Underfitting and overfitting, Naïve Bayes
- algorithm input / Algorithm input and output
- algorithm output / Algorithm input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations
Naive Bayes (NB) / Feature relevance and analysis
naive Bayes baseline
- implementing / Implementing naive Bayes baseline
named entity recognition / Information extraction and named entity recognition
named entity recognition (NER)
- about / Named entity recognition
- Hidden Markov models / Hidden Markov models for NER
- Maximum entropy Markov model (MEMM) / Maximum entropy Markov models for NER
natural language processing (NLP)
- about / NLP, subfields, and tasks, Deep learning and NLP
- text categorization / Text categorization
- part-of-speech tagging (POS tagging) / Part-of-speech tagging (POS tagging)
- text clustering / Text clustering
- information extraction / Information extraction and named entity recognition
- named entity recognition / Information extraction and named entity recognition
- sentiment analysis / Sentiment analysis and opinion mining
- opinion mining / Sentiment analysis and opinion mining
- coreference resolution / Coreference resolution
- Word sense disambiguation (WSD) / Word sense disambiguation
- machine translation (MT) / Machine translation
- semantic reasoning / Semantic reasoning and inferencing
- inferencing / Semantic reasoning and inferencing
- text summarization / Text summarization
- question, automating / Automating question and answers
- answers, automating / Automating question and answers
Nemenyi test / Comparing algorithms and metrics
Neo4j
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
- URL, for licensing / Machine learning – tools and datasets
Neo4J / Graph databases
neural network
- about / Underfitting and overfitting
neural network, training
- about / Training neural network
- empirical risk minimization / Empirical risk minimization
- parameter initialization / Parameter initialization
- loss function / Loss function
- gradients / Gradients
- feed forward and backpropagation / Feed forward and backpropagation, How does it work?
neural networks
- about / Neural networks
- perceptron / Perceptron
- feedforward neural networks / Feedforward neural networks
- autoencoder / Autoencoder
- Restricted Boltzman machine / Restricted Boltzmann machine
- deep convolutional networks / Deep convolutional networks
- limitations / Limitations of neural networks, Vanishing gradients, local optimum, and slow training
No Free Lunch Theorem (NFLT) / Model building
nominal data
- about / Measurement scales
non-Euclidean distance
- about / Non-Euclidean distances
non-linear algorithms
- about / Non-linear algorithms
- Hoeffding Trees (HT) / Hoeffding trees or very fast decision trees (VFDT)
- very fast decision trees (VFDT) / Hoeffding trees or very fast decision trees (VFDT)
non-linear models
- about / Non-linear models
- Decision Trees / Decision Trees
- K-Nearest Neighbors (KNN) / K-Nearest Neighbors (KNN)
- support vector machines (SVM) / Support vector machines (SVM)
non-negative Matrix factorization (NMF)
- about / Non-negative matrix factorization (NMF)
- input / Input and output
- output / Input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitation / Advantages and limitations
Non-negative Matrix Factorization (NNMF) / Clustering techniques
nonlinear methods
- about / Nonlinear methods
- Kernel Principal Component Analysis (KPCA) / Kernel Principal Component Analysis (KPCA)
- manifold learning / Manifold learning
normalization
- about / Outliers
- Min-Max Normalization / Outliers
- Z-Score Normalization / Outliers
Normalized mutual information (NMI) / Normalized mutual information index
NoSQL
- about / NoSQL
- key-value databases / Key-value databases
- document databases / Document databases
- columnar databases / Columnar databases
- graph databases / Graph databases
notations, supervised learning
- about / Formal description and notation
- instance / Formal description and notation
- label / Formal description and notation
- binary classification / Formal description and notation
- regression / Formal description and notation
- dataset / Formal description and notation
Noun Phrase (NP)
- about / Syntactic features

O

one-class SVM
- about / One-class SVM
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
online bagging algorithm
- about / Online Bagging algorithm
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
online boosting algorithm
- about / Online Boosting algorithm
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
online courses
- about / Online courses
Online k-Means
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
online learning engine
- about / Online learning engine
online linear models, with loss functions
- inputs / Inputs and outputs
- ouputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitataions / Advantages and limitations
Online Naive Bayes
- about / Online Naïve Bayes
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
OpenMarkov
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
/ OpenMarkov
opinion mining / Sentiment analysis and opinion mining
Oracle Database Online Documentation
- URL / Dataset
ordinal data
- about / Measurement scales
OrientDB / Graph databases
outlier algorithms
- about / Outlier algorithms
- statistical-based / Outlier algorithms, Statistical-based
- distance-based / Outlier algorithms
- density-based / Outlier algorithms
- clustering-based / Outlier algorithms
- high-dimension-based / Outlier algorithms
- distance-based methods / Distance-based methods
- density-based methods / Density-based methods
- clustering-based methods / Clustering-based methods
- high-dimensional-based methods / High-dimensional-based methods
- one-class SVM / One-class SVM
outlier detection
- about / Outlier or anomaly detection
- outlier algorithms / Outlier algorithms
- outlier evaluation techniques / Outlier evaluation techniques
- used, for unsupervised learning / Unsupervised learning using outlier detection
- partition-based clustering / Partition-based clustering for outlier detection
- input / Inputs and outputs, Inputs and outputs
- output / Inputs and outputs, Inputs and outputs
- working / How does it work?, How does it work?
- advantages / Advantages and limitations, Advantages and limitations
- limitations / Advantages and limitations, Advantages and limitations
- Exact Storm / Exact Storm
- Abstract-C / Abstract-C
- Direct Update of Events (DUE) / Direct Update of Events (DUE)
- Micro Clustering based Algorithm (MCOD) / Micro Clustering based Algorithm (MCOD)
- Approx Storm / Approx Storm
- validation techniques / Validation and evaluation techniques
- evaluation techniques / Validation and evaluation techniques
outlier evaluation techniques
- about / Outlier evaluation techniques
- supervised evaluation / Supervised evaluation
- unsupervised evaluation / Unsupervised evaluation
- technique / Unsupervised evaluation
outlier models
- observation / Observations and analysis
- analysis / Observations and analysis
outliers
- removing / Remove outliers
- handling / Outliers
- detecting, in data / Outliers
- IQR / Outliers
- distance-based methods / Outliers
- density-based methods / Outliers
- mathematical transformation, of feature / Outliers
- handling, robust statistical algorithms used / Outliers
Output layer
- about / Feedforward neural networks
overfits
- about / Applied machine learning workflow
overfitting
- about / Underfitting and overfitting
oversampling / Undersampling and oversampling

P

p-norm distance
- about / Euclidean distances
Page-Hinckley test / CUSUM and Page-Hinckley test
Paired-t test / Paired-t test
pairwise-adaptive similarity / Pairwise-adaptive similarity
PAPI
- URL / Machine learning as a service
parallel plots / Multivariate feature analysis
Parquet / Columnar databases
part-of-speech (POS)
- about / Working with text data
part-of-speech tagging (POS tagging) / Part-of-speech tagging (POS tagging)
partial memory
- about / Partial memory
- full memory / Full memory
- detection methods / Detection methods
- adaptation methods / Adaptation methods
partition based algorithm
- Online k-Means / Online k-Means
pattern analysis
- about / Pattern analysis
Pearson coefficient
- about / Content-based filtering
Pearson correlation coefficient
- about / Java machine learning
perceptron
- about / Artificial neural networks, Introducing image recognition, Perceptron
peristalsis / Visualization analysis
Phrase (VP)
- about / Syntactic features
Pipelines
- reference link / Random Forest
plan recognition
- about / Plan recognition
Poisson distribution / Poisson distribution
Polynomial Kernel / How does it work?
Portable Format for Analytics (PFA) / Predictive Model Markup Language
position sensors
- about / Mobile phone sensors
positive definite matrix / Positive definite matrix
positive semi-definite matrix / Positive definite matrix
PRC curve / ROC and PRC curves
Pre-processing phase
- about / Building a machine learning application
precision
- about / Precision and recall
Prediction.IO / Machine learning as a service
predictive apriori
- about / Weka
Predictive Model Markup Language (PMML)
- about / Predictive Model Markup Language
Prepositional Phrase (PP)
- about / Syntactic features
Principal Component Analysis (PCA)
- about / Histogram-based anomaly detection
/ Embedded approach
principal component analysis (PCA)
- about / Principal component analysis (PCA)
- inputs / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advanatges / Advantages and limitations
- limitations / Advantages and limitations
/ Dimensionality reduction
Principal component analysis (PCA)
- about / Data reduction
principal components / How does it work?
Principal Components Analysis (PCA)
- about / Kernel methods
probabilistic classifiers
- about / Probabilistic classifiers
probabilistic graphical models (PGM) / Machine learning – tools and datasets
probabilistic graph modeling / Machine learning – types and subtypes
probabilistic latent semantic analysis (PLSA)
- about / Probabilistic latent semantic analysis (PLSA)
- input / Input and output
- output / Input and output
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
probabilistic latent semantic index (PLSI) / Topic modeling
Probabilistic Principal Component Analysis (PPCA) / Advantages and limitations
probability
- about / Probability revisited
- concepts / Concepts in probability
- conditional probability / Conditional probability
- Chain rule and Bayes' theorem / Chain rule and Bayes' theorem
- random variables / Random variables, joint, and marginal distributions
- joint / Random variables, joint, and marginal distributions
- marginal distributions / Random variables, joint, and marginal distributions
- marginal independence / Marginal independence and conditional independence
- conditional independence / Marginal independence and conditional independence
- factors / Factors
- factors, types / Factor types
- distribution queries / Distribution queries
- probabilistic queries / Probabilistic queries
- MAP queries / MAP queries and marginal MAP queries
- marginal MAP queries / MAP queries and marginal MAP queries
process, machine learning
- about / Process
- business problem, identifying / Process
- mapping / Process
- data collection / Process
- data quality analysis / Process
- data sampling / Process
- transformation / Process
- feature analysis / Process
- feature selection / Process
- modeling / Process
- model evaluation / Process
- model selection / Process
- model deployment / Process
- model performance, monitoring / Process
processors / SAMOA architecture
Propagation-based techniques, Bayesian networks
- about / Propagation-based techniques
- belief propagation / Belief propagation
- factor graph / Factor graph
- factor graph, messaging in / Messaging in factor graph
- input and output / Input and output
- working / How does it work?
- advantages and limitations / Advantages and limitations
publish-subscribe frameworks / Publish-subscribe frameworks

Q

Query by Committee (QBC)
- about / Query by Committee (QBC)
Query by disagreement (QBD)
- about / Query by disagreement (QBD)

R

R-Squared / R-Squared
Radial Basis Function (RBF) / Inputs and outputs
Rand index / Rand index
Random Forest / Random Forest
Random Forest (RF) / Feature relevance and analysis, Random Forest
random projections (RP)
- about / Random projections (RP)
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
RapidMiner
- about / Machine learning – tools and datasets, Case Study – Horse Colic Classification
- URL / Machine learning – tools and datasets
- experiments / RapidMiner experiments
- visualization analysis / Visualization analysis
- feature selection / Feature selection
- model process flow / Model process flow
- model evaluation metrics / Model evaluation metrics
ratio data
- about / Measurement scales
real-time Big Data Machine Learning
- about / Real-time Big Data Machine Learning
- SAMOA / SAMOA as a real-time Big Data Machine Learning framework
- machine learning algorithms / Machine Learning algorithms
- tools / Tools and usage
- usage / Tools and usage
- experiments / Experiments, results, and analysis
- results / Experiments, results, and analysis
- analysis / Experiments, results, and analysis
- results, analysis / Analysis of results
real-time stream processing / Real-time stream processing
real-world case study
- about / Real-world case study
- tools / Tools and software
- software / Tools and software
- business problem / Business problem
- machine learning, mapping / Machine learning mapping
- data collection / Data collection
- data quality analysis / Data quality analysis
- data sampling / Data sampling and transformation
- data transformation / Data sampling and transformation
- feature analysis / Feature analysis and dimensionality reduction
- dimensionality reduction / Feature analysis and dimensionality reduction
- models, clustering / Clustering models, results, and evaluation
- results / Clustering models, results, and evaluation, Outlier models, results, and evaluation
- evaluation / Clustering models, results, and evaluation, Outlier models, results, and evaluation
- outlier models / Outlier models, results, and evaluation
reasoning, Bayesian networks
- patterns / Reasoning patterns
- causal or predictive reasoning / Causal or predictive reasoning
- evidential or diagnostic reasoning / Evidential or diagnostic reasoning
- intercausal reasoning / Intercausal reasoning
- combined reasoning / Combined reasoning
recall
- about / Precision and recall
receiver operating characteristics (ROC) / Machine learning – concepts and terminology
Receiver Operating Characteristics (ROC)
- about / Roc curves
recommendation engine
- basic concepts / Basic concepts
- key concepts / Key concepts
- user-based analysis / User-based and item-based analysis
- item-based analysis / User-based and item-based analysis
- similarity, calculating / Approaches to calculate similarity
- exploitation / Exploitation versus exploration
- exploration / Exploitation versus exploration
- book-recommendation engine, building / Building a recommendation engine
Recurrent neural networks (RNN)
- about / Recurrent Neural Networks
- structure / Structure of Recurrent Neural Networks
- learning / Learning and associated problems in RNNs
- issues / Learning and associated problems in RNNs
- Long short term memory (LSTM) / Long Short Term Memory
- Gated Recurrent Units (GRUs) / Gated Recurrent Units
regression
- about / Regression, Underfitting and overfitting, Regression, Formal description and notation
- linear regression / Linear regression
- evaluating / Evaluating regression
- mean squared error / Mean squared error
- mean absolute error / Mean absolute error
- correlation coefficient / Correlation coefficient
- data, loading / Loading the data
- attributes, analyzing / Analyzing attributes
- model, building / Building and evaluating regression model
- model, evaluating / Building and evaluating regression model
- tips / Tips to avoid common regression problems
regression model
- evaluating / Building and evaluating regression model
- building / Building and evaluating regression model
- linear regression / Linear regression
- regression trees / Regression trees
regression trees
- about / Regression trees
regularization
- about / Regularization
- L2 regularization / L2 regularization
- L1 regularization / L1 regularization
reinforcement learning
- about / What kind of problems can machine learning solve?
/ Machine learning – types and subtypes
representation, Bayesian networks
- about / Representation
- definition / Definition
resampling / Is sampling needed?
Resilient Distributed Dataset (RDD)
- about / Apache Spark
Resilient Distributed Datasets (RDD) / Spark architecture
Restricted Boltzman machine
- about / Restricted Boltzmann machine
restricted Boltzmann machine
- about / Artificial neural networks
restricted Boltzmann machines (RBM)
- about / Deeplearning4j
Restricted Boltzmann Machines (RBM)
- about / Restricted Boltzmann Machines
- definition and mathematical notation / Definition and mathematical notation
- Conditional distribution / Conditional distribution
- free energy / Free energy in RBM
- training / Training the RBM
- sampling / Sampling in RBM
- contrastive divergence / Contrastive divergence , How does it work?
- persistent contrastive divergence / Persistent contrastive divergence
ROC curve / ROC and PRC curves
Roc curves
- about / Roc curves
roles, machine learning
- about / Roles
- business domain expert / Roles
- data engineer / Roles
- project manager / Roles
- data scientist / Roles
- machine learning expert / Roles
RuleSetModel / Predictive Model Markup Language

S

SAMOA
- about / Machine learning – tools and datasets, SAMOA as a real-time Big Data Machine Learning framework
- URL / Machine learning – tools and datasets
- architecture / SAMOA architecture
sampling
- about / Machine learning – concepts and terminology, Sampling
- uniform random sampling / Machine learning – concepts and terminology
- stratified random sampling / Machine learning – concepts and terminology
- cluster sampling / Machine learning – concepts and terminology
- systematic sampling / Machine learning – concepts and terminology
sampling-based techniques, Bayesian networks
- about / Sampling-based techniques
- forward sampling with rejection / Forward sampling with rejection, How does it work?
Samza / SAMOA as a real-time Big Data Machine Learning framework
scalar product
- of vectors / Scalar product of vectors
Scale Invariant Feature Transform (SIFT)
- about / Introducing image recognition
ScatterPlot Matrix / Multivariate feature analysis
scatter plots / Multivariate feature analysis
score function
- about / Supervised learning
self-organizing maps (SOM)
- about / Self-organizing maps (SOM)
- inputs / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
self-training SSL
- about / Self-training SSL
- inputs / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
- Query by Committee (QBC) / Query by Committee (QBC)
semantic features / Semantic features
semantic reasoning / Semantic reasoning and inferencing
semi-supervised learning / Machine learning – types and subtypes
Semi-Supervised Learning (SSL)
- about / Semi-supervised learning
- representation / Representation, notation, and assumptions
- notation / Representation, notation, and assumptions
- assumptions / Representation, notation, and assumptions
- assumptions, to be true / Representation, notation, and assumptions
- techniques / Semi-supervised learning techniques
- self-training SSL / Self-training SSL
- multi-view SSL / Co-training SSL or multi-view SSL
- co-training SSL / Co-training SSL or multi-view SSL
- label SSL / Cluster and label SSL
- cluster SSL / Cluster and label SSL
- transductive graph label propagation / Transductive graph label propagation
- transductive SVM (TSVM) / Transductive SVM (TSVM)
- advanatages / Advantages and limitations
- disadvanatages / Advantages and limitations
- data distribution sampling / Data distribution sampling
Semi-Supervised Learning (SSL), case study
- about / Case study in semi-supervised learning
- tools / Tools and software
- software / Tools and software
- business problem / Business problem
- machine learning, mapping / Machine learning mapping
- data collection / Data collection
- data quality, analysis / Data quality analysis
- data sampling / Data sampling and transformation
- data transformation / Data sampling and transformation
- datasets / Datasets and analysis
- datasets, analysis / Datasets and analysis
- feature analysis, results / Feature analysis results
- experiments / Experiments and results
- results / Experiments and results
- analysis / Analysis of semi-supervised learning
sentiment analysis / Sentiment analysis and opinion mining
sequential data / Datasets used in machine learning
shrinking methods
- embedded approach / Embedded approach
Sigmoid function / Sigmoid function
Sigmoid Kernel / How does it work?
Silhouettes index / Silhouette's index
similar items
- searching / Find similar items
similarity calculation
- about / Approaches to calculate similarity
- collaborative filtering / Collaborative filtering
- content-based filtering / Content-based filtering
- hybrid approach / Hybrid approach
similarity measures
- about / Similarity measures
- Euclidean distance / Euclidean distance
- Cosine distance / Cosine distance
- pairwise-adaptive similarity / Pairwise-adaptive similarity
- extended Jaccard Coefficient / Extended Jaccard coefficient
- Dice coefficient / Dice coefficient
SimRank
- about / Non-Euclidean distances
single layer regression model
- building / Building a single-layer regression model
Singular value decomposition (SVD)
- about / Data reduction
Singular Value Decomposition (SVD) / Advantages and limitations
singular value decomposition (SVD) / Dimensionality reduction, Singular value decomposition (SVD)
sliding windows
- about / Sliding windows
SMILE
- reference link / Tools and software
Smile
- URL / Machine learning – tools and datasets
- about / Machine learning – tools and datasets
software / Tools and software
source-sink frameworks / Source-sink frameworks
Spark-MLlib
- about / Machine learning – tools and datasets
- URL / Machine learning – tools and datasets
Spark core, components
- Resilient Distributed Datasets (RDD) / Spark architecture
- Lineage graph / Spark architecture
Spark MLlib
- used, as Big Data Machine Learning / Spark MLlib as Big Data Machine Learning platform
- architecture / Spark architecture
- machine learning / Machine Learning in MLlib
- tools / Tools and usage
- usage / Tools and usage
- experiments / Experiments, results, and analysis
- results / Experiments, results, and analysis
- analysis / Experiments, results, and analysis
- reference link / Experiments, results, and analysis
- k-Means / k-Means
- k-Means, with PCA / k-Means with PCA
- k-Means with PCA, bisecting / Bisecting k-Means (with PCA)
- Gaussian Mixture Model (GMM) / Gaussian Mixture Model
- Random Forest / Random Forest
- results, analysis / Analysis of results
Spark SQL / Spark SQL
Spark Streaming
- about / Apache Spark, Real-time Big Data Machine Learning
sparse coding
- about / Sparse coding
spatio-temporal patterns
- about / Transaction analysis
Spearman's footrule distance
- about / Java machine learning
spectral clustering
- about / Spectral clustering
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
SQL frameworks / SQL frameworks
stacked autoencoders
- about / Autoencoder
standard deviation / Standard deviation
standardization
- about / Document collection and standardization
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
standards and markup languages
- about / Standards and markup languages
Statistical-based
- about / Statistical-based
- input / Inputs and outputs
- outputs / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Statistics 110 (Harvard) by Joe Biltzstein
- URL / Online courses
stemming / Stemming or lemmatization
step execution mode / Amazon Elastic MapReduce
Stochastic Gradient Descent (SGD)
- about / How does it work?
/ Supervised learning experiments
stop words removal
- about / Stop words removal
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
stratification
- about / Stratification
stratified sampling / Stratified sampling
stream / SAMOA architecture
stream computational technique
- about / Basic stream processing and computational techniques, Stream computations
- frequency count / Stream computations
- point queries / Stream computations
- distinct count / Stream computations
- mean / Stream computations
- standard deviation / Stream computations
- correlation coefficient / Stream computations
- sliding windows / Sliding windows
- sampling / Sampling
stream learning / Machine learning – types and subtypes
stream learning, case study
- about / Case study in stream learning
- tools / Tools and software
- software / Tools and software
- business problem / Business problem
- machine learning, mapping / Machine learning mapping
- data collection / Data collection
- data sampling / Data sampling and transformation
- data transformation / Data sampling and transformation
- feature analysis / Feature analysis and dimensionality reduction
- dimensionality reduction / Feature analysis and dimensionality reduction
- models / Models, results, and evaluation
- results / Models, results, and evaluation
- evaluation / Models, results, and evaluation
- supervised learning experiments / Supervised learning experiments
- concept drift experiments / Concept drift experiments
- clustering experiments / Clustering experiments
- outlier detection experiments / Outlier detection experiments
- results, analysis / Analysis of stream learning results
Stream Processing Engines (SPE) / Real-time stream processing
stream processing technique
- about / Basic stream processing and computational techniques
structured data
- sequential data / Datasets used in machine learning
Structure Score Measure / Measures to evaluate structures
subfields
- about / NLP, subfields, and tasks
Subspace Outlier Detection (SOD) / How does it work?
Sum of Squared Errors (SSE) / Clustering models, results, and evaluation, Experiments, results, and analysis
sum transfer function
- about / Perceptron
supermarket dataset
- about / The supermarket dataset
- shopping patterns, discovering / Discover patterns
- shopping patterns, discovering with Apriori algorithm / Apriori
- shopping patterns, discovering with FP-growth algorithm / FP-growth
supervised learning / Machine learning – types and subtypes
- about / What kind of problems can machine learning solve?, Supervised learning
- classification / Classification
- regression / Regression
- experiments / Supervised learning experiments
- Weka, experiments / Weka experiments
- RapidMiner, experiments / RapidMiner experiments
- reference link / Results, observations, and analysis
- and unsupervised learning, common issues / Issues in common with supervised learning
- assumptions / Assumptions and mathematical notations
- mathematical notations / Assumptions and mathematical notations
Support Vector Machine (SVM) model / Predictive Model Markup Language
Support Vector Machines (SVM)
- about / Kernel methods
/ How does it work?
support vector machines (SVM)
- about / Support vector machines (SVM)
- algorithm input / Algorithm inputs and outputs
- algorithm output / Algorithm inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
survivorship bias
- about / Sampling traps
suspicious behaviour detection
- about / Suspicious and anomalous behavior detection
suspicious pattern detection
- about / Suspicious pattern detection
suspicious patterns, modelling
- about / Modeling suspicious patterns
- vanilla approach / Vanilla approach
- dataset rebalancing / Dataset rebalancing
SVM
- about / Underfitting and overfitting
Syntactic features
- about / Syntactic features
Syntactic Language Models (SLM)
- about / Syntactic features
Synthetic Minority Oversampling Technique (SMOTE) / Undersampling and oversampling
Sample, Explore, Modify, Model, and Assess (SEMMA).
- about / SEMMA methodology

T

target variables
- churn probability / Challenge
- appetency probability / Challenge
- upselling probability / Challenge
tasks
- about / NLP, subfields, and tasks
term frequency (TF) / Frequency-based techniques
Term Frequency (TF) / Term frequency (TF)
term frequency-inverse document frequency (TF-IDF) / Term frequency-inverse document frequency (TF-IDF)
Tertius
- about / Weka
test set
- about / Train and test sets
text categorization
- about / Text categorization
text classification
- about / Text classification
- examples / Text classification
text clustering / Text clustering
- about / Text clustering
- feature transformation / Feature transformation, selection, and reduction
- selection / Feature transformation, selection, and reduction
- reduction / Feature transformation, selection, and reduction
- techniques / Clustering techniques
- evaluation / Evaluation of text clustering
text data
- extracting / Working with text data
- importing / Importing data
- importing, from directory / Importing from directory
- importing, from file / Importing from file
- pre-processing / Pre-processing text data
text mining
- about / Introducing text mining
- topic modeling / Topic modeling, Topic modeling
- text classification / Text classification
- topics / Topics in text mining
- categorization/classification / Text categorization/classification
- clustering / Text clustering
- named entity recognition (NER) / Named entity recognition
- Deep Learning / Deep learning and NLP
- NLP / Deep learning and NLP
text processing components
- about / Text processing components and transformations
- document collection / Document collection and standardization
- standardization / Document collection and standardization
- tokenization / Tokenization
- stop words removal / Stop words removal
- lemmatization / Stemming or lemmatization
- local-global dictionary / Local/global dictionary or vocabulary?
- vocabulary / Local/global dictionary or vocabulary?
- feature extraction/generation / Feature extraction/generation
- feature representation / Feature representation and similarity
- similarity / Feature representation and similarity
- feature selection / Feature selection and dimensionality reduction
- dimensionality reduction / Feature selection and dimensionality reduction
text summarization / Text summarization
time-series forecasting / Machine learning – types and subtypes
time series data
- anomaly detection / Anomaly detection in time series data
tokenization
- about / Tokenization
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
tools / Tools and software
- about / Tools and usage
- Mallet / Mallet
- KNIME / KNIME
tools, machine learning
- RapidMiner / Machine learning – tools and datasets
- Weka / Machine learning – tools and datasets
- Knime / Machine learning – tools and datasets
- Mallet / Machine learning – tools and datasets
- Elki / Machine learning – tools and datasets
- JCLAL / Machine learning – tools and datasets
- KEEL / Machine learning – tools and datasets
- DeepLearning4J / Machine learning – tools and datasets
- Spark-MLlib / Machine learning – tools and datasets
- H2O / Machine learning – tools and datasets
- MOA/SAMOA / Machine learning – tools and datasets
- Neo4j / Machine learning – tools and datasets
- GraphX / Machine learning – tools and datasets
- OpenMarkov / Machine learning – tools and datasets
- Smile / Machine learning – tools and datasets
topic modeling
- about / Topic modeling, Topic modeling
- probabilistic latent semantic analysis (PLSA) / Probabilistic latent semantic analysis (PLSA)
- with mallet / Topic modeling with mallet
- business problem / Business problem
- machine learning, mapping / Machine Learning mapping
- data collection / Data collection
- data sampling / Data sampling and transformation
- transformation / Data sampling and transformation
- feature analysis / Feature analysis and dimensionality reduction
- dimensionality reduction / Feature analysis and dimensionality reduction
- models / Models, results, and evaluation
- results / Models, results, and evaluation
- evaluation / Models, results, and evaluation
- text processing results, analysis / Analysis of text processing results
topic modelling, for BBC news
- about / Topic modeling for BBC news
- BBC dataset, collecting / BBC dataset
- modeling / Modeling
- model, evaluating / Evaluating a model
- model, reusing / Reusing a model
- model, saving / Saving a model
- model, restoring / Restoring a model
traditional machine learning
- architecture / Traditional machine learning architecture
training data
- collecting / Collecting training data
Training data
- about / Building a machine learning application
training phases
- competitive phase / How does it work?
- cooperation phase / How does it work?
- adaptive phase / How does it work?
train set
- about / Train and test sets
transaction analysis
- about / Transaction analysis
transaction data / Datasets used in machine learning
transductive graph label propagation
- about / Transductive graph label propagation
- input / Inputs and outputs
- output / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
transductive SVM (TSVM)
- about / Transductive SVM (TSVM)
- output / Inputs and outputs
- input / Inputs and outputs
- working / How does it work?
- advantages / Advantages and limitations
- limitations / Advantages and limitations
transformations
- about / Text processing components and transformations
Tree augmented network (TAN)
- about / Tree augmented network
- input and output / Input and output
- working / How does it work?
- advantages and limitations / Advantages and limitations
TreeModel / Predictive Model Markup Language
Tunedit
- about / Datasets
- URL / Datasets

U

UCI machine learning repository
- URL / Datasets
UCI repository
- reference link / Data Collection
UC Irvine (UCI) database
- about / Datasets
- URL / Datasets
Udemy
- URL / Online courses
uncertainty sampling
- about / Uncertainty sampling
- working / How does it work?
- least confident sampling / Least confident sampling
- smallest margin sampling / Smallest margin sampling
- label entropy sampling / Label entropy sampling
- advantages / Advantages and limitations
- limitations / Advantages and limitations
underfits
- about / Applied machine learning workflow
underfitting
- about / Underfitting and overfitting
undersampling / Undersampling and oversampling
univariate feature analysis
- about / Univariate feature analysis
- categorical features / Categorical features
- continuous features / Continuous features
univariate feature selection
- information theoretic approach / Information theoretic approach
- statistical approach / Statistical approach
Universal PMML Plug-in (UPPI) / Predictive Model Markup Language
unknown-unknowns
- about / Unknown-unknowns
unnormalized measure / Factor types
unstructured data / Datasets used in machine learning
- mining, issues / Issues with mining unstructured data
unsupervised learning / Machine learning – types and subtypes
- about / What kind of problems can machine learning solve?, Unsupervised learning
- similar items, searching / Find similar items
- clustering / Clustering
- specific issues / Issues specific to unsupervised learning
- assumptions / Assumptions and mathematical notations
- mathematical notations / Assumptions and mathematical notations
- outlier detection, used / Unsupervised learning using outlier detection
usage / Tools and usage
user-based analysis
- about / User-based and item-based analysis
user-based collaborative filtering
- about / User-based filtering
US Forest Service (USFS) / Data collection
US Geological Survey (USGS) / Data collection

V

V-Measure
- about / V-Measure
- Homogeneity / V-Measure
- Completeness / V-Measure
validation
- techniques / Training, validation, and test set
vanilla approach
- about / Vanilla approach
Variable elimination (VE) algorithm / Variable elimination algorithm
variance / Variance
vector
- about / Vector
- scalar product / Scalar product of vectors
vector space model (VSM)
- about / Vector space model
- binary / Binary
- Term Frequency (TF) / Term frequency (TF)
- inverse document frequency (IDF) / Inverse document frequency (IDF)
- term frequency-inverse document frequency (TF-IDF) / Term frequency-inverse document frequency (TF-IDF)
version space sampling
- about / Version space sampling
- Query by disagreement (QBD) / Query by disagreement (QBD)
very fast decision trees (VFDT) / Hoeffding trees or very fast decision trees (VFDT)
- output / Inputs and outputs
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Very Fast K-means Algorithm (VFKM) / Advantages and limitations
visualization analysis
- about / Visualization analysis
- univariate feature analysis / Univariate feature analysis
- multivariate feature analysis / Multivariate feature analysis
Vote Entropy
- disadvanatages / How does it work?

W

Waikato Environment for Knowledge Analysis (Weka)
- about / Weka
- URL / Weka
web resources and competitions
- about / Web resources and competitions, Competitions
- datasets / Datasets
- online courses / Online courses
- websites and blogs / Websites and blogs
- venues and conferences / Venues and conferences
website traffic
- anomaly detection / Anomaly detection in website traffic
weighted linear sum (WLS) / How does it work?
weighted linear sum of squares (WSS) / How does it work?
weighted majority algorithm (WMA)
- about / Weighted majority algorithm
- input / Inputs and outputs
- output / Inputs and outputs
- working / Advantages and limitations
- advantages / Advantages and limitations
- limitations / Advantages and limitations
Weka
- URL / Machine learning – tools and datasets
- about / Machine learning – tools and datasets, Case Study – Horse Colic Classification
- experiments / Weka experiments
- Sample end-to-end process, in Java / Sample end-to-end process in Java
- experimenter / Weka experimenter and model selection
- model selection / Weka experimenter and model selection
weka.classifiers package
- weka.classifiers.bayes / Weka
- weka.classifiers.evaluation / Weka
- weka.classifiers.functions / Weka
- weka.classifiers.lazy / Weka
- weka.classifiers.meta / Weka
- weka.classifiers.mi / Weka
- weka.classifiers.rules / Weka
- weka.classifiers.trees / Weka
Weka 3.6
- URL / Before you start
- downloading / Before you start
Weka Bayesian Network GUI / Weka Bayesian Network GUI
WEKA Packages
- URL / Before we start
Welchs test / Welch's t test
Widmer / Widmer and Kubat
Wilcoxon signed-rank test / Wilcoxon signed-rank test
word2vec
- about / Working with text data
- URL / Working with text data
Word sense disambiguation (WSD) / Word sense disambiguation
workflow, Applied Machine Learning
- data and problem definition / Applied machine learning workflow
- data collection / Applied machine learning workflow
- data preprocessing / Applied machine learning workflow
- data analysis and modeling / Applied machine learning workflow
- evaluation / Applied machine learning workflow
wrapper approach / Wrapper approach

X

Xiaming Chen
- URL / Datasets

Y

Yahoo traffic dataset
- URL / Dataset

Z

Z-Score Normalization / Outliers
ZeroMQ Message Transfer Protocol (ZMTP) / Message queueing frameworks

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Table of Contents for
Index