Index
B
- back-off bin, What about rare categories?
- bag-of-n-grams, Bag-of-n-Grams
- bag-of-words (BoW) featurization, Bag-of-Words
- basis vectors, From Vectors to Subspaces, From Vectors to Subspaces
- bias, Overview of Linear Classification
- “Big Learning Made Easy—With Counts!” blog post, Bin Counting
- bigrams, Bag-of-n-Grams
- Bilenko, Misha, Bin Counting
- bin counting, Categorical Variables: Counting Eggs in the Age of Robotic Chickens, Bin Counting-Counts without bounds
- binarization (of counts), Binarization
- binning
- binomial distribution, Hypothesis testing for collocation extraction
- Box-Cox transforms, Power Transforms: Generalization of the Log Transform
C
- C-HOG blocks, How are neighborhoods defined? How should they cover the image?
- categorical variables, Categorical Variables: Counting Eggs in the Age of Robotic Chickens-Bibliography
- chunking, Chunking and part-of-speech tagging
- class-imbalanced dataset, Creating a Classification Dataset
- classification
- classification dataset, creating, Creating a Classification Dataset
- clustering algorithms, Nonlinear Featurization via K-Means Model Stacking
- collisions, Feature Hashing
- collocations, Collocation Extraction for Phrase Detection-Summary
- column space, Column space
- complex features, Fancy Tricks with Simple Numbers
- convolutional layers in neural networks, Convolutional Layers-Convolutional Layers
- convolutions, Image Gradients
- count-min sketch, What about rare categories?
- counts, Dealing with Counts-Log Transformation
- CountVectorizer transformer, Bag-of-n-Grams, Scaling Bag-of-Words with Tf-Idf Transformation
- covariance between random variables (in PCA), Variance and Empirical Variance
D
- data
- data leakage
- data matrix, Deep Dive: What Is Happening?-Deep Dive: What Is Happening?, Overview of Linear Classification
- data space, Scalars, Vectors, and Spaces
- data visualization, importance of, Log Transform in Action
- decision surface, Overview of Linear Classification
- decision tree models, Interaction Features, Feature Selection
- deep learning
- delimiters, Parsing and Tokenization
- dense featurization with k-means, Alternative Dense Featurization
- dimensionality of subspaces, From Vectors to Subspaces
- dimensionality reduction, Dimensionality Reduction: Squashing the Data Pancake with PCA
- distance, Intuition
- distribution, Fancy Tricks with Simple Numbers
- document frequency, Scaling Bag-of-Words with Tf-Idf Transformation
- document-term matrix, Deep Dive: What Is Happening?
- dummy coding, Dummy Coding-Effect Coding
F
- factor analysis, Use Cases
- feature engineering, Fancy Tricks with Simple Numbers
- feature extraction, The Simplest Image Features (and Why They Don’t Work)
- feature hashing, Feature Hashing-Feature Hashing
- feature normalization, Feature Scaling or Normalization
- (see also feature scaling)
- feature scaling, Feature Scaling or Normalization-ℓ2 Normalization, The Effects of Feature Scaling: From Bag-of-Words to Tf-Idf-Summary
- feature selection, Feature Selection-Feature Selection
- feature space, Scalars, Vectors, and Spaces
- FeatureHasher, scikit-learn, Feature Hashing
- features
- filtering, Feature Selection, Item-Based Collaborative Filtering
- filters
- financial modeling, use of PCA in, Use Cases
- fixed-width binning, Fixed-width binning
- frequency-based filtering (text data), Frequency-Based Filtering
- frequent words, filtering from text, Frequent words
- fully connected layers (in neural networks), Fully Connected Layers
G
- Gaussian distribution, Fancy Tricks with Simple Numbers
- Gaussian filter, applying to an image, Convolutional Layers-Convolutional Layers
- gradient boosting tree (GBT) classifiers, k-Means Featurization for Classification, Pros, Cons, and Gotchas
- gradient orientation histograms, Gradient Orientation Histograms
- grid search, Tuning Logistic Regression with Regularization
- GridSearchCV function, scikit-learn, Tuning Logistic Regression with Regularization
H
- hard clustering, k-Means Clustering
- hash functions, Feature Hashing
- (see also feature hashing)
- heatmap of paper recommendations, Academic Paper Recommender: Naive Approach
- heavy-tailed distribution, Log Transformation
- HOG (Histogram of Oriented Gradients), Manual Feature Extraction: SIFT and HOG-Learning Image Features with Deep Neural Networks
- horizontal image gradients, Image Gradients
- hyperparameter tuning, Tuning Logistic Regression with Regularization
- hyperparameters, Tuning Logistic Regression with Regularization
- hyperplanes, Overview of Linear Classification
- hypothesis testing, using for collocation extraction, Hypothesis testing for collocation extraction
I
- image descriptors, Manual Feature Extraction: SIFT and HOG
- image feature extraction, Automating the Featurizer: Image Feature Extraction and Deep Learning-Summary
- image gradients, Image Gradients-Image Gradients
- image neighborhoods, How are neighborhoods defined? How should they cover the image?
- ImageNet Large Scale Visual Recognition Challenge (ILSVRC), Learning Image Features with Deep Neural Networks
- indices, maintaining assignments during coversions, Academic Paper Recommender: Take 2
- inner product, From Vectors to Subspaces, From Vectors to Subspaces
- interaction features, Fancy Tricks with Simple Numbers, Interaction Features-Interaction Features
- intercept, Dummy Coding
- intercept term, Overview of Linear Classification
- intrinsic dimensionality, Intuition
- inverse document frequency, Tf-Idf : A Simple Twist on Bag-of-Words
- item-based collaborative filtering, Item-Based Collaborative Filtering
L
- left null space, Left null space
- likelihood ratio test analysis, Hypothesis testing for collocation extraction
- linear algebra
- linear classification
- linear combination, From Vectors to Subspaces
- linear correlation, Variance and Empirical Variance
- linear dependent features, One-Hot Encoding
- linear independence, From Vectors to Subspaces
- linear operators, The Anatomy of a Matrix
- linear projection (in PCA), Linear Projection
- linear regression
- log transforms, Fancy Tricks with Simple Numbers, Log Transformation-Feature Scaling or Normalization
- log-odds ratio for bin counting, Bin Counting
- logical functions, Fancy Tricks with Simple Numbers
- logistic regression
M
- machine learning
- magnitude of numeric data, Fancy Tricks with Simple Numbers
- manifold (nonlinear subspace), Nonlinear Featurization via K-Means Model Stacking
- manifold learning, Nonlinear Featurization via K-Means Model Stacking
- mask, Convolutional Layers
- mathematical formulas, Models
- mathematical modeling, Models
- matrices
- matrix-vector formulation, principal components, Principal Components: Matrix-Vector Formulation
- mean, Scaling Bag-of-Words with Tf-Idf Transformation
- metric (k-means), k-Means Clustering
- Microsoft Academic Graph dataset, Item-Based Collaborative Filtering
- min-max scaling, Min-Max Scaling
- missing data, Models
- model evaluation, Model Evaluation
- model stacking, Fancy Tricks with Simple Numbers
- models
N
- n-grams, Bag-of-n-Grams, Parsing and Tokenization
- natural language processing (NLP), Collocation Extraction for Phrase Detection
- neighborhoods (image), How are neighborhoods defined? How should they cover the image?
- neural networks (deep), learning image features with, Learning Image Features with Deep Neural Networks-Structure of AlexNet
- NLP (natural language processing), Collocation Extraction for Phrase Detection
- NLTK Python package, Stemming
- nonlinear dimensionality reduction, Nonlinear Featurization via K-Means Model Stacking
- nonlinear embedding, Nonlinear Featurization via K-Means Model Stacking
- nonlinear featurization, Nonlinear Featurization via K-Means Model Stacking
- nonlinear manifold feature extraction (k-means), Nonlinear Featurization via K-Means Model Stacking
- nonordinal values, Categorical Variables: Counting Eggs in the Age of Robotic Chickens
- normalization, Fancy Tricks with Simple Numbers
- normalization constant, ℓ2 Normalization
- null space, Null space, Solving a Linear System
- numeric data, Fancy Tricks with Simple Numbers-Summary
- counts, Dealing with Counts-Log Transformation
- feature scaling or normalization, Feature Scaling or Normalization-ℓ2 Normalization
- feature selection, Feature Selection-Feature Selection
- interaction features, Interaction Features-Interaction Features
- log transformation, Log Transformation-Feature Scaling or Normalization
- scalars, vectors, and spaces, Scalars, Vectors, and Spaces-Dealing with Counts
- NumPy sparse array, converting Pandas DataFrame to, Academic Paper Recommender: Take 2
P
- Pandas
- parsing, Parsing and Tokenization
- part-of-speech (PoS) tagging, Chunking and part-of-speech tagging
- PCA (principal component analysis), Dimensionality Reduction: Squashing the Data Pancake with PCA-Bibliography
- considerations and limitations, Considerations and Limitations of PCA-Considerations and Limitations of PCA
- derivation, Derivation-PCA in Action
- implementing PCA, Implementing PCA
- linear projection, Linear Projection
- principal components, first formulation, Principal Components: First Formulation
- principal components, general solution of, General Solution of the Principal Components
- principal components, matrix-vector formulation, Principal Components: Matrix-Vector Formulation
- transforming features using linear projection, Transforming Features
- variance and empirical variance, Variance and Empirical Variance
- use cases, Use Cases
- using on scikit-learn digits dataset, PCA in Action-PCA in Action
- whitening and ZCA, Whitening and ZCA
- phrase detection, Filtering for Cleaner Features
- Poisson distribution, Power Transforms: Generalization of the Log Transform
- pooling layers (in neural networks), Pooling Layers
- Porter stemmer, Stemming
- power transforms, Fancy Tricks with Simple Numbers, Power Transforms: Generalization of the Log Transform-Feature Scaling or Normalization
- principal component analysis (see PCA)
- probability plots (probplots), Power Transforms: Generalization of the Log Transform
- Pythagorean theorem, ℓ2 Normalization
- Python
R
- R-HOG blocks, How are neighborhoods defined? How should they cover the image?
- radial basis function support vector machine (RBF SVM), k-Means Featurization for Classification, Pros, Cons, and Gotchas
- random forest classifiers, k-Means Featurization for Classification
- rank or dimensionality (subspaces), From Vectors to Subspaces, From Vectors to Subspaces
- rare categories, What about rare categories?
- rare words, filtering from text, Rare words
- raw counts, PCA and, Considerations and Limitations of PCA
- receiver operating characteristic (ROC) curves, k-Means Featurization for Classification
- recommender for academic papers (example), Back to the Feature: Building an Academic Paper Recommender-Bibliography
- first pass, data import, cleaning, and feature parsing, First Pass: Data Import, Cleaning, and Feature Parsing-Academic Paper Recommender: Naive Approach
- second pass, more engineering and smarter model, Second Pass: More Engineering and a Smarter Model-Third Pass: More Features = More Information
- third pass, more features and more information, Third Pass: More Features = More Information-Academic Paper Recommender: Take 3
- rectified linear unit, Rectified Linear Unit (ReLU) Transformation
- rectified linear unit (ReLU) transformation, Rectified Linear Unit (ReLU) Transformation-Rectified Linear Unit (ReLU) Transformation
- redundant data, Models
- reference category, Dummy Coding
- regularization constraints, adding to a model, Solving a Linear System
- regularization, tuning logistic regression with, Tuning Logistic Regression with Regularization-Tuning Logistic Regression with Regularization
- resampling, Tuning Logistic Regression with Regularization
- response normalization layers (in neural networks), Response Normalization Layers
- robustness, Binarization
- row space, Row space
S
- scalars, Scalars, Vectors, and Spaces, From Vectors to Subspaces
- scale, Fancy Tricks with Simple Numbers
- scikit-learn
- SciPy, stats package, Power Transforms: Generalization of the Log Transform
- sentences, analysis of, Parsing and Tokenization
- separators, Parsing and Tokenization
- SIFT (Scale Invariant Feature Transform), Manual Feature Extraction: SIFT and HOG-Learning Image Features with Deep Neural Networks
- sigmoid function, Classification with Logistic Regression, Rectified Linear Unit (ReLU) Transformation
- signed feature hashing, Feature Hashing
- singular value decomposition (SVD) of a matrix, Derivation, Principal Components: Matrix-Vector Formulation, Singular Value Decomposition (SVD)-Singular Value Decomposition (SVD), Solving a Linear System
- singular vectors, Singular Value Decomposition (SVD)
- space characters, Parsing and Tokenization
- spaces, Scalars, Vectors, and Spaces
- spaCy library, Chunking and part-of-speech tagging
- span of a set of vectors, From Vectors to Subspaces
- sparse data
- spectrum (of a matrix), Considerations and Limitations of PCA, Singular Value Decomposition (SVD)
- standardization, Standardization (Variance Scaling)
- statistical factor model, Use Cases
- statistical modeling, Models
- stats package, Power Transforms: Generalization of the Log Transform
- stemming, Stemming
- stocks, correlation patterns in, Use Cases
- stopwords, Stopwords
- string objects, Parsing and Tokenization
- subspaces, From Vectors to Subspaces
- Swiss roll, Nonlinear Featurization via K-Means Model Stacking
T
- tanh function, Rectified Linear Unit (ReLU) Transformation
- target engineering, Fancy Tricks with Simple Numbers, Binarization
- target hints, k-means featurization with/without, k-Means Featurization for Classification
- text data, Text Data: Flattening, Filtering, and Chunking-Summary
- TextBlob library, Chunking and part-of-speech tagging
- tf-idf (term frequency-inverse document frequency), Tf-Idf : A Simple Twist on Bag-of-Words-Putting It to the Test
- tokenization, Bag-of-n-Grams, Parsing and Tokenization
- training
- tree-based models, Fancy Tricks with Simple Numbers
V
- variance, Scaling Bag-of-Words with Tf-Idf Transformation
- variance scaling, Standardization (Variance Scaling)
- variance-stabilizing transformations, Power Transforms: Generalization of the Log Transform
- (see also log transforms; power transforms)
- vector quantization, Clustering as Surface Tiling
- vector spaces, Scalars, Vectors, and Spaces
- vectors, Scalars, Vectors, and Spaces, From Vectors to Subspaces
- vertical image gradients, Image Gradients
- volume, Intuition
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.