Index
A
Accuracy
Activation functions
gradient descent
hyperbolic tangent (tanh)
key improvements
LeakyReLU
linear functions
nonlinear functions
nonlinearity/variability
rectified linear unit
scaled linear activation
SELU
sigmoid function
swish function
Activation Maximization
AdaBoostClassifier
Adaptive Gradient Boosting (AdaBoost)
Adaptive Moment Estimation (Adam)
Adaptive Relation Modeling Network (ARM-Net)
adaptive relation modeling module
benchmark models
definition
module
PyTorch
Adult Census dataset
Aggregating/ensembling models
ak.StructuredDataBlock
AlexNet
Alignment-score-computing network
Alpha value
Ames Housing dataset
applyKernel function
Artificial neural network (ANN)
Asymmetric convolutions
Attention-based tabular modeling
Attention mechanism
Bahdanau-style attention
BERT
context vector
deep learning models
definition
GELU
Keras
attention scores
bidirectional model
bidirectional recurrent model
implementation
multi-head attention model
synthetic dataset
unidirectional model
LSTM
multimodal
natural language models
sequence-to-sequence dataset
SHA-RNN model
text sequences
transform architecture
Audio files
Audio model
Autoencoders
See alsoVanilla
abstractions
architecture
decoding
denoising
encoding
encrypted data
image-to-text encoding
internal capabilities
language representation
natural language
nontrivial patterns
pretraining
reparative
sending/receiving
sparse
tabular data
AutoInt model
AutoKeras model
AutoKeras NAS algorithm
Auto-ML
AverageEnsemble class
Average weighting
Averaging ensemble
B
Backpropagation process
Backpropagation Through Time (BPTT)
Bag of Words (BoW) model
Bahdanau-style attention layer
Batches
Batch normalization
Bayesian optimization
BERT architecture
Bidirectionality
Binary classification problem
Binary cross-entropy (BCE)
Black box
Blank canvas
Boosting gradient
AdaBoost
LightGBM
regression/classification
sigmoid function
theory/intuition
visualization
XGBoost
buildAutoencoder function
Builder
C
CatNN
Cell state
Classical machine learning model
Classification and Regression Trees (CART)
Click-through rate (CTR) prediction problems
Cluster
Code snippet
Colon-bracket syntax
Compartmentalized form
Compressed Sparse Row (CSR)
Conditional Tabular GAN (CTGAN)
anonymize_field option
approaches
conditional generation
official documentation
official implementation
performance
simple CTGAN demo
synthetic data generated
tabular data generation
Confusion Matrix
Constructor
Contiguous semantics
Continuous vote
Conveyor belt
Convolutional component
Convolutional neural networks (CNNs)
ANN
architectures
build architecture
EfficientNet
flattened MNIST dataset
heatmap
Inception v3
definition
expanded filter bank module
ImageNet
Keras
module A/B
modules
MNIST digits dataset
parameters
pooling operation
reshaping MNIST dataset
ResNet
submodels
sweep
validation performance modeling
Convolution operation
8-by-8 uniform blurring kernel
first/second layer kernel
flattening layer
fourth layer kernel
grayscale images
identity kernel
implementation
input shape
kernel
loss/accuracy history
low dimensional image
MNIST dataset
operation, kernel
parameters
peek.predict
pixel-by-pixel image
stacking
3-by-3 blurring kernel
3-by-3 sharpening kernel
training/validation performance
2-by-2 uniform blurring kernel
Convolutions
Convolution windows
Corrupted image
Cross-validation
Custom loss
cv2 function
D
Data generation algorithms
Data leakage
Data pipeline
Data points
Data preparation/preprocessing
bigrams
BoW model
categorical encoding methods
components
continuous quantitative data
deep sentiment analysis extraction
elementary operations
encoding process
engineering
SeeEngineering techniques
entropy
extraction/engineering methods
feature extraction
feature selection
few-shot learning scheme
geographical data
high-correlation method
Information Gain
key components
keyword search function
LASSO coefficient selection
linear discriminant analysis
min-max scaling
natural language processing framework
n-gram data
non-invertible function
normalized distribution
one-shot learning scheme
original vs. robust-scaled distribution
partial data destruction
PCA
SeePrincipal Component Analysis (PCA)
permutation importance
quantitative representation
raw vectorization
recursive feature elimination
robust scaling
selection methods
sentiment extraction implementation
single/multi-feature transformations
sparsity
standardization
storage/manipulation
SeeTensorFlow datasets
supervised learning scheme
TARS model
text data
TF-IDF encoding
time/temporal data
train/validation splitting
transformations
t-SNE
vaderSentiment library
variances
Word2Vec
zero-shot learning scheme
Decision trees
advantages
CART
features
Gini Impurity/Entropy
high-variance intuition
implementation
leaf nodes
mitigate variance intuition
node class
prediction functions
random forest
scikit-learn implementation
decoder.predict(…)
Decoding
Deep Double Descent theory
DeepGBM
DeepInsight
Deep Jointly Informed Neural Network (DJINN)
Deep learning (DL)
Keras
wide model
Deep neural decision trees (DNDTs)
Default method vs. half-filling method
Denoising autoencoders
application
blanket fashion
MNIST data
MNIST images
noise level
noisy images
noisy input
noisy/perturbed input
performance
predicted output
random noise
reflective standard deviation
reparative models
sample of images
tabular data
triviality
unperturbed desired output
validation
Denoising loss
Dense-convolution-recurrent model
DenseNet
Digit recognition
Direct tabular attention modeling
Direct tabular recurrent modeling
Discriminator
Disjunctive normal form (DNF)
Disjunctive normal neural form (DNNF) block
Distillation
DeepGBM
components
definition
embeddings
GBDT
notations
training
definition
Dow Jones Industrial Average (DJIA)
.drop() command
E
EfficientNetB1 model
EfficientNet models
Element-wise function
Embedding mechanisms
Embeddings
Embedding vector
encoder.predict(…)
Encoding process
Ames Housing dataset
binary representation
discrete data
enumerate() function
frequency
James-Stein encoding
label encoding models
leave-one-out encoding
multicollinearity
numerical representation
one-hot representation
sparsity/multicollinearity
target representation
TF-IDF representation
WoE technique
Engineering techniques
distribution
extraction methods
feature extraction
homogenous
PCA
researches
statistics
Entire GAN model
Epidemic forecasting
Epochs
Exclusive Feature Bundling (EFB)
explain_instance method
Exploding
Extreme Gradient Boosting (XGBoost)
F
False positive Rate (FPR)
False validation scores
Fast Signed Gradient Method (FSGM)
Feature matching
Feed-forward (FF) layers
Feed-forward operation
activation
SeeActivation functions
feature feeds
hidden layer
notations
First-order optimization approaches
Flexibility
Flow-based malware detection
Force plot visualization
Forget gate
Fully connected component
Function vectorization
G
Gated linear unit activation (GLU)
Gated recurrent unit (GRU) network
components
forget and input gates
hidden state
hyperbolic tangent activation
LSTMs
representation
structure/complexity
update gate
Gauging model
Gaussian Error Linear Unit (GELU) activation
Generative Adversarial Network (GAN)
conditional
formal training algorithm
generator model
gradients
image generation
optimization procedure
schematic
simple GAN, TensorFlow
techniques
theory
Generator
Geographical data
get_alpha() method
Global pooling
Gradient-based methods
Gradient-Based One-Side Sampling (GOSS)
Gradient Boosting
GradientBoostingClassifier
Gradient Boosting Decision Tree (GBDT)
Gradient Boosting Machines (GBMs)
Gradient Boosting neural networks (GrowNet)
ANNs
architecture
definition
GBMs
package
training procedures
train/predict methods
XGBoost
Gradient explanations
Gradients
exploding
vanishing
Grayscale representation
Grid of values
Grid search
GrowNet
H
Hadamard product
Halving
Helper function
Hidden states
Hierarchical Data Format (h5)
Higgs Boson dataset
High-correlation method
High precision
Historical averaging
Horizontal rule
Human-designed compression
Human voice audio
Hyperbolic tangent (tanh)
Hyperopt
Hyperoptimization
Hyperparameter optimization tools
Hypothetical objective function
I, J
Identity kernel
Image Generation for Tabular Data (IGTD)
Inception v3 model
Indexing commands
Information compression factor
Information gain
Information-rich signals
Initial memory states
architecture
dual linkage
feed-forward layer
hidden state
hidden-state-learned tabular recurrent model
LSTMs
optimal transformation
recurrent layer
schematic diagram
semantics
sequence
sinusoidal position encodings
tabular data
tabular input
timesteps
Input-informed ensemble model architecture
Input-informed weighting
Interactive visualization
Internal thinking processes
Interpretation
Intertwined loops
K
Keras
batch sizes
bidirectional LSTM
hidden layer
input shape
multiple layers
preprocessing data
real-time forecasting
regularization learning networks
syntax
tabular data prediction
Keras application models
Keras library
customization
deeper dive
batch normalization
callbacks
concatenation
dropouts
early stopping
embedding layer
evaluation phase
functional API
hidden layers
inserting activation
model architecture
model checkpoint
multi-input/multi-output models
nonlinear topology models
parameters
plotted model diagram
summary() method
validation
weight sharing intuition
modeling
architecture
dense layers
Fashion MNIST dataset
imshow() function
model compilation
sequential model
softmax
source code
training process
2D array
workflow
ONEIROS project
TensorFlow installation
Keras neural network
keras.utils.plot_model
Kernel-PCA
K-Nearest Neighbors (KNN)
advantages/disadvantages
algorithm concept
boosting gradient
decision trees
Elbow method
high-correlation method
implementation
linear regression
logistic regression
Minkowski distance
NumPy array
one-hot encoding
Scikit-learn implementation
single test data
standardization
theory/intuition
Chebyshev distance
Euclidian distance
features
Manhattan distance
Minkowski distance
normalization
visualization
L
Labels
Label smoothing
Layer freezing
Leaky gates
Least Absolute Shrinkage and Selection Operator (LASSO)
Least Square Theory
__len__ and a __getitem__ method
Light Gradient Boosting Machines (LightGBM)
Linear Discriminant Analysis (LDA)
Linear mappings
Linear output activation
Linear regression models
ElasticNet regression
gradient descent
implementation
deterministic results
gradient descent
Scikit-learn’s implementation
source code
LASSO regression
regularization
ridge regression
theory/intuition
chain rule/derivatives
composite function
equation
estimation line/data point
explanatory variable
gradient descent
slope-intercept form
variables
weight values
variations
Local Interpretable Model-Agnostic Explanations (LIME)
initialize
matplotlib
validation dataset
Logistic regression models
binary classification
binary cross-entropy (BCE)
classification
derivatives
gradient descent
implementation
scikit-learn
sigmoid function
softmax function
theory/intuition
variations
Long short-term memory (LSTM)
cell
cell state
diagram
gate handles
memories
timestamp
vanishing gradient
Long-term memories
Loss functions
M
Machine learning (ML)
accuracy
algorithms
applications
confusion matrix
Deep Double Descent phenomenon
F1 score/F-beta implementation
KNN
SeeK-Nearest Neighbors (KNN)
MAE
MSE
metrics/evaluation
modeling
SeeModeling
neural networks
overfitting-underfitting paradigm
precision
recall implementation
ROC-AUC implementation
Scikit-learn implementation
Manual method
Many-to-many prediction
Mapping
Marginal contributions
Masked language modeling (MLM)
Mean Absolute Error (MAE)
Mean-based grayscale representation
Mean squared error (MSE)
Medical diagnosis datasets
Melting
Memory cells
Memory-equipped recurrent models
Memory neuron
Meta-evaluation model
Meta-learning
Meta-model architecture
Meta-model error estimator
Meta-nonlinearities
Meta-optimization
components
controller model/controlled model
meta-parameters
meta-parameter space
objective function
optimization procedure
Meta-overfitting
Meta-parameters
Mice Protein Expression dataset
min()
Mini-batch stochastic gradient descent (SGD)
Min-max scaling
MNIST dataset
Model accuracy
Model agnostic
Modeling
approximations/representations
automated modeling
bias-variance trade-off
approaches
bias/variance errors
data points
decomposition
high/low-variance
representation
training data
underfitting vs. overfitting
clustering algorithm
dimensionality reduction algorithms
domains and levels
equivalent binary representations
facial recognition
fashion model’s
feature space representation
Chebyshev distance
count proportion
distance
Euclidean distance
geometric shapes
hypercubes
hyperplane separation
one dimensional code
plotting code
proportion
ratio calculation
three-dimensional code
two-dimensional code
fundamental principles
learning
ML data cycle
alignment/misalignment
DataFrame
data leakage
dataset process
data type hierarchy
dominant definitions
dummy dataset construction
ecommerce platform
feature set and a label set
k-fold evaluation
logical structures
manual implementation
memorization/genuine learning
phenomena/concepts
random seeding
train/validation dataset
modes
optimization/gradient descent
beta parameter
global minimum
gradient descent
landscape model
linear regression
loss functions
mean squared error
optimization process
optimizers
parameters
prediction and gradient functions
sequences
phenomena
quantitative conversion/representation
regression/classification problems
scientific model
steady-state approximation
supervised learning
unsupervised learning
Modern language models
Multi-head attention mechanism
Multi-head attention model
Multi-head intersample self-attention mechanism
Multi-head self-attention (MSA)
Multi-input function
Multimodal
applications
components
definition
EfficientNetB1
embedding mechanism
improvements
Keras visualization
requests data
structure
tabular dataset
TensorFlow Sequence dataset
training performance
Multimodal recurrent modeling
compiling and fitting the model
convolutional model
convolutions
embeddings
high precision/nonlinearity
multimodal stock data
recurrent layers
relevant components
sequential input
software reviews dataset
standard feature vector
standard feed-forward fashion
tabular inputs
temporal relationship
time series
training and validation
usage
vectorizing
Multimodal text
Multi-model arrangement
Multi-modeling ensemble
Multipart function
Multiple aggregations
Multi-stack recurrent model
Multitask autoencoders
absolute error
α-adjusting curve
autoencoding task
compiling stage
decoders
dimensions of performance
dynamic fashion
encoder
epochs
learning
MNIST dataset
original task model
outputs
recompiling and fitting
reconstruction loss
reconstruction task
refitting
sigmoid equation
sigmoid function
training regime
visualization
N
“Naïve” meta-optimization algorithms/procedures
NAS-discovered architecture
Natural language models
architecture
double-LSTM
feature maps
multi-head attention approach
multimodal dataset
recurrent language models
score matrix
submodel
TripAdvisor dataset
Natural language processing
Nested search space
Nesterov accelerated gradient (NAG)
Neural Architecture Search (NAS)
Neural network model
backpropagation
backpropagation algorithm
backpropagation process
feed-forward
SeeFeed-forward operation
gradient descent
hard-coded algorithms
Keras
SeeKeras library
loss functions
multilayer perceptron model
optimizers
Adam optimizer
adaptive methods
backpropagation
gradient descent
mini-batch SGD
Nesterov accelerated gradient
perceptron model
regularization
research papers
self-normalizing model
SELU activation function
simple illustration
single perceptron
tabular data
SeeTabular neural networks
tabular deep learning
Universal Approximation Theorem
variations
Neural network search structure
Next sentence prediction (NSP)
No-gradient optimization
Noisy image
Noisy normal distribution
Nonlinearity
Nonlinear topology models
Nontemporal relationship
Nontraditional usage
Nontrivial task
Normal distribution
Novel modeling paradigm
NumPy arrays
advanced indexing
construction
data types
functions
image manipulation
indexing
manipulate
reassignment
reshape
NumPy functions
O
Oblivious decision tree
One-dimensional array
One-dimensional convolutions, tabular data
architectural components
contiguous semantics
custom function
cybersecurity
encoding component
identification synthetic dataset
kernel
loss/accuracy
measurements
meta-parameters
noise standard deviation
numElements elements
performance history
powerful/sophisticated soft ordering
training/validation datasets
Open-ended Neuroelectronic Intelligent Robot Operating System (ONEIROS)
Optimal neural network architecture
Ordinary Least Squares (OLS)
OR operation
P, Q
Pandas DataFrame
advanced mechanics
dummy
indexing single column
mechanics
indices
multiplication
random renaming
resetting index
slicing
storage size
NumPy array
Pattern-based arrays
Pearson’s Correlation Coefficient
Peek model
Performance distributions
Performance value
Physics dynamics
Piecewise function
Pivot operation
Pixel values
Pixel-wise reconstruction
Plain linear activation
Political sentiment
Pooling operation
AlexNet architecture
continual convolution stacking
global pooling
information compression factor
neural network
parameter count scales
parameter scaling
pooled matrix
scaling capability
self-extending
two dimensional max
Predicted class
Prediction task
Preprocessing pipeline
SeeData preparation/preprocessing
Pretraining
autoencoder logic
computer vision
datasets
feature extractor
fine-tuning
frozen encoder/feature extractor
large-scale image classification
latent space size
layer freezing
MNIST dataset
multistage
multitask
natural language processing
semi-supervised method
supervised model training
supervised target
supervised task
task model
validation and training loss curves
Principal Component Analysis (PCA)
advantages
approaches
components
diagonal axis
disadvantages
dummy dataset
engineering/extraction
features extraction
real-world representation
rotated data relative
scree plot
source code
visualization
Probabilistic surrogate functions
pyplot method
Python code
PyTorch
R
Random Forests
classifier
regressor
style training
Random Gaussian noise
Random noise
Random search
Raw vectorization
Realistic synthetic data
Reassignments
Receiver Operating Characteristics (ROC)
Reconstruction loss
Rectified Linear Unit (ReLU)
Recurrent layers
Recurrent layers tabular data
Recurrent models
customer satisfaction
dataset
natural language
probability prediction
vocabulary size
Recurrent models theory
bidirectionality
BPTT
exploding gradients
GRUs
LSTMs
memory cells
recurrent neurons
RNNs
SeeRecurrent neural networks (RNNs)
vanishing gradient
Recurrent neural networks (RNNs)
ANNs
and BPTT
GRUs
LSTMs
memory cells
tabular data
vanishing gradients
Recurrent neurons
Recursive Feature Elimination (RFE)
Red, green, and blue (RGB)
Regression
Regression problem
Regularization learning networks (RLN)
.rename() method
Reparative autoencoders
Replaced token detection (RTD)
Reshape functions
ResNet
architecture
branching operations
definition
DenseNet-style residual connections
hybrid
Keras
residual connection
style
vanishing gradient problem
Return sequences
Return state
Robust scaling
Root Mean Squared Error (RMSE)
S
Sampled latent space vector
Sample sequence
Scaled exponential linear unit (SELU)
Second element
Second-order optimization techniques
Self-Attention and Intersample Attention Transformer (SAINT)
attention maps
authors’ notation
cloning repository
definition
intersample attention density
mean AUROC
OpenML
pretraining mask
training pipeline
transformer block
Self-consciously
Semantic system
Sentiment extraction implementation
Sequence data
Sequence length
Sequence-to-sequence modeling
Sequential autoencoder
Sequential Model–Based Optimization (SMBO)
SHapley Additive exPlanations (SHAP)
gradient explainer
mean
standard deviation
subnetwork
Short-term memory
Shuffles
Sigmoid equation
Sigmoid function
SimpleRNN
Single Headed Attention RNN (SHA-RNN) architecture
Sinusoidal curves
Skip connections
Sklearn model object
Sneaking
Soft decision tree (SDT)
Soft Decision Tree Regressor for Tabular Data (SDTR)
binary tree
custom training function
DataLoader
leaf node
node probabilities
nodes/connections
PyTorch
training SDT
Softmax function
Softmax Gumbel function
Softmax layer
Softmax output
Soft ordering techniques
Sparse autoencoders
alpha value
interpretability
latent size
latent space
layer’s activity
MNIST dataset
parameters
quasi-active
quasi-continuous latent space
reconstruction
regularizations
sampled original inputs
size representation
Sparsity
Speech Accent Archive dataset
SP-LIME
Stack additional layers
Stack operation
Standard deviation
Standard mean squared error
Standard RNN neuron
Stock data
Stock forecasting
Stock prediction
“Stringent” architecture
Supervised learning
Surrogate function
Symmetric factorization module
T
TabNet architecture
attentive tabular learning
dataset pack
decision-making procedure
definition
feature transformer model
implementation
prior scale term
selection masks
TabNetClassifier model
tabnet library
TabNet model
TabTransformer model
architecture
categorical features
compartmentalized design
configuration constants
configuration parameters
definition
features
input heads
Keras model
keras.models.Model
MLP
replaced token detection
submodel
train-validation split
transformer block
t-SNE reduction
Tabular data
generation methods
models
Tabular neural networks
batch normalization
distribution calculation
element-wise linear transformation
ghost batch normalization
interaction modeling methods
leaky gates
modifications
synthetic task
wide/deep learning
Tanh activation function
Task model
taskOut
t-distributed Stochastic Neighbor Embedding (t-SNE)
Tensorflow
TensorFlow database
Keras library
neural networks
TensorFlow datasets
biomedical datasets
custom dataset
data formats
features
filler code
handling datasets
memory
NumPy memory maps
Pandas chunking
pickle files
Python library h5py
SciPy/TensorFlow sparse matrix
key components
multidimensional array
NumPy array
sequence dataset
TensorSliceDataset
tf.keras.utils.Sequence class
Term Frequency–Inverse Document Frequency (TF-IDF)
TextVectorization layer
Three-dimensional array
Time-independent target prediction
Time series
architecture
audio files
feature map representation
high-frequency and high-complexity
high-frequency data
next-timestep forecasting
one-dimensional convolutional layers
one mode
prediction
prices of stocks
recurrent layers
sequential data
sequentially ordered features
target prediction
time-dependent prediction
time-series problems
Time-series data
Time-series dataset
TimeseriesGenerator class
Timestamps
Timestep
Timestep-by-timestep
Training
Training curves
Training GAN
train_step method
Train-validation split
Transformer architecture
Transformer model
Transformer-style positional encoding
Transformer-style sinusoidal position encodings
Transporters
Tree-structured deep learning approaches
Tree-structured neural networks
branch node
configurations
decision tress
DJINN
DNDT
datasets
GPUs
leaf nodes
non-binary decision tree
parameters
PyTorch
single binning network
single-layer neural network
training
DNNF
Grownet
NAS
net-DNF
literals
logical expressions
neural AND function
node
OR function
SDT
weight initialization
XGBoost
Tree-Structured Parzen Estimator
True positive Rate (TPR)
t-SNE method
latent space sizes
nodes
Two-dimensional array
Two-dimensional convolutions, tabular data
contiguous semantics
DeepInsight
IGTD
soft ordering
U
Unfreezing
Universal Approximation Theorem
Unrolled RNN neuron
Unstacking operation
Unsupervised learning
Untampered clean images
Update gate
Upvotes
US Amazon Reviews dataset
V
Valence Aware Dictionary for Sentiment Reasoning (VADER)
Validation
Vanilla
adapting
autoencoder
compartmentalized model
encoder/decoder
hypotheses
images
image and text-based datasets
image processing library
information flow
larger-scale autoencoder experiments
latent space
lines and reconstructions
logarithmic expression
Mice Protein Expression dataset
MNIST dataset
outActivation parameter
overcomplete autoencoder
parameters
pixel values
reconstruction
regression loss
sample latent shape
samples
sequentially
sigmoid function
standard neural networks
synthetic toy line dataset
tabular autoencoder
training/validation
training/visualizing
training history
t-SNE method
Vanilla soft ordering
Vanishing gradient problem
Variance threshold
Variational Autoencoders (VAEs)
advantage
applications/demonstrations
bivariate relationships
decoded linear interpolations
formal optimization problem
Higgs Boson dataset
implementation
latent space distribution
latent space vector
L2-style regularization
Mice Protein Expression dataset
MNIST dataset
n-dimensional normal distribution
reparameterization trick
standard autoencoder
theory
Vectorization methods
Vectorizer
Vectorizing
Voice accent
W
Weighted average
Weighted averaging
Weight of evidence (WoE) technique
Weight sharing model
Wide model
Window size
Word2Vec representations
X, Y
XBNet
architecture
classifier
definition
instantiate model
pseudocode
training
training procedures
XGBoost
XGBoost
Z
Zero vector
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.103.234