Index
A, B
ALS method
Alternating least squares (ALS)
C
Clustering
agglomerative
approaches
centroids
correlation coefficient
databricks notebooks
definition
elbow method
Euclidean method
hierarchica
intra-cluster distance
K-means
3D visualization
ClusteringEvaluator method
Code, LR
Dataframe
dataset
Jupyter notebook
output variable
Pyspark
RMSE
statistical measures
VectorAssembler
Collaborative filtering-based RS
decisions
explicit feedback
implicit feedback
latent factor
missing values
nearest neighbors
user item matrix
“columns” method
Confusion matrix
accuracy
precision
recall
Content-based RS
cosine similarity
Euclidean distance
Movie attributes
user profile
Continuous bag of words (CBOW)
Corpus
corr function
CountVectorizer method
D
Databricks notebook
dropDuplicates function
E, F
Elbow method
Euclidean method
G
Graph computation
groupBy function
H
Hierarchical clustering
I, J
IndexToString function
Information gain (IG)
IoT devices
K
K-means algorithm
K-means clustering
L
Linear regression (LR)
code
Dataframe
dataset
Jupyter notebook
output variable
Pyspark
RMSE
statistical measures
VectorAssembler
confusion matrix
accuracy
precision
recall
data
definition
evaluation
dummy variables
interpretation
model evaluation
probability
probability
logit
output
predictions
users
variable types
sample data
M
Machine learning
AI applications
algorithms
applications
finance
healthcare
manufacturing/automobile
media/marketing
retail
social media
travel/hospitality
categories
computational efficiency
data scientists
definition
supervised
unsupervised
main() function
MLlib library
N, O
Natural Language Processing (NLP)
corpus
CountVectorizer
dataframe
definition
indicator_cumulative
sequence embeddings
stopword removal
StringType
tokenization process
VectorAssembler
P, Q
PySpark
data filtering
dropDuplicates
load/read data
Pandas
RFs
writing data
CSV
Koalas
Parquet
Python’s matplotlib library
R
Random forests (RFs)
advantages
bestModel
code
dataframe
decision tree
attributes
dataset
definition
entropy
IG
hyperparameter tuning
individual decision trees
Spark’s VectorAssembler
test sets
training
Recommender systems (RSs)
categories
content
data info
active user
dataframe
RMSE value
SparkSession object
StringIndexer
train model
definition
features
hybrid
popularity-based RS
retail setting
Rule-based systems
S
Silhouette coefficient method
Spark
APIs
architecture
core
Dataframe
data generation
definition
evolution
internet/social media era
machine data
setting up environment
cluster
databricks
Docker
downloading
installing
notebooks
T
TensorFrame
U
Unsupervised learning
clusterings
reinforcement
semi-supervised
types
V, X, Y, Z
Vocabulary
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.135.21