Index
A
- A-Priori algorithm
- AdaBoost algorithm
- affinity propagation (AP) clustering
- agglomerative clustering
- algorithm, for association rule generation
- Amazon Web Services (AWS.tools) / Why R?
- Anscombe's quartet / Visualization in R
- application programming interface (API) / Obtaining Twitter data
- arguments
- assignment operator
- association rules
- associations
- associative classification
- attribute
- Auto Regressive Integrated Moving Average (ARIMA) algorithm
B
- bagging algorithm
- basket
- Bayes classification
- Bayesian hierarchical clustering algorithm
- BBN algorithm
- bccmpls / Preliminary analyses
- benefits, R / Why R?
- Big Data
- big data
- binning
- BIRCH algorithm
- Bonferroni's Principle
- Bonferroni correction
- boosting algorithm
- BP algorithm
- brute-force algorithm
C
- C4.5 algorithm
- carrot (>) operator / The basics – assignment and arithmetic
- CART algorithm
- case studies, social media mining
- categorical attributes
- CF-Tree
- chameleon algorithm
- Charm algorithm
- CLARA algorithm
- CLARANS algorithm
- classification
- classification, with frequent patterns
- classification-based methods
- Classification Based on Association (CBA)
- Classification Based on Multiple Association Rules (CMAR)
- CLIQUE algorithm
- closed frequent itemsets
- clustering-based methods
- collective outliers
- Comprehensive R Archive Network (CRAN)
- conditional anomaly detection (CAD) algorithm
- conditional probability tables (CPT)
- constraint-based frequent pattern mining
- contextual outliers
- continuous, numeric attributes
- corpus / Preliminary analyses
- correlation rules
- credit card fraud detection
- credit card transaction flow
- CRISP-DM
- CRM (Customer Relation Management)
- Cubic Clustering Criterion
- CUR decomposition
- customer purchase data analysis
D
- DASL
- data attributes
- data attributes, views
- data classification
- data cleaning
- data description
- data dimension reduction
- data discretization
- data frames
- data integration
- data measuring
- data mining
- Data Quality (DQ)
- dataset
- data smoothing
- data source
- data transformation
- DBSCAN algorithm
- decision tree
- decision tree induction
- decision tree induction, attribute selection measures
- DENCLUE algorithm
- dendrogram / Preliminary analyses
- density-based cluster
- density-based methods
- directed graphs
- discrete, numeric attributes
- discriminative frequent pattern-based classification
- disjunctive normal form (DNF)
- distance-based outlier detection algorithm
- Distributed Storage and List (dsl) / Why R?
- dist variable / Visualization in R
- divisive clustering
- document-term matrix
- document retrieval
- document text
- Dolphin algorithm
- Dropbox / Preliminary analyses
E
- e-commerce
- Eclat algorithm
- eigenvalues
- eigenvectors
- EM methods
- escription length (DL)
- Expectation Maximization (EM) algorithm
F
- Facebook
- FAQs, R / Why R?
- FCA-based association rule mining algorithm
- feature extraction, examples
- files
- FindAllOutsD algorithm
- FindAllOutsM algorithm
- FP-growth algorithm
- frequent itemset
- Frequent Itemset Mining Dataset Repository
- frequent patterns
- frequent subgraph patterns mining algorithm
- frequent subsequence
- frequent substructures
- function
- future prices
G
H
- HadoopInteractiVE (hive) / Why R?
- Hadoop Steaming (HadoopSteaming) / Why R?
- help function
- hError algorithm
- hierarchical agglomerative clustering / Preliminary analyses
- hierarchical clustering
- hierarchical clustering algorithm
- high-dimensional data
- high-performance algorithms
- high-value credit card customers
- High Contrast Subspace (HiCS) algorithm
- HilOut algorithm
- honest signals
- horizontal format
- human sensors
- hybrid association rules mining
I
K
- k-itemset
- k-means algorithm
- k-medoids algorithm
- kNN algorithm
L
M
- machine learning
- machine learning (ML)
- machine learning (ML), classes
- MAFIA algorithm
- MapReduce
- market basket analysis
- market basket model
- maximal frequent itemset (MFI)
- Maximal Marginal Relevance (MMR) algorithm
- Maximum Likelihood Estimation (MLE)
- missing values
- mobile fraud detection
- modifiable areal unit problem (MAUP) / Measurement and inferential challenges
- multidocument summarization algorithm
- multilevel and multidimensional association rules mining
N
- 1NN classifier algorithm
- N-gram-based text-categorization algorithm
- naive Bayes classifier
- Naive Bayes classifier case study / Case study 2 – Naive Bayes classifier
- natural language processing (NLP)
- Naïve Bayes classification
- news categorization
- NL algorithm
- nominal attributes
- nontraditional social data
- normalization methods, data transformation
- numeric attributes
- numeric attributes, types
O
P
- PAM algorithm
- partition-based clustering
- patterns
- PCA
- plot() function / Visualization in R
- PrefixSpan algorithm
- preliminary analyses
- probabilistic hierarchical clustering algorithm
- process, data mining
- ProjectTemplate
- proximity-based methods
Q
- qualitative approaches
- queries
- question answering (QA) system
R
- R
- Random forests algorithm
- R code
- recommendation systems
- references / Further reading
- registerTwitterOAuth function / Obtaining Twitter data
- Relative Closeness (RC), chameleon algorithm
- Relative Interconnectivity (RI), chameleon algorithm
- RHadoop
- RIPPER algorithm
- route outlier detection (ROD) algorithm
- RStudio
- rule-based classification
- rules
S
- Scherers typology of emotions
- search engine
- SEMMA
- sentential frequent itemsets
- sentiment
- sentiment analysis / An expanding field
- sentiment polarity
- seq() function
- sequence dataset
- sequence patterns
- sequences
- sequential covering algorithm
- sequential patterns
- shingling algorithm
- single-pass-any-time clustering algorithm
- social media
- social media data
- social media mining / Final thoughts
- social network
- social networking service (SNS)
- social network mining
- SPADE algorithm
- spam e-mail
- spectral clustering algorithm
- squared error-based clustering algorithm
- Stack Overflow / Why R?
- state of communication section / The state of communication
- statistical method
- statistics
- STING algorithm
- stock market data
- stop words / Preliminary analyses
- STREAM algorithm
- stream data
- Structural Clustering Algorithm for Network (SCAN) algorithm
- summarization
- SURFING algorithm
- SVD
- SVM algorithm
- symbolic sequences
T
- Term Frequency-Inverse Document Frequency (TF-IDF)
- text classification
- text mining
- Text Mining Distributed Corpus Plug-In (tm.plug.dc) / Why R?
- Text Retrieval Conference (TREC)
- text summarization
- time-series data
- Time To Live (TTL)
- topic detection
- topic representation
- topic signature
- Tracking Evolving Clusters in NOisy Streams (TECNO-STREAMS) algorithm
- traditional social data
- traditional social science data
- tree pruning
- Trojan horse
- Trojan traffic identification
- tweet / Scherer's typology of emotions
- tweets
- Twitter
- Twitter data / Scherer's typology of emotions
- twitteR package / Obtaining Twitter data
- Twittersphere / Scherer's typology of emotions
U
- UCI Machine Learning Repository
- undirected graphs
- unsupervised image categorization
- user search intent
V
- vector-space model
- vectors
- vertical format
- visitor analysis, in browser cache
- visualization
- visualization, features
W
- WAVE clustering algorithm
- weak ties
- web attack
- web click streams
- web data mining
- web data mining, tasks
- web key resource page judgment
- web logs
- web page clustering
- web pages
- web sentiment analysis
- web server
- web spam
- web usage mining
- WordCloud package / Preliminary analyses
- WordNet
- World Wide Web (WWW)
Y
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.