Index
A
B
- backpropagation
- backtesting
- bag-of-words / Step 2 – exploring and preparing the data
- bagged trees technique / Random forests
- bagging / Random forests
- bagging algorithm
- bandwidth / Probability distributions
- bank loans example, with C5.0 decision trees
- barchart() command / How it works…
- bar charts
- creating / Creating bar charts, How to do it..., Creating bar charts, How it works…, There's more…, See also
- working / How it works..., There's more...
- creating, with multiple factor variable / Creating bar charts with more than one factor variable, How it works...
- creating, with horizontal bar orientation / Adjusting the orientation of bars – horizontal and vertical, How it works..., There's more...
- creating, with vertical bar orientation / Adjusting the orientation of bars – horizontal and vertical, How it works..., There's more...
- widths, adjusting / Adjusting bar widths, spacing, colors, and borders, Getting ready, How it works...
- spacing, adjusting / Adjusting bar widths, spacing, colors, and borders, Getting ready, How it works...
- colors, adjusting / Adjusting bar widths, spacing, colors, and borders, Getting ready, How it works...
- borders, adjusting / Adjusting bar widths, spacing, colors, and borders, Getting ready, How it works...
- values, displaying / Displaying values on top of or next to the bars, How to do it..., How it works..., See also
- labels, placing inside bars / Placing labels inside bars, How to do it..., How it works...
- creating, with vertical error bars / Creating bar charts with vertical error bars, How to do it..., How it works..., There's more...
- stacked bar charts, creating / Creating stacked bar charts, How it works…, There's more…
- creating, to visualize cross-tabulation / Creating bar charts to visualize cross-tabulation, How it works…, There's more…
- creating, ggplot2 library used / Creating bar charts, Getting ready, How it works…, See also
- creating, with error bars / Creating a bar chart with error bars, Getting ready, How it works…, There's more…
- bar colors, histograms
- barplot() function / How to do it..., How it works...
- barplot function / Base graphics using the default package
- bars
- base graphics
- Basel Accords
- Basel I / Basel I
- Basel II
- Basel III / Basel III
- Basel Regulatory Framework
- base R / Visualization methods
- Basic Indicator Approach (BIA) / Minimum capital requirements
- basket
- batch mode / Navigating the basics
- Bayes classification
- Bayes factors / The Bayesian independent samples t-test
- Bayesian hierarchical clustering algorithm
- Bayesian independent samples t-test
- Bayesian linear regression / Advanced topics
- Bayesian methods
- Bayesian methods, basics concepts
- Bayesian methods,basics concepts
- BBN algorithm
- bell curve / Central tendency
- Beowulf cluster
- beta level (β level) / When things go wrong
- betweenness centrality
- bias / The case of linearly separable data
- bias-variance trade-off
- bias-variance tradeoff / Choosing an appropriate k
- big data
- big data analysis, in R
- big data linear regression analysis
- biglm package
- big matrices
- bigmemory package
- bigrf package
- bimodal / Measuring the central tendency – the mode
- Binary options
- binning
- binomial distribution / Null Hypothesis Significance Testing
- bins
- Bioconductor
- bioinformatics
- bioinformatics data
- BIRCH algorithm
- Bitcoin prices
- bivariate relationship (two variable) / Multivariate data
- bivariate relationships
- Black-Scholes surface
- Black model
- blind tasting experience example / The k-NN algorithm
- blowby / Simple linear regression
- body mass index (BMI) / Step 1 – collecting data
- Bonferroni's Principle
- Bonferroni correction / Testing more than two means
- boosting
- boosting algorithm
- bootstrap aggregating / Random forests
- bootstrap sampling / Bootstrap sampling
- borders, histograms
- box-and-whisker plot / Relationships between a categorical and a continuous variable
- box-and-whiskers plot / Visualizing numeric variables – boxplots
- box plot
- boxplot() command / How to do it...
- boxplot() function / There's more...
- boxplot() method
- boxplot function / Base graphics using the default package
- box plots
- creating / Creating box plots
- working / How it works..., There's more...
- creating, with narrow boxes / Creating box plots with narrow boxes for a small number of variables, How to do it..., How it works..., There's more
- grouping, over variable / Grouping over a variable, How it works..., There's more
- creating, with notches / Creating box plots with notches, How it works..., There's more
- styling / Changing the box styling, How it works..., There's more
- whiskers, adjusting / Adjusting the extent of plot whiskers outside the box, How to do it..., How it works...
- number of observations, displaying / Showing the number of observations, Getting ready, How it works..., There's more
- variable, splitting at arbitrary intervals / Splitting a variable at arbitrary values into subsets, Getting ready, How it works..., There's more
- creating, ggplot2 library used / Creating a box plot, How it works…
- box styles
- box widths
- BP algorithm
- branches
- breast cancer
- breast cancer example
- brute-force algorithm
C
- C4.5 algorithm
- C5.0 algorithm
- calendar.plot() function / There's more
- calendar heat maps
- call quanto
- candle patterns, key reversal
- cap
- carb / Introduction
- caret package
- CART algorithm
- cash-flow
- cash-flow generator functions / Cash-flow generator functions
- categorical / Types of input data
- categorical attributes
- categorical data
- categorical variable
- categorical variables
- cell body / From biological to artificial neurons
- central tendency
- centroid / Using distance to assign and update clusters
- CF-Tree
- chameleon algorithm
- character data type / Logicals and characters
- characteristics, neural networks
- Charm algorithm
- charts, bitcoin
- chi-square distribution / Testing independence of proportions
- chi-squared statistic / Testing independence of proportions
- circular decision boundary / The circular decision boundary
- CLARA algorithm
- CLARANS algorithm
- classification / Types of machine learning algorithms
- classification, with frequent patterns
- classification-based methods
- classification and regression training (caret package) / Beyond accuracy – other measures of performance
- Classification and Regression Tree (CART) algorithm / Understanding regression trees and model trees
- Classification Based on Association (CBA)
- Classification Based on Multiple Association Rules (CMAR)
- classification performance
- classification prediction data-classification prediction data
- classification rules
- classifier
- class imbalance problem / Measuring performance for classification
- CLIQUE algorithm
- closed frequent itemsets
- closely packed data points
- clustering / Types of machine learning algorithms
- clustering, k-means clustering algorithm
- clustering-based methods
- CMYK (Cyan Magenta Yellow Key) color model / There's more
- Cohen’s d / Don't be fooled!
- cointegration / Cointegration
- col argument / How it works...
- collective outliers
- colorBrewer
- color combinations
- colors
- of points, setting / Setting colors of points, lines, and bars, How to do it..., How it works...
- of lines, setting / How to do it..., How it works...
- of bars, setting / How to do it..., How it works...
- of axis annotations, setting / How to do it..., There's more...
- of axis labels, setting / How it works..., There's more...
- of plot titles, setting / How it works..., There's more...
- of legends, setting / There's more...
- column-major order / Matrixes and arrays
- combination function / Understanding ensembles
- comments / Arithmetic and assignment
- Complete Unified Device Architecture (CUDA)
- Comprehensive R Archive Network (CRAN) / Working with packages
- concrete strength, modeling with ANNs
- conditional anomaly detection (CAD) algorithm
- conditional histogram
- conditional probability
- conditional probability tables (CPT)
- conditional scatter plot
- conditional value at risk (CVaR) / Monte-Carlo simulation
- confidence intervals
- confusion matrix / Confusion matrices
- connections
- constraint-based frequent pattern mining
- contextual outliers
- continuous, numeric attributes
- continuous variable
- continuous variables
- contour function / Base graphics using the default package
- contour plots
- controlled experiment / Testing two means
- control object / Customizing the tuning process
- convex hull / The case of linearly separable data
- core-periphery decomposition
- corpus / Data preparation – cleaning and standardizing text data
- correlation
- correlation coefficients / Correlation coefficients
- correlation heat maps
- correlation matrix
- correlation rules
- cost complexity pruning / Decision trees
- countries
- covariance / Covariance
- covariance matrix / Comparing multiple correlations
- Cox-Ingersoll-Ross model
- CRAN
- CRAN task view
- CRAN Web Technologies
- credit card fraud detection
- credit card transaction flow
- credit default swap (CDS) / Credit risk
- credit risk / Credit risk
- crescent decision boundary / The crescent decision boundary
- CRISP-DM
- CRM (Customer Relation Management)
- cross-tabulation / Relationships between two categorical variables
- cross-validation / Cross-validation
- cross tab / Relationships between two categorical variables
- CSV (Comma-Separated Values) file
- CSV files
- Cubic Clustering Criterion
- CUR decomposition
- curl utility
- currency options
- customer purchase data analysis
- customized legends
- cut points
- cyl / Introduction
D
- 3D plot
- DASL
- data
- data.table package
- data attributes
- data attributes, views
- databaseBasketball
- Database Management Systems (DBMSs)
- databases
- data classification
- data cleaning
- data description
- data dictionary
- data dimension reduction
- data discretization
- data exploration
- data formats
- data frame
- data integration
- data measuring
- data mining
- data munging
- data points
- data preparation
- data preparation, breast cancer example
- Data Quality (DQ)
- data selection / Data selection
- dataset
- data smoothing
- data source
- Data Source Name (DSN)
- data storage / Data storage
- data structures, R
- data table
- data transformation
- data visualization
- data warehouse (DWH) / Data preparation
- data wrangling
- date variable
- day / Introduction
- DBSCAN algorithm
- ddply()function / Getting ready
- decision nodes
- decision tree
- decision tree forests
- decision tree induction
- decision tree induction, attribute selection measures
- decision trees
- deep learning
- Deep Neural Network (DNN)
- default package
- degrees of freedom / Populations, samples, and estimation
- delimiter
- delta hedge performance
- DENCLUE algorithm
- dendrites
- dendrograms
- density() function / There's more...
- density, data points
- density-based cluster
- density-based methods
- density line
- density plot
- density plots
- dependent events / Understanding joint probability
- dependent variable
- derivatives
- descriptive model / Types of machine learning algorithms
- dev.off() command / How it works...
- diagonal decision boundary / The diagonal decision boundary
- dimensions
- directed graphs
- directional hypothesis / One and two-tailed tests
- discrete, numeric attributes
- discrete numeric variable / Univariate data
- discriminative frequent pattern-based classification
- disjunctive normal form (DNF)
- disk-based data frames
- disp / Introduction
- distance-based outlier detection algorithm
- distributions
- divide and conquer
- divisive clustering
- document retrieval
- document text
- Dolphin algorithm
- domain-specific data
- doParallel package
- dotchart() function / How to do it..., How it works...
- dotchart function / Base graphics using the default package
- dot charts
- dots per inch (dpi) / There's more...
- double-knock-in (DKI)
- double-knock-out (DKO)
- Double-no-touch (DNT)
- Double-no-touch option
- Double-one-touch (DOT) / The life of a Double-no-touch option – a simulation
- Dow Jones Industrial Average index (DIJA) / Neural networks
- dplyr package
- drat / Introduction
- dummy coding / Preparing data for use with k-NN, Step 3 – training a model on the data
- dummy variable / Examining relationships – two-way cross-tabulations, Step 3 – training a model on the data
- dynamic delta hedge / Dynamic delta hedge
- dynamic hedging
E
F
- 10-fold cross-validation (10-fold CV) / Cross-validation
- F-measure / The F-measure
- F-score / The F-measure
- F1 score / The F-measure
- factor
- Fama-French model
- Fama-French three-factor model / Fama-French three-factor model
- FCA-based association rule mining algorithm
- feature extraction, examples
- Federal Reserve Economic Data (FRED) / Getting data from open sources
- feed-forward neural networks (FFNN) / Neural networks
- feedforward networks
- ffbase project
- ff package
- filled.contour() function / How it works...
- filled contour plots
- filtering
- FindAllOutsD algorithm
- FindAllOutsM algorithm
- findInterval() function / How it works...
- five-number summary / Measuring spread – quartiles and the five-number summary
- flow of control constructs / Flow of control
- font families
- selecting, under Windows / Choosing font families and styles under Windows, Mac OS X, and Linux, How it works..., There's more
- selecting, under Mac OS X / Choosing font families and styles under Windows, Mac OS X, and Linux, How it works..., There's more
- selecting, under Linux / Choosing font families and styles under Windows, Mac OS X, and Linux, How it works..., There's more
- fonts
- setting, for annotations / Setting fonts for annotations and titles, There's more..., See also
- setting, for titles / Setting fonts for annotations and titles, There's more..., See also
- selecting, for PostScripts / Choosing fonts for PostScripts and PDFs, How it works..., There's more
- selecting, for PDFs / Choosing fonts for PostScripts and PDFs, How it works..., There's more
- foreach package
- FP-growth algorithm
- FRED (Federal Reserve Economic Data)
- frequency distributions
- frequent itemset
- Frequent Itemset Mining Dataset Repository
- frequently purchased groceries
- frequent patterns
- frequent subgraph patterns mining algorithm
- frequent subsequence
- frequent substructures
- functions / Functions
- fundamental analysis
- fundamental equity strategy
- future performance
- future performance estimation
- future prices
- FX
- FX rates
G
- GADM
- GADM data
- GARCH model
- GARCH modeling, with rugarch package
- Gaussian distribution / Central tendency
- Gaussian RBF kernel / Using kernels for non-linear spaces
- gear / Introduction
- generalization / Generalization
- Generalized Linear Model (GLM) / Logistic regression
- Generalized Linear Models (GLM) / Understanding regression
- general pricing approach
- GenMax algorithm
- genre categorization
- Geographical Information Systems (GIS) data formats / Introduction
- geometric Brownian motion (GBM) / Dynamic delta hedge
- geom_bar() function / There's more…
- get.hist.quote() function
- ggplot2
- ggplot2 library
- about / Introduction
- used, for creating bar charts / Creating bar charts, Getting ready, How it works…, See also
- used, for creating density plot / Visualizing the density of a numeric variable, How it works…
- used, for creating box plots / Creating a box plot, How it works…
- line charts, creating with / Creating a line chart, How it works…, There's more...
- graph, annotating with / Graph annotation with ggplot, How to do it..., How it works...
- ggplot2 package
- GLM (general linear model) / Estimation of the Fama-French model
- global data
- glyph / Step 1 – collecting data
- Google maps
- GPU
- gradient descent / Training neural networks with backpropagation
- Grammar of Graphics
- graph
- Graph-Based Sub-topic Partition Algorithm (GSPSummary) algorithm
- graph and network data
- graph margins
- graph mining
- Graph Modeling Language (GML)
- graphs
- inspired, by Grammar of Graphics / Graphs inspired by Grammar of Graphics
- creating, with maps / Getting ready, How it works..., There's more...
- saving / Saving and exporting graphs, How it works..., See also
- exporting / How to do it..., How it works..., See also
- creating, with regional maps / Creating graphs with regional maps, How it works..., There's more
- exporting, in high-resolution image formats / Exporting graphs in high-resolution image formats – PNG, JPEG, BMP, and TIFF, How to do it..., How it works..., There's more
- exporting, in vector formats / Exporting graphs in vector formats – SVG, PDF, and PS, How it works..., There's more
- text descriptions, adding / Adding text descriptions to graphs, How to do it..., How it works..., There's more
- graph templates
- greedy learners / What makes trees and rules greedy?
- Greeks
- grid
- grid() function / How it works...
- gross incomes (GI) / Minimum capital requirements
- group argument / How it works…
- grouped data points
- GSP algorithm
H
- Hadoop
- harmonic mean / The F-measure
- hash-tag / Arithmetic and assignment
- header line
- heatmap() function / How to do it..., How it works...
- heatmap function / Base graphics using the default package
- heat maps
- hedge optimization / Optimization of the hedge
- help.start() function / Getting help in R
- hError algorithm
- hierarchical clustering
- hierarchical clustering algorithm
- high-dimensional data
- high-level data visualization
- high-performance algorithms
- high-resolution image formats
- high-value credit card customers
- High Contrast Subspace (HiCS) algorithm
- high frequency trading (HFT) / The TA toolkit
- HilOut algorithm
- hist() function / How it works...
- hist function / Base graphics using the default package
- histogram
- histograms
- creating / Creating histograms and density plots, How to do it...
- working / How it works..., There's more..., See also
- distributions, visualizing as count frequencies / Visualizing distributions as count frequencies or probability densities, How to do it..., How it works..., There's more
- distributions, visualizing as probability densities / Visualizing distributions as count frequencies or probability densities, How to do it..., How it works..., There's more
- bin size, setting / Setting the bin size and the number of breaks, How it works..., There's more
- breaks, setting / Setting the bin size and the number of breaks, How it works..., There's more
- bar colors, adjusting / Adjusting histogram styles – bar colors, borders, and axes, How it works..., There's more
- borders, adjusting / Adjusting histogram styles – bar colors, borders, and axes, How it works..., There's more
- axes, adjusting / Adjusting histogram styles – bar colors, borders, and axes, How it works..., There's more
- density line, overlaying / Overlaying a density line over a histogram, How it works...
- drawing, in margins of line plots / Histograms in the margins of line and scatter plots, How it works...
- drawing, in margins of scatter plots / Histograms in the margins of line and scatter plots, How it works...
/ Visualizing numeric variables – histograms
- historical VaR / Historical VaR
- Hmisc package
- holdout method / The holdout method, Cross-validation
- Holm-Bonferroni correction / Testing more than two means
- horizontal box plots
- horizontal format
- horizontal grid lines
- hp / Introduction
- httr package
- hybrid association rules mining
- hyperplane / Understanding Support Vector Machines
- Hypertext Markup Language (HTML)
I
J
K
L
M
- MACD / MACD
- machine / Using a bigger and faster machine
- machine learning
- machine learning (ML)
- machine learning (ML), classes
- machine learning, in practice
- machine learning, process
- machine learning algorithms
- Mac OS X
- MAFIA algorithm
- magrittr package
- Mann-Whitney U test / What if my assumptions are unfounded?
- MapReduce
- maps
- marginal likelihood
- margin labels
- Margin of static replication
- Margrabe formula / The Margrabe formula
- marker lines
- market basket analysis
- market basket analysis example
- data, collecting / Step 1 – collecting data
- data, preparing / Step 2 – exploring and preparing the data
- data, exploring / Step 2 – exploring and preparing the data
- sparse matrix, creating for transaction data / Data preparation – creating a sparse matrix for transaction data
- item support, visualizing / Visualizing item support – item frequency plots
- transaction data, visualizing / Visualizing the transaction data – plotting the sparse matrix
- model, training on data / Step 3 – training a model on the data
- model performance, evaluating / Step 4 – evaluating model performance
- model performance, improving / Step 5 – improving model performance
- set of association rules, sorting / Sorting the set of association rules
- subset of association rules, sorting / Taking subsets of association rules
- association rules, saving to file / Saving association rules to a file or data frame
- association rules, saving to data frame / Saving association rules to a file or data frame
- market basket model
- market efficiency / Market efficiency
- market risk / Market risk
- market risk, of derivatives
- mathematical notations
- matrix / Matrixes and arrays
- matrix notation / Multiple linear regression
- maturity (M) / Minimum capital requirements
- maximal frequent itemset (MFI)
- Maximal Marginal Relevance (MMR) algorithm
- Maximum Likelihood Estimation (MLE) / Logistic regression
- maximum margin hyperplane (MMH) / Classification with hyperplanes
- mean / Measuring the central tendency – mean and median
- mean absolute error (MAE) / Measuring performance with the mean absolute error
- mean height
- Mean Squared Error (MSE) / Simple linear regression
- measures of spread
- medical expenses, predicting with linear regression
- about / Example – predicting medical expenses using linear regression
- data, collecting / Step 1 – collecting data
- data, preparing / Step 2 – exploring and preparing the data
- data, exploring / Step 2 – exploring and preparing the data
- correlation matrix / Exploring relationships among features – the correlation matrix
- relationships, visualizing among features / Visualizing relationships among features – the scatterplot matrix
- scatterplot matrix / Visualizing relationships among features – the scatterplot matrix
- model, training on data / Step 3 – training a model on the data
- model performance, training / Step 4 – evaluating model performance
- model performance, improving / Step 5 – improving model performance, Model specification – adding non-linear relationships, Transformation – converting a numeric variable to a binary indicator, Model specification – adding interaction effects, Putting it all together – an improved regression model
- message-passing interface (MPI)
- meta-learners / Types of machine learning algorithms
- meta-learning methods
- mfcol argument / How it works...
- mfrow argument / How it works...
- min-max normalization / Preparing data for use with k-NN
- minimum capital requirements / Minimum capital requirements
- missing data
- analysis / Analysis with missing data
- visualizing / Visualizing missing data
- types / Types of missing data, So which one is it?
- methods, for dealing / Unsophisticated methods for dealing with missing data
- complete case analysis / Complete case analysis
- pairwise distribution / Pairwise deletion
- mean substitution / Mean substitution
- hot deck imputation / Hot deck imputation
- regression imputation / Regression imputation
- stochastic regression imputation / Stochastic regression imputation
- multiple imputation / Multiple imputation, So how does mice come up with the imputed values?
- out-of-bounds data, checking for / Checking for out-of-bounds data
- column data type, checking / Checking the data type of a column
- unexpected categories, checking / Checking for unexpected categories
- outliers, checking for / Checking for outliers, entry errors, or unlikely data points
- entry errors, checking / Checking for outliers, entry errors, or unlikely data points
- unlikely data points, checking / Checking for outliers, entry errors, or unlikely data points
- outliers, checking / Checking for outliers, entry errors, or unlikely data points
- assertions, chaining / Chaining assertions
- missing values
- mixed data
- mobile fraud detection
- mobile phone spam
- mobile phone spam example
- data, collecting / Step 1 – collecting data
- dat a collecting, URL / Step 1 – collecting data
- data, preparing / Step 2 – exploring and preparing the data
- data, exploring / Step 2 – exploring and preparing the data
- text data, cleaning / Data preparation – cleaning and standardizing text data
- text data, standardizing / Data preparation – cleaning and standardizing text data
- text documents, splitting into words / Data preparation – splitting text documents into words
- training, creating / Data preparation – creating training and test datasets
- test datasets, creating / Data preparation – creating training and test datasets
- text data, visualizing / Visualizing text data – word clouds
- indicator features, creating for frequent words / Data preparation – creating indicator features for frequent words
- model, training on data / Step 3 – training a model on the data
- model performance, evaluating / Step 4 – evaluating model performance
- model performance, improving / Step 5 – improving model performance
- model, of deposit interest rate development / A Model of deposit interest rate development
- modeling, in R
- model performance
- model performance, breast cancer example
- model trees / Understanding regression trees and model trees
- money management
- Monte-Carlo simulation / Monte-Carlo simulation
- month / Introduction
- mpg / Introduction
- multi-layer precepton (MLP) / Neural networks
- multicore package
- multidocument summarization algorithm
- multilayer network
- Multilayer Perceptron (MLP)
- multilevel and multidimensional association rules mining
- multimodal / Measuring the central tendency – the mode
- multinomial logistic regression / Understanding regression
- multiple-line graphs
- multiple bar charts
- multiple correlations
- multiple histograms
- multiple linear regression / Understanding regression
- multiple means
- multiple plot matrix layouts
- multiple R-squared value (coefficient of determination) / Step 4 – evaluating model performance
- multiple regression
- multiple variables
- multivariate continuous data visualization
- multivariate data
- multivariate relationships
- multivariate time series analysis
- multivariate visualization
- MusicBrainz
N
O
- OCR, performing with SVMs
- OCSVM (One Class SVM) algorithm
- ODIN algorithm
- one-class nearest neighbor algorithm
- one-tailed hypothesis test
- one-tailed test / One and two-tailed tests
- one-way table / Exploring categorical variables
- one sample t-test
- online data
- online repositories
- online services
- Open Database Connectivity (ODBC)
- OpenRefine / OpenRefine
- open sources
- operational risk
- opinion-orientation algorithm
- opinion mining
- OPTICS-OF algorithm
- OPTICS algorithm
- optimized learning algorithms
- optimized packages
- optimizing
- order() function / How it works...
- ordinal / Types of input data
- ordinal attributes
- ordinary least squares estimation
- Out-Of-Bag (OOB) / Random forests
- out-of-bag error rate / Training random forests
- outlier detection
- outliers
- overfitting / Evaluation
- ozone / Introduction
P
- p-value
- pair plots
- pairs() command / How it works...
- pairs() function / How it works...
- pairs plot
- pair trading
- pairwise t-tests / Testing more than two means
- palettes
- PAM algorithm
- par() command / How to do it..., How it works...
- parallel cloud computing
- parallel computing
- parallelization
- parameter
- parameter tuning
- parametric statistical tests / What if my assumptions are unfounded?
- partition-based clustering
- pattern discovery / Types of machine learning algorithms
- patterns
- PCA
- PDFs
- Pearson's correlation coefficient / Correlations
- Pearson’s correlation / Correlation coefficients
- performance
- performance measures
- performance tradeoffs
- persp function / Base graphics using the default package
- pie() function / How it works...
- pie charts
- pie function / Base graphics using the default package
- plot
- plot() command / How it works..., How it works...
- plot() function / There's more...
- plot background colors
- plotCI function
- plot function / Base graphics using the default package
- plotrix package
- plotting point symbol
- plot titles
- png() command / How to do it...
- points
- points() function / There's more...
- poisonous mushrooms
- poisonous mushrooms example, with rule learners
- Poisson regression
- polynomial kernel / Using kernels for non-linear spaces
- polynomial regression / The circular decision boundary
- population / Populations, samples, and estimation
- position
- positively skewed / Central tendency
- positive predictive value / Precision and recall
- POSIXlt class / How it works...
- posterior probability
- postpruning
- PostScripts
- power / When things go wrong
- pre-pruning
- precision / Precision and recall
- predict function / Random forests
- predictive model / Types of machine learning algorithms
- PrefixSpan algorithm
- Price/Cash flow (P/CF) / Separating investment targets
- pricing formula
- principal component analysis
- prior probability
- probabilistic hierarchical clustering algorithm
- probability
- probability density function (PDF) / Probability distributions
- probability distributions
- probability mass function (PMF) / Probability distributions
- probability of default (PD) / Minimum capital requirements
- process, data mining
- proprietary files
- about / Working with proprietary files and databases
- Microsoft Excel files, reading / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- Microsoft Excel files, writing / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- SAS files, writing / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- SAS files, reading / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- SPSS files, reading / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- SPSS files, writing / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- Stata files, writing / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- Stata files, reading / Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
- proprietary microarray
- proximity-based methods
- pruning / Decision trees
- pure / Choosing the best split
- purity / Choosing the best split
Q
R
- 1 R algorithm / The 1R algorithm
- R
- about / Navigating the basics, Why R?, Machine learning with R
- arithmetic operators / Arithmetic and assignment
- assignment operators / Arithmetic and assignment
- logical data type / Logicals and characters
- character data type / Logicals and characters
- flow of control constructs / Flow of control
- help, obtaining / Getting help in R
- data, loading / Loading data into R, Working with packages
- k-NN, using / Using k-NN in R
- logistic regression / Using logistic regression in R
- overview / Introduction
- on Linux, URL / Introduction
- on Mac OS X, URL / Introduction
- on Widows, URL / Introduction
- inbuilt datasets / A note on R's built-in datasets
- recycling in / There's more...
- advantage / Why R?
- disadvantage / What are the disadvantages of R?
- statistics / Statistics and R
- visualization / Visualization with R
- packages, installing / Installing R packages
- packages, loading / Loading and unloading R packages
- packages, unloading / Loading and unloading R packages
- data structures / R data structures
- used, for managing data / Managing data with R
- working with classification prediction data / Working with classification prediction data in R
- R, performance improvement
- R-squared value / Step 4 – evaluating model performance
- Radial Basis Function (RBF) network
- random forests
- Random forests algorithm
- rank
- rbinom() / Introduction
- rbinom() function / How it works...
- R code
- RColorBrewer
- Rcpp
- RCurl
- Read-Evaluate-Print-Loop (REPL) / Navigating the basics
- rea under the ROC curve (AUC) / ROC curves
- Receiver Operating Characteristic (ROC) curve
- recommendation systems
- recovery rate (RR) / Credit risk
- recurrent network
- recursive partitioning
- recursive splitting / Decision trees
- recycling
- regional maps
- regression / Correlation coefficients
- regression analysis
- regression equations
- regression models
- regression plane
- regression trees
- regular expressions / Regular expressions
- regularization / Advanced topics
- relational database
- relationships
- Relative Closeness (RC), chameleon algorithm
- Relative Interconnectivity (RI), chameleon algorithm
- relative strength indicator (RSI) / Built-in indicators
- relative transaction costs
- Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm / The RIPPER algorithm
- residuals / Ordinary least squares estimation
- Residual Sum of Squares (RSS) / Simple linear regression
- resubstitution error / Estimating future performance
- results
- Revolution Analytics
- RGB (Red Green Blue) model / There's more
- rggobi package
- RGL
- rgl.surface() function
- RHadoop
- RHIPE package
- right-tailed / Central tendency
- R implementation
- rio package
- RIPPER algorithm
- risk-weighted assets (RWA) / Basel I
- risk categories
- risk measures
- risky bank loans
- rnorm() / Introduction
- rnorm(1000) function / How it works...
- rnorm function / Estimating means
- Root Mean Squared Error (RMSE) / Simple linear regression
- rote learning
- route outlier detection (ROD) algorithm
- rpart.plot
- R projects
- R Scripting
- R scripts
- RSI / RSI
- RStudio
- rudimentary ANNs / Understanding neural networks
- rug() function / How to do it...
- rule-based classification
- rules
- runif() / Introduction
- rvest package
S
- samples / Populations, samples, and estimation
- sampling distribution / The sampling distribution
- scale() function / How it works...
- scale argument / There's more
- scatter plot / Creating a conditional scatter plot
- scatterplot / The relationship between two continuous variables
- scatterplot3d() function / How it works..., There's more...
- scatterplot3d function / How it works…, How it works…
- scatterplot matrix (SPLOM) / Visualizing relationships among features – the scatterplot matrix
- scatter plots
- scientific notations
- Scoville scale / Preparing data for use with k-NN
- scripting
- search engine
- seasonal component / The seasonal component
- segmentation analysis / Types of machine learning algorithms
- select argument / There's more…
- semi-supervised learning / Clustering as a machine learning task
- SEMMA
- sentential frequent itemsets
- separate and conquer
- sequence dataset
- sequence patterns
- sequential covering algorithm
- sequential patterns
- SETAR model
- shapefiles / There's more
- Shapiro-Wilk test / What if my assumptions are unfounded?
- shingling algorithm
- sigmoid kernel / Using kernels for non-linear spaces
- signals
- simple linear regression / Understanding regression
- simple moving average (SMA) / Built-in indicators
- simple tuned model
- Simpson’s Paradox / Relationships between two categorical variables
- simulation method
- simulations, in R
- single-pass-any-time clustering algorithm
- Site4 / How it works...
- skewness degree / Central tendency
- slack variable / The case of nonlinearly separable data
- slope
- slope-intercept form
- SMA / SMA and EMA
- smaller samples / Smaller samples
- SMFI5 package
- smoothed density scatter plots
- smoothScatter() function / Getting ready, How it works..., There's more...
- SMS Spam Collection
- snowball
- snow package
- social network
- social networking service (SNS) / Example – finding teen market segments using k-means clustering
- social network mining
- Solar.R / Introduction
- sortCol argument / There's more…
- SPADE algorithm
- spam e-mail
- sparklines
- sparse matrix / Data preparation – splitting text documents into words, Data preparation – creating a sparse matrix for transaction data
- Spearman’s rank coefficient (rho) / Correlation coefficients
- spectral clustering algorithm
- split() function / How it works...
- split point / Decision trees
- spread
- sprintf() function / How it works...
- SQL databases
- SQL query / Why didn't we just do that in SQL?
- squared error-based clustering algorithm
- squashing functions / Activation functions
- stacked bar charts
- stacking
- standard deviation / Spread
- standard deviation reduction (SDR) / Adding regression to trees
- standard error / The sampling distribution
- Standard GARCH model / The standard GARCH model
- Standardized Approach (STA) / Minimum capital requirements
- static delta hedge / Static delta hedge
- static replication, of non-maturity deposits / Static replication of non-maturity deposits
- statistical arbitrage
- statistical hypothesis testing / Understanding regression
- statistical method
- statistics
- STING algorithm
- stochastic volatility (SV) models / Volatility modeling
- stock charts
- stock market data
- stock models
- stocks
- STREAM algorithm
- stream data
- strptime() function / How it works...
- Structural Clustering Algorithm for Network (SCAN) algorithm
- structured products
- Structured Query Language (SQL)
- subsetting / Subsetting
- subtree raising / Pruning the decision tree
- subtree replacement / Pruning the decision tree
- summarization
- summary statistics / Exploring numeric variables
- supervised learning / Types of machine learning algorithms
- supervisory review / Supervisory review
- Supervisory Review Evaluation Process (SREP) / Supervisory review
- Support Vector Machine (SVM) / Bagging
- support vectors / Classification with hyperplanes
- SURFING algorithm
- SVD
- SVM algorithm
- SVMlight
- symbolic sequences
- synapse
- systemic risk, in nutshell
T
U
V
W
- WAVE clustering algorithm
- web attack
- web click streams
- web data mining
- web data mining, tasks
- web key resource page judgment
- web logs
- web page clustering
- web pages
- web scraping
- web sentiment analysis
- web server
- web spam
- web usage mining
- whiskers, box plots
- width
- Wilcoxon rank-sum test / What if my assumptions are unfounded?
- wind / Introduction
- Windows
- wine quality estimation, with regression trees
- within cluster sum of squares (WCSS) / Big data K-means clustering analysis
- word cloud
- wordcloud package
- WordNet
- world map
- wt / Introduction
X
Y
Z
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.