Index


a

  • acast 81
  • ACC (accuracy) 170, 172–173, 179
  • ACF (autocorrelation function) 178
  • ACID (Atomicity, Consistency, Isolation, Durability) 86
  • active state 7
  • ADD 99–101, 107, 112–113, 117
  • additive model 178
  • advanced analytics 1
  • Agglomerative Clustering 175
  • aggregate functions 97, 104–105
  • airquality dataset 21–22, 27–29, 102, 153
  • algebra 4
  • algorithms 22, 166, 168, 169, 174–176
  • aligned 9
  • Allaire, J. J. 183
  • allocation 181
  • alpha value 163
  • alphabetical order 89
  • ALTER TABLE 94, 99–101, 107, 112–113
  • alternative hypothesis 163
  • analysis 1–5, 7–8, 14, 19–20, 22, 29–30, 51–58, 60, 75, 151, 160, 167, 177, 179–182
  • AND 90–93
  • angle 168, 181
  • annual 2
  • anscombe 129, 130
  • Apache Hive 85
  • application 1, 7, 179
  • AR (autoregressive model) 179
  • area under the curve (AUC) 165, 174
  • ARIMA 178–179
  • ARIMAX 178
  • arithmetic 36, 159, 179
  • arrays, 3, 23, 119, 121—123 125
  • AS 98–99
  • ascending order 89, 105
  • as.Date function 39
  • assigning values 12, 33, 42, 46
  • assignment operators 47
  • association 180
  • asymmetry 162
  • Atkins, A. 183
  • Atomicity, Consistency, Isolation, Durability (ACID) 86
  • AUC (area under the curve) 165, 174
  • autocorrelation function (ACF) 178
  • autoregressive model 178–179
  • availability 85, 86
  • avg.() 97
  • axes 130

b

  • Balanced Iterative Reducing and Clustering using Hierarchies (Birch) 176
  • bar chart 130
  • Bar‐Line Plot 130, 133, 134, 143
  • Bar Plot 130–132, 143
  • Barr, Anthony James 1
  • BASE (Basically Available, Soft state, Eventual consistency) 86
  • Base SAS 2, 3, 86
  • Basically Available, Soft state, Eventual consistency (BASE) 86
  • Bayes theorem 166, 169
  • Bayesian network 179
  • Bell Laboratories 2
  • bell curve 161
  • Bernoulli 161
  • BETWEEN 103–104
  • Big Data xiii
  • binary classifier system 169, 171–173
  • BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) 176
  • Book table 110
  • Box Plot 130, 135, 144
  • brackets 22
  • breadth 19
  • Brow, Dan 113, 116
  • Brown, Dan 107, 112
  • Brownlee, J. 183
  • browser 2, 154, 156
  • Bubble Plot 130, 136, 144
  • bundle 3
  • business analytics xiii, 1

c

  • Calculator 35
  • CALL 119
  • Call Symput 125–126
  • CAP theorem 85
  • carburettors 79
  • cards 9, 15, 45
  • categorical variables 33, 48, 57–58, 75, 159, 160
  • CDF (cumulative distribution function) 165
  • central limit theorem 162
  • central tendency 51, 160, 161
  • centroid 175, 176
  • CF Nodes (Characteristic Feature nodes) 176
  • CFT (Characteristic Feature Tree) 176
  • Chambers, John 2
  • Chang, W. 183
  • character variables 9, 15, 17, 22–23, 25, 33, 46
  • Characteristic Feature nodes (CF Nodes) 176
  • Characteristic Feature Tree (CFT) 176
  • Cheng, J. 183
  • Chi Square 161, 163
  • Cij 171
  • circle chart 130
  • CLASS 51, 53, 104
  • click 1–2
  • clustering 174–176, 180
  • Collins 107, 112
  • comma elimination 34
  • Comma Separated Value (CSV) Files 10, 11
  • completeness 174
  • complex matrix 180
  • Composite Method 179
  • comprehensive 3, 54
  • Comprehensive R Archive Network (CRAN) 3, 174
  • compress 45
  • concatenation 44
  • confusion matrix 171–173
  • consistency 85–86
  • constant 178
  • continuous variable 159–160
  • core 176
  • Corpus 180
  • Cosine similarity 181
  • count() 97
  • CRAN (Comprehensive R Archive Network) 3, 174
  • create table 98
  • cross tabulations 75–82
  • cross validation 167
  • CrossTable 80
  • Croston 179
  • CSV (Comma Separated Value) Files 10, 11
  • Cth column 22
  • cumulatives 75–76, 165
  • cumulative distribution function (CDF) 165
  • curse 180
  • curse of dimensionality 180
  • cyclicality 177
  • cylinders 79, 140

d

e

  • Econometrics and Time Series Analysis (ETS) 2–3
  • Eigendecomposition 180
  • Elastic Net regression 168
  • elbow 177
  • emails 42
  • encoded 22
  • end xiii, 20, 23, 119–120, 122–123, 133, 141
  • error 22, 29, 31, 54, 85–86, 122, 160, 163–164, 168, 171–172, 178–179
  • ETS (Econometrics and Time Series Analysis) 2, 4, 178–179
  • Euclidean distances 170, 180
  • Excel 10–12, 126, 151
  • exogenous variable 179
  • exploratory data analysis 19, 52, 160
  • Exponential Smoothing 178, 179
  • extraction 47
  • Extreme Gradient Boosting (XGBoost) 171

f

g

  • Garbage In Garbage OUT (GIGO) 7
  • Gaussian mixture models 176
  • Gentleman, R. 2, 184
  • getnames 10, 22, 87, 95, 97
  • GIGO (Garbage In Garbage OUT) 7
  • glm 169
  • GNU project 2
  • Goodnight, James 1
  • Gramfort, A. 183
  • GRAPH (Graphics and presentation) 2
  • group_by() 70
  • group by analysis 57–60, 104–106
  • gtables package 80
  • guarantee 85–86

h

  • handling dates 37–42
  • handling numeric data 33–36
  • handling strings data 42–48
  • HAVING 105–106
  • Hclust (hierarchical clustering) 175
  • HeatMap 130, 137, 145
  • Hidden Markov models (HMM) 179
  • hierarchical clustering (Hclust) 175
  • histogram 130, 138, 146
  • HMisc package 60
  • HMM (Hidden Markov models) 179
  • Hoirnik, K. 184
  • homogeneity 174
  • html 2–3, 126–127, 151–152, 154–157
  • Hypothesis Testing 160, 163–164

i

  • IDE RStudio 3
  • ‘if’ statement 23, 28–30
  • Ignore 22
  • Ihaka, Ross 2
  • IML (interactive matrix language) 2, 4
  • importing data 77–1
  • independence 165, 169
  • inferential statistics 160
  • INFORMAT 34, 37
  • inner join 110–112, 116, 181
  • INPUT statement 8, 14, 37, 43–44
  • input data 77–1
  • INR 122
  • INSERT INTO statement 94–96, 106–108, 112–113
  • install.packages 3
  • intck option 38–39
  • integers 33, 40, 106–107, 112–113
  • integral types 33
  • Internet 42, 181
  • INTO 94, 108
  • IS NULL or IS NOT NULL 102

j

  • Jack 106–107, 112–116
  • JOIN 110–112
  • json files 10
  • jsonlite package 10

k

  • keys 108
  • K‐Fold Cross Fitting 167
  • KMeans 174, 175
  • knit document 154–156
  • kNN (K Nearest Neighbors) 170
  • Kolmogorov‐Smirnov non‐parametric 164
  • Kullback–Leibler divergence 180
  • kurtosis 51, 54, 162

l

m

n

o

p

q

  • Quantile regression 168
  • quantitative variables 159
  • Quartiles 161
  • quicklt 62
  • quotation marks 42
  • quotes 12–13, 25, 46

r

s

t

  • table package 4–5, 10–11, 28, 51, 56, 60, 62, 75–83, 85, 94, 98–101, 104, 106–108, 110, 112–113, 130, 140, 147, 153–154, 156
  • t‐Distributed Stochastic Neighbor Embedding (t‐SNe) 180
  • TDM (term document matrix) 180
  • TF‐IDF 180
  • theorem 85, 162, 166, 169
  • theory 162, 165
  • Tikhonov regularization 168
  • time series analysis 177–179
  • TNR (true negative rate) 172
  • topic modeling 181
  • tp 172
  • TPR (true positive rate) 172–173
  • translate function 45
  • transmute() 69
  • Trimn function 43
  • trimws 47
  • triplet 130
  • t‐SNe (t‐Distributed Stochastic Neighbor Embedding) 180
  • tz option 40–41

u

v

w

x

y

  • ymd 41

z

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.53.32