References

  1. BAESENS, BART. (2014). Analytics in a Big Data World, The Essential Guide to Data Science and Its Applications. Wiley India Pvt. Ltd.

  2. MAYER-SCHONBERGER, VIKTOR & CUKIER KENNETH. (2013). Big Data, A Revolution That Will Transform How We Live, Work and Think. John Murray (Publishers), Great Britain.

  3. LINDSTROM, MARTIN. (2016). Small Data – The Tiny Clues That Uncover Huge Trends. Hodder & Stoughton, Great Britain.

  4. FREEDMAN, DAVID; PISANI, ROBERT & PURVES, ROGER. (2013). Statistics. Viva Books Private Limited, New Delhi.

  5. LEVINE, DAVID.M. (2011). Statistics for SIX SIGMA Green Belts. Dorling Kindersley (India) Pvt. Ltd., Noida, India.

  6. DONNELLY, JR. ROBERT.A. (2007). The Complete Idiot’s Guide to Statistics, 2/e. Penguin Group (USA) Inc., New York 10014, USA.

  7. TEETOR, PAUL. (2014). R Cookbook. Shroff Publishers and Distributors Pvt. Ltd., Navi Mumbai.

  8. WITTEN, IAN.H.; FRANK, EIBE & HALL, MARK.A. (2014). Data Mining, 3/e – Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, Burlington, MA 01803, USA.

  9. HARRINGTON, PETER. (2015). Machine Learning in Action. Dreamtech Press, New Delhi.

  10. ZUMEL, NINA & MOUNT, JOHN. (2014). Practical Data Science with R. Dreamtech Press, New Delhi.

  11. KABACOFF, ROBERT.I. (2015). R In Action – Data analysis and graphics with R. Dreamtech Press, New Delhi.

  12. [Online] www.quora.com .

  13. [Online] www.r-bloggers.com .

  14. [Online] www.stackexchange.com .

  15. [Online] https://cran.r-project.org/ .

  16. [Online] www.r-project.org/ .

  17. COMPUTERWORLD FROM IDG. (2016). 8 big trends in big data analysis. [Online] Available from: http://www.computerworld.com/article/2690856/big-data/8-big-trends-in-big-data-analytics.html

  18. WELLESLEY INFORMATION SERVICES, MA 02026, USA. (2016). Big Data Analytics Predictions for 2016. Available from: http://data-informed.com/big-data-analytics-predictions-2016/

  19. COMPUTERWORLD FROM IDG. (2016). 11 Market Trends in Advanced Analytics. [Online] Available from: http://www.computerworld.com/article/2489750/it-management/11-market-trends-in-advanced-analytics.html#tk.drr_mlt

  20. WELLESLEY INFORMATION SERVICES, MA 02026, USA. (2016). 5 Big Trends to Watch in 2016. [Online] Available from: http://data-informed.com/5-big-data-trends-watch-2016/ .

  21. ZHANG, NANCY.R. Ridge Regression, LARS, Logistic Regression. [Online] Available from: http://statweb.stanford.edu/~nzhang/203_web/lecture12_2010.pdf

  22. QIAN, JUNYANG & HASTIE, TRAVOR. (2014). Glmnet Vignette. [Online] Available from: http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html

  23. USUELLI, MICHELE. (2014). R Machine Learning Essentials. Packt Publishing.

  24. BALI, RAGHAV & SARKAR, DIPANJAN. (2016). R Machine Learning By Example. Packt Publishing.

  25. DAVID, CHIU & YU-WEI. (2015). Machine Learning with R Cookbook. Packt Publishing.

  26. LANTZ, BRETT. (2015). Machine Learning with R, 2/e. Packt Publishing.

  27. Data Mining - Concepts and Techniques By Jiawei Han, Micheline Kamber and Jian Pei, 3e, Morgan Kaufmann

  28. S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. VLDB’96

  29. D. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data warehouses. SIGMOD’97

  30. R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. ICDE’97

  31. S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. ACM SIGMOD Record, 26:65-74, 1997

  32. E. F. Codd, S. B. Codd, and C. T. Salley. Beyond decision support. Computer World, 27, July 1993.

  33. J. Gray, et al. Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1:29-54, 1997.

  34. Swift, Ronald S. (2001) Accelerating Customer Relationships Using CRM and Relationship Technologies, Prentice Hall

  35. Berry, M. J. A., Linoff, G. S. (2004) Data Mining Techniques. Wiley Publishing.

  36. Ertek, G. Visual Data Mining with Pareto Squares for Customer Relationship Management (CRM) (working paper, Sabancı University, Istanbul, Turkey)

  37. Ertek, G., Demiriz, A. A framework for visualizing association mining results (accepted for LNCS)

  38. Hughes, A. M. Quick profits with RFM analysis. http://www.dbmarketing.com/articles/Art149.htm

  39. Kumar, V., Reinartz, W. J. (2006) Customer Relationship Management, A Databased Approach. John Wiley & Sons Inc.

  40. Spence, R. (2001) Information Visualization. ACM Press.

  41. Dyche, Jill, The CRM Guide to Customer Relationship Management, Addison-Wesley, Boston, 2002.

  42. Gordon, Ian. “Best Practices: Customer Relationship Management” Ivey Business Journal Online, 2002, pp. 1-6.

  43. Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner [Hardcover] By Galit Shmueli (Author), Nitin R. Patel (Author), Peter C. Bruce (Author)

  44. A. Gupta and I. S. Mumick. Materialized Views: Techniques, Implementations, and Applications. MIT Press, 1999.

  45. J. Han. Towards on-line analytical mining in large databases. ACM SIGMOD Record, 27:97-107, 1998.

  46. V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. SIGMOD’96

  47. C. Imhoff, N. Galemmo, and J. G. Geiger. Mastering Data Warehouse Design: Relational and Dimensional Techniques. John Wiley, 2003

  48. W. H. Inmon. Building the Data Warehouse. John Wiley, 1996

  49. R. Kimball and M. Ross. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. 2ed. John Wiley, 2002

  50. P. O’Neil and D. Quass. Improved query performance with variant indexes. SIGMOD'97

  51. Microsoft. OLEDB for OLAP programmer’s reference version 1.0. In http://www.microsoft.com/data/oledb/olap , 1998

  52. A. Shoshani. OLAP and statistical databases: Similarities and differences. PODS’00.

  53. S. Sarawagi and M. Stonebraker. Efficient organization of large multidimensional arrays. ICDE'94

  54. OLAP council. MDAPI specification version 2.0. In http://www.olapcouncil.org/research/apily.htm , 1998

  55. E. Thomsen. OLAP Solutions: Building Multidimensional Information Systems. John Wiley, 1997

  56. P. Valduriez. Join indices. ACM Trans. Database Systems, 12:218-246, 1987.

  57. J. Widom. Research problems in data warehousing. CIKM’95.

  58. Kurt Thearling. Data Mining. http://www.thearling.com , [email protected]

  59. “Building Data Mining Applications for CRM”, By Alex Berson, Stephen Smith and Kurt Thearling

  60. Building Data Mining Applications for CRM by Alex Berson, Stephen Smith, Kurt Thearling (McGraw Hill, 2000).

  61. Introduction to Data Mining, By Pang-Ning, Michael Steinbach, Vipin Kumar, 2006 Pearson Addison-Wesley.

  62. Data Mining: Concepts and Techniques, Jiawei Han and Micheline Kamber, 2000 (c) Morgan Kaufmann Publishers

  63. Data Mining In Excel, Galit Shmueli Nitin R. Patel Peter C. Bruce, 2005 Galit Shmueli, Nitin R. Patel, Peter C. Bruce

  64. Principles of Data Mining by David Hand, Heikki Mannila and Padhraic Smyth ISBN: 026208290x The MIT Press © 2001 (546 pages)

  65. http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html

  66. http://paginas.fe.up.pt/~ec/files_1011/week%2008%20-%20Decision%20Trees.pdf

  67. http://www.quora.com/Machine-Learning/Are-gini-index-entropy-or-classification-error-measures-causing-any-difference-on-Decision-Tree-classification

  68. http://www.quora.com/Machine-Learning/Are-gini-index-entropy-or-classification-error-measures-causing-any-difference-on-Decision-Tree-classification

  69. https://rapid-i.com/rapidforum/index.php?topic=3060.0

  70. http://stats.stackexchange.com/questions/19639/which-is-a-better-cost-function-for-a-random-forest-tree-gini-index-or-entropy

  71. Creswell, J. W. (2013). Research design: Qualitative, quantitative, and mixed methods approaches. Sage Publications, Incorporated.

  72. http://www.physics.csbsju.edu/stats/box2.html

  73. Advance Data Mining Techniques, Olson, D.L, Delen, D, 2008 Springer

  74. Phyu, Nu Thair, “Survey of Classification Techniques in Data Mining”, Proceedings of the International MultiConference of Engineers and Computer Scientists 2009 Vol I IMECS 2009, March 18 - 20, 2009, Hong Kong

  75. Myatt, J. Glenn, “Making Sense of Data – A practical Guide to Exploratory Data Analysis and Data Mining”, 2007, WILEY-INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION

  76. Fawcett, Tom, “An Introduction to ROC analysis”, Pattern Recognition Letters 27 (2006) 861–874

  77. Sayad, Saeed. “An Introduction to Data Mining”, Self-Help Publishers (January 5, 2011).

  78. Delmater, Rhonda, and Monte Hancock. "Data mining explained." (2001).

  79. Alper, Theodore M. "A classification of all order-preserving homeomorphism groups of the reals that satisfy finite uniqueness." Journal of mathematical psychology 31.2 (1987): 135-154.

  80. Narens, Louis. "Abstract measurement theory." (1985).

  81. Luce, R. Duncan, and John W. Tukey. "Simultaneous conjoint measurement: A new type of fundamental measurement." Journal of mathematical psychology 1.1 (1964): 1-27.

  82. Provost, Foster J., Tom Fawcett, and Ron Kohavi. "The case against accuracy estimation for comparing induction algorithms." ICML. Vol. 98. 1998.

  83. Hanley, James A., and Barbara J. McNeil. "The meaning and use of the area under a receiver operating characteristic (ROC) curve." Radiology 143.1 (1982): 29-36.

  84. Ducker, Sophie Charlotte, W. T. Williams, and G. N. Lance. "Numerical classification of the Pacific forms of Chlorodesmis (Chlorophyta)." Australian Journal of Botany 13.3 (1965): 489-499.

  85. Kaufman, Leonard, and Peter J. Rousseeuw. "Partitioning around medoids (program pam)." Finding groups in data: an introduction to cluster analysis(1990): 68-125.

Index

A

  1. Affinity analysis

  2. Aggregate() function

  3. Akaike information criterion (AIC) value

  4. Amazon

  5. Apache Hadoop ecosystem

  6. Apache Hadoop YARN

  7. Apache HBase

  8. Apache Hive

  9. Apache Mahout

  10. Apache Oozie

  11. Apache Pig

  12. Apache Spark

  13. Apache Storm

  14. Apply() function

  15. Arrays, R

  16. Artificial intelligence

  17. Association-rule analysis

    1. association rules

    2. if-then

    3. interpreting results

    4. market-basket analysis

    5. rules

    6. support

  18. Association rules/affinity analysis

B

  1. Bar plot

  2. Bayes theorem

  3. Bias-variance erros

  4. Big data

    1. analysis

    2. analytics, future trends

      1. addressing security and compliance

      2. artificial intelligence

      3. autonomous services for machine learning

      4. business users

      5. cloud

      6. data lakes

      7. growth of social media

      8. healthcare

      9. in-database analytics

      10. in-memory analytics

      11. Internet of Things

      12. migration of solutions

      13. prescriptive analytics

      14. real-time analytics

      15. vertical and horizontal applications

      16. visualization at business users

      17. whole data processing

    3. characteristics

    4. ecosystem

    5. use of

  5. Big data analytics

  6. Binomial distribution

  7. Bivariate data analysis

  8. Bootstrap aggregating/bagging

  9. Boxplots

  10. Business analytics

    1. applications of

      1. customer service and support areas

      2. human resources

      3. marketing and sales

      4. product design

      5. service design

    2. computer packages and applications

    3. consolidate data from various sources

    4. drivers for

    5. framework for

    6. infinite storage and computing capability

    7. life cycle of project

    8. programming tools and platforms

    9. required skills for business analyst

      1. data analysis techniques and algorithms

      2. data structures and storage/warehousing techniques

      3. programming knowledge

      4. statistical and mathematical concepts

  11. Business Analytics and Statistical Tools

  12. Business analytics process

    1. data collection and integration

      1. data warehouse

      2. HR and finance functions

      3. IT database

      4. manufacturing and production process

      5. metadata

      6. NoSQL databases

      7. operational database

      8. primary source

      9. sampling technique

      10. secondary source

      11. variable selection

    2. definition

    3. deployment

    4. functions

      1. collection and integration

      2. deployment

      3. evaluation

      4. exploration and visualization

      5. management and review report

      6. modeling techniques and algorithms

      7. preprocessing

      8. problem, objectives, and requirements

    5. historical data

    6. identifying and understanding problem

    7. life cycle

    8. management report and review

      1. data cleaning carried out

      2. data set use

      3. deployment and usage

      4. issues handling

      5. model creation

      6. prerequisites

      7. problem description

    9. model evaluation

      1. confusion matrix

      2. gain/lift charts

      3. holdout partition

      4. k-fold cross-validation

      5. ROC chart

      6. test data

      7. validation

    10. model evaluationt

      1. training

    11. preprocessing

SeePreprocessing data
  1. real-time data

  2. regression model

  3. root-mean-square error

  4. sequence of phases

  5. techniques and algorithms

    1. data types

    2. descriptive analytics

    3. machine learning

    4. predictive analytics

C

  1. Classification techniques

    1. decision tree

SeeDecision tree structure
  1. disadvantage

  2. k-nearest neighbor (K-NN)

  3. probabilistic models

    1. advantages and limitations

    2. bank credit-card approval process

    3. Naïve Bays

  4. R

    1. cross-validation error

    2. CSV format

    3. functions

    4. misclassification error

    5. plotting deviance vs. size

    6. school data set

    7. testing model

    8. training set and test set

    9. tree() package

  5. random forests

  6. step process

  7. types

  1. Cloud

  2. Cloudera

  3. Clustering analysis

    1. average linkage (average distance)

    2. categorical variable

    3. centroid distance

    4. complete linkage (maximum distance)

    5. Euclidean distance

    6. finance

    7. hierarchical clustering

      1. algorithm

      2. dendrograms

      3. limitations

    8. hierarchical method

    9. HR department

    10. Manhattan distance

    11. market segmentation

    12. measures distance (between clusters)

    13. mixed data types

    14. n records

    15. nonhierarchical clustering

SeeK-means algorithm
  1. nonhierarchical method

  2. overview

  3. pearson product correlation

  4. purpose of

  5. single linkage (minimum distance)

  1. Coefficient of determination

  2. Comma-Separated Values (CSV)

  3. Computations on data frames

    1. analyses

    2. EmpData data

    3. in R

    4. scatter plots

  4. Continuous data

  5. Control structures in R

    1. for loops

    2. if-else

    3. looping functions

      1. apply() function

      2. cut() function

      3. lapply() function

      4. sapply() function

      5. split() function

      6. tapply() function

    4. while loops

    5. writing functions

  6. Correlation

  7. Correlation coefficient

  8. Correlation graph

  9. Cross-Industry Standard Process for Data Mining (CRISP-DM)

  10. Cut() function

  11. Cutree() function

D

  1. Data

  2. Data aggregation

  3. Data analysis, R

    1. reading and writing data

      1. from Microsoft Excel file

      2. from text file

      3. from web

  4. Data analysis tools

  5. Data analytics

  6. Data exploration and visualization

    1. descriptive statistics

    2. goal of

    3. graphs

      1. box/whisker plot

      2. correlation

      3. density function

      4. histograms

      5. notched plots

      6. registered users vs. casual users

      7. scatter plot matrices

      8. scatter plots

      9. trellis plot

      10. types of

      11. univariate analysis

    4. normalization techniques

    5. phase

    6. tables

    7. transformation

    8. View() command

  7. Data frames, R

  8. Data lakes

  9. Data Mining Group (DMG)

  10. Data science

  11. Data structures

    1. in R

      1. arrays

      2. data frames

      3. factors

      4. lists

      5. matrices

  12. Decision tree structure

    1. bias and variance

    2. classification rules

    3. data tuples

    4. entropy/expected information

    5. generalization errors

    6. gini index

    7. impurity

    8. induction

    9. information gain

    10. overfitting and underfitting

    11. overfitting errors

      1. CART method

      2. pruning process

      3. regression trees

      4. tree growth

    12. recursive divide-and-conquer approach

    13. root node

  13. Deep learning

  14. Dendrograms

  15. Density function

  16. Descriptive analytics

    1. computations on dataframes

SeeComputations on data frames
  1. graphical

SeeGraphical description of data
  1. Maximum depth of river

  2. mean depth of the river

  3. median of the depth of river

  4. notice, sign board

  5. percentile

  6. population and sample

  7. probability

  8. quartile 3

  9. statistical parameters

SeeStatistical parameters
  1. Discrete data types

  2. Durbin-Watson test

E

  1. Economic globalization

  2. Ecosystem, big data

  3. Euclidean distance

  4. Extensible Markup Language (XML)

F

  1. Factors, R

  2. for loops

G

  1. Graphical description of data

    1. bar plot

    2. boxplot

    3. histogram

    4. plots in R

      1. code

      2. creation, simple plot

      3. plot()

      4. variants

  2. Gross domestic product (GDP)

H

  1. Hadoop Distributed File System (HDFS)

  2. Hadoop ecosystem

    1. advantages

  3. Hadoop framework

  4. Healthcare, big data

  5. Hierarchical clustering

    1. algorithm

    2. closeness

    3. dendrograms

    4. limitations

  6. Histograms

  7. Huge computing power

  8. Huge storage power

  9. Hybrid Transactional/Analytical Processing (HTAP)

  10. Hypothesis testing

I

  1. If-else structure

  2. In-database analytics

  3. In-memory analytics

  4. Integrated development environment (IDE)

  5. Internet of Things

  6. Interquartile Range (IQR) method

  7. Interval data types

J

  1. JavaScript Object Notation (JSON) files

  2. JobTracker

K

  1. k-fold cross-validation

  2. K-means algorithm

    1. case study

      1. outliers verification

      2. relevant variables

      3. scores() function

      4. standardized values

      5. test data set

    2. data points (observations)

      1. aggregate() function

      2. cutree() function

      3. data observations

      4. dendrogram

      5. dist() function

      6. hclust() function

      7. hierarchical partitioning approach

      8. library(NbClust) command

      9. NbClust() command

      10. observations

      11. plot() function

      12. rect.hclust() function

      13. rent and distances

      14. selected approaches

    3. goal

    4. k-means algorithm

    5. limitations

    6. objective of

    7. partition clustering methods

  3. kmeansruns() function

  4. k-nearest neighbor (K-NN)

L

  1. lapply() function

  2. Lasso Regression method

  3. Linear regression

    1. assumptions

    2. correlation

      1. attrition

      2. cause-and-effect relationship

      3. coefficient

      4. customer satisfaction

      5. employee satisfaction index

      6. sales quantum

      7. strong/weak association

    3. data frame creation

    4. degrees of freedom

    5. equal variance, variable

    6. equation

    7. F-statistic

    8. function

    9. independent and dependent variable

    10. innovativeness

    11. intercept

    12. least squares method

    13. linear relationship

    14. marketing efforts

    15. multiple R-squared

    16. predict() function

    17. profitability

    18. properties

    19. p-value

    20. quality-related statistics

    21. R command

    22. residuals

    23. residual standard error

    24. sales personnel

    25. standard error

    26. testing

      1. independence errors

      2. linearity

      3. normality

    27. validation

      1. crPlots(model name) function

      2. gvlma() function

      3. scale-location plot

    28. value of significance

    29. work environment

  4. Lists, R

  5. Logistic regression

    1. binomial distribution

    2. data creation

    3. glm() function

    4. lm() function

    5. logistic regression model

    6. model creation

      1. comparison

      2. conclusion

      3. deviance

      4. dispersion

      5. glm() function

      6. model fit verification

      7. multicollinearity

      8. residual deviance

      9. summary of

      10. variables

      11. warning message

    7. multinomial logistic regression

    8. read.csv() command

    9. regularization

SeeRegularization
  1. training and testing

    1. prediction() function

    2. response variable

    3. validation

  1. Looping functions

    1. apply() function

    2. cut() function

    3. lapply() function

    4. sapply() function

    5. split() function

    6. tapply() function

M

  1. Machine learning

  2. Manhattan distance

  3. MapReduce

  4. Market-basket analysis (MBA)

SeeAffinity analysis
  1. Matrices, R

  2. Measurable data

SeeQuantitative data
  1. Microsoft Azure

  2. Microsoft Business Intelligence and Tableau

  3. Microsoft Excel file, reading data

  4. Microsoft SQL Server database

  5. Minkowski distance

  6. Min-max normalization

  7. Mtcars Data Set

  8. Multicollinearity

  9. Multinomial logistic regression

  10. Multiple linear regression

    1. assumptions

    2. components

    3. correlation

    4. data

    5. data-frame format

    6. discrete variables

    7. equation

    8. lm() function

    9. multicollinearity

    10. predictors

    11. response variable

    12. R function glm()

    13. stepwise

    14. subsets approach

    15. training and testing model

    16. validation

      1. crPlots

      2. Durbin-Watson test

      3. ncvTest(model name)

      4. normal Q-Q plot

      5. qqPlot

      6. residuals vs. fitted

      7. residuals vs. leverage plot

      8. scale-location plot

      9. Shapiro-Wilk normality test

  11. multiple linear regression equation

SeeMultiple linear regression
  1. Multiple regression

  2. myFun() function

N

  1. Naïve Bays

  2. Natural language processing (NLP)

  3. NbClust() function

  4. Nominal data types

  5. Nonhierarchical clustering

SeeK-means algorithm
  1. Non-linear regression

  2. Normal distribution

  3. Normalization techniques

  4. NoSQL

  5. Null hypothesis

O

  1. Online analytical processing (OLAP)

  2. Open Database Connectivity (ODBC)

  3. Ordinal data types

  4. Overdispersion

P

  1. Packages and libraries, R

  2. Partition clustering methods

  3. Poisson distribution

  4. Prediction

  5. Predictive analytics

    1. classification

    2. regression

  6. Predictive Model Markup Language (PMML)

  7. Preprocessing data

    1. preparation

      1. duplicate, junk, and null characters

      2. empty values

      3. handling missing values

    2. R

      1. as.numeric() function

      2. complete.cases() function

      3. data types

      4. factor levels

      5. factor() type

      6. head() command

      7. methods

      8. missing values

      9. names() and c() function

      10. table() function

      11. vector operations

    3. types

  8. Probabilistic classification

    1. advantages and limitations

    2. bank credit-card approval process

    3. Naïve Bays

  9. Probability

    1. concepts

    2. distributions

SeeProbability distributions
  1. events

  2. mutually exclusive events

  3. mutually independent events

  4. mutually non-exclusive events

  1. Probability distributions

    1. binomial

    2. normal

    3. poisson

  2. Probability sampling

  3. Property graphs (PG)

Q

  1. Qualitative data

  2. Quantitative data

R

  1. R

    1. advantages

    2. console

    3. control structures

      1. for loops

      2. if-else

      3. looping functions

      4. while loops

      5. writing functions

    4. data analysis

      1. reading and writing data

    5. data analysis tools

    6. data structures

      1. arrays

      2. data frames

      3. factors

      4. lists

      5. matrices

    7. glm() function

    8. installation

      1. RStudio interface

    9. interfaces

    10. library(NbClust) command

    11. lm() function

    12. Naïve Bays

    13. objects types

    14. packages and libraries

    15. pairs() command

    16. programming, basics

      1. assigning values

      2. creating vector

    17. View() command

  2. Random forests

  3. Random sampling

  4. Ratio data types

  5. read.csv() function

  6. read.table() function

  7. Receiver operating characteristic (ROC)

  8. rect.hclust() function

  9. Regularization

    1. cv.fit() model

    2. cv.glmnet() function

    3. generic format

    4. glmnet() function

    5. glmnet_fit command

    6. methods

    7. plot() function

    8. plot(cv.fit)

    9. predict() function

    10. print() function

    11. shrinkage methods

    12. variable

  10. Ridge Regression method

  11. RODBC package

  12. Root-mean-square error (RSME)

  13. RStudio

    1. installation error

    2. installing

    3. interface

    4. output

    5. window

S

  1. sapply() function

  2. Scatter plot matrices

  3. Scatter plots

    1. analysis of data

    2. changes, relationship

    3. Coding

    4. created in R

    5. EmpData1

  4. seq_along() function

  5. Shrinkage methods

  6. Simple regression

  7. split() function

  8. Standard deviations

  9. Statistical parameters

    1. mean

      1. data set

      2. downside of

      3. in R

      4. limitations

      5. profit and effective

      6. single parameter

      7. usage of

    2. median

    3. mode

    4. quantiles

    5. range

    6. standard deviation

    7. summary(dataset)

    8. variance

  10. Storm

  11. Stratified sampling

  12. Supervised machine learning

  13. Systematic sampling

T

  1. tapply() function

  2. Text file, reading data

  3. Transformation

  4. Trellis graphics

U

  1. Univariate analysis

  2. Unsupervised machine learning

    1. association-rule analysis

      1. association rules

      2. if-then

      3. interpreting results

      4. market-basket analysis

      5. rules

      6. support

    2. clustering

SeeClustering analysis

V

  1. Variance errors

  2. Variance inflation factor (VIF)

  3. Variety

  4. Velocity

  5. Visualization

    1. Workflow

  6. Visualization

SeeData exploration and visualization

W, X, Y

  1. Web, reading data

  2. while loops

  3. Whole data processing

Z

  1. Z-score normalization

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.79.65