Index

abbreviations and acronyms for variables 21, 40

accuracy 39

algorithms xvi, 234, 235

clustering 105, 108, 109, 114

alpha level (or significance level) 745, 86

alternative hypothesis (H1) 712, 74, 75, 76, 83, 86

analytics anxiety 511

causes of 78

concept of and key components 6

key risks 11

remedies 89

analytics biases 12, 12536

data analysis 128, 1312, 136

data communication and usage 128, 1324, 136

data gathering 128, 12931

key concepts 126

analytics roles 231, 232, 233

analytics software presentations xxii, 18996

demo demons 189, 191, 1923

good practices 1934

analytics software tools 8, 23, 60, 116, 233, 236

analytics trends 229, 233, 2345

ANOVA (analysis of variance) 689, 84

application-related questions 142, 144, 1456

artificial intelligence (AI), transparent 235

audience(s)

attitude about topic 174

building common ground with 171, 176, 183, 194

call to action 174, 176, 180, 183, 184, 185

involvement 156, 162, 166, 167

triggering emotions in 171, 174, 184

autonomy-supporting language 206, 207, 208

average dispersion see variance

averages 33, 40, 41

see also mean; median; mode

b values 623, 83

bad data-based news xxii, 199214

acceptance stage 203

communication strategies 2068

resistance as challenge of 206, 207

common types

goal failure 201

insufficient competitiveness 201, 203

negative trends 201

comprehension stage 2034

communication strategies 2046

confusion as challenge of 204, 205

dos and don’ts of communicating 212, 213

as game changer 203

motivation (action stage) 204

communication strategies 209, 211

frustration as challenge of 209, 211

bar charts 23, 159, 164, 180

beta values (b) 63, 83

biases see analytics biases

big data xvi, 9, 109, 115, 235

big knowledge 9

bimodal distribution 28

binary variables 19, 20, 29, 33, 55

biserial correlation coefficient 55

bivariate data 46, 48, 84

black box allergy 8

categorical variables 1819, 20, 28, 29, 33, 658

binary 19, 20, 29, 33, 55

nominal 19, 20, 33, 55

ordinal 19, 20, 29, 33, 54

causation, correlation vs 55, 83, 133

causation bias 133

cause-and-effect relationships 76, 80

central location see central tendency

central tendency, measures of 2733, 34

chart shock 8

charts

DESIGN principles xxi, 15668

Decluttering 156, 158, 166, 167

Emphasising 156, 1589, 166, 167

Storifying 156, 160, 161, 166, 167

Involving the audience 156, 162, 166, 167

Giving meaning 156, 1634, 167

No distortion rule 156, 164, 166, 167

labelling 158, 163, 166

symbols 163, 166

titles or captions 163, 166

see also bar charts; dendrograms; graphs; histograms; line charts; maps; pie charts; scatter plots

cloud analytics 234

cluster analysis xxi, 10521

hierarchical 109, 11314, 115, 117, 120

how it works 10810

k-means clustering 109, 120

key concepts 106

limitations of 11718

outliers and 115, 117

risks 120

similarity measures 11013

visualisations 11316

cognitive computing 234

Cohen’s d 76

collaborative analytics 234

comparison

chart format to enable 159

of group means 668, 76

competitiveness, insufficient 201, 203

complex relationships xx–xxi, 91102

conceptual models 100

confirmation bias 1301

confounding variables 812, 83, 86, 131

confusion 204, 205

continuous variables 18, 1920, 28, 33, 54, 55, 64

interval 19, 20, 29, 30, 33, 54, 64

ratio 19, 20, 29, 30, 33, 54, 64

sizes of relationships between 76

controlling language 2068

convenience sampling 79

conversations xvi–xvii

correlation 4950, 525, 84

direction and strength of see correlation coefficient

negative linear 50, 52, 53, 85

no correlation 50

positive linear 4950, 52, 53, 84

scatter plots 4850, 51

vs causation 55, 83, 133

correlation coefficient 525, 60, 83

biserial 55

Cramer’s V 55

Kendall’s tau 54

Pearson 524, 76

point-serial 55

Spearman’s rho 54

counter-arguing 208

critical mindset 229, 231, 233

curse of knowledge bias 1323, 134, 193, 194

curvilinear relationships 501

dashboards

making data meaningful in 163

storified 160, 161, 17880, 182

data analysis 142

asking questions related to 142, 145

biases 128, 1312, 136

rigour and relevance of 144

Data Central 233

data communication and usage biases 128, 1324, 136

data disagreements xxii, 21726

data distortions xxi

avoiding, and chart design 156, 164, 166, 167

see also analytics biases

data fatigue 8

data fluency xvi, xxii, xviii–xx

elements for sustained

critical attitude 229, 231, 233

monitoring trends 229, 233, 2345

resources 229, 231, 233

understanding and skills 229, 230

five step action plan 236

data gathering biases 128, 12931

data quality 142, 1445

data quality paranoia 8

data scepticism 8

Data Science Central 231

data screening 1634

data sources 142, 1445

data storytelling xxi–xxii, 8, 9, 160, 17186, 230

audience

attitude about topic 174

building common ground with 171, 176, 183

call to action and rewards 174, 176, 180, 183, 184, 185

triggering emotions in 171, 174, 184

canvas 172, 1748, 181

dashboard example 17880, 182

moral of the story 180, 184

personal style in 1845

situation–complication–resolution format 176, 180, 1834, 185

specific–explore–generalise format 176, 177

surprise story approach 1767

use of voice in 171, 1778, 1845

see also storifying charts

data visualisation 15368, 172, 230

cluster analysis 11316

complex relationships 1001

and data disagreement 2201, 222

frequency distributions 23, 25

tools 23, 60, 116, 236

see also charts

decimals 39, 40, 41

decluttering 156, 158, 166, 167

demo demons 189, 191, 1923

dendrograms 109, 11416, 121

dependent variables 21, 22, 40, 81

deviation 358

chart format to emphasise 159

direction and strength of relationship see Pearson correlation coefficient

discrete variables 1920

dispersion see spread

distance matrix 114

distance metrics see Euclidean distance

distortions see data distortions

distributed analytics 233, 234

distribution see frequency distributions; normal distributions

Drucker, Peter 147

Dunning–Kruger effect 133

edge analytics 233, 234

effect sizes 768, 86

emotion(s) 160, 185

triggering 171, 174, 184

emphasising, and chart design 156, 1589, 166, 167

error of the model (residual) 57

eta squared 76

Euclidean distance 110, 11113

evidence-based decisions 6, 7

Excel 236

experimental hypothesis see alternative hypothesis

experimental research designs 80, 83

exponential relationships 51

factor analysis 11718

feedback 93, 147, 162

frequency distributions 15, 2333

concept of and key components 24

measures of central tendency 2733, 34

pointiness (kurtosis) 257, 40

presentation format 23, 24, 25

skewed 257, 301, 40

frustration 209, 211

full mediation 99

gain-framed perspective 206

Gartner.com 233

Gaussian distributions see normal distributions

generalisations 48, 6976, 82, 83

goal failure 201

graphic representation see data visualisation

graphs, frequency distributions 23, 24, 25

group means 668, 76

group membership, and outcomes, comparing differences between 648, 76

grouping see cluster analysis

histograms 23

humour 193, 208

hybrid intelligence 235

hypothesis 71

alternative (experimental) (H1) 712, 74, 75, 76, 83, 86

null (H0) 712, 73, 745, 76, 83, 86

hypothesis testing 716

important effects, and significant effects, difference between 7682

incompetence compensation competence 8

independent variables 212, 40, 81

Infoworld.com 233

interaction effect 93

interactive data 162

the intercept 57, 58, 5960, 83, 85

interquartile range 345, 36

Intersection over Union see Jaccard coefficient or index

interval variables 19, 20, 29, 30, 33, 54, 64

involvement, audience 156, 162, 166, 167

Jaccard coefficient or index 11011

jargon, avoidance of 634, 846, 176, 204

k-means clustering 109, 120

Kendall’s tau 54

kurtosis 257, 40

labelling charts 158, 163, 166

large numbers 39, 40, 41

line of best fit 568, 59, 85

line charts 164

linear regression model 5669

with error term 60, 61

with multiple predictors 612

linear relationships 4650

concept of and key components 47

direction and strength of 515

see also correlation; correlation coefficient

LinkedIn 231, 233

loss-framed perspective 206

maps 159

similarity 115, 116, 120

market segmentation 107

mean 27, 2932, 33, 40

group 668, 76

spread of data around 348

meaning, giving, and chart design 156, 1634, 166, 167

median 27, 289, 30, 312, 33, 40

mediation xxi, 91, 93, 94, 989, 100, 101

full 99

partial 99

mediator variable 98, 99, 100, 101, 102

meetups 233

Microsoft Excel 236

Microsoft Power BI 8, 116, 236

mobile analytics 235

mode 278, 33, 40

moderation xxi, 91, 938, 100, 101

moderator variable 93, 95, 96, 97, 100, 101, 102

multidimensional scaling plot 115

multimodal distribution 28

nasty questions 2089

negative linear correlation 50, 52, 53, 85

negative trends 201

nominal variables 19, 20, 33, 55

non-probability sampling 79

normal distributions 25, 26

deviations from see kurtosis; skewed distributions

normality bias 132

null hypothesis (H0) 712, 73, 745, 76, 83, 86

odds ratio 76

ordinal variables 19, 20, 29, 33, 54

outcome variables 212, 40, 48, 57, 589, 76, 82

see also predictor variables, and outcomes

outliers 16, 30, 312, 34, 40, 163

cluster analysis 115, 117

and data analysis 131, 136

overfitting 132

p-values 745, 768, 86

partial mediation 99

Pearson correlation coefficient 524, 76

pie charts 164, 180

point-biserial correlation coefficient 55

population(s) 701, 85

positive linear correlation 4950, 52, 53, 84

Power BI 8, 116, 236

precision 39

predictions 48, 5569, 82, 83, 85

predictor variables 212, 40, 48, 76, 82

and outcomes 57, 589, 612

mediation analysis 91, 93, 94, 989, 100, 101

moderation analysis 91, 938, 100, 101

principal component analysis (PCA) 118

probability sampling 79

Python (programming software) xvi, 236

qualitative data 17

quality of data see data quality

quantitative data 17

quantum computing 233, 235

quasi-experimental research 801

questions about data (Q&A sessions) xxi, 14150

analysis-related questions 142, 145

application-related questions 142, 144, 1456

data source and quality questions 142, 1445

nasty questions 2089

risks 149

tenacity 146, 147

timing of 1467

tone 146

R (programming software) xvi, 236

random assignment/randomisation 80, 81, 86

random sampling 79, 86

range 345

interquartile 345, 36

ranking, chart format to enable 159

RapidMiner software 8, 236

ratio variables 19, 20, 29, 30, 33, 54, 64

recency effect 211

regression analysis 5669, 84, 85

regression line (line of best fit) 568, 59, 85

regression plane 612

regression slope (or gradient) 579, 85

representative sample 789, 83, 86

research design 78, 83

experimental 80, 83

quasi-experimental 801

residual (error of the model) 57

resistance to bad news 206, 207

resources, for sustained data fluency 229, 231, 233

reverse mentoring 8, 231

risk-based segmentation 107

S-shaped relationships 51

sample(s) 701, 86

representative 789, 83, 86

sampling methods 79

SAS software 8, 236

scatter plots (or scatter diagrams) 4850, 51, 159

segmentation see cluster analysis

selection bias 130

self-service analytics (or business intelligence) 233, 235

signal-to-noise ratio 715

significance level (or alpha level) 745, 86

significant effects, and important effects, difference between 7682

silhouette scores 117

similarity 105, 11013

measures 120

Euclidean distance 11113

Jaccard coefficient or index 11011

similarity maps 115, 116, 120

similarity matrix 114

Simon, Herbert 105

skewed distributions 257, 301, 40

slope see regression slope

Smedley, Ralph C. xviii

snowball sampling 79

software see analytics software

Spearman’s rho 54

Spinoza, Baruch xviii

spread (dispersion) 348

SPSS 236

standard deviation 378

stat phobia 7

statistics 1541, 230

avoiding jargon when talking about 634, 846

storifying charts 156, 160, 161, 166, 167

storytelling see data storytelling

straight-line relationships see linear relationships

sum of squared errors (SS) 37, 38

survivor bias 130

symbols 163, 166

systematic vs unsystematic variation 714

Tableau 8, 23, 60, 116, 236

tables, frequency distributions 23, 25

tails of a distribution 26

taxonomies 105, 107

test statistics 715

titles, charts 163, 166

transparent AI 235

trends

in analytics 229, 233, 2345

chart format to show 159

negative 201

U-shaped relationships 501

univariate data 46, 84

use of data, biases in 1324

variables 1723

abbreviations and acronyms 21, 40

confounding 812, 83, 86, 131

dependent 21, 22, 40, 81

discrete 1920

independent 212, 40, 81

and measures of central tendency 28, 29, 30, 33

mediator 98, 99, 100, 101, 102

moderator 93, 95, 96, 97, 100, 101, 102

see also categorical variables; continuous variables; outcome variables; predictor variables

variables, relationship between 212

linear 4650

concept of and key components 47

direction and strength 515

non-linear

exponential 51

S-shaped 51

U-shaped (or curvilinear) 501

see also complex relationships; correlation; correlation coefficient

variance 378, 41

ANOVA (analysis of variance) 689, 84

variation, systematic vs unsystematic 714

visualisation see data visualisation

voice, and data storytelling 171, 1778, 1845

voluntary response sampling 79

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.95.7