Index

A

alpha (α) 194–196

Alternative Hypothesis 175

alternative models, evaluating 315–319

analysis

See also specific types

capability 398–400

with Fit Y by X 100–101

with multivariate platform 98–99

trend 350–352

analysis of variance (ANOVA)

about 229

applications 245–248

assumptions about 229–230

conducting 287–291

interpreting regression results 257

one-way 230–237

satisfaction of conditions 237–238

two-way 238–245

analysis platform, using 11–14

analytics frameworks, applying 89–91

ANOVA

See analysis of variance (ANOVA)

applications

analysis of variance (ANOVA) 245–248

chi-square tests 213–215

data 37–38, 56–59

discrete distributions 118–121

experimental design 379–384

forecasting techniques 355–358

inference 184–187, 201–204, 226–228

linear regression analysis 261–265

multiple regression 319–322

normal model 138–141

probability 118–121

quality improvement 402–405

regression analysis 338–340

residuals analysis 280–283

residuals estimation 280–283

sampling and sampling distributions 159–162

variables 82–85

applying analytics frameworks 89–91

ARIMA (AutoRegressive Integrated Moving Average) models 352–355

assigning probability values 108

assumptions

about analysis of variance (ANOVA) 229–230

evaluating 240–241

asterisk (*) 178–179

autocorrelation 342–343

AutoRegressive Integrated Moving Average (ARIMA) models 352–355

autoregressive models 352–355

axes, customizing in histograms 50–52

B

bars, customizing in histograms 50–52

beta (β) 194–196

“Big Data” 28

binomial distribution 113–115

bivariate data 61–62

bivariate inference

about 285

life expectancy by GDP per capita 291–293

life expectancy by income group 287–291

research context 285

blocking 369–371

blocks 361–362

bootstrap confidence intervals 181–182

bootstrapping 197

Box, George 352, 360–361

box plot 46

box-and-whiskers plot 46

bubble plots 78–81

“by-hand” method 177

C

capability analysis 398–400

cases 24

casewise data 180

categorical 3

categorical regression models 323

See also regression analysis

categorical variables

distributions of 41–45

inference for 173–187

inference for two 208

one continuous variable and one 71–74

sample observations of 168–169

two 63–71

center of distributions 46, 47

Central Limit Theorem (CLT) 154–156, 218–224

central tendency, of distributions 46, 47

characterization 361

chart variability 395–397

checking data for suitability of normal model 133–137

chi-square distribution 177, 205

chi-square tests

about 205

applications 213–215

contingency tables 209–210

goodness-of-fit test 205–208

of independence 211–213

inference for two categorical variables 208

Classical method, of assigning probabilities 108

CLT (Central Limit Theorem) 154–156, 219, 231

clustering 157–159

coefficients 257

collinearity

about 309–315

dealing with 314–315

example 309–314

column properties 7

Column Switcher 97

columns, of data tables 2

comma (,), with Normal Distribution function 130

comparing

two means with JMP 217–224

two variances with JMP 224–226

complement of an event 107

complex sampling 157–159

conditional probability 107–108

conditional values 210

conducting

analysis of variance (ANOVA) 287–291

significance testing with JMP 174–179, 190–196

confidence band 278

confidence intervals

about 179

bootstrap 181–182

estimating 173

for parameters 276–277

for Y|X 278–279

confidence limits 55

constant variance 273

contingency tables

about 209–210

displaying covariation in categorical variables 68–71

probability and 109–111

continuous columns

continuous data

fitting lines to bivariate 249–253

probability and 123

using Distribution platform for 46–52

continuous variables

inference for single 189–204

one categorical variable and 71–74

sample observations of 169–171

two 74–81

two-sample inference for 217–228

control charts

about 386–395

for individual observations 387–389

for means 389–392

for proportions 392–395

control limits 387, 390

correlation 77–78

covariation

one continuous, one categorical variable 71–74

two categorical variables 63–71

two continuous variables 74–81

creating

data tables 5–8, 36

pseudo-random normal data 137–138

cross-section 24

cross-sectional data 90

cross-sectional sampling 28–29

crosstabulation 68–71

CTRL key 134

cumulative probabilities 115, 129–132

curvature 270

curvilinear regression models 323

See also regression analysis

curvilinear relationships 330–337

customizing histograms 50–52

cycle pattern 341

D

data

See also continuous data

applications 37–38, 56–59

bivariate 61–62

casewise 180

checking for suitability of normal model 133–137

cross-sectional 90

experimental 29–32

longitudinal 90

matched pairs of 199–200

observational 33, 90

panel 25

populations 23–25

processes 23–25

raw case data 36–37

representativeness 25–28

samples and sampling 23–29

study design 29–36

summary 36–37, 182–183

survey 33–36

time-series 90

types of 2–3, 90–91

data analysis

goals of 1–2

role of probability in 105–106

data dictionary 33

Data Filter tool 42–43

Data Grid area 7

data management

See data

data sources

See data

data tables

about 2

creating 5–8, 36

degrees of freedom (DF) 207

density functions 124–126, 163–164

description 1–2

descriptive statistics

about 87

analysis with Fit Y by X 100–101

analysis with multivariate platform 98–99

applying analytics frameworks 89–91

data source and structure 90

exploring relationship with Graph Builder 95–98

interpretation 101

observational units 90

preparation for analysis 92

questions for analysis 88–89

univariate descriptions 92–94

variables and data types 90–91

visualizing multiple relationships 101–103

World Development Indicators (WDI) 87–88

detecting patterns 341–344

DF (degrees of freedom) 207

dichotomous dependent variables 327–330

dichotomous independent variables 323–327

disclosure button 8

discrete distributions

about 105

applications 118–121

as models of real processes 117–118

discrete random variables

about 111

three common 112–115

dispersion, of distributions 46, 47

Distribution command 166

Distribution platform, for continuous data 46–52

“distribution-free” methods 213

distributions

See also discrete distributions

binomial 113–115

of categorical variables 41–45

center of 46, 47

central tendency of 46, 47

chi-square 177, 205

dispersion of 46, 47

Hypergeometric 171

integer 112–113

non-normal 222–224

normal 164–165, 218–224

Poisson 115

probability 111, 163–164

of quantitative variables 45–46

theoretical discrete 111

of variables 40–41

dummy variables 323–327

Dunnett’s method 235–236

E

effect likelihood ratio tests 330

equal variances, compared with unequal variances 221–222

error 183

estimating

confidence intervals 173

population means with JMP 197–199

population proportions with JMP 179–183

evaluating

alternative models 315–319

assumptions 240–241

events

probability of 107

rules for two 107–108

excluded rows 14

expected frequency 208

experimental data 29–32

experimental design

about 359

applications 379–384

blocks and blocking 361–362, 369–371

factorial designs 362–369

factors 361–362

fractional designs 371–375

goals of 360–361

multi-factor experiments 362–369

randomization 361–362

reasons for experimenting 360

response surface designs 375–379

experimental runs 361

exporting JMP results to word-processor documents 16–20

extraordinary sampling variability 167–171

F

factor profiles 242

factorial analysis 234–237

factorial designs 362–369

factors 361–362

Fit Model platform, residuals analysis in 304–306

Fit Y by X, analysis with 100–101

fitted line 77

fitting 11

five-number summary 54

fly ash 360

forecasting techniques

about 341

applications 355–358

autoregressive models 352–355

detecting patterns 341–344

smoothing methods 344–350

trend analysis 350–352

fractional designs 371–375

frequency of values 46

full factorial experimental design 364–369

G

Gaussian density function 126

generalization, simulation to 151–152

goals

of data analysis 1–2

of experimental design 360–361

golden mean 258

goodness-of-fit test 205–208

Gosset, William 199

Grabber 50–51

Graph Builder

about 8–11

exploring categorical data with 44–45

exploring data with 76

exploring relationships with 95–98

using 52–53

graphs, linked 50

H

Hand tool 50–51

Haydn, Franz Joseph 258

Help tool 240

heterogeneity of variance 273

heteroscedasticity 273, 304–306

hidden rows 14

histograms 46, 50–52

Holt, Charles 348

Holt’s Method 348–349

homogeneity 225

homogeneity of variance 273

homoscedasticity 273

Hypergeometric distribution 171

hypothesis testing 173

I

IIP (Index of Industrial Production) 342

independence

about 274–276

chi-square tests of 211–213

independent events 108

Index of Industrial Production (IIP) 342

indicator variables 323–327

individual observations, charts for 387–389

inference

See also bivariate inference; linear regression analysis; univariate inference

about 2, 189, 217

applications 184–187, 201–204, 226–228

comparing two means with JMP 217–224

comparing two variances with JMP 224–226

conditional status of statistical 174

conditions for 189–190, 217

conducting significance testing with 174–179

conducting significance testing with JMP 190–196

confidence interval estimation 173, 179

estimating population means with JMP 197–199

estimating population proportions with JMP 179–183

matched pairs 199–200

satisfying conditions 197

for single categorical variable 173–187

for single continuous variable 189–204

for two categorical variables 208

two-sample 217–228

influential observations 270–272

integer distribution 112–113

interaction effect 239, 241–245

interpretation 101

interpreting regression results 256–261

interquartile range (IQR) 55

inverse cumulative problems, solving 132–133

IQR (interquartile range) 55

irregular pattern 341

J

Jenkins, Gwilym 352

“jitters” 9

JMP

See also specific topics

comparing two means with 217–224

comparing two variances with 224–226

conducting significance testing with 174–179, 190–196

estimating population means with 197–199

estimating population proportions with 179–183

exporting results to word-processor documents 16–20

leaving 21

selecting simple random samples with 145–148

simulating random variation with 116–117

starting 3–4

JMP Scripting Language (JSL) 148

joint probability 107

joint relative frequency 210

joint-frequency table 68–71

JSL (JMP Scripting Language) 148

K

Kruskal-Wallis Test 224

L

label property 7

labeled rows 14

Lack of Fit 257

least squares estimation, conditions for 267–268

leaving JMP 21

linear exponential smoothing (Holt’s Method) 348–349

linear regression analysis

about 249

applications 261–265

assumptions of 255–256

fitting lines to bivariate continuous data 249–253

interpreting regression results 256–261

simple regression model 253–255

linearity 254–255, 269–270

linked graphs/tables 50

logarithmic growth 291

logarithmic models 334–337

longitudinal data 90

longitudinal sampling 28–29

lower fences 55

M

Mann-Whitney U Test 224

margin of error 180

matched pairs 199–200

MDGs (Millennium Development Goals) 88

means

comparing two with JMP 217–224

control charts for 389–392

metadata 5

Millennium Development Goals (MDGs) 88

millimeters of mercury (mmHg) 141

missing data 62, 67

mmHg (millimeters of mercury) 141

model specification 323

modeling types 2–3

modifying analysis 67

Mozart, Wolfgang Amadeus 258

multicollinearity 309

multi-factor experiments 362–369

multiple regression

about 295

applications 319–322

collinearity 309–314

evaluating alternative models 315–319

fitting a model 298–302

model 295–296, 302–304

residuals analysis in Fit Model platform 304–306

visualizing 296–298

multivariate platform, analysis with 98–99

mutually exclusive events 107

N

National Health and Nutrition Examination Survey (NHANES) 33

nominal columns 3

non-linear regression models 323

See also regression analysis

non-linear relationships 330–337

non-normal distributions, comparing two means with JMP 222–224

nonparametric equivalent test 237–238

nonparametric methods 213

non-parametric test 197

non-random sampling 28

normal density function 126

normal distributions 164–165, 218–224

normal model

about 123, 127

applications 138–141

checking data for suitability of 133–137

continuous data and probability 123

density functions 124–126

generating pseudo-random normal data 137–138

normal calculations 128–133

Normal Probability Plot (NPP) 133–137

Normal Quantile function 132

Normal Quantile Plots 133–137

normality 272–273, 294

NPP (Normal Probability Plot) 133–137

null hypothesis 175–176

O

observational data 33, 90

observational units 24, 90

observations 2

one-way analysis of variance (ANOVA) 230–237

optimization 361

ordinal columns 3

ordinary least squares estimation (OLS) 265n1

ordinary sampling variability 167–171

outlier box plots 55

overlap marks 232

P

panel data 25

panel studies 29

panning axes 52

parameter estimates 257–258, 330

parameters, confidence intervals for 276–277

Pareto charts 400–402

partition platform 306–309

patterns, detecting 341–344

percentiles 53–55

Pipeline and Hazardous Materials Program (PHMSA) 117–118

Poisson distribution 115

polynomial functions 331

population means, estimating with JMP 197–199

population proportions, estimating with JMP 179–183

populations 2, 23–25

post-stratification weights 157

power of a test 194–196

predictability, of risks 25

prediction bands 279

prediction intervals, for Y|X 279

Prediction Variance Profile Plot 376

primitives 103

probability and probabilistic sampling

about 105, 163

applications 118–121

assigning values 108

contingency tables and 109–111

continuous data and 123

cumulative probabilities 115, 129–132

events, probability of 107

extraordinary sampling variability 167–171

normal distributions 164–165

ordinary sampling variability 167–171

probability distributions and density functions 163–164

role of in data analysis 105–106

t distributions 164–165

usefulness of theoretical models 166–167

probability distributions 111, 163–164

probability of an event (Pr(A)) 107

probability theory 105–108

process capability 398

processes

about 23–25

in quality improvement 385–386

proportions, charts for 392–395

pseudo-random normal data, generating 137–138

p-value 178–179, 183, 192–194

Q

quadratic models 331–334

quality improvement

about 385

applications 402–405

capability analysis 398–400

control charts 386–395

Pareto charts 400–402

processes 385–386

variation in 385–386

quantile 54

quantitative 3

quantitative variables, distributions of 45–46

R

random error 255

Random function 137–138

random variation, simulating with JMP 116–117

randomization 24, 361–362

Rasmussen, Marianne 106–107, 110

raw case data 36–37

red triangles 6, 100, 134

regression analysis

See also multiple regression

applications 338–340

curvilinear relationships 330–337

dichotomous dependent variable 327–330

dichotomous independent variables 323–327

interpreting results 257

non-linear relationships 330–337

regression tree approach 306–309

relationships

curvilinear 330–337

exploring with Graph Builder 95–98

non-linear 330–337

visualizing multiple 101–103

Relative Frequency method, of assigning probabilities 108

re-launching analysis 67

representativeness, of data 25–28

residuals, normality in 294

residuals analysis

about 267, 268–269

applications 280–283

conditions for least squares estimation 267–268

constant variance 273

curvature 270

in Fit Model platform 304–306

independence 274–276

influential observations 270–272

linearity 269–270

normality 272–273

residuals estimation

about 267, 276

applications 280–283

conditions for least squares estimation 267–268

confidence intervals for parameters 276–277

confidence intervals for Y|X 278–279

prediction intervals for Y|X 279

response combinations, to bivariate data 62

response surface 362

response surface designs 375–379

row states 14–16

Rsquare (r²) 77

Run Chart 386

Rydén, Jesper 258

S

sales lift 376

sample mean, sampling distribution of 152–154

sample proportion, sampling distribution of 148–151

sampling and sampling distributions

about 23–25, 143, 167–168

applications 159–162

Central Limit Theorem (CLT) 154–156

clustering 157–159

complex sampling 157–159

cross-sectional sampling 28–29

defined 2

methods of sampling 144–145

non-random 28

reasons for sampling 143–144

of sample mean 152–154

simple random sampling (SRS) 25–28, 144–145, 145–148

from simulation to generalization 151–152

stratification 157–159

time series sampling 24–25, 28–29

using JMP to select simple random samples 145–148

variability across samples 148–159

sampling error 25

sampling frame 25, 145

sampling variability, ordinary and extraordinary 167–171

sampling weights, comparing two means with JMP 221

saving 20–21

scatterplot 75–76, 78–81

screening 361

script 148

season pattern 341

selected rows 14

session script, saving 21

shadowgrams 51, 126

shape, of distributions 46–47

Shewhart, Walter 405n1

Shewhart Charts

See control charts

shortest half bracket 55

sidereal period of orbit 331

significance testing

about 173

conducting with JMP 174–179, 190–196

simple exponential smoothing 346–348

Simple Moving Average 344–345

simple random sampling (SRS) 25–28, 144–145, 145–148

simple regression model 253–255

simulating

to generalization 151–152

random variation with JMP 116–117

smoothing methods

about 344

linear exponential smoothing (Holt’s Method) 348–349

simple exponential smoothing 346–348

Simple Moving Average 344–345

Winters’ Method 349–350

solving

cumulative probability problems 129–132

inverse cumulative problems 132–133

split plot experiment 199

SRS (simple random sampling) 25–28, 144–145, 145–148

standard deviation 54–55

standard error 155

Standard Normal Distribution 127

starting JMP 3–4

statistics

See descriptive statistics

stratification 157–159

study design 29–36

Subjective method, of assigning probabilities 108

summary data 36–37, 182–183

Summary of Fit 256–257

summary statistics, for single variables 53–55

survey data 33–36

T

t distributions 155–156, 164–165

Table variable note 6

tables, linked 50

See also data tables

tails, in continuous distributions 129

Test Means command 197

testing, for slopes other than zero 258–261

theoretical discrete distribution 111

time series sampling 24–25, 28–29

time-series data 90

transforming the variable 291

treatment effect 229

trend analysis 350–352

trend pattern 341

t-tests 257–258

Tukey’s HSD (Honestly Significant Difference) 235–236, 237–238

two-sample inference, for continuous variables 217–228

two-way analysis of variance 238–245

two-way table 68–71

Type I error 183

Type II error 183–184

U

unequal variances, compared with equal variances 221–222

uniform scaling option 50

union of two events 107

univariate descriptions 92–94

univariate inference

about 285

life expectancy by GDP per capita 291–293

life expectancy by income group 287–291

research context 285

unusual observations, of distributions 46, 47–50

upper fences 55

V

values

assigning probability 108

frequency of 46

variability, across samples 148–159

variability charts 395–397

variables

See also bivariate data; categorical variables; continuous variables

about 39

applications 82–85

defined 2

descriptive statistics 87

dichotomous dependent 327–330

dichotomous independent 323–327

distributions of 40–41

dummy 323–327

indicator 323–327

quantitative 45–46

summary statistics for single 53–55

transforming 291

types of 40–41

variance

heterogeneity of 273

homogeneity of 273

variances, comparing two with JMP 224–226

variation, in quality improvement 385–386

visualizing

multiple regression 296–298

multiple relationships 101–103

W

WDI (World Development Indicators) 87–88

weighting 157

Welch’s test 233

whiskers 55

whole model test 330

Wilcoxon Signed Rank Test 197

Wilson Estimator 180

Winters, Peter 349

Winters’ Method 349–350

word-processor documents, exporting JMP results to 16–20

World Development Indicators (WDI) 87–88

Y

Y-hat 278

Y|X

confidence intervals for 278–279

prediction intervals for 279

Z

z-scores 127

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.102.249