- σ (standard deviation of an entire population), 76, 313–314
- 25th percentile (first quartile/Q1), 88
- 75th percentile (third quartile/Q3), 88
A
- ACT scores, examining, 86–88
- addition rule
- about, 180–182
- simplifying with mutually inclusive events, 185
- age trend project example, for pie charts, 103–108
- alternative hypothesis, 42, 343
- analyzing
- data from experiments, 425–427
- results from surveys, 405
- anecdotes, 21
- anonymity, for surveys, 403
- asterisk (*), 144
- average. See mean
- avoiding probability misconceptions, 189–190
- axes, in histograms, 128–129
B
- bar graphs
- about, 108
- evaluating, 112
- lotto example, 110
- pet peeves example, 111–116
- scales on, 110–111
- transportation expenses example, 108–110
- behavior, studying using t-tables, 258
- bell curve, 38
- best-fitting line. See regression line
- bias
- avoiding in experiments, 422
- defined, 32
- binomial distribution
- about, 199
- checking, 204
- finding
- binomial probabilities using formulas, 207–210
- probabilities using binomial tables, 210–212
- identifying binomials, 203–206
- independence of trials, 205
- mean of binomials, 212–213
- number of trials, 204
- practice questions answers and explanations, 214–215
- probability of success (p) changes, 205–206
- quiz, 216–217
- standard deviation of binomials, 212–213
- success vs. failure, 205
- binomial table
- finding binomial probabilities using, 210–212
- illustrated, 507–512
- binomials
- checking conditions, 204
- identifying, 203–206
- mean of, 212–213
- normal approximation to, 236–238
- standard deviation of, 212–213
- blind experiment, 30, 425
- borderline values, in histograms, 128
- boundaries, setting for rejecting H0, 350
- boxplots
- about, 143
- creating, 143–145
- examples of, 146–151
- interpreting, 146–151
- British Medical Journal, 20
C
- calculating
- center, 65–74
- chance of wrong decisions, 352–355
- conditional distributions, 478–479
- confidence intervals for population means, 315–318
- correlation, 440–441
- joint distributions, 476–477
- margin of error
- for sample means, 293–294
- for sample proportions, 291–292
- marginal distributions, 473
- mean, 66–68
- median, 68–70
- percentiles
- about, 83–84
- for normal distribution, 232–233
- for t-distribution, 256–257
- p-values, 347–349
- relative standing with percentiles, 83–89
- sample variability, 289–291
- standard deviation, 75–76
- standard error, 266–269
- test statistic, 345
- totals for two-way tables, 469–471
- variability
- with IQR, 147–148
- using standard errors, 344–345
- categorical data
- about, 34, 49, 99–100
- bar graphs, 108–116
- descriptive statistics, 49–50
- example questions, 51, 52, 56, 104–105, 113
- frequency, 50–52
- graphing, 99–124
- interpreting counts, 55–56
- percents, 50–56
- pie charts, 100–108
- practice questions answers and explanations, 57–59, 117–122
- quiz, 60–63, 123–124
- tables, 50–56
- two-way tables, 54–55
- cause-and-effect relationship
- about, 21, 435
- checking for, 489–490
- compared with correlation, 458–459
- in observational studies, 415
- questioning claims of, 44–45
- cells, setting up in two-way tables, 469
- center
- finding using median, 148
- measuring, 65–74
- Central Limit Theorem (CLT). See also sampling distributions
- about, 38–39, 263
- sampling distribution and, 269–272
- Cheat Sheet (website), 3
- checking
- binomial conditions, 204
- for cause and effect, 489–490
- conditions, 446, 453
- independence, 483–486
- independence for events, 182–183
- match, 16
- shape of boxplots, 146–147
- sources, 20–21
- Chi-square test, 480
- claims
- about, 341–342
- assessing chance of wrong decisions, 352–355
- compiling evidence, 344–345
- decision-making, 346–349
- example questions, 348, 351, 354
- gathering good evidence, 343–344
- making conclusions, 349–352
- practice questions answers and explanations, 356–358
- p-values, 346–349
- quiz, 359–360
- setting up hypotheses, 342–343
- test statistic, 344–345
- weighing evidence, 346–349
- clarifying survey purpose, 396
- clinical trials, 423
- collecting
- data
- about, 343–344
- from experiments, 424–425
- for surveys, 402
- quality data, 30–32
- sample statistics, 344
- comparing
- designing experiments for making comparisons, 417–419
- household incomes, 85–86
- independence and exclusivity, 186
- marginal and conditional distributions, 485–486
- means and medians, 70–74
- results of two conditional distributions, 484–485
- two (independent) population averages, 371–374
- two population proportions, 378–381
- compiling evidence, 344–345
- complements
- about, 173
- complement rule, 178–179
- probabilities, 176
- conclusions
- drawing
- about, 44–45, 349–352
- appropriate, 428–429
- from surveys, 405–406
- using critical value, 366–368
- conditional distributions
- about, 478
- calculating, 478–479
- comparing
- with marginal distributions, 485–486
- results of two, 484–485
- conditional probabilities, 176–177
- conditions, checking, 446, 453
- confidence intervals
- about, 38, 40–41, 305, 426
- calculating for population means, 315–318
- creating
- for difference of two means, 322–325
- for one population proportion, 319–321
- sample size needed, 318–319
- defined, 307
- for the difference of two population means (μ1 - μ2), 322–325
- for the difference of two population proportions (p1 - p2), 326–329
- estimates and, 306
- estimating difference of two proportions, 326–329
- example questions, 309, 314, 317, 321, 324–325, 327–328
- finding misleading, 329
- interpreting results, 308–309
- linking statistics to parameters, 306–307
- for the population proportion (p), 319–321
- population variability and, 313–314
- practice questions answers and explanations, 330–336
- quiz, 337–340
- sample size and, 312–313
- selecting, 310–311
- selecting values for, 257–258
- terminology for, 307–308
- width of, 310
- confidentiality, for surveys, 403
- confounding variables (confounders), controlling for, 422–423
- connecting
- statistics to parameters, 306–307
- test statistics and p-values, 346–347
- continuous data, 33
- continuous random variables, 200–201
- control group, 29–30, 418
- correlation analysis
- about, 43–44, 426, 435
- calculating, 440–441
- compared with cause and effect, 458–459
- defined, 486
- example questions, 438, 443, 448, 451, 455
- interpreting, 441–443
- practice questions answers and explanations, 460–464
- properties of, 443–445
- quantifying linear relationships using, 440–445
- quiz, 465–466
- scatterplots, 436–439
- correlation coefficient, 440
- countably infinite sample spaces, 170, 200–201
- counts, interpreting, 55–56
- cover letter, 397–398
- creating
- boxplots, 143–145
- conclusions, 349–352
- confidence intervals for difference of two means, 322–325
- histograms, 126–130
- informed decisions, 429–430
- predictions, 453–456, 491
- questions for surveys, 398–399
- scatterplots, 436–437
- crime statistics, 17–18
- criteria, for good experiments, 417
- critical value (z*-value)
- defined, 311
- drawing conclusions using, 366–368
D
- data
- about, 33–34
- analyzing from experiments, 425–427
- categorical
- about, 34, 49, 99–100
- bar graphs, 108–116
- descriptive statistics, 49–50
- example questions, 51, 52, 56, 104–105, 113
- frequency, 50–52
- graphing, 99–124
- interpreting counts, 55–56
- percents, 50–56
- pie charts, 100–108
- practice questions answers and explanations, 57–59, 117–122
- quiz, 60–63, 123–124
- tables, 50–56
- two-way tables, 54–55
- collecting
- about, 30–32
- from experiments, 424–425
- for surveys, 402
- compiling, 344–345
- continuous, 33
- defined, 2
- discrete, 33
- gathering, 343–344
- numerical
- about, 33–34, 65, 99, 125
- examining boxplots, 143–151
- example questions, 128–129, 133–134, 138, 142, 145, 150–151, 155–156
- graphing, 125–166
- handling histograms, 126–143
- handling time charts, 152–158
- practice questions answers and explanations, 159–164
- quiz, 165–166
- quality of, 30–32
- reliability of, 424
- shape of, 130–132
- simplifying excess, 153–158
- time series, 152
- unbiased, 425
- validity of, 424–425
- data set, 34
- debates, statistics in, 17–18
- decision-making
- about, 346–349
- assessing chance of wrong decisions, 352–355
- making informed decisions, 429–430
- defining
- null hypothesis, 342
- p-values, 347
- sample size, 420
- sampling distributions, 264–265
- target population for surveys, 396–397
- degrees of freedom (df), 252
- dependent relationships, 486–488
- dependent variables, 446
- descriptive statistics, 49–50
- designing
- experiments, 29–30, 417–427
- polls, 28–29
- studies, 28–30
- surveys, 28–29, 396–399
- determining
- binomial probabilities
- using binomial table, 210–212
- using formulas, 207–210
- center using median, 148
- confidence intervals for one population proportion, 319–321
- impact of sample size, 296–299
- margin of error, 289–296
- misleading confidence intervals, 329
- misleading histograms, 139–143
- misleading time charts, 153–158
- probabilities
- for normal distributions, 227–230
- for sample mean, 273–274
- for sample proportion, 278–279
- for specific values of X, 210–211
- with t-tables, 253–255
- for X greater-than, less-than, or between two values, 211–212
- for Z with Z-table, 225–226
- sample size needed, 318–319
- slope of a line, 447
- variables with marginal distributions, 472–476
- volunteers for experiments, 421
- which variable is X and which is Y, 445–446
- X when you know the percent, 232–235
- y-intercept, 447–449
- discrete data, 33
- discrete random variables
- about, 200–201
- mean of, 202–203
- variance of, 202–203
- distributions
- about, 38
- binomial
- about, 199
- checking, 204
- finding binomial probabilities using formulas, 207–210
- finding probabilities using binomial tables, 210–212
- identifying binomials, 203–206
- independence of trials, 205
- mean of binomials, 212–213
- number of trials, 204
- practice questions answers and explanations, 214–215
- probability of success (p) changes, 205–206
- quiz, 216–217
- standard deviation of binomials, 212–213
- success vs. failure, 205
- conditional, 478–486
- defined, 264
- joint, 476–478
- marginal, 472–476, 485–486
- normal
- about, 37, 38, 219
- basics of, 219–222
- calculating percentile for, 232–233
- example questions, 222, 225, 229, 231, 234, 237
- finding probabilities for, 227–230
- finding X when you know the percent, 232–235
- normal approximation to the binomial, 236–238
- percentiles, 230–231
- practice questions answers and explanations, 239–247
- quiz, 248–250
- standard normal (Z-) distribution, 223–226
- sampling
- about, 263, 269
- defining, 264–265
- example questions, 272, 274, 277, 279
- finding probabilities for sample mean, 273–274
- finding probabilities for sample proportion, 278–279
- mean of, 265–266
- measuring standard error, 266–269
- practice questions answers and explanations, 280–282
- quiz, 283–284
- of sample proportion, 275–277
- shape of, 269–272
- double-blind experiment, 30, 425
- drawing conclusions
- about, 44–45
- from surveys, 405–406
- using critical value, 366–368
E
- Empirical Rule (68-95-99.7)
- about, 79–83
- using, 138–139
- empty sets, 172
- error, 266
- error of omission, 16
- estimating
- about, 306
- difference of two proportions, 326–329
- Ethical Review Board (ERB), 424
- ethics
- respecting in experiments, 423–424
- for surveys, 397–398
- evaluating
- bar graphs, 112
- pie charts, 104
- time charts, 156
- using conditional probabilities, 177
- events
- checking independence for, 182–183
- independence in multiple, 182–184
- multiplication rule for independent, 183–184
- mutually inclusive, 184–185
- probabilities of, 173–177
- as subsets of sample spaces, 171–172
- evidence
- compiling, 344–345
- gathering, 343–344
- weighing, 346–349
- examining ACT scores, 86–88
- Example icon, 3
- example questions
- categorical data, 51, 52, 56, 104–105, 113
- claims, 348, 351, 354
- confidence intervals, 309, 314, 317, 321, 324–325, 327–328
- correlation, 438, 443, 448, 451, 455
- experiments, 416, 426, 430
- hypothesis tests, 367, 370, 373, 374, 377–378, 381
- independence, 470, 475, 481–482, 487
- margin of error, 290, 295, 298
- mean, 67–68, 69–70, 72–74, 78–79, 81–82, 84, 88–89
- median, 67–68, 69–70, 72–74, 78–79, 81–82, 84, 88–89
- normal distribution, 222, 225, 229, 231, 234, 237
- numerical data, 128–129, 133–134, 138, 142, 145, 150–151, 155–156
- observational studies, 416, 426, 430
- polls, 399, 401, 404, 406
- probability, 180–181, 188, 189, 190
- regression analysis, 438, 443, 448, 451, 455
- sampling distributions, 272, 274, 277, 279
- t-distributions, 255, 256–257
- two-way tables, 470, 475, 481–482, 487
- exclusivity, compared with independence, 186
- experiments. See also observational studies
- about, 413–414, 415–417
- analyzing data from, 425–427
- avoiding bias in, 422
- collecting data from, 424–425
- controlling confounding variables, 422–423
- criteria for good, 417
- designing, 29–30, 417–427
- example questions, 416, 426, 430
- finding volunteers for, 421
- interpreting results of, 428–430
- in observational studies, 414
- practice questions answers and explanations, 431–432
- quiz, 433–434
- random assignments for, 421–422
- respecting ethical issues in, 423–424
- selecting
- sample size for, 419–420
- subjects for, 421
F
- factor, in observational studies, 414
- fake data, 22
- fake treatments, 418–419
- Federal Drug Administration (FDA), 424
- finding
- binomial probabilities
- using binomial table, 210–212
- using formulas, 207–210
- center using median, 148
- confidence intervals for one population proportion, 319–321
- impact of sample size, 296–299
- margin of error, 289–296
- misleading confidence intervals, 329
- misleading histograms, 139–143
- misleading time charts, 153–158
- probabilities
- for normal distributions, 227–230
- for sample mean, 273–274
- for sample proportion, 278–279
- for specific values of X, 210–211
- with t-tables, 253–255
- for X greater-than, less-than, or between two values, 211–212
- for Z with Z-table, 225–226
- sample size needed, 318–319
- slope of a line, 447
- variables with marginal distributions, 472–476
- volunteers for experiments, 421
- which variable is X and which is Y, 445–446
- X when you know the percent, 232–235
- y-intercept, 447–449
- finite sample spaces, 170
- first quartile. See 25th percentile (first quartile/Q1)
- five-number summary, 88–89
- following up, after surveys, 403–405
- formulas
- finding binomial probabilities using, 207–210
- solving
- conditional probabilities with, 176–177
- conditional probabilities without, 176
- frequency, 50–52
G
- gathering
- data
- about, 343–344
- from experiments, 424–425
- for surveys, 402
- quality data, 30–32
- sample statistics, 344
- generalizing results, 429
- generating
- boxplots, 143–145
- conclusions, 349–352
- confidence intervals for difference of two means, 322–325
- histograms, 126–130
- informed decisions, 429–430
- predictions, 453–456, 491
- questions for surveys, 398–399
- scatterplots, 436–437
- graphing
- categorical data, 99–124
- conditional distributions, 479–483
- joint distributions, 477–478
- marginal distributions, 473–476
- numerical data, 125–166
- greater-than probabilities
- about, 253–254
- finding for X, 211–212
- groups, quantity of, 139–141
H
- H0, setting boundaries for rejecting, 350
- handling
- for confounding variables, 422–423
- histograms, 126–143
- negative t-values, 365
- small samples, 363–368
- unknown standard deviations, 363–368
- histograms
- about, 70–74, 126
- creating, 126–130
- detecting misleading, 139–143
- examples of, 126–130
- interpreting, 130–136
- time charts compared with, 153
- using, 137–139
- household incomes, comparing, 85–86
- hypotheses, setting up, 342–343
- hypothesis tests
- about, 38, 41, 361, 426
- comparing
- two (independent) population averages, 371–374
- two population proportions, 378–381
- paired t-test, 375–378
- practice questions answers and explanations, 382–386
- quiz, 387–388
- testing
- for average difference, 375–378
- one population mean, 362–363
- one population proportion, 368–370
I
- icons, explained, 3
- identifying binomials, 203–206
- in-bounds, staying, 454–456
- including mutually exclusive events, 184–185
- independence. See also two-way tables
- about, 467
- checking
- about, 483–486
- for events, 182–183
- compared with exclusivity, 186
- example, 187–188
- example questions, 470, 475, 481–482, 487
- in multiple events, 182–184
- multiplication rule for independent events, 183–184
- practice questions answers and explanations, 492–497
- quiz, 498–502
- of trials, 205
- Independent Ethics Committee (IEC), 424
- independent variables, 446
- inequalities, 171
- inflection point, 38
- informed decisions, making, 429–430
- Institutional Review Board (IRB), 424
- interpreting
- boxplots, 146–151
- correlation, 441–443
- counts, 55–56
- experiment results, 428–430
- histograms, 130–136
- percentiles, 85–88
- percents, 55–56
- regression lines, 449–451
- results
- about, 308–309, 489–491
- from surveys, 405–407
- scatterplots, 437–439
- slope of lines, 449–450
- standard deviation, 76
- test statistic, 345
- time charts, 152
- two-way tables, 472–483
- y-intercept, 450–451
- interquartile range (IQR)
- about, 78, 89
- measuring variability using, 133–134, 147–148
- intersection (joint) probabilities, 175
- intersections
- about, 173
- multiplication rule for, 179–180
- interval, 307
J
- joint distributions
- about, 476
- calculating, 476–477
- graphing, 477–478
- joint (intersection) probabilities, 175
- Journal of the American Medical Association (JAMA), 20
- leading questions, 398
- least-squares method, simple linear regression analysis using, 446
- less-than, finding probabilities for X, 211–212
- level, in observational studies, 414
- line graph. See time charts
- linear regression
- about, 445
- calculating regression line, 446–449
- checking conditions, 446
- determining which variable is X and which is Y, 445–446
- example regression line, 451–452
- interpreting regression line, 449–451
- using least-squares method, 446
- linear relationships
- about, 437
- quantifying using correlation, 440–445
- lines
- lottery statistics, 18–20
- lotto example
- for bar graphs, 110
- for pie charts, 101–102
M
- margin of error (MOE)
- about, 39–40, 287
- calculating
- for sample means, 293–294
- for sample proportions, 291–292
- confidence level and, 294–296
- defined, 307
- determining impact of sample size, 296–299
- example questions, 290, 295, 298
- finding, 289–296
- measuring sample variability, 289–291
- plus or minus, 287–288
- practice questions answers and explanations, 300–302
- quiz, 303–304
- reporting results, 293
- marginal column totals, 469
- marginal distributions
- calculating, 473
- comparing with conditional distributions, 485–486
- finding variables with, 472–476
- graphing, 473–476
- marginal probabilities, 175
- marginal row totals, 469
- marginal totals, 469
- match, checking, 16
- matched-pairs design, 423
- maximum, 88
- mean
- about, 36, 65
- of binomials, 212–213
- compared with median, 70–74, 132–133
- creating confidence intervals for difference of two, 322–325
- of discrete random variables, 202–203
- Empirical Rule (68-95-99.7), 79–83
- example questions, 67–68, 69–70, 72–74, 78–79, 81–82, 84, 88–89
- finding probabilities for sample, 273–274
- measuring
- about, 66–68
- relative standing with percentiles, 83–89
- practice questions answers and explanations, 90–96
- quiz, 97–98
- of sampling distributions, 265–266
- variation and, 74–79
- measuring
- center, 65–74
- chance of wrong decisions, 352–355
- conditional distributions, 478–479
- confidence intervals for population means, 315–318
- correlation, 440–441
- joint distributions, 476–477
- margin of error
- for sample means, 293–294
- for sample proportions, 291–292
- marginal distributions, 473
- mean, 66–68
- median, 68–70
- percentiles
- about, 83–84
- for normal distribution, 232–233
- for t-distribution, 256–257
- p-values, 347–349
- relative standing with percentiles, 83–89
- sample variability, 289–291
- standard deviation, 75–76
- standard error, 266–269
- test statistic, 345
- totals for two-way tables, 469–471
- variability
- with IQR, 147–148
- using standard errors, 344–345
- median (50th percentile)
- about, 36, 65
- compared with mean, 70–74, 132–133
- defined, 88
- Empirical Rule (68-95-99.7), 79–83
- example questions, 67–68, 69–70, 72–74, 78–79, 81–82, 84, 88–89
- finding center using, 148
- measuring
- about, 68–70
- relative standing with percentiles, 83–89
- practice questions answers and explanations, 90–96
- quiz, 97–98
- variation and, 74–79
- medical breakthroughs, 413–414
- minimum, 88
- misconception probability, 189–190
- misleading confidence intervals, finding, 329
- misleading histograms, detecting, 139–143
- misleading questions, 398
- misleading statistics, 17–22, 23
- missing data, 22
- MOE (margin of error)
- about, 39–40, 287
- calculating
- for sample means, 293–294
- for sample proportions, 291–292
- confidence level and, 294–296
- defined, 307
- determining impact of sample size, 296–299
- example questions, 290, 295, 298
- finding, 289–296
- measuring sample variability, 289–291
- plus or minus, 287–288
- practice questions answers and explanations, 300–302
- quiz, 303–304
- reporting results, 293
- multiplication rule
- about, 179–180
- for independent events, 183–184
- mutually exclusive events, including, 184–185
N
- New England Journal of Medicine, 20
- no treatment, 419
- nonrandom samples, 31
- normal approximation, to binomials, 236–238
- normal distribution
- about, 37, 38, 219
- basics of, 219–222
- calculating percentile for, 232–233
- example questions, 222, 225, 229, 231, 234, 237
- finding
- probabilities for, 227–230
- X when you know the percent, 232–235
- normal approximation to the binomial, 236–238
- percentiles, 230–231
- practice questions answers and explanations, 239–247
- quiz, 248–250
- sampling distribution and, 269
- standard normal (Z-) distribution, 223–226
- not-equal-to alternative, 365
- null hypothesis, 42, 342
- numerical data
- about, 33–34, 65, 99, 125
- examining boxplots, 143–151
- example questions, 128–129, 133–134, 138, 142, 145, 150–151, 155–156
- graphing, 125–166
- handling
- histograms, 126–143
- time charts, 152–158
- practice questions answers and explanations, 159–164
- quiz, 165–166
O
- observational studies. See also experiments
- about, 413–414
- basics of, 414–417
- example questions, 416, 426, 430
- observing, 415
- practice questions answers and explanations, 431–432
- quiz, 433–434
- terminology for, 414–415
- observing observational studies, 415
- omission, error of, 16
- opposites, complement rule for, 178–179
- organizing
- results from surveys, 405
- two-way tables, 468
- outcomes, 205
- outliers
- defined, 36, 67
- finding in boxplots, 148–149
- output, from regression analysis, 456–457
- overstated results, 44, 428
P
- paired differences, 376
- paired t-tests, 41, 375–378
- parameters
- about, 35–36
- linking statistics to, 306–307
- participants, in observational studies, 414
- percentiles
- about, 37, 52–53, 230–231
- calculating
- about, 83–84
- for normal distribution, 232–233
- relative standing with, 83–89
- for t-distribution, 256–257
- finding X when you know the percent, 232–235
- five-number summary, 88–89
- interpreting, 55–56, 85–88
- interquartile range, 89
- personal expenses example, for pie charts, 100–101
- pie charts
- about, 100
- age trend projection example, 103–108
- evaluating, 104
- lotto revenue example, 101–102
- personal expenses example, 100–101
- takeout order example, 102–103
- placebo effect, 30, 418–419
- planning surveys, 396–399
- plus or minus, margin of error and, 287–288
- polls
- about, 391
- designing, 28–29
- example questions, 399, 401, 404, 406
- impact of, 392–395
- practice questions answers and explanations, 408–410
- quiz, 411–412
- surveys, 395–407
- pollsters, 392
- population
- about, 34–35
- projecting from samples to, 490–491
- variability in, 313–314
- population averages, comparing two (independent), 371–374
- population means
- calculating confidence intervals for, 315–318
- defined, 36
- difference of two
- when population standard deviation known, 371–374
- when population standard deviation unknown, 374
- population proportions
- comparing two, 378–381
- determining confidence intervals for one, 319–321
- testing one, 368–370
- population standard deviation
- difference of two population means
- when known, 371–374
- when unknown, 374
- standard error and, 267–269
- practice questions answers and explanations
- binomial distribution, 214–215
- categorical data, 57–59, 117–122
- claims, 356–358
- confidence intervals, 330–336
- correlation, 460–464
- experiments, 431–432
- hypothesis tests, 382–386
- independence, 492–497
- margin of error, 296–297
- mean, 90–96
- median, 90–96
- normal distribution, 239–247
- numerical data, 159–164
- observational studies, 431–432
- polls, 408–410
- probability, 192–195
- random variables, 214–215
- regression analysis, 460–464
- sampling distributions, 280–282
- t-distributions, 259
- two-way tables, 492–497
- predictions
- making, 453–456, 491
- using probability, 190–191
- probabilities
- about, 169
- avoiding misconceptions, 189–190
- defined, 42, 170
- distinguishing independent from mutually exclusive events, 186–188
- of events involving A and/or B, 173–177
- example questions, 180–181, 188, 189, 190
- finding
- for normal distributions, 227–230
- for sample mean, 273–274
- for sample proportion, 278–279
- for specific values of X, 210–211
- with t-tables, 253–255
- using binomial table, 210–212
- using formulas, 207–210
- for X greater-than, less-than, or between two values, 211–212
- for Z with Z-table, 225–226
- including mutually exclusive events, 184–185
- independence of multiple events, 182–184
- notation for, 174–175
- practice questions answers and explanations, 192–195
- predictions using, 190–191
- quiz, 196–197
- rules of, 178–182
- set notation, 169–173
- probability distribution (p(x)), 202
- probability models, 187
- probability of success (p), changes in, 205–206
- properties
- of correlation, 443–445
- of standard deviation, 77
- proportions, estimating difference of two, 326–329
- p-values
- about, 42, 346, 490–491
- calculating, 347–349
- connecting test statistics and, 346–347
- defining, 347
Q
- qualitative data. See categorical data
- quantifying linear relationships using correlation, 440–445
- quantity, of groups, 139–141
- questioning claims of cause and effect, 44–45
- questions, formulating for surveys, 398–399
- quizzes
- binomial distribution, 216–217
- categorical data, 60–63, 123–124
- claims, 359–360
- confidence intervals, 337–340
- correlation, 465–466
- experiments, 433–434
- hypothesis tests, 387–388
- independence, 498–502
- margin of error, 303–304
- mean, 97–98
- median, 97–98
- normal distribution, 248–250
- numerical data, 165–166
- observational studies, 433–434
- polls, 411–412
- probability, 196–197
- random variables, 216–217
- regression analysis, 465–466
- sampling distributions, 283–284
- t-distributions, 260–261
- two-way tables, 498–502
R
- random assignments, for experiments, 421–422
- random digit dialing (RDD), 31
- random number generators, 32
- random samples
- about, 31–32
- for surveys, 400
- random variables
- about, 199–200
- defined, 264
- discrete, 202–203
- discrete vs. continuous, 200–201
- mean of discrete, 202–203
- practice questions answers and explanations, 214–215
- probability distributions, 202
- quiz, 216–217
- variance of discrete, 202–203
- Randomized Controlled Trial (RCT), in observational studies, 414
- range, 78–79
- RDD (random digit dialing), 31
- regression, 43–44, 435
- regression analysis
- about, 426
- example questions, 438, 443, 448, 451, 455
- making predictions, 453–456
- output, 456–457
- practice questions answers and explanations, 460–464
- quiz, 465–466
- residuals, 454, 457–458
- regression line
- about, 446–447
- defined, 445
- interpreting, 449–451
- relationships
- cause-and-effect
- about, 21, 435
- checking for, 489–490
- compared with correlation, 458–459
- in observational studies, 415
- questioning claims of, 44–45
- dependent, 486–488
- scatterplots and, 436–439
- relative standing
- about, 37
- measuring with percentiles, 83–89
- reliability, of data, 424
- Remember icon, 3
- reporting
- results, 293
- standard deviation, 75–77
- residuals, 454, 457–458
- respecting ethical issues in experiments, 423–424
- response, in observational studies, 414
- response bias, 402
- results
- comparing for two conditional distributions, 484–485
- generalizing, 429
- interpreting
- about, 308–309, 489–491
- from experiments, 428–430
- from surveys, 405–407
- overstating, 428
- reporting, 293
- right-tail probability, 253–254
- rules of probability, 178–182
- Rumsey, Deborah (author)
S
- s (standard deviation)
- about, 36–37
- of binomials, 212–213
- calculating, 75–76
- defined, 2
- handling unknown, 363–368
- interpreting, 76
- lobbying for, 77
- measuring variability using, 133
- properties of, 77
- reporting, 75–77
- sample means, calculating margin of error for, 293–294
- sample proportion
- calculating margin of error for, 291–292
- finding probabilities for, 278–279
- sampling distribution of, 275–277
- sample size
- about, 21
- confidence intervals and, 312–313
- determining
- impact of, 296–299
- needs for, 318–319
- margin of error and, 296–297
- selecting for experiments, 419–420
- standard error and, 266–267
- for surveys, 400–401
- sample spaces, 170–172
- sample statistics, gathering, 344
- sample variability, measuring, 289–291
- sample variance, 75
- samples
- about, 30–32
- defined, 420
- handling small, 363–368
- projecting to populations from, 490–491
- selecting for surveys, 399–402
- sampling distributions
- about, 263
- defining, 264–265
- example questions, 272, 274, 277, 279
- finding probabilities
- for sample mean, 273–274
- for sample proportion, 278–279
- mean of, 265–266
- measuring standard error, 266–269
- practice questions answers and explanations, 280–282
- quiz, 283–284
- of sample proportion, 275–277
- shape of, 269–272
- sampling error, 39–40
- scales
- on bar graphs, 110–111
- in histograms, 141–143
- of time charts, 153
- scatterplots
- about, 436
- creating, 436–437
- interpreting, 437–439
- scientific method, 26
- scientific surveys, 22
- selecting
- confidence levels, 310–311
- sample size for experiments, 419–420
- samples for survey, 399–402
- subjects for experiments, 421
- survey time/type, 397
- values for confidence intervals, 257–258
- self-selected sample, 31
- set notation
- about, 169
- complements, 173
- empty sets, 172
- events, 171–172
- inequalities, 171
- intersections, 173
- sample spaces, 170
- unions, 172–173
- setting boundaries for rejecting H0, 350
- setup
- cells in two-way tables, 469
- hypotheses, 342–343
- 75th percentile (third quartile/Q3), 88
- shape
- of boxplots, 146–147
- of data, 130–132
- of sampling distributions, 269–272
- simplifying addition rule with mutually inclusive events, 185
- 68-95-99.7 Rule (Empirical Rule)
- about, 79–83
- using, 138–139
- skepticism, 45
- skewed left histogram, 131, 132, 137–138
- skewed right histogram, 131, 132, 137
- slope, of lines, 446, 447, 449–450
- solving conditional probabilities
- with formulas, 176–177
- without formulas, 176
- sources
- about, 392
- checking, 20–21
- stacked bar graph, 479–480
- standard deviation (s)
- about, 36–37
- of binomials, 212–213
- calculating, 75–76
- defined, 2
- handling unknown, 363–368
- interpreting, 76
- lobbying for, 77
- measuring variability using, 133
- properties of, 77
- reporting, 75–77
- standard deviation of an entire population (σ), 76, 313–314
- standard errors
- defined, 311
- measuring
- about, 266–269
- variability using, 344–345
- population standard deviation and, 267–269
- sample size and, 266–267
- standard normal distribution (Z-distribution), 38, 39
- standard scores, 37–38, 345
- standard treatments, 419
- standardizing, from X to Z, 223–225
- start/end points, of time charts, 153
- start/finish lines, in histograms, 141–143
- statistical jargon, 33–44
- statistical significance, 42–43, 349
- statistics
- about, 7, 35
- defined, 2, 25
- descriptive, 49–50
- in everyday life, 25–26
- linking to parameters, 306–307
- media and, 8–13
- misleading, 17–22
- as more than numbers, 26–28
- using at work, 13–14
- Statistics II For Dummies (Rumsey), 415, 446, 454, 480
- studies, designing, 28–30
- subjects, choosing for experiments, 421
- sum of squares error (SSE), 447
- surveys
- about, 393–394, 395
- anonymity and confidentiality for, 403
- carrying out, 402–405
- choosing time/type for, 397
- clarifying purpose of, 396
- collecting data for, 402
- defining target population, 396–397
- designing, 28–29, 396–399
- ethics and, 397–398
- examples of, 394–395
- following up after, 403–405
- formulating questions for, 398–399
- interpreting results from, 405–407
- planning, 396–399
- random samples for, 400
- sample size for, 400–401
- selecting samples for, 399–402
- symmetric histograms, 130, 132, 137
T
- t9, 252
- tables
- binomial, 210–212, 507–512
- t-table, 505–506
- two-way
- about, 43–44, 54–55, 467
- calculating totals, 469–471
- example questions, 470, 475, 481–482, 487
- finding variables with marginal distributions, 472–476
- interpreting, 472–483
- organizing, 468
- practice questions answers and explanations, 492–497
- quiz, 498–502
- setting up cells, 469
- takeout order example, for pie charts, 102–103
- target population
- defining for surveys, 396–397
- survey samples and, 399–400
- t-distributions
- about, 251
- calculating percentiles for, 256–257
- compared with Z-distributions, 251–252
- effect of variability on, 252–253
- example questions, 255, 256–257
- finding probabilities with, 253–255
- practice questions answers and explanations, 259
- quiz, 260–261
- selecting values for confidence intervals, 257–258
- t-table, 253–258
- Technical Support, 4
- terminology, 33–44, 307–308, 414–415
- test statistics
- about, 345
- calculating, 345
- connecting p-values and, 346–347
- interpreting, 345
- testing
- for average difference, 375–378
- hypothesis
- about, 38, 41, 361, 426
- comparing two (independent) population averages, 371–374
- coparing two population proportions, 378–381
- defined, 342
- example questions, 367, 370, 373, 374, 377–378, 381
- handling small samples, 363–368
- handling standard deviations, 363–368
- paired t-test, 375–378
- practice questions answers and explanations, 382–386
- quiz, 387–388
- testing for average difference, 375–378
- testing one population mean, 362–363
- testing one population proportion, 368–370
- t-test, 363–368
- one population mean, 362–363
- one population proportion, 368–370
- third quartile. See 75th percentile (third quartile/Q3)
- time, choosing for surveys, 397
- time charts
- about, 152
- evaluating, 156
- examples of, 153–158
- finding misleading, 153–158
- histograms compared with, 153
- interpreting, 152
- variability and, 153
- time series data, 152
- Tip icon, 3
- tornado statistics, 18
- totals, calculating for two-way tables, 469–471
- transportation expenses example, for bar graphs, 108–110
- treatment group
- about, 29–30, 418
- in observational studies, 415
- trials
- independence of, 205
- number of, 204
- t-tables
- calculating percentiles for, 256–257
- finding probabilities with, 253–255
- illustrated, 505–506
- selecting values for confidence intervals, 257–258
- studying behavior using, 258
- using, 253
- t-test
- about, 363–364
- drawing conclusions using critical value, 366–368
- examining not-equal-to alternative, 365
- handling negative t-values, 365
- relating t to Z, 365
- using, 364
- t-values, handling negative, 365
- 25th percentile (first quartile/Q1), 88
- two-way tables. See also independence
- about, 43–44, 54–55, 467
- calculating totals, 469–471
- example questions, 470, 475, 481–482, 487
- finding variables with marginal distributions, 472–476
- interpreting, 472–483
- organizing, 468
- practice questions answers and explanations, 492–497
- quiz, 498–502
- setting up cells, 469
- type, choosing for surveys, 397
- Type I errors, 353
- Type II errors, 353–354
U
- unbiased data, 425
- uncountably infinite sample spaces, 170
- union probabilities, 175
- unions
- about, 172–173
- addition rule for, 180–182
V
- validity, of data, 424–425
- values, selecting for confidence intervals, 257–258
- variability
- effect of on t-distributions, 252–253
- measuring
- with IQR, 147–148
- using standard errors, 344–345
- in population, 313–314
- time charts and, 153
- viewing, 133–136
- variables
- about, 34
- determining which is X and which is Y, 445–446
- finding with marginal distributions, 472–476
- variance, of discrete random variables, 202–203
- variation
- about, 74
- range, 78–79
- standard deviation, 75–77
- verifying
- binomial conditions, 204
- for cause and effect, 489–490
- conditions, 446, 453
- independence, 483–486
- independence for events, 182–183
- match, 16
- shape of boxplots, 146–147
- sources, 20–21
- viewing variability, 133–136
- volunteers
- finding for experiments, 421
- sample for, 31
W
- Warning icon, 3
- websites
- Cheat Sheet, 3
- clinical trials, 423
- Technical Support, 4
- weighing evidence, 346–349
- width, of confidence intervals, 310
X
- X
- finding
- probabilities for specific values of, 210–211
- probabilities greater-than, less-than, or between two values for, 211–212
- when you know the percent, 232–235
- standardizing to Z from, 223–225
Y
- y-intercept, of lines, 446, 447–449, 450–451
- Your Turn icon, 3
Z
- Z
- relating t to, 365
- standardizing from X to, 223–225
- Z-distribution (standard normal distribution)
- about, 38, 223–225
- compared with t-distribution, 251–252
- Z-table
- finding probabilities for Z with, 225–226
- illustrated, 503–504
- z-values, 39
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.