Show Points
|
Hides or shows the points in the scatterplot. A check mark indicates that points are shown.
|
Histogram Borders
|
Attaches histograms to the x- and y-axes of the scatterplot. A check mark indicates that histogram borders are turned on. See “Histogram Borders”.
|
Group By
|
Lets you select a classification (or grouping) variable. A separate analysis is computed for each level of the grouping variable, and regression curves or ellipses are overlaid on the scatterplot. See “Group By”.
|
Script
|
Contains options that are available to all platforms. These options enable you to redo the analysis or save the JSL commands for the analysis to a window or a file. For more information, see Using JMP.
|
Fit Mean
|
Adds a horizontal line to the scatterplot that represents the mean of the Y response variable.
|
See “Fit Mean”.
|
Fit Line
|
Adds straight line fits to your scatterplot using least squares regression.
|
|
Fit Polynomial
|
Fits polynomial curves of a certain degree using least squares regression.
|
|
Fit Special
|
Transforms Y and X. Transformations include: log, square root, square, reciprocal, and exponential. You can also turn off center polynomials, constrain the intercept and the slope, and fit polynomial models.
|
See “Fit Special”.
|
Fit Spline
|
Fits a smoothing spline that varies in smoothness (or flexibility) according to the lambda (λ) value. The λ value is a tuning parameter in the spline formula.
|
See “Fit Spline”.
|
Fit Each Value
|
Fits a value to each unique X value, which can be compared to other fitted lines, showing the concept of lack of fit.
|
See “Fit Each Value”.
|
Fit Orthogonal
|
Fits lines that adjust for variability in X as well as Y.
|
See “Fit Orthogonal”.
|
Density Ellipse
|
Draws an ellipse that contains a specified mass of points.
|
See “Density Ellipse”.
|
Nonpar Density
|
Shows patterns in the point density, which is useful when the scatterplot is so darkened by points that it is difficult to distinguish patterns.
|
See “Nonpar Density”.
|
Category
|
Description
|
Fitting Commands
|
Regression Fits
|
Regression methods fit a curve through the points. The curve is an equation (a model) that is estimated using least squares, which minimizes the sum of squared differences from each point to the line (or curve). Regression fits assume that the Y variable is distributed as a random scatter above and below a line of fit.
|
Fit Mean
Fit Line
Fit Polynomial
Fit Special
Fit Spline
Fit Each Value
Fit Orthogonal
|
Density Estimation
|
Density estimation fits a bivariate distribution to the points. You can either select a bivariate normal density, characterized by elliptical contours, or a general nonparametric density.
|
Fit Density Ellipse
Nonpar Density
|
Mean
|
Mean of the response variable. The predicted response when there are no specified effects in the model.
|
Std Dev [RMSE]
|
Standard deviation of the response variable. Square root of the mean square error, also called the root mean square error (or RMSE).
|
Std Error
|
Standard deviation of the response mean. Calculated by dividing the RMSE by the square root of the number of values.
|
SSE
|
Error sum of squares for the simple mean model. Appears as the sum of squares for Error in the analysis of variance tables for each model fit.
|
RSquare
|
Measures the proportion of the variation explained by the model. The remaining variation is not explained by the model and attributed to random error. The Rsquare is 1 if the model fits perfectly.
The Rsquare values in Figure 5.10 indicate that the polynomial fit of degree 2 gives a small improvement over the linear fit.
|
RSquare Adj
|
Adjusts the Rsquare value to make it more comparable over models with different numbers of parameters by using the degrees of freedom in its computation.
|
Root Mean Square Error
|
Estimates the standard deviation of the random error. It is the square root of the mean square for Error in the Analysis of Variance report. See Figure 5.12.
|
Mean of Response
|
Provides the sample mean (arithmetic average) of the response variable. This is the predicted response when no model effects are specified.
|
Observations
|
Provides the number of observations used to estimate the fit. If there is a weight variable, this is the sum of the weights.
|
Source
|
The three sources of variation: Lack of Fit, Pure Error, and Total Error.
|
DF
|
The degrees of freedom (DF) for each source of error.
• The Total Error DF is the degrees of freedom found on the Error line of the Analysis of Variance table (shown under the “Analysis of Variance Report”). It is the difference between the Total DF and the Model DF found in that table. The Error DF is partitioned into degrees of freedom for lack of fit and for pure error.
• The Pure Error DF is pooled from each group where there are multiple rows with the same values for each effect. See “Statistical Details for the Lack of Fit Report”.
• The Lack of Fit DF is the difference between the Total Error and Pure Error DF.
|
Sum of Squares
|
The sum of squares (SS for short) for each source of error.
• The Total Error SS is the sum of squares found on the Error line of the corresponding Analysis of Variance table, shown under “Analysis of Variance Report”.
• The Pure Error SS is pooled from each group where there are multiple rows with the same value for the x variable. This estimates the portion of the true random error that is not explained by model x effect. See “Statistical Details for the Lack of Fit Report”.
• The Lack of Fit SS is the difference between the Total Error and Pure Error sum of squares. If the lack of fit SS is large, the model might not be appropriate for the data. The F-ratio described below tests whether the variation due to lack of fit is small enough to be accepted as a negligible portion of the pure error.
|
Mean Square
|
The sum of squares divided by its associated degrees of freedom. This computation converts the sum of squares to an average (mean square). F-ratios for statistical tests are the ratios of mean squares.
|
F Ratio
|
The ratio of mean square for lack of fit to mean square for Pure Error. It tests the hypothesis that the lack of fit error is zero.
|
Prob > F
|
The probability of obtaining a greater F-value by chance alone if the variation due to lack of fit variance and the pure error variance are the same. A high p value means that there is not a significant lack of fit.
|
Max RSq
|
The maximum R2 that can be achieved by a model using only the variables in the model.
|
Source
|
The three sources of variation: Model, Error, and C. Total.
|
DF
|
The degrees of freedom (DF) for each source of variation:
• A degree of freedom is subtracted from the total number of non missing values (N) for each parameter estimate used in the computation. The computation of the total sample variation uses an estimate of the mean. Therefore, one degree of freedom is subtracted from the total, leaving 49. The total corrected degrees of freedom are partitioned into the Model and Error terms.
• One degree of freedom from the total (shown on the Model line) is used to estimate a single regression parameter (the slope) for the linear fit. Two degrees of freedom are used to estimate the parameters ( and ) for a polynomial fit of degree 2.
• The Error degrees of freedom is the difference between C. Total df and Model df.
|
Sum of Squares
|
The sum of squares (SS for short) for each source of variation:
• In this example, the total (C. Total) sum of squared distances of each response from the sample mean is 57,258.157, as shown in Figure 5.12. That is the sum of squares for the base model (or simple mean model) used for comparison with all other models.
• For the linear regression, the sum of squared distances from each point to the line of fit reduces from 12,012.733. This is the residual or unexplained (Error) SS after fitting the model. The residual SS for a second degree polynomial fit is 6,906.997, accounting for slightly more variation than the linear fit. That is, the model accounts for more variation because the model SS are higher for the second degree polynomial than the linear fit. The C. total SS less the Error SS gives the sum of squares attributed to the model.
|
Mean Square
|
The sum of squares divided by its associated degrees of freedom. The F-ratio for a statistical test is the ratio of the following mean squares:
• The Model mean square for the linear fit is 45,265.424. This value estimates the error variance, but only under the hypothesis that the model parameters are zero.
• The Error mean square is 245.2. This value estimates the error variance.
|
F Ratio
|
The model mean square divided by the error mean square. The underlying hypothesis of the fit is that all the regression parameters (except the intercept) are zero. If this hypothesis is true, then both the mean square for error and the mean square for model estimate the error variance, and their ratio has an F-distribution. If a parameter is a significant model effect, the F-ratio is usually higher than expected by chance alone.
|
Prob > F
|
The observed significance probability (p-value) of obtaining a greater F-value by chance alone if the specified model fits no better than the overall response mean. Observed significance probabilities of 0.05 or less are often considered evidence of a regression effect.
|
Term
|
Lists the name of each parameter in the requested model. The intercept is a constant term in all models.
|
Estimate
|
Lists the parameter estimates of the linear model. The prediction formula is the linear combination of these estimates with the values of their corresponding variables.
|
Std Error
|
Lists the estimates of the standard errors of the parameter estimates. They are used in constructing tests and confidence intervals.
|
t Ratio
|
Lists the test statistics for the hypothesis that each parameter is zero. It is the ratio of the parameter estimate to its standard error. If the hypothesis is true, then this statistic has a Student’s t-distribution.
|
Prob>|t|
|
Lists the observed significance probability calculated from each t-ratio. It is the probability of getting, by chance alone, a t-ratio greater (in absolute value) than the computed value, given a true null hypothesis. Often, a value below 0.05 (or sometimes 0.01) is interpreted as evidence that the parameter is significantly different from zero.
|
Y Transformation
|
Use these options to transform the Y variable.
|
X Transformation
|
Use these options to transform the X variable.
|
Degree
|
Use this option to fit a polynomial of the specified degree.
|
Centered Polynomial
|
To turn off polynomial centering, deselect the Centered Polynomial check box. See Figure 5.20. Note that for transformations of the X variable, polynomial centering is not performed. Centering polynomials stabilizes the regression coefficients and reduces multicollinearity.
|
Constrain Intercept to
|
Select this check box to constrain the model intercept to be the specified value.
|
Constrain Slope to
|
Select this check box to constrain the model slope to be the specified value.
|
R-Square
|
Measures the proportion of variation accounted for by the smoothing spline model. For more information, see “Statistical Details for the Smoothing Fit Reports”.
|
Sum of Squares Error
|
Sum of squared distances from each point to the fitted spline. It is the unexplained error (residual) after fitting the spline model.
|
Change Lambda
|
Enables you to change the λ value, either by entering a number, or by moving the slider.
|
R-Square
|
Measures the proportion of variation accounted for by the kernel smoother model. For more information, see “Statistical Details for the Smoothing Fit Reports”.
|
Sum of Squares Error
|
Sum of squared distances from each point to the fitted kernel smoother. It is the unexplained error (residual) after fitting the kernel smoother model.
|
Local Fit (lambda)
|
Select the polynomial degree for each local fit. Quadratic polynomials can track local bumpiness more smoothly. Lambda is the degree of certain polynomials that are fitted by the method. Lambda can be 1 or 2.
|
Weight Function
|
Specify how to weight the data in the neighborhood of each local fit. Loess uses tri-cube. The weight function determines the influence that each xi and yi has on the fitting of the line. The influence decreases as xi increases in distance from x and finally becomes zero.
|
Smoothness (alpha)
|
Controls how many points are part of each local fit. Use the slider or type in a value directly. Alpha is a smoothing parameter. It can be any positive number, but typical values are 1/4 to 1. As alpha increases, the curve becomes smoother.
|
Robustness
|
Reweights the points to deemphasize points that are farther from the fitted curve. Specify the number of times to repeat the process (number of passes). The goal is to converge the curve and automatically filter out outliers by giving them small weights.
|
Number of Observations
|
Gives the total number of observations.
|
Number of Unique Values
|
Gives the number of unique X values.
|
Degrees of Freedom
|
Gives the pure error degrees of freedom.
|
Sum of Squares
|
Gives the pure error sum of squares.
|
Mean Square
|
Gives the pure error mean square.
|
Univariate Variances, Prin Comp
|
Uses the univariate variance estimates computed from the samples of X and Y. This turns out to be the standardized first principal component. This option is not a good choice in a measurement systems application since the error variances are not likely to be proportional to the population variances.
|
Equal Variances
|
Uses 1 as the variance ratio, which assumes that the error variances are the same. Using equal variances is equivalent to the non-standardized first principal component line. Suppose that the scatterplot is scaled the same in the X and Y directions. When you show a normal density ellipse, you see that this line is the longest axis of the ellipse.
|
Fit X to Y
|
Uses a variance ratio of zero, which indicates that Y effectively has no variance.
|
Specified Variance Ratio
|
Lets you enter any ratio that you want, giving you the ability to make use of known information about the measurement error in X and response error in Y.
|
Variable
|
Gives the names of the variables used to fit the line.
|
Mean
|
Gives the mean of each variable.
|
Std Dev
|
Gives the standard deviation of each variable.
|
Variance Ratio
|
Gives the variance ratio used to fit the line.
|
Correlation
|
Gives the correlation between the two variables.
|
Intercept
|
Gives the intercept of the fitted line.
|
Slope
|
Gives the slope of the fitted line.
|
LowerCL
|
Gives the lower confidence limit for the slope.
|
UpperCL
|
Gives the upper confidence limit for the slope.
|
Alpha
|
Enter the alpha level used in computing the confidence interval.
|
Variable
|
Gives the names of the variables used in creating the ellipse
|
Mean
|
Gives the average of both the X and Y variable.
|
Std Dev
|
Gives the standard deviation of both the X and Y variable.
A discussion of the mean and standard deviation are in the section “The Summary Statistics Report” in the “Distributions” chapter.
|
Correlation
|
The Pearson correlation coefficient. If there is an exact linear relationship between two variables, the correlation is 1 or –1 depending on whether the variables are positively or negatively related. If there is no relationship, the correlation tends toward zero.
For more information, see “Statistical Details for the Correlation Report”.
|
Signif. Prob
|
Probability of obtaining, by chance alone, a correlation with greater absolute value than the computed value if no linear relationship exists between the X and Y variables.
|
Number
|
Gives the number of observations used in the calculations.
|
Fitting Command
|
Fitting Menu
|
Fit Mean
|
Fit Mean
|
Fit Line
|
Linear Fit
|
Fit Polynomial
|
Polynomial Fit Degree=X*
|
Fit Special
|
Linear Fit
Polynomial Fit Degree=X*
Transformed Fit X*
Constrained Fits
|
Fit Spline
|
Smoothing Spline Fit, lambda=X*
|
Kernel Smoother
|
Local Smoother
|
Fit Each Value
|
Fit Each Value
|
Fit Orthogonal
|
Orthogonal Fit Ratio=X*
|
Density Ellipse
|
Bivariate Normal Ellipse P=X*
|
Nonpar Density
|
Quantile Density Colors
|
Fit Robust
|
Robust Fit
|
Confid Curves Fit
|
Displays or hides the confidence limits for the expected value (mean). This option is not available for the Fit Spline, Density Ellipse, Fit Each Value, and Fit Orthogonal fits and is dimmed on those menus.
|
Confid Curves Indiv
|
Displays or hides the confidence limits for an individual predicted value. The confidence limits reflect variation in the error and variation in the parameter estimates. This option is not available for the Fit Mean, Fit Spline, Density Ellipse, Fit Each Value, and Fit Orthogonal fits and is dimmed on those menus.
|
Line Color
|
Lets you select from a palette of colors for assigning a color to each fit.
|
Line of Fit
|
Displays or hides the line of fit.
|
Line Style
|
Lets you select from the palette of line styles for each fit.
|
Line Width
|
Gives three line widths for the line of fit. The default line width is the thinnest line.
|
Report
|
Turns the fit’s text report on and off.
|
Save Predicteds
|
Creates a new column in the current data table called Predicted colname where colname is the name of the Y variable. This column includes the prediction formula and the computed sample predicted values. The prediction formula computes values automatically for rows that you add to the table. This option is not available for the Fit Each Value and Density Ellipse fits and is dimmed on those menus.
Note: You can use the Save Predicteds and Save Residuals commands for each fit. If you use these commands multiple times or with a grouping variable, it is best to rename the resulting columns in the data table to reflect each fit.
|
Save Residuals
|
Creates a new column in the current data table called Residuals colname where colname is the name of the Y variable. Each value is the difference between the actual (observed) value and its predicted value. Unlike the Save Predicteds command, this command does not create a formula in the new column. This option is not available for the Fit Each Value and Density Ellipse fits and is dimmed on those menus.
Note: You can use the Save Predicteds and Save Residuals commands for each fit. If you use these commands multiple times or with a grouping variable, it is best to rename the resulting columns in the data table to reflect each fit.
|
Remove Fit
|
Removes the fit from the graph and removes its text report.
|
Linear Fits, Polynomial Fits, and Fit Special, and Fit Robust Only:
|
|
Mean Confidence Limit Formula
|
Creates a new column in the data table containing a formula for the mean confidence intervals.
|
Indiv Confidence Limit Formula
|
Creates a new column in the data table containing a formula for the individual confidence intervals.
|
Confid Shaded Fit
|
Draws the same curves as the Confid Curves Fit command and shades the area between the curves.
|
Confid Shaded Indiv
|
Draws the same curves as the Confid Curves Indiv command and shades the area between the curves.
|
Plot Residuals
|
Produces four diagnostic plots: residual by predicted, actual by predicted, residual by row, and a normal quantile plot of the residuals. See “Diagnostics Plots”.
|
Set Alpha Level
|
Prompts you to enter the alpha level to compute and display confidence levels for line fits, polynomial fits, and special fits.
|
Smoothing Spline Fit and Local Smoother Only:
|
|
Save Coefficients
|
Saves the spline coefficients as a new data table, with columns called X, A, B, C, and D. The X column gives the knot points. A, B, C, and D are the intercept, linear, quadratic, and cubic coefficients of the third-degree polynomial. These coefficients span from the corresponding value in the X column to the next highest value.
|
Bivariate Normal Ellipse Only:
|
|
Shaded Contour
|
Shades the area inside the density ellipse.
|
Select Points Inside
|
Selects the points inside the ellipse.
|
Select Points Outside
|
Selects the points outside the ellipse.
|
Quantile Density Contours Only:
|
|
Kernel Control
|
Displays a slider for each variable, where you can change the kernel standard deviation that defines the range of X and Y values for determining the density of contour lines.
|
5% Contours
|
Shows or hides the 5% contour lines.
|
Contour Lines
|
Shows or hides the contour lines.
|
Contour Fill
|
Fills the areas between the contour lines.
|
Select Points by Density
|
Selects points that fall in a user-specified quantile range.
|
Color by Density Quantile
|
Colors the points according to density.
|
Save Density Quantile
|
Creates a new column containing the density quantile each point is in.
|
Mesh Plot
|
Is a three-dimensional plot of the density over a grid of the two analysis variables. See Figure 5.18.
|
Model Clustering
|
Creates a new column in the current data table and fills it with cluster values.
Note: If you save the modal clustering values first and then save the density grid, the grid table also contains the cluster values. The cluster values are useful for coloring and marking points in plots.
|
Save Density Grid
|
Saves the density estimates and the quantiles associated with them in a new data table. The grid data can be used to visualize the density in other ways, such as with the Scatterplot 3D or the Contour Plot platforms.
|
18.224.68.28