glossary

Add-on:

A utility that can be added to SPSS. Also called a module.

Analysis of covariance:

See ANCOVA.

Analysis of variance:

See ANOVA.

ANCOVA:

Analysis of covariance. ANOVA with the addition of a second or third covariate.

ANOVA:

Analysis of variance. Using an F-ratio to test the fit of a linear model.

ascending:

A sorting order. The cases are ordered so the values range from small to large. See also descending.

association:

Variables are said to be associated if the value of one is a whole or fractional multiplication of the other.

autoscript:

A script that executes automatically in response to the output of data. Specific autoscripts can be assigned to specific output types. See also script.

average:

The result of adding several values and then dividing by the number of values. See also mean and mode.

base:

The main system of SPSS. Modules can be added to expand SPSS, but the base system is always present.

BASIC:

See script.

bell curve:

See normal distribution.

binning:

The process of organizing the values of a variable into groups. Each group is a defined as a specific range of values and each group can be thought of as being sorted a bin. This is also called clustering.

bivariate:

Using two variables.

break variable:

When organizing data into tabular form, the break variable is used to group the information. At the point in the report where the break variable changes value, a subtotal line is generated, or a new page is started, or some other break appears in the report.

canonical correlation:

A correlation expressed in a standard form.

case:

Any single collection of values. All the values in a single row. A case is sometimes called a single record, and it normally contains one constant value for each variable.

case summary:

A simple table that directly summarizes values of the cases.

categorical variable:

A type of variable that can take on only one of a specific set of values, such as year of birth, make of car, or favorite color — in effect, defining a category. See also scale, ordinal, nominal, dichotomy, and binning.

censored case:

A case for which the event being analyzed has not occurred during the time period of the study.

chart:

See graph.

clustering:

See binning.

coefficient of determination

: A statistic used to specify the correctness of the fit of regression coefficients.

command language:

See Syntax.

confidence interval:

A range above and below an average into which a specified percentage of the values appears. For example, if gravel trucks for a company deliver an average of 190 loads per month, but 95 percent of the trucks deliver between 183 and 194 loads, the 95 percent confidence interval ranges from a low of 7 below to a high of 4 above.

constant:

A number. A quantity that is regarded as fixed or unchanging. See also variable.

continuous:

See scale.

correlation:

The degree of similarity or difference between two variables.

covariance:

A comparison of the variance of one set of values with that of another.

covariate:

A variable that takes part in the prediction of an outcome. An independent variable in regression. It is secondary to the relationship of the main independent variable.

cutpoint:

A number used as a divider to split values into groups, as in binning.

dataset:

The data displayed in the Data Editor window, whether loaded from a file, entered from the keyboard, or both. (It's also written as two words: data set.) Multiple datasets can be loaded and will appear in separate windows — they will be labeled DataSet1, DataSet2, and so on.

degrees of freedom:

The minimum number of values that must be specified to determine all the data points. This number is usually one less than the number of values used in the calculation.

delimiter:

A character used to indicate the beginning of, ending of, or separation between individual values in a series of strings of characters. For example, the string of characters 59,21,34 is a series of comma-delimited numbers.

dependent variable:

A variable that has its value derived from one or more other variables. Also called a predicted variable. See also independent variable.

descending:

A sorting order that arranges values from large to small. See also ascending.

deviation:

The amount by which a measurement differs from some fixed value.

dichotomy:

A variable with only two possible values, such as yes/no, true/false, or like/dislike. It is a specific type of categorical variable. See also categorical variable.

dodging:

Plotting points on a graph so they appear next to one another instead of one of top of the other.

error:

Two kinds of errors exist in the world of statistics. The conventional kind comes about when you enter a wrong number and get a bogus result. The other kind is calculated — that is, you calculate the amount of possible error present in the results you get from the data you have. With modern survey techniques, you will often hear the term "margin of error" for this second type.

faceting:

See paneling.

field:

In the SPSS documentation, field is used as a synonym for variable.

F-ratio:

A comparison of the variance of unexpected values to the variance of expected values.

frequency distribution:

The collection of values that a variable takes in a sample.

GLM:

General Linear Model. A general procedure for analyzing variance, covariance, and regression.

goodness of fit:

The extent to which observed values approximate values from a theoretical distribution.

graph:

A non-numeric display of values. The terms graph and chart are used in SPSS internal documentation almost interchangeably.

GUI:

Graphical User Interface. Control of an application with windows and a mouse. All versions of SPSS operate this way.

histogram:

A graphical display of a distribution in which the extent of each rectangle represents the magnitude (as in a bar chart) and the width of each rectangle represents the magnitude of the bin. The area of each rectangle thus represents the frequency.

hoc:

See post hoc.

imputation:

The process of calculating numeric values for missing values in the data.

independence:

The degree to which two or more variables have no effect on one another.

independent variable:

A variable whose values are used as the basis for calculation of statistics. See also dependent variable.

kurtosis:

A measure of how peaked a bell curve is. A positive number indicates there is more of a peak than standard; a negative number indicates a flatter line.

Levene test:

A test that determines whether the variance of two groups is significantly different or significantly the same.

linear:

A straight line. No curves.

log-linear model:

An analysis based on a correlation using the raw values of one variable and the natural logarithm of another.

longitudinal data:

Data which spans all cases. Not clustered.

mean:

1. Another word for average. 2. A calculated value equally distant from the two extreme values. 3. The temperament of the person making you learn this stuff. See also average and mode.

missing data:

If you declare a value for a variable as representing the fact that no value is present, the missing value will not be included in calculations.

mixed model:

A statistical model containing both fixed effects and random effects.

mode:

The value that occurs most frequently in a given set of data. See also average and mean.

model:

a mathematical model of some process.

module:

A utility that can be added to SPSS. Also called an add-on.

multiple response set:

A special variable that has its content generated from the content of two or more other variables. In SPSS, it doesn't appear in the Data View (in the Data Editor window), but does appear when you select variable names for other activities.

multivariate:

Using multiple variables.

nominal:

Numbers that specify categories. For example, yes, no, and undecided could be represented by 2, 1, and 0. See also scale, ordinal, and categorical.

nonlinear:

Not in a straight line. Curved.

normality:

The degree to which the values match normal distribution.

normal distribution:

A distribution that is continuous and symmetric. It is used primarily because many quantitative measurements appear to approximate this distribution. It is also called the bell curve.

OLAP cubes:

Online Analytical Processing cubes. A multilevel table containing totals, means, or some other statistics in which each level of the table contains the values relating to one value of a categorical variable.

OMS:

Output Management Systems. The ability in SPSS to output to different file formats.

Online Analytical Processing:

See OLAP cubes.

ordinal:

Types of numbers that specify the order of occurrences. The ordinal forms of 1, 2, and 3 are first, second, and third. See also scale, nominal, and categorical.

outliers:

The extreme values of a variable. Generally, they are the five largest and five smallest values.

paneling:

Adding another dimension of data to a graphic display causing the layout to be replicated a number of times to accommodate the values of the data along the new dimension. This process is also known as faceting.

parametric:

A procedure that requires one or more seed values that control its processing.

PASW:

Predictive Analysis SoftWare. For a couple of years, SPSS was known as PASW.

Pearson's Product Moment Correlation:

Commonly called Pearson's correlation. It represents the degree of linear relationship between two variables.

periodicity:

The interval of repetition at which data recordings are made.

pivot table:

A table with names identifying the rows and columns. Swapping the positions of the rows and columns to make the table appear in a different form, but containing the same data, is known as pivoting the table. The tables in SPSS Viewer are pivot tables.

post hoc:

The erroneous conclusion that some condition arises as the result of a previous condition.

p-p plot:

A proportion-proportion plot. The observed cumulative proportion is plotted against the expected cumulative proportion.

predicted variable:

See dependent variable.

predictor:

A variable, or collection of variables, the values of which predict the values of some others.

probit:

A function of probability based on the quartiles of normal distribution.

pyramid:

A special form of a histogram where the bars representing the values extend outward to the sides from a center line. It often assumes the shape of a pyramid.

Python:

A general-purpose programming language that can also be used to program SPSS scripting.

q-q plot:

A quantile-quantile plot. The quantiles of the observed values are plotted against the quantiles of a specified distribution.

quantiles:

A set of values chosen to divide a sampling of data into groups, each containing (as far as possible) an equal number of values.

quartile:

Specific values that divide all the values into four groups, with an equal number of values in each group. The groups are generally called the first, second, third, and fourth quartiles.

R:

See coefficient of determination.

recency:

The quality or state of being recent.

recoding:

The conversion of a set of SPSS values to a new set of values. For example, if you have yes/no coded as 0/1, by recoding you can change the values to 1/2 in a single operation.

record:

Any single collection of values for the variables defined in SPSS. A record is all the values of a single row. It is a single case or row.

regression:

Determining the "best fit" equation for the relationship between two variables. See also dependent variable and independent variable.

row:

Any single collection of values for all the variables defined in a SPSS dataset. It appears as a single row in the Data View window. It is a single case.

scale:

A type of number that uses a standard by which something is measured, such as inches, pounds, dollars, or hours. Another name for scale is continuous. See also ordinal, nominal, and categorical.

script:

A program written in either the BASIC or Python programming language. These are different languages than Syntax.

skewness:

A measure of the unevenness of the distribution of data. Positive skewness indicates more high values than low in the distribution; negative skewness indicates more low values than high.

SPSS:

Statistical Package for the Social Sciences. The original name of SPSS.

standard deviation:

A calculated indicator of the extent of deviation for a specific collection of data. The value is derived from the variations where the points are compared to a standard bell-shaped curve. It is the square root of the variance.

standard error:

A measurement of the magnitude of the change from one sample to the next.

statistic:

A single number calculated in a specific way. Some examples of types of a statistics are sum, mean, deviation, and average.

statistics:

A collection of statistical values.

string:

A series of characters making up a name or even a complete sentence. Quite often the beginning and ending of a string is delimited by quotes.

Syntax:

The name of the programming language fundamental to SPSS. All actions performed by SPSS are in response to the internal interpretation of Syntax commands. In the SPSS documentation, Syntax is sometimes referred to as the command language.

t:

The number of degrees of freedom. A continuous distribution with density symmetrical around the null value and a bell-shaped curve.

univariate:

A statistic derived from the values of one variable. Examples are mean, standard deviation, and sum.

variable:

In statistical software, a place to store constants. A variable can store a number of constants (one for each case). Each case (or row) in SPSS consists of a collection of constant values assigned to variables.

variance:

The average of the differences between a set of measured values and a set of expected values on a standard bell-shaped curve. It is the square of the standard deviation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.170.187