User Guide

Chapter
26
Linear Regre
ssion
Linear Regression estimates the coefficients of the linear equation, involving one or
more indepe
ndent variables, that best predict the value of the dependent variable.
For example, you can try to predict a salesperson’s total yearly sales (the dependent
variable) from independent variables such as age, education, and years of experience.
Example. I
s the number of games won by a basketball team in a season related to the
average number of points the team scores per game? A scatterplot indicates that these
variables are linearly related. The number of games won and the average number
of points
scored by the opponent are also linearly related. These variables have a
negative relationship. As the number of games won increases, the average number
of points scored by the opponent decreases. With linear regression, you can model
the relat
ionshipofthesevariables.Agoodmodelcanbeusedtopredicthowmany
gamesteamswillwin.
Statistics. For each variable: number of valid cases, mean, and standard deviation.
For each
model: regression coefficients, correlation matrix, part and partial
correlations, multiple R, R
2
, adjusted R
2
, change in R
2
, standard error of the estimate,
analysis-of-variance table, predicted values, and residuals. Also, 95%-confidence
interv
als for each regression coefficient, variance-covariance matrix, variance
inflation factor, tolerance, Durbin-Watson test, distance measures (Mahalanobis,
Cook, and leverage values), DfBeta, DfFit, prediction intervals, and casewise
diagno
stics. Plots: scatterplots, partial plots, histograms, and normal probability plots.
Data. The dependent and independent variables should be quantitative. Categorical
variables, such as religion, major field of study, or region of residence, need to be
recod
ed to binary (dummy) variables or other types of contrast variables.
Assumptions. For each value of the independent variable, the distribution of the
dependent variable must be normal. The variance of the distribution of the dependent
varia
ble should be constant for all values of the independent variable. The relationship
409