11 Simple Linear Regression Models Essay

Submitted By Mark-Elakawi
Words: 1362
Pages: 6

Performance Evaluation:

Simple Linear Regression Models
Hongwei Zhang http://www.cs.wayne.edu/~hzhang Statistics is the art of lying by means of figures.
--- Dr. Wilhelm Stekhel

Acknowledgement: this lecture is partially based on the slides of Dr. Raj Jain.

Simple linear regression models
Response Variable: Estimated variable
Predictor Variables: Variables used to predict the response
Also called predictors or factors

Regression Model: Predict a response for a given set of predictor variables
Linear Regression Models: Response is a linear function of predictors Simple Linear Regression Models: Only one predictor

Outline
Definition of a Good Model
Estimation of Model parameters
Allocation of Variation
Standard deviation of Errors
Confidence Intervals for Regression Parameters
Confidence Intervals for Predictions
Visual Tests for verifying Regression Assumption

Outline
Definition of a Good Model
Estimation of Model parameters
Allocation of Variation
Standard deviation of Errors
Confidence Intervals for Regression Parameters
Confidence Intervals for Predictions
Visual Tests for verifying Regression Assumption

Definition of a good model?

Good models (contd.)
Regression models attempt to minimize the distance measured vertically between the observation point and the model line (or curve) The length of the line segment is called residual, modeling error, or simply error

The negative and positive errors should cancel out => Zero overall error Many lines will satisfy this criterion

Choose the line that minimizes the sum of squares of the errors

Good models (contd.)
Formally,
where,

is the predicted response when the predictor variable is

x. The parameter b0 and b1 are fixed regression parameters to be determined from the data.
Given n observation pairs {(x1, y1), …, (xn, yn)}, the estimated response for the i-th observation is:

The error is:

Good models (contd.)
The best linear model minimizes the sum of squared errors
(SSE):

subject to the constraint that the overall mean error is zero:

This is equivalent to the unconstrained minimization of the variance of errors (Exercise 14.1)

Outline
Definition of a Good Model
Estimation of Model parameters
Allocation of Variation
Standard deviation of Errors
Confidence Intervals for Regression Parameters
Confidence Intervals for Predictions
Visual Tests for verifying Regression Assumption

Estimation of model parameters
Regression parameters that give minimum error variance are:

where,

Example 14.1

Example (contd.)

Example (contd.)

Derivation of regression parameters?

Derivation (contd.)

Derivation (contd.)

Least Squares Regression vs. Least Absolute
Deviations Regression?
Least Squares Regression

Least Absolute Deviations
Regression

Not very robust to outliers

Robust to outliers

Simple analytical solution

No analytical solving method
(have to use iterative computation-intensive method)

Stable solution

Unstable solution

Always one unique solution

Possibly multiple solutions

The unstable property of the method of least absolute deviations means that, for any small horizontal adjustment of a data point, the regression line may jump a large amount. In contrast, the least squares solutions is stable in that, for any small horizontal adjustment of a data point, the regression line will always move only slightly, or continuously.

Outline
Definition of a Good Model
Estimation of Model parameters
Allocation of Variation
Standard deviation of Errors
Confidence Intervals for Regression Parameters
Confidence Intervals for Predictions
Visual Tests for verifying Regression Assumption

Allocation of variation

Allocation of variation (contd.)
The sum of squared errors without regression would be:

This is called total sum of squares or (SST). It is a measure of

y's variability and is called variation of y. SST can be computed as follows:

Where, SSY is the sum of squares of y (or Σy2). SS0 is the sum of squares of

and is equal to

Allocation of