15.6. Assumptions in Regression#

The statistical assumptions for regression are as follows:

  • The sample is randomly selected.

  • The linear regression model assumes that the relationship between $x$ and the mean of $y$ follows a straight line.

  • The conditional distribution of $y$ is normal.

  • The standard deviation of the conditional distribution is the same at each fixed $x$ value (homoscedasticity).

In practice, the assumptions are never perfectly fulfilled, but the regression model can still be useful. It is adequate to check that no assumption is grossly violated. (Agresti, Chapter 14).

Note that first assumption – that the sample is randomly selected – depends on the method of data collection. It is not a statistical test!

The remaining assumptions are statistical assumptions, and we will come to these in the next section.