# Statistics: Ellipses of regression

### From MathWiki

Notation is used inconsistently and needs to be corrected

Consider a linear regression model:

where is a matrix, and .

Recall that the least-squares estimator .

Let be the variance-covariance matrix for the matrix.

where is the matrix of the orthogonal projection onto the subspace of spanned by the vector, .

Recall that and .

If we let
*s*
be the residual standard error with
ν = *n* − *p* − 1
degrees of freedom then,
under the assumption of normality,

Using the notation described in Statistics: Ellipses, a 100(1 − α)% confidence ellipse for in can be expressed as:

The data ellipse for the predictors is:

- .

The ellipse

has the property that its projections onto 1-dimensional axes produce 100(1 − α)% confidence interval for the corresponding parameter.

In general, the ellipse

has projections that are Sheffe confidence regions that have a minimal coverage probability
100(1 − α)%
when the space of parameters estimated has been selected from a space of dimension *d*.

The following figures show the projection of a 3-dimensional confidence ellipsoid onto 2-dimensional and 1-dimensional subspaces.

The 3-dimensional ellipsoid is (using the notation above):

The 1- and 2-dimensional ellipses have the form:

where is the appropriate transformation. For example, to produce the ellipse for the last two coefficients

The following shows in green the 2-dimensional joint ellipse for the two slope parameters, 'b.Weight' and 'b.Height'. Also shown in blue is are the 1-dimensional ordinary confidence intervals and the 2-dimensional ellipse with Scheffe radius 1 whose shadows produce 1-dimensional ordinary confidence intervals.

The green ellipse is:

The blue ellipse is:

and the blue intervals are ordinary 95% confidence intervals and shadows of the blue ellipse:

where

denotes the partial standard deviation of
*X*_{i} adjusting for
*X**s*
other than *X*_{i}.

Any subset of these confidence regions have *joint* coverage probability at least 95%. Equivalently, any one of these confidence regions has coverage probability at least 95% even if the subspace for the parameters was selected as a subspace of this 3-dimensional space after seeing the data. Thus these regions provided protection against 'fishing' or 'data dredging' within a specified space of parameters.

For for a general non-technical treatment of multiple testing and adjusting for 'data dredging' for hypotheses, see a recent article: Bender, R., Lange, S. (2001) "Adjusting for multiple testing-when and how?", *Journal of Clinical Epidemiology* **54** 343–349 (*http://resolver.scholarsportal.info.ezproxy.library.yorku.ca/resolve/08954356/v54i0004/343_afmtah&form=pdf&file=file.pdf*)

For pairwise comparisons see:

Jaccard J, Becker MA, Wood G. Pairwise multiple comparison procedures: a review. Psychol Bull 1984;96:589–96.

Seaman MA, Levin JR, Serlin RC. New developments in pairwise multiple comparisons: some powerful and practicable procedures. Psychol Bull 1991;110:577–86. (*http://resolver.scholarsportal.info/resolve/00332909/v110i0003/577_ndipmcspapp&form=pdf&file=file.pdf*)

The figures in this page were produced with a script written in R.