# Statistics: Propensity scores

### From MathWiki

Table of contents |

## A few notes on propensity scores

The propensity score is a predictor of the probability of being in a treatment group versus a control group. We can think of an analogue in the conditional expectation of a continuous predictor variable using other covariates. Although the resulting prediction is not a propensity score in the strict sense it has, nevertheless, some linear properties that are similar to those of the propensity score.

I am sure that the following results are well known and I would be very interested in being made aware of a source.

## Basic concept

The idea of the propensity score is based on the idea of a coarsest conditioning partition.
For a linear analogue, consider a regression of a vector of responses,
, on two sets of predictor variables contained in two matrices
and
The vector of least-squares regression coefficients on
is
where

- with

Let be the projection matrix onto the orthogonal complement of .

Then:

Now since since and .

Thus, is the regression coefficient of on .

This is the basis of added-variable plots and an early theorem of Econometrics known as the Frisch-Waugh-Lovell theorem. [Thanks to Barry Smith]

We could take this a few steps further and show how the residuals are the same and *R*^{2} is the same as partial *R*^{2}
in the multiple regression, etc. But all we need to do here is to contemplate the resulting formula for
:

The formula reveals that, given *X*_{1}, depends on
only through
or, equivalently, *P*_{2}*X*_{1} since *Q*_{2}*X*_{1} = *X*_{1} − *P*_{2}*X*_{1}.

In other words, if we replace with a different set of variables, , say, then will have the same value if that is, if the predicted values from regressing on are the same as those from regressing on .

Now, the smallest such space is
where
is the least-squares predictor of
*X*_{1}
regressing on
, i.e. the linear analogue to the propensity score.

## Balancing spaces

We can call any space represented by a basis matrix, , that produces the same residuals when is regressed on it, a 'balancing' space [actually this is an inappropriate borrowing from the expression used in the context of propensity scores. I'm sure these concepts are well know, probably in Econometrics ... help1].

It can be shown that is a balancing space if and only if

## Better proofs

There are better ways of seeing this that reveal how the linear analogue of the propensity score allows to be decomposed into a sum of two orthogonal subspaces.

## Broader approaches

This discussion is based on the assumption of a linear model. The concept of propensity scores is developed in a broader context in which the relationship between and controlling for is not necessarily linear. Still, we only need to include the prediction (not necessarily linear) of on .

But then, we can't merely treat as a linear predictor. We need to condition on actual values of i.e. treat it as a categorical variable or a suitable approximation: intervals as categories or a non-parametric fit.

## References (annotated)

- GW Imbens (2000) The role of the propensity score in estimating dose-response functions, Biometrika 2000 87(3):706-710; doi:10.1093/biomet/87.3.706