Statistics: Propensity scores
|Table of contents|
A few notes on propensity scores
The propensity score is a predictor of the probability of being in a treatment group versus a control group. We can think of an analogue in the conditional expectation of a continuous predictor variable using other covariates. Although the resulting prediction is not a propensity score in the strict sense it has, nevertheless, some linear properties that are similar to those of the propensity score.
I am sure that the following results are well known and I would be very interested in being made aware of a source.
The idea of the propensity score is based on the idea of a coarsest conditioning partition. For a linear analogue, consider a regression of a vector of responses, , on two sets of predictor variables contained in two matrices and The vector of least-squares regression coefficients on is where
Let be the projection matrix onto the orthogonal complement of .
Now since since and .
Thus, is the regression coefficient of on .
This is the basis of added-variable plots and an early theorem of Econometrics known as the Frisch-Waugh-Lovell theorem. [Thanks to Barry Smith]
We could take this a few steps further and show how the residuals are the same and R2 is the same as partial R2 in the multiple regression, etc. But all we need to do here is to contemplate the resulting formula for :
The formula reveals that, given X1, depends on only through or, equivalently, P2X1 since Q2X1 = X1 − P2X1.
In other words, if we replace with a different set of variables, , say, then will have the same value if that is, if the predicted values from regressing on are the same as those from regressing on .
Now, the smallest such space is where is the least-squares predictor of X1 regressing on , i.e. the linear analogue to the propensity score.
We can call any space represented by a basis matrix, , that produces the same residuals when is regressed on it, a 'balancing' space [actually this is an inappropriate borrowing from the expression used in the context of propensity scores. I'm sure these concepts are well know, probably in Econometrics ... help1].
It can be shown that is a balancing space if and only if
There are better ways of seeing this that reveal how the linear analogue of the propensity score allows to be decomposed into a sum of two orthogonal subspaces.
This discussion is based on the assumption of a linear model. The concept of propensity scores is developed in a broader context in which the relationship between and controlling for is not necessarily linear. Still, we only need to include the prediction (not necessarily linear) of on .
But then, we can't merely treat as a linear predictor. We need to condition on actual values of i.e. treat it as a categorical variable or a suitable approximation: intervals as categories or a non-parametric fit.
- GW Imbens (2000) The role of the propensity score in estimating dose-response functions, Biometrika 2000 87(3):706-710; doi:10.1093/biomet/87.3.706