SCS Seminars: Measurement error

From MathWiki

Up to SCS: Seminars

At the first meeting of the seminar on October 20, we had a brief discussion of measurement error, its impact on regression coefficients and the effect of adjusting for it in a SEM. The omnipresence of measurement error has fundamental consequences for causal inference with observational data: not only do we need to identify relevant confounding factors, we also need to measure them with relatively little error.

The following pdf file [Visualizing Regression (http://www.math.yorku.ca/~georges/Files/VisualizingRegression.pdf)](warning: the file is large and the quality of the graphics is very poor, you might need to adjust the brightness of your screen to see some relevant gridlines) contains 'freeze-frame animations' on pp. 35--52 that illustrate the effect on estimated regression coeficients of adding successively larger errors to one of the predictor variables in a regression with two predictors. What's missing is a visualization of the consequence of attempting to correct with an SEM, using the correct or an incorrect value for the variance of measurement error.

John Fox's book Applied Regression Analysis, Linear Models, and Related Methods (http://socserv.socsci.mcmaster.ca/jfox/Books/Applied-Regression/) (Sage, 1997) has an exercise that uses the 'Duncan' data set readily available through his 'car' package in R.

Exercise 6.16 on p. 133
Recall Duncan's regression of occupational prestige on the educational and income levels of occupations. [...] Following Duncan, regress prestige on education and income. As well, perform a simple regression of prestige on income alone. Then add random measurement errors to education. Sample these measurement errors from a normal distribution with mean 0, repeating the exercise for each of the following measurement error variances \sigma_{\delta}^2 = 10^2, 25^2, 50^2, 100^2. In each case, recompute the regression of prestige on income and education. Then, treating the initial multiple regression as corresponding to \sigma_{\delta}^2 = 0, plot the coefficients of education and income as a function of \sigma_{\delta}^2. What happens to the education coefficient as measurement error in education grows? What happens to the income coefficient?