# Team Gini

Hi, dear team members: I am very glad that we are in one group. Could I separate this assignment to each of us roughly. And we can help each other too.

Jing: section 1: 1(d), section 3: 1, section 5 and section 6 (b)
Jinling: section 1: 1(b), 3, 5, 7, 9.
Mary: section 1: 1(f), 11, section 2: 2.2, 2.5, 2.10.
Wei: Section 2: 3.3, 3.7, section 3: 3, 5, 7.
Jinling


## Section 1: 1(b), 1(d), 1(f), 3, 5, 7, 9, 11

1. b) Show that for any X, if GX = I then G is a generalized inverse of X. [Note that no such G will exist unless X is of full column rank.]

Proof:

Theorem (Moore 1920): G is a generalized inverse of X since XGX = X.

Now, X (mxn) is a matrix with full column rank n, if GX = I, G muxt be nxm.

Then XGX = X(GX) = XI = X.

According to Moore's theorem, G is a generalized inverse of X.

Mary 01:20, 12 Jun 2006 (EDT)

1(b)

Since GX = I

then XGX = X(GX) by theorem

= XI = X

so XGX = X

G is a generalized inverse of X by definition.

GM comment: It's good practice to reserve the notation X − 1 for a real inverses and to use X −  for generalized inverses GM comment: All matrices have a generalized inverse, often many in fact. But having a generalized inverse does NOT mean that a matrix is invertible. See the discussion page Talk:Team Gini for an easy counterexample


By Jinling

d) If A − 1 = (ZΔZ' + I) with Δ positive definite, then A = IZ − 1 + Z'Z) − 1Z'

solution: since Δ positive definite, A = (ZΔZ' + I) − 1

Using the Sherman-Morrison identity: (A + UDV) − 1 = A − 1A − 1U(D − 1 + VA − 1U) − 1VA − 1

we can get A = (I) − 1 − (I) − 1Z − 1 + Z(I) − 1Z') − 1Z'(I) − 1 = IZ − 1 + Z'Z) − 1Z'

By Jing

f) In general (X'AX) − 1X'A will not be equal to (X'X) − 1X'. There is however a special case where they are equal. Let A satisfy A − 1 = ZiTZi' + σ2I where $\mathrm{span}(Z_i) \subset \mathrm{span} ( X_i )$. Show that, in this case, (Xi'AXi) − 1Xi'A = (Xi'Xi) − 1Xi'.

[Hint: one way to show that two matrices are equal is to show that they have the same value over a basis for their domain. It is easy to show that the two matrices have the same value on span(Xi). If you can also show that they have the same value on the orthogonal complement of span(Xi), $\mathrm{span}^{\perp}(X_i)$, then the two matrices have the same value on their domain.]

Proof:

The span of a set of vectors is the set of all their linear combinations.

First, show that the two matrices have the same value on span(Xi).

(Xi'AXi) − 1Xi'AXi = I, (Xi'Xi) − 1Xi'Xi = I. So (Xi'AXi) − 1Xi'AXi = (Xi'Xi) − 1Xi'Xi.

Then, show that they have the same value on the orthogonal complement of span(Xi), $\mathrm{span}^{\perp}(X_i)$.

Since A satisfy A − 1 = ZiTZi' + σ2I where $\mathrm{span}(Z_i) \subset \mathrm{span} ( X_i )$, we can get A = σ − 2I − σ − 4Zi(T − 1 + σ − 2Zi'Zi) − 1Zi'.

Let $Y_i {\perp}X_i$. (Xi'AXi) − 1Xi'AYi = (Xi'AXi) − 1σ − 2Xi'IYi − (Xi'AXi) − 1σ − 4Xi'Zi(T − 1 + σ − 2Zi'Zi) − 1Zi'Yi = 0. Here we have $\mathrm{span}(Z_i) \subset \mathrm{span}$.

And (Xi'Xi) − 1Xi'Yi = 0. Therefore, (Xi'AXi) − 1Xi'AYi = (Xi'Xi) − 1Xi'Yi.

According to the hint, the two matrices have the same value on their domain.

Mary 05:05, 12 Jun 2006 (EDT)

3.Solution: The fixed effects model precision weights for $\hat{\gamma_i}^{W}$ under the fixed model:

$\hat{\gamma_i}^{W}= (X_i'X_i)^{-1}X_i'Y_i$

$Var(\hat{\gamma_i}^{W}))= (X_i'X_i)^{-1}{\sigma^2}$

$\hat{\gamma_i}^{W}/Var(\hat{\gamma_i}^{W}) = (X_i'X_i)^{-1}X_i'Y_i /(X_i'X_i)^{-1}{\sigma^2}= X_i'Y_i/{\sigma^2}$

The precision-weighted combination of the within-cluster estimated coefficients are:

$\sum(\hat{\gamma_i}^{W}/Var(\hat{\gamma_i}^{W})) / \sum{(Var(\hat{\gamma_i}^{W}))^{-1}})$

$= \sum{X_i' Y_i {\sigma^{-2}}}/ \sum(((X_i'X_i)^{-1}{\sigma^2})^{-1})= \sum{X_i' Y_i}/ \sum(X_i'X_i)=(X'X)^{-1}X'Y$

The marginal estimated coefficients:

$\hat{\gamma^{M}}= (X'X)^{-1} X'Y = \hat{\gamma_i}^{W}/Var(\hat{\gamma_i}^{W}) \sum{(Var(\hat{\gamma_i}^{W}))^{-1}})$

By Jinling

5.Solution: Assume that each Xi if of full column rank. The mixed model:

Yi = Xiγ + ZUi + εi

$Var(\hat{\gamma_i})= {\sigma}^{2} (X'X)^{-1} + {\Tau}$

$W_i = 1/Var(\hat{\gamma_i})$

the within-cluster estimated coefficients are:

$\hat{\gamma}^{W} =(X_i'X_i)^{-1} X_i'Y_i$

The precision-weighted combination of the within-cluster estimated coefficients are:

$(\sum{W_i})^{-1} \sum{{\hat{\gamma_i}}^{W}} W_i = \sum{(X_i'X_i)^{-1}X_i'W_iY_i }/\sum(W_i)=(X'WX)^{-1} X'WY$

The GLS mixed model estimated coefficients are:

$\hat{\gamma^{GLS}}=(X'WX)^{-1} X'WY= (\sum{W_i})^{-1} \sum{{\hat{\gamma_i}}^{W}} W_i$

By Jinling

7.solution

In questuion # 6,

$(X'X-kX'P_G X)\hat{\gamma}^{GLS}=X'X \hat{\gamma}^M-kX'P_GX\hat{\gamma}^B$

Now we treat the matrices involved in the formula in # 6 as if they were scalars, then

$\hat{\gamma}^{GLS}=((X'X-kX'P_G X))^{-1}(X'X \hat{\gamma}^M - kX'P_GX\hat{\gamma}^B)=\frac{a1}{a1+a2}\hat{\gamma}^M + \frac{a2}{a1+a2}\hat{\gamma}^B$

We can think $\hat{\gamma}^{GLS}$ is a linear combination of $\hat{\gamma}^M$ and $\hat{\gamma}^B$ and where a1 = ((X'XkX'PGX)) − 1X'X ,a2 = − ((X'XkX'PGX)) − 1kX'PGX are constants. Then $\hat{\gamma}^{GLS}$ can be thought as two parts: one is along the direction of $\hat{\gamma}^B$, the magnitude is $\frac{a2}{a1+a2}\hat{\gamma}^B =(1- \frac{a1}{a1+a2})\hat{\gamma}^B$, another one is along the direction of $\hat{\gamma}^M$, the magnitude is $\frac{a1}{a1+a2}\hat{\gamma}^M$. So, if $\frac{a1}{a1+a2}$ is large, namely ((X'XkX'PGX)) − 1X'X is large, then $\hat{\gamma}^{GLS}$ tends to the direction of $\hat{\gamma}^M$. If $\frac{a1}{a1+a2}$ is small, namely ((X'XkX'PGX)) − 1X'X is small, then $\hat{\gamma}^{GLS}$ tends to the direction of $\hat{\gamma}^B$.

By Jinling

9.Solution: For X of full column rank, We have $Y=X \hat{\gamma}^{GLS} + d$

$\hat{\gamma}^{GLS} = (X'WX)^{-1} XWY$

where X'Wd = 0 , W = cVar(Y) − 1

Suppose we have another GLS with $Y=X \hat{\gamma}^* + d$ where X'Wd = 0

then $d = Y - X \hat{\gamma}^*$

and $X'Wd = X'W (Y - X \hat{\gamma}^*)$ $= X'WY - X'W X\hat{\gamma}^*$ = 0

$X'W X \hat{\gamma}^* = X'WY$

$\hat{\gamma}^* = (X'WX)^{-1} XWY = \hat{\gamma}^{GLS}$

So the GLS fits uniquely for any c > 0.

By Jinling

11. For a random intercept model with clusters of equal size, express $\hat{\beta}^{GLS}$ as a weighted combination of $\hat{\beta}^A$ and $\hat{\beta}^B$.

Solution:

$\hat{\beta}^{GLS} = Q_G \hat{\beta}^A + P_G \hat{\beta}^B$

$\hat{\beta}^{GLS} = (X^C{}'WX^C)^{-1} X^C{}'WY^C, W = I - \frac{n}{\tau^2/\sigma^2 + n } P_G = I - k P_G$

$\hat{\beta}^A = (X^C{}' Q_G X^C)^{-1} X^C{}' Q_G Y^C, Q_G = I - P_G$

$\hat{\beta}^B = (X^C{}' P_G X^C)^{-1} X^C{}'P_GY^C, P_G = Z(Z'Z)^{-1}Z'$

We can write the GLS model as: $Y = 1 \hat{\alpha} + Q_G X^* \hat{\beta}^A + P_G X^*\hat{\beta}^B + e,$ where $e \perp L(1,P_G X^*, Q_G X^*)$

In order to show this is GLS solution, we need to show $\begin{bmatrix} 1'\\P_GX^*\\Q_GX^*\end{bmatrix} W e = 0.$

First, 1'We = 1'(IkPG)e = (1' − k1)e = c1'e = 0.

Second, X * 'PGWe = X * 'PG(IkPG)e = X * 'PGekX * 'PGe = 0 − k0 = 0.

Finally, X * 'QGWe = X * 'QG(IkPG)e = X * 'QGekX * 'PGe = 0 − k0 = 0.

Mary 05:32, 12 Jun 2006 (EDT)

## Section 2: 2.2, 2.5, 2.10, 3.3, 3.7

2.2 Solution

In hypothesis testing, the significance level is the criterion used for rejecting the null hypothesis. Exact distribution theory based on the F-distribution is to reject H0 when $-2log \wedge\;\ge\; N log (1+\frac{m-1}{N-m}F_{m-1,N-m,1-a}).$

Asymptotic theory tells us that the large-sample distribution of $-2log \wedge\;$ under H0 is chi-square with degrees of freedom equal to the difference in the number of parameters in the parameter space and the number under H0. The large-sample test would therefore be to reject H0 when $-2log \wedge\;\ge\;\chi_{m-1, 1-\alpha}^2$

If the asymptotic critical value is used instead of the exact critical value, no matter N=20, 50 or 100, the significance level is same: $\chi_{4, 1-\alpha}^2$

Mary 15:20, 15 Jun 2006 (EDT)

2.5 Solution

(a)$Var(\bar{y}_{i.})= \sigma_a^2+\sigma^2/n_i$

As we know, solutions $\dot{\mu}, \dot{\sigma^2}, \dot{\sigma_a^2}$ are ML estimators only if the triplet($\dot{\mu}, \dot{\sigma^2}, \dot{\sigma_a^2})$ is in the 3-space of ($\mu, \sigma^2, \sigma_a^2).$ And in ensuring that this is achieved, we need to consider the negativity problem. For each data set, equations (2.41), (2.42), and (2.43) have to be solved numerically. When $\dot{\sigma_a^2}< 0$, we have to let $\hat \sigma_a^2 = 0.$ Therefore, $\hat \sigma^2 = \frac{SST}{N}, \hat Var(\bar{y}_{i.})= \frac{\hat \sigma^2}{n_i}.$According to the equation (2.41), $\hat{\mu} = \frac{\sum_{i} \frac {\bar{y}_{i.}}{\hat Var(\bar{y}_{i.})}}{\sum_{i} \frac{1} {\hat Var(\bar{y}_{i.})} } = \frac{\sum_{i} \frac {\bar{y}_{i.}n_i}{\sigma^2}}{\sum_{i} \frac{n_i} {\sigma^2} } = \frac{\sum_{i} \bar{y}_{i.}n_i}{\sum_{i} n_i} = \bar {y}..$

(b) From (2.43), we get $\dot{\sigma_a^2}< 0$, but we can't use the negative value as an estimation of any variance. As we mentioned before, we need to consider the negativity problem, so we let $\hat \sigma_a^2 = 0.$ Thus, (2.43) is not used.

Mary 17:23, 15 Jun 2006 (EDT)

2.10 Solution

X and Y are random variables which follow bivariate normal distribution with correlation ρ. Suppose we know X is k standard deviations above its mean, but Y can't be observed. Then we could use that information of X to get a prediction of Y, and the predictor we use is the minimum mean squared error predictor of Y, based on X; it is E[Y|X], the condition mean of Y given X. We can proof that Y|X follows a Normal Distribution with $E[Y|X] = \mu _y + \rho \sigma_y \frac{X- \mu_x}{\sigma_x}.$ Since $\frac{X- \mu_x}{\sigma_x} = k,$ then we get $\frac{E(Y|X)- \mu_y}{\sigma_y} = \rho k.$

Mary 17:24, 15 Jun 2006 (EDT)

## Section 3: 1, 3, 5, 7. Fitting 'Basic' Mixed Models Using SAS and R

1. Null model (fixed intercept and random intercept)

SAS code:

      PROC MIXED ASYCOV CL;
CLASS SCHOOL;
MODEL MATHACH = / DDFM = SATTERTH ;
RANDOM INTERCEPT / SUBJECT = SCHOOL
TYPE=FA0(1) G S; /* Note that the '0' is a ZERO */
RUN;


What benefit might there be from using TYPE=FA0(1) instead of TYPE=UN? What is the estimated intraclass correlation using this model? Run the corresponding problem in R and compare output. Use R to produce trellis plots showing OLS within cluster fits, BLUPS and population-level fits.

Solution:

(1)The TYPE=UN (unstructured) option is useful for correlated random coefficient models.GM comment: I regret to say that I don't understand this paragraph. Perhaps more context and detail is necessary. /*Sorry George, I don't understand it too. I got the above explanation from SAS Help and Document in SAS software 9.0 package.*/ TYPE=FA0(1) here to request a G estimate that is constrained to be nonnegative definite.G containing variance components in a diagonal structure.In our case, the benefit of using TYPE=FA0(1) is to make variance components in a diagonal struction nonnegative.

(2)The intraclass correlation is a helpful diagnostic tool in determining if a multilevel modeling will be superior to a traditional method, such as regression or ANOVA. A rough rule of thumb is when the intraclass correlation is over .10, hidden clusters are present in your data, and a multilevel modeling model would be a more appropriate data analysis technique. GM comment: Can you provide the reference for this rule of thumb? In what way are the clusters hidden since you must presumably know what they are to compute the intraclass correlation? Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models by Judith D. Singer (http://www.gse.harvard.edu/~faculty/singer)

From the SAS output using TYPE=FA0(1), we can get τ00 = 3.0288 and σ2 = 39.8255 so $\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}=0.07$ Also,from the SAS output using TYPE=UNwe can getτ00 = 9.1814 and σ2 = 39.8248 so $\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}=0.187$ In sum, I think a multilevel modeling is useful in this model.

(3)R code for Null model:

 fit <- lme( mathach ~ 1, data= hs , random = ~ 1 | school )
summary(fit)
summary(fit)$coef  R output: GM comment: It's better to include the output in the wiki instead of using an image. That way it's possible to add comments, highlights, etc. To make output line up properly, just enclose it in <pre>, </pre> tags.  > summary(fit) Linear mixed-effects model fit by REML Data: hs AIC BIC logLik 11317.12 11333.47 -5655.562 Random effects: Formula: ~1 | school (Intercept) Residual StdDev: 3.030109 6.310685 Fixed effects: mathach ~ 1 Value Std.Error DF t-value p-value (Intercept) 12.85788 0.5048477 1680 25.46882 0 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.04431764 -0.76097490 0.04898448 0.76587527 2.42059025 Number of Observations: 1720 Number of Groups: 40  SAS output:  The SAS System 13:33 Wednesday, June 16, 2006 1 The CONTENTS Procedure Data Set Name WORK.HS Observations 1720 Member Type DATA Variables 9 Engine V9 Indexes 0 Created 14:02 Wednesday, June 16, 1993 Observation Length 72 Last Modified 14:02 Wednesday, June 16, 1993 Deleted Observations 0 Protection Compressed NO Data Set Type Sorted YES Label Data Representation WINDOWS Encoding wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size 8192 Number of Data Set Pages 16 First Data Page 1 Max Obs per Page 113 Obs in First Data Page 87 Number of Data Set Repairs 0 File Name C:\DOCUME~1\bsun\LOCALS~1\Temp\SAS Temporary Files\_TD1472\hs.sas7bdat Release Created 9.0000M0 Host Created WIN_PRO Alphabetic List of Variables and Attributes # Variable Type Len Format 4 FEMALE Num 8 BEST15.2 6 MATHACH Num 8 BEST15.2 9 MEANSES Num 8 BEST15.2 3 MINORITY Num 8 BEST15.2 1 ROWNAMES Char 4 2 SCHOOL Num 8 BEST15.2 8 SECTOR Num 8 BEST15.2 5 SES Num 8 BEST15.2 7 SIZE Num 8 BEST15.2 Sort Information Sortedby SCHOOL Validated YES Character Set ANSI The SAS System 13:33 Wednesday, June 16, 2006 2 The SUMMARY Procedure Lower Upper Variable Minimum Quartile Median Quartile Maximum SCHOOL 1317.00 2995.00 5783.00 8202.00 9586.00 MINORITY 0 0 0 0 1.0000000 FEMALE 0 0 1.0000000 1.0000000 1.0000000 SES -2.3280000 -0.4180000 0.1920000 0.7870000 1.8320000 MATHACH -2.8320000 7.7805000 13.6230000 18.7030000 24.9930000 SIZE 153.0000000 493.0000000 1068.00 1523.00 2403.00 SECTOR 0 0 0 1.0000000 1.0000000 MEANSES -0.7510000 -0.1010000 0.1985000 0.4480000 0.7590000 The SAS System 13:33 Wednesday, June 16, 2006 3 The Mixed Procedure Model Information Data Set WORK.HS Dependent Variable MATHACH Covariance Structure Factor Analytic Subject Effect SCHOOL Estimation Method REML Residual Variance Method Profile Fixed Effects SE Method Model-Based Degrees of Freedom Method Satterthwaite Class Level Information Class Levels Values SCHOOL 40 1317 1374 1436 1461 1909 2030 2336 2651 2658 2755 2995 3039 3152 3332 3377 3657 4511 4642 4868 5192 5783 5838 6074 6366 6415 6469 6808 6897 7101 7919 8202 8367 8775 8854 9021 9104 9158 9340 9347 9586 Dimensions Covariance Parameters 2 Columns in X 1 Columns in Z Per Subject 1 Subjects 40 Max Obs Per Subject 61 Observations Used 1720 Observations Not Used 0 Total Observations 1720 Iteration History Iteration Evaluations -2 Res Log Like Criterion 0 1 11538.61727936 1 2 11311.13751265 0.00000323 2 1 11311.12408759 0.00000000 Convergence criteria met. The SAS System 13:33 Wednesday, June 16, 2006 4 The Mixed Procedure Estimated G Matrix Row Effect SCHOOL Col1 1 Intercept 1317 9.1737 Covariance Parameter Estimates Cov Parm Subject Estimate Alpha Lower Upper FA(1,1) SCHOOL 3.0288 0.05 2.3924 3.9592 Residual 39.8255 0.05 37.2627 42.6634 Asymptotic Covariance Matrix of Estimates Row Cov Parm CovP1 CovP2 1 FA(1,1) 0.1503 -0.01085 2 Residual -0.01085 1.8898 Fit Statistics -2 Res Log Likelihood 11311.1 AIC (smaller is better) 11315.1 AICC (smaller is better) 11315.1 BIC (smaller is better) 11318.5 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 1 227.49 <.0001 Solution for Random Effects Std Err Effect SCHOOL Estimate Pred DF t Value Pr > |t| Intercept 1317 0.2933 0.9875 1719 0.30 0.7665 Intercept 1374 -2.7094 1.1926 1719 -2.27 0.0232 Intercept 1436 4.7819 1.0173 1719 4.70 <.0001 Intercept 1461 3.5215 1.1249 1719 3.13 0.0018 Intercept 1909 1.3553 1.1926 1719 1.14 0.2559 Intercept 2030 -0.7138 0.9945 1719 -0.72 0.4730 Intercept 2336 3.3503 0.9945 1719 3.37 0.0008 Intercept 2651 -1.5917 1.0704 1719 -1.49 0.1372 Intercept 2658 0.4909 1.0094 1719 0.49 0.6268 Intercept 2755 3.3126 0.9945 1719 3.33 0.0009 Intercept 2995 -3.0262 1.0019 1719 -3.02 0.0026 Intercept 3039 3.4026 1.3215 1719 2.57 0.0101 Intercept 3152 0.3241 0.9611 1719 0.34 0.7360 Intercept 3332 1.2746 1.0704 1719 1.19 0.2339 Intercept 3377 -3.3482 1.0094 1719 -3.32 0.0009 Intercept 3657 -3.0750 0.9674 1719 -3.18 0.0015 Intercept 4511 0.5127 0.9270 1719 0.55 0.5802 Intercept 4642 1.6254 0.9118 1719 1.78 0.0748 Intercept 4868 -0.4857 1.1131 1719 -0.44 0.6626 Intercept 5192 -2.1197 1.1926 1719 -1.78 0.0757 Intercept 5783 0.2544 1.1778 1719 0.22 0.8290 Intercept 5838 0.7295 1.1501 1719 0.63 0.5260 Intercept 6074 0.8549 0.9377 1719 0.91 0.3621 Intercept 6366 2.6036 0.9270 1719 2.81 0.0050 Intercept 6415 -0.9235 0.9491 1719 -0.97 0.3307 Intercept 6469 5.2016 0.9323 1719 5.58 <.0001 Intercept 6808 -3.2510 1.0173 1719 -3.20 0.0014 Intercept 6897 2.0574 0.9806 1719 2.10 0.0360 Intercept 7101 -0.8727 1.1926 1719 -0.73 0.4644 Intercept 7919 1.7829 1.0804 1719 1.65 0.0991 Intercept 8202 -1.0191 1.1018 1719 -0.92 0.3551 Intercept 8367 -6.3394 1.5231 1719 -4.16 <.0001 Intercept 8775 -3.1097 0.9875 1719 -3.15 0.0017 Intercept 8854 -7.5886 1.1372 1719 -6.67 <.0001 Intercept 9021 1.7065 0.9377 1719 1.82 0.0690 Intercept 9104 3.6835 0.9433 1719 3.90 <.0001 Intercept 9158 -3.9862 0.9550 1719 -4.17 <.0001 Intercept 9340 -1.4607 1.1778 1719 -1.24 0.2151 Intercept 9347 0.6327 0.9323 1719 0.68 0.4975 Intercept 9586 1.8683 0.9218 1719 2.03 0.0428  Since this is a Null model, I don't think I can produce trellis plots showing OLS within cluster fits, BLUPS and population-level fits. by Jing 3. One-way ANCOVA with random effects  PROC MIXED; CLASS SCHOOL; MODEL MATHACH = SES / S DDFM = SATTERTH; RANDOM INTERCEPT / SUBJECT = SCHOOL TYPE = FA0(1); RUN;  Is the effect of SES significant? Run the corresponding analysis in R and compare output. Use R to produce trellis plots showing OLS within cluster fits, BLUPS and population-level fits.  R code: ##One-way ANCOVA with random effects hs <- read.csv("http://www.math.yorku.ca/~georges/Data/hs.csv") source("http://www.math.yorku.ca/~georges/R/fun.R") td ( new = T ) xqplot ( hs ) library(nlme) fit <- lme( mathach ~ ses, data= hs ,random = ~ 1 | school ) summary(fit) ## Use R to produce trellis plots showing OLS within cluster fits, BLUPS and population-level fits. hpred <- hs hpred$y0 <- predict(fit, level = 0)     # Population
hpred$y1 <- predict(fit, level = 1) # BLUPS xyplot( mathach ~ ses | school , hpred,skip=c(rep(F,25),T,T), panel = function(x, y, subscripts, ...) { panel.xyplot(x,y,...) panel.lmline(x,y,...,col='black',lwd=2) y0 <- hpred$y0[subscripts]
panel.lmline(x,y0,...,type='l',col='green',lwd=2)
y1 <- hpred$y1[subscripts] panel.lmline(x,y1,...,type='l',col='red',lwd=2) } )   ##Results from R Linear mixed-effects model fit by REML Data: hs AIC BIC logLik 11192.08 11213.88 -5592.041 Random effects: Formula: ~1 | school (Intercept) Residual StdDev: 2.198615 6.116201 Fixed effects: mathach ~ ses Value Std.Error DF t-value p-value (Intercept) 12.531101 0.3812171 1679 32.8713 0 ses 2.530339 0.2146264 1679 11.7895 0 Correlation: (Intr) ses -0.078 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.12326426 -0.73360355 0.03418091 0.76324884 2.70656225 Number of Observations: 1720 Number of Groups: 40  The effect of SES is significant because the p_vaule of SES is less than .0001.  The SAS System 00:21 Thursday, June 17, 2006 6 The Mixed Procedure Covariance Parameter Estimates Cov Parm Subject Estimate FA(1,1) SCHOOL 2.1985 Residual 37.4080 Fit Statistics -2 Res Log Likelihood 11184.1 AIC (smaller is better) 11188.1 AICC (smaller is better) 11188.1 BIC (smaller is better) 11191.5 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 1 91.08 <.0001 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept 12.5311 0.3812 34.4 32.87 <.0001 SES 2.5304 0.2146 1655 11.79 <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F SES 1 1655 139.00 <.0001  From the above outputs, I think the result from SAS is the same as the result from R.  OLS within cluster fits, BLUPS and population-level fits:  by Jing 5. Intercepts and slopes as outcomes  PROC MIXED ASYCOV CL; CLASS SCHOOL; MODEL MATHACH = SES SECTOR SES*SECTOR / S DDFM = SATTERTH; RANDOM INTERCEPT SES / /* try various options for TYPE */ SUBJECT = SCHOOL G GC; RUN;  Compare SAS results with the corresponding analysis in R. Use R to produce trellis plots showing OLS within cluster fits, BLUPS and population-level fits.  ## R code fit <- lme( mathach ~ ses+factor(sector)+ses*factor(sector), data= hs ,random = ~ 1 +ses | school ) ##R result: iteration limit reached without convergence (9)  SAS output:  The SAS System 00:21 Thursday, June 17, 2006 10 The Mixed Procedure Estimated G Matrix Row Effect SCHOOL Col1 Col2 1 Intercept 1317 3.5793 2 SES 1317 Estimated Chol(G) Matrix Row Effect SCHOOL Col1 Col2 1 Intercept 1317 1.8919 2 SES 1317 Covariance Parameter Estimates Cov Parm Subject Estimate Alpha Lower Upper Intercept SCHOOL 3.5793 0.05 2.1090 7.3715 SES SCHOOL 0 . . . Residual 37.3310 0.05 34.9244 39.9965 Asymptotic Covariance Matrix of Estimates Row Cov Parm CovP1 CovP2 CovP3 1 Intercept 1.2446 -0.07293 2 SES 3 Residual -0.07293 1.6668 Fit Statistics -2 Res Log Likelihood 11167.7 AIC (smaller is better) 11171.7 AICC (smaller is better) 11171.7 BIC (smaller is better) 11175.1 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t|  by Jing 7. Using the general linear hypothesis (SAS ESTIMATE statement) to test plausible questions: Consider the following mixed model:  PROC MIXED ASYCOV ASYCORR CL; CLASS SCHOOL; MODEL MATHACH = SES SECTOR FEMALE SES*SECTOR / S DDFM = SATTERTH; RANDOM SES FEMALE INT / SUBJECT = SCHOOL TYPE = FA0(3) G GC; ESTIMATE ‘Cath – Pub | low SES’ SECTOR 1 SES*SECTOR -2 / CL E; ESTIMATE ‘Cath – Pub | high SES’ SECTOR 1 SES*SECTOR +2 / CL E; RUN;  How are the above ESTIMATE statements obtained and what does each test? Compare the ESTIMATE statement in SAS with the output of 'glh' in R. ## Section 5: all R code: sub<-c(1,1,1,2,2,2,3,3,3,4,4,4) dose<-c(1,2,3,3,4,5,5,6,7,7,8,9) ys<-c(3.1,2.2,1.1,5.0,4.1,3.0,7.0,6.1,5.0,9.1,8.2,7.1) yb<-c(3.1,3.9,1.1,5.0,4.9,3.0,7.0,6.9,5.0,9.1,9.9,7.1) expt<-data.frame(sub, dose,ys, yb) if(F) { # assuming the following has been done: dir.create("/R") download.file( "http://www.math.yorku.ca/~georges/R/fun.R", "/R/fun.R") } source("/R/fun.R") /*Plot data*/ library(lattice) library(car) library(MASS) library(Martix) xyplot(ys ~ dose | sub, data=expt, type = "l") xyplot(yb ~ dose | sub, data=expt, type = "l") scatterplot(ys ~ dose | sub, data=expt) scatterplot(yb ~ dose | sub, data=expt) /*within subject model small sigma*/ expt.lm <- lm(ys ~ dose+factor(sub), data=expt) summary(expt.lm) summary(expt.lm)$sigma
/*within subject model big sigma*/
expt_yb.lm <- lm(yb ~ dose+factor(sub), data=expt)
summary(expt_yb.lm)
summary(expt_yb.lm)$sigma /*compute subject averages and deviations for ys and dose*/ library(nlme) expt.avg<-gsummary(expt, form= ~ sub) expt.avg ys_b <- tapply( expt$ys, expt$sub, mean, na.rm=T ) [ tapply(expt$ys, expt$sub) ] yb_b <- tapply( expt$yb, expt$sub, mean, na.rm=T ) [ tapply(expt$yb, expt$sub) ] dose_b <- tapply( expt$dose, expt$sub, mean, na.rm=T ) [ tapply(expt$dose, expt$sub) ] ys_w <- ys - ys_b yb_w <- yb - yb_b dose_w <- dose - dose_b expt<-data.frame(sub, dose,ys, yb,ys_b,yb_b,dose_b,ys_w,yb_w,dose_w) expt /*between subjec model small sigma*/ expt_ysb.lm <- lm(ys_b ~ dose, data=expt) summary(expt_ysb.lm) summary(expt_ysb.lm)$sigma

/*between subject model big sigma*/
expt_ybb.lm <- lm(yb_b ~ dose, data=expt)
summary(expt_ybb.lm)
summary(expt_ybb.lm)$sigma /*pooled model small sigma*/ expt_ysp.lm <- lm(ys ~ dose, data=expt) summary(expt_ysp.lm) summary(expt_ysp.lm)$sigma

/*pooled model big sigma*/
expt_ybp.lm <- lm(yb ~ dose, data=expt)
summary(expt_ybp.lm)
summary(expt_ybp.lm)$sigma /*proc mixed estimate small sigma*/ fit <- lme( ys ~ dose, data= expt ,random = ~ 1 | sub ) summary(fit) summary(fit)$sigma

/*proc mixed estimate big sigma*/
fit_yb <- lme( yb ~ dose, data= expt ,random = ~ 1  | sub )
summary(fit_yb)
summary(fit_yb)$sigma /*try controlling for the patient average dose:*/ fit_ys_average <- lme( yb ~ dose_w*dose_b, data= expt ,random = ~ 1 | sub ) summary(fit_ys_average) summary(fit_ys_average)$sigma

/*Try a radnom effect of dose_w: */
fit_ys_random <- lme( yb ~ 0 + dose_w * dose_b, data= expt ,random = ~ 1  + dose_w| sub )
summary(fit_ys_random)
summary(fit_ys_random)$sigma /*Try a radnom effect of dose: */ fit_ys_dose_random <- lme( yb ~ 0 + dose_w * dose_b, data= expt ,random = ~ 1 + dose| sub ) summary(fit_ys_dose_random) summary(fit_ys_dose_random)$sigma

/*Plot OLS fits for each patient, BLUPs and population-level fits*/
fit <- lme( ys ~ dose, data= expt ,random = ~ 1 +dose | sub )
hpred <- expt
hpred$y0 <- predict(fit, level = 0) # Population hpred$y1 <- predict(fit, level = 1)     # BLUPS

xyplot( ys ~ dose | sub , hpred,skip=c(rep(F,25),T,T),
panel = function(x, y, subscripts, ...) {
panel.xyplot(x,y,...)
panel.lmline(x,y,...,col='black',lwd=2)
y0 <- hpred$y0[subscripts] panel.lmline(x,y0,...,type='l',col='green',lwd=2) y1 <- hpred$y1[subscripts]
panel.lmline(x,y1,...,type='l',col='red',lwd=2)
}
)


Results: The estimated parameters are as following:

 "Effect" of Dose Small sigma Big sigma Marginal (Pooled) 1.682021 1.847033 Between 0.8419515 0.868264 Within (Ajusted for patient) 0.06172134 0.87831 Mixed TYPE=UN Mixed TYPE=FA0(1) 0.06172449 0.889105

Graphics:

     DATA PLOT


    OLS fits for each patient, BLUPs and popultario-level fits PLots


by Jing

## Section 6: b

/*The following R code is from our course R code examples*/

if(F) { # assuming the following has been done:
dir.create("/R")
}

source("/R/fun.R")
if(F) { # assuming the following has been done:
dir.create("dataIN")
"dataIN/hs.csv")
}

library(nlme)
library(car)    # you might need to install this

library(MASS) # This is the library from Modern Applied

hs$sector table(hs$sector)

hs$Sector <- factor( ifelse( hs$sector == 0, "Public", "Catholic") )
table( hs$Sector ) ### School hs$school
atotal( z <- table(hs$school) ) #number of students in each school cbind ( atotal( table(hs$school) )  )
plot( table(hs$school) ) # first define a useful function: ch <- function(x) as.character(x) # more informative school id's hs$School <- factor( paste ( substring( ch( hs$Sector), 1,1), hs$school, sep = ""))

atotal( table ( hs$School )) atotal( table( hs$School, hs$Sector) ) atotal( table( hs$School, hs$Sector) > 1) ### Sex hs$Sex <- factor( ifelse( hs$female == 1, "Female", "Male")) with( hs, atotal ( table( Sector, Sex))) fit <- xtabs( ~ Sector + Sex, hs) fit summary(fit) # # Create "aggregate" data frame "by school" # -- This is a "Level 2" data frame with one row per school, # we can call it the "short" data frame, and the original the "long" # -- Note we can also think of a "wide" data frame as that with the # full data stored as different variable with one row per school. hs.a <- gsummary( hs, form = ~ School) # from library(nlme) dim( hs.a ) hs.a # # Note 1: gsummary takes the mean of numerical variables and the mode # of factor variables. This works fine for outer numerical, outer # factor and inner numerical variables. It doesn't generally # produce useful results for inner factors. # # Note 2: gsummary is in 'nlme' # # We can add custom-made variables with tapply hs.a$N      <- tapply( hs$ses, hs$School, length )      # number of observations per school
hs.a$ses.sd <- tapply( hs$ses,  hs$School, function(x) sqrt(var(x))) # sd of ses hs.a ### ### Creating 'contextual' variables ### # Note: we can 'lift' a variable from the short data frame to the long data frame # with tapply without a function argument. hs$N <- hs.a$N[ tapply( hs$School, hs$School) ] hs$ses.sd <- hs.a$ses.sd[ tapply( hs$School, hs$School) ] # Note: # - The first argument is not important as long as it has the correct length # - We can also define a Level 2 variable in "one shot": hs$ses.mean <- tapply( hs$ses, hs$School, mean, na.rm=T ) [
tapply(hs$ses, hs$School) ]

# Caution: Don't reorder the "by" factor between uses of tapply.
# The order won't match

# capply in funRDC.R does this more conveniently ('c' stands for 'contextual')

capply

hs$minority.prop <- capply( hs$minority, hs\$School, mean, na.rm = T)

###
###  Saving a data set  ( You may need the create the directory 'data' in the
###  working directory. To find out where you are, use > getwd()
###

dir.create("data")  # only necessary once

save( hs, file = "data/hs.rda" )  # in future session you can load the data.frame with

# Note that you also just save the environment at the end of your session but
# this can get messy and unreliable

data( hs )  # this loads the data saved in 'data/hs.rda'

### Fitting Mixed models with contextual effects using R
fit <- lme( mathach ~ ses+ses.mean+minority+minority.prop, data= hs ,random = ~ 1 +ses| School )
summary(fit)


Results:

Linear mixed-effects model fit by REML
Data: hs
AIC      BIC    logLik
11089.92 11138.94 -5535.959

Random effects:
Formula: ~1 + ses | School
Structure: General positive-definite, Log-Cholesky parametrization
StdDev    Corr
(Intercept) 1.5098857 (Intr)
ses         0.4702158 -0.458
Residual    5.9635896

Fixed effects: mathach ~ ses + ses.mean + minority + minority.prop
Value Std.Error   DF   t-value p-value
(Intercept)   12.417579 0.4081393 1678 30.424853  0.0000
ses            2.038890 0.2305240 1678  8.844588  0.0000
ses.mean       4.294775 0.7490919   37  5.733308  0.0000
minority      -4.146355 0.4491158 1678 -9.232263  0.0000
minority.prop  2.006420 1.4780814   37  1.357449  0.1829
Correlation:
(Intr) ses    ses.mn minrty
ses           -0.096
ses.mean      -0.362 -0.262
minority      -0.008  0.138 -0.041
minority.prop -0.638 -0.039  0.168 -0.291

Standardized Within-Group Residuals:
Min          Q1         Med          Q3         Max
-3.26616600 -0.70279789  0.01105957  0.75313796  2.81928889

Number of Observations: 1720
Number of Groups: 40



From the above result, since the estimated value of ses.mean is 4.294775,I think a low SES child appear to do better in a high SES. This depends on minority status because the p-value of minority is 0.0000. A high SES child can do better in a high SES than a low SES shool. Since the estimated value of minority is -4.146355,I don't think minority students appear to do better with a high proportion of minority students. The limitations in the interpretation of these results may be the data is not enough.

by Jing