# MATH 6643 Applications of Mixed Models 2008

• Web page
• Partial solutions to assignments:
• Assignment 1 (http://www.math.yorku.ca/~georges/Courses/6643/Solutions/MATH_6643_2008_Solutions_for_Assignment_1.pdf)
• Assignment 2 (http://www.math.yorku.ca/~georges/Courses/6643/Solutions/MATH%206643%202008%20Solutions%20for%20Assignment%202.pdf)
• Assignment 3 (http://www.math.yorku.ca/~georges/Courses/6643/Solutions/MATH%206643%202008%20Solutions%20for%20Assignment%203.pdf)
• Sample exam questions (http://www.math.yorku.ca/~georges/Courses/6643/Class11/MATH_6643_2008_Sample_Exam_Questions.pdf)

## General Information

The approach of the course is eclectic, both theoretical and applied. As much as possible, the mathematical content will be conveyed with pictures and diagrams as well as formulas. We will cover both the mathematical foundations of the subject by doing exercises developing a formal mathematical approach and by performing analyses of real data. There will be individual assignments and a major project done individually or in pairs.

### Text

• There is no official text for the course. We will use a number of references, resources on the internet and class notes.

You

### Course Work

#### Assignments

(30%) There will be approximately weekly individual assignments. You will post selected solutions on the wiki server.

#### Project

(30%) A course project will involve researching some aspect of mixed models. As you work, you should post your results to the wiki server. To avoid name conflicts prepend every page with "MATH6643-08 Project X:" where X is a letter identifying your project. During the course, you can also make contributions based on your project to the general information on the wiki but you should do this only when you're confident that you are posting correct material. The project will include

• writing a description of tha problem and your solution along with an up-to-date annotated bibliography,
• if applicable, developing software tools in R or SAS to solve problems in the area you are researching and,
• a 15-minute presentation at the end of the course.

You should choose your topic by Wednesday, May 14. Here are some ideas. You can propose your own. If you have ideas for projects that you would like others to do, please send them to me as soon as possible [to ensure rapid attention please include the course code, MATH 6643, in your subject line].

• Dealing with multilevel missing data and imputation in multilevel studies: what are the special issues?
• Writing and documenting an interface from R to SAS to run PROC MIXED, and writing methods in R to work with the output from SAS.
• [taken] Bootstrapping in multilevel studies: Can we do multilevel bootstrapping or can we bootstrap only at the highest level?
• Causal inference with longitudinal data: What do to if a treatment is initiated during the study? What if everyone receives the treatement? Or if the treatment is randomized? Or if the treatment group is self-selected? Should pretreatment observations on the response be used as covariates?
• [taken] Good graphical diagnostics for multilevel analyses: Implement SAS PROC MIXED diagnostics in R. This project could be done by two students, one specializing in between-cluster-level, the other on within-cluster diagnostics. Ideally, the tools should work with data with more than two levels.
• Explore consequences when the within-cluster effect is different from the compositional effect. Why is the variance estimator bimodal? Explore the geometry of the likelihood function. This project could be done by two students.
• What should be done with non-convergence? Create examples of data sets where the typical mixed-model algorithms fail to converge. Examine how varying the parameters of the process producing the sample is related to convergence. Draw the lessons that can be drawn from your experiment.
• Construct diagnostics in R corresponding to the Wu-Hausman test and graphical diagnostics to alert analysts to the possibility that compositional effects ar different from within-cluster effects.
• Implement 'variance inflation' diagnostics for mixed models in R.
• Categorical Mixed Models
• Implementing the Satterthwaite algorithm for approximate degrees of freedom for Wald tests of linear hypotheses.
• MATH6643-08 Project L: Mixed Models and Model Selection: Mixed Models and AIC-Related Model Selection

#### Final exam

(30%) There will be a 3-hour final exam (15%) and a 2-day long take-home exam (15%). About 90% of the in-class final will be based largely on questions similar to those already seen in assignments. The weekend-long take home will involve the individual analysis of a multilevel or longitudinal data set.

#### Wiki contributions

(10%) As you work you will think of good contributions to make the to the information on statistics and R on the wiki server. This component of the course will be based on the extent and quality of your contributions during the course.

### Class list

The projects are individual but may be done with support from a small team working on related projects. A class photo, names, e-mail addresses and assignment to teams can be found at http://www.math.yorku.ca/~georges/Courses/6643 -- coming soon. Note that a userid (mixed) and password (mixed) are needed to access this page.

## Course schedule

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
May 4 5 6 7
First class
8 9 10
11 12

Class 2

13 14
Class 3
15

SORA/TABA
Cook/Lawless

workshop
16 17
18 19
Class $\emptyset$
I forgot this is Victoria Day
20 21
Class 4
22 23 24
25
SSC Conference
26

SSC
No class

27
SSC
28

SSC
No class

29
SSC
30 31
June 1 2
Class 5
3 4
Class 6
5 6 7
8 9
Class 7
10 11
Class 8
12 13 14
15 16
Class 9
17 18
Class 10
19 20 21
22 23
Class 11
24 25

Final Exam

Start of take-home
26 27
Take-home due at noon
28

## Class 1

• The video of the lecture is too large to upload to the server but I will be happy to make it available on a CD if anyone is interested.
• An annotated version of "Playing with Ellipses" (http://www.math.yorku.ca/~georges/Courses/6643/Class01/Ellipses-ANNOTATED.pdf) a review of concepts in linear regression.
• Class notes (http://www.math.yorku.ca/~georges/Courses/6643/Class01/Class01.pdf) (note that you might have to rotate these when viewing in Adobe. In Adobe Professional, follow the tabs: View|Rotate View|Counterclockwise. Perhaps the same thing works in Acrobat Reader.

### Assignment 1

Due May 12 before class

You must work individually but you can use any written source provided you cite it.

1. [10] Prove the Sherman-Morrison identity (state appropriate assumptions). Note that U and V need not be square matrices.
(A + UDV) − 1 = A − 1A − 1U(D − 1 + VA − 1U) − 1VA − 1
Hint: It might be easier to first prove a special case for (I + UV) − 1 and then use basic facts about inverses and products, e.g. (AB) − 1 = B − 1A − 1
2. [10] Consider a linear regression model Y = Xβ + ε where X is a $n \times (k+1)$ matrix whose first column consists of 1's, β = (β01,...,βk)' and Var(ε) = σ2I. Let ΣXX be the $k \times k$variance matrix of the predictor variables and let sE be the residual standard error. Find and prove an expression for $Var( (\hat{\beta_1}, ... , \hat{\beta_k} )')$
3. [10] Let Σ be symmetric. Show that Σ is positive-definite if and only there exists a non-singular matrix A such that Σ = AA'.
4. [10] Show that a symmetic matrix Σ is a variance matrix if and only if there exists a matrix A such that Σ = AA'.
5. [10] Let A and B be square matrices such that AA' = BB' = Σ with Σ positive definite. Show that there exists an orthogonal matrix Γ such that A = BΓ
6. [50] Retrieve the "Arrests" data set from "library(effects)" in R. You can get information about the variables in the data set in the usual way with
> ?Arrests
> install.packages('effects')
The Toronto Star has published some stories claiming that this data set reveals a pattern of discrimination in police behaviour. You have been hired by the National Post to study the data set and produce an independent opinion. Your opinion may agree, disagree or otherwise qualify the claim that this data shows a conclusive pattern of discrimination. Write a report with suitable graphs and include the details of your analysis with appropriate graphs as an appendix.

## Class 2

• A version of "Playing with Ellipses" (http://www.math.yorku.ca/~georges/Courses/6643/Class02/Ellipses-ANNOTATED.pdf) including annotations in Class 2.
• Class notes (http://www.math.yorku.ca/~georges/Courses/6643/Class02/Class02.pdf)

## Class 3

• Class notes (http://www.math.yorku.ca/~georges/Courses/6643/Class03/Class03.pdf)
• R script part 1 (http://www.math.yorku.ca/~georges/Courses/6643/Class03/VisualizingMultipleRegressionPart1.R)
• R script part 2 (http://www.math.yorku.ca/~georges/Courses/6643/Class03/VisualizingMultipleRegressionPart2.R)
• R script part 3 (http://www.math.yorku.ca/~georges/Courses/6643/Class03/VisualizingMultipleRegressionPart3.R)

### Assignment 2

Due May 21 before class

You must work individually but you can use any written source provided you cite it.

1. [10] We define two vectors x1 and x2 to be conjugate with respect to a positive definite matrix M if x'1Mx2 = 0. Given a non-singular matrix A show that the columns of A form a set of mutually conjugate vectors with respect to Σ − 1 where Σ = AA'
2. [10] A basis for $\mathbb{R}^p$ that is a conjugate basis with respect to a positive definite matrix M is a sequence of vectors x1,x2,...,xp in $\mathbb{R}^p$ such that x'iMxi = 1 and x'iMxj = 0 if $i \neq j$. Show that the columns of a non-singular matrix A form a conjugate basis with respect to Σ − 1 if Σ = AA'. Note that a conjugate basis is merely an orthogonal basis with respect to the metric defined by | | x | | 2 = x − 1x.
3. [10] We will call a "square root" of a square matrix M any square matrix A such that M = AA'. Show that a matrix has a square root if and only if it is a variance matrix.
4. [10] Consider a $2 \times 2$ variance matrix $\Sigma = \begin{bmatrix} \sigma_{11} & \sigma_{12} \\ \sigma_{21} & \sigma_{22}\end{bmatrix}$ for a random vector $\begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix}$. Verify that the Cholesky matrix $C = \begin{bmatrix} \sigma_{11}^{1/2} & 0 \\ \sigma_{21}/ \sigma_{11}^{1/2}& \sqrt{\sigma_{22} - \sigma_{12}^2 / \sigma_{11}}\end{bmatrix}$ is a square root of Σ.
Show that the Cholesky matrix can be written as $\begin{bmatrix} \sigma_1 & 0 \\ \beta_{21} \sigma_1 & \sigma_{2 \cdot 1}\end{bmatrix}$ where β21 is the regression coefficient of Y2 on Y1.
Draw a concentration (or data) ellipse and indicate the interpretation of the vectors defined by the columns of C relative to the ellipse.
5. [10] Show that a non-singular $2 \times 2$ variance matrix, Σ can be factored so that Σ = AA' with A an upper triangular matrix [in contrast with problem 9 where the matrix is lower triangular]. Explain the interpretation of the elements of this matrix as in question 9.
6. [25] Generate 100 observations for three variables Y, X and Z so that in the regression of Y on both X and Z neither regression coefficient is significant (at the 5% level) but a test of the hypothesis that both coefficients are 0 is rejected at the 1% level. Explain your strategy in generating the data. How should the data be generated to produce the required result? Show a data ellipse for X and Z and appropriate confidence ellipses for their two regression coefficients. What does this example illustrate about the appropriatenes of scanning regression output for significant p-values and concluding that nothing is happening if none of the p-value achieve significance?
7. [25] Generate 100 observations for three variables Y, X and Z so that in the separate simple regressions of Y on each of X and Z neither regression coefficient is significant (at the 5% level) but a test of the hypothesis that both coefficients are 0 in a multiple regression of Y on both X and Z is rejected at the 5% level. Explain your strategy in generating the data. How should the data be generated to produce the required result? Show a data ellipse for X and Z and appropriate confidence ellipses for their two regression coefficients. Explain the relationship between the ellipses and the phenomenon exhibited in this problem. What does this example illustrate about the appropriatenes of forward stepwise regression to identify a suitable model to predict Y using both X and Z?

## Class 7

• Notes (http://www.math.yorku.ca/~georges/Courses/6643/Class07/MATH%206643%20Class%207.pdf)

## Class 9

• Notes (http://www.math.yorku.ca/~georges/Courses/6643/Class09/MATH%206643%20Class%209.pdf)

## Class 10

• Notes (http://www.math.yorku.ca/~georges/Courses/6643/Class10/MATH%206643%20Class%2010.pdf)
• MixedModelsinR.R (http://www.math.yorku.ca/~georges/Courses/6643/Class10/MixedModelsinR.R)

## Class 11

• Notes (http://www.math.yorku.ca/~georges/Courses/6643/Class11/Math_6643_2008_Class_11_Notes.pdf)
• Sample Exam questions (http://www.math.yorku.ca/~georges/Courses/6643/Class11/MATH_6643_2008_Sample_Exam_Questions.pdf)
• Workshop on Longitudinal Models in SAS (http://www.math.yorku.ca/~georges/Courses/6643/Class11/Workshop-v1-0Slides.pdf)
• Contextual effects in R (http://www.math.yorku.ca/~georges/Courses/6643/Class11/ContextualEffectsofSex.R)