# MATH 6643 - 2006

(Redirected from MATH 6643)

Math 6643 Applications of Mixed Models

## News

### May 11: Scheduling

• Latest on course meetings (updated May 11):
• Wednesday 1:30 to 4:30 in S103 Ross
• Fridays 9 to 12 in S103 Ross
• Exceptions:
• Wednesday, May 17, 9 am to 4 pm: Rod Little's symposium on Missing Data. Register at http://www.yorku.ca/isr/spida2006/symposium.html
• Wednesday, May 31 and Friday, June 2: all classes at York are cancelled due to the 'Learneds'. Note that the SSC meetings in London, Ontario, will be held from May 28 to May 31.
• We need to work out dates for the last class, the take-home final and the in-class final.

## General Information

The approach of the course is eclectic, both theoretical and applied. As much as possible, the mathematical content will be conveyed with pictures and diagrams as well as formulas. We will cover both the mathematical foundations of the subject by doing exercises in the textbook which takes a formal mathematical approach and by performing analyses of real data. There will be a number of projects which will involve developing software tools in R for the analysis of mixed models.

### Instructor

• Georges Monette
• Office hours: Wednesdays 4:30 to 5:30

### Text

• McCulloch, C. E. and Searle, S. R. (2001) Generalized, Linear and Mixed Models. Wiley, N.Y.

### Course Work

There will be a number of assignments done on a wiki server by assigned teams or by individuals. The last assignment will be include a computing projects implementing diagnostics for mixed models in R and a project involving the analysis of a real data set. There will be a 3-hour final exam and a weekend-long take-home exam. Note that all course work except the exams will be posted on a wiki server and available for future reference.

• Possible computing projects:
• Implement graphical diagnostics in R for mixed models. You can take inspiration from those recently implemented in SAS PROC MIXED. This could be two projects, one implementing cluster-level diagnostics and another implementing micro unit level diagnostics.
• Implement 'variance inflation' diagnostics for mixed models in R.

### Class list and teams

Class photo, names, e-mail addresses and assignment to teams can be found at http://www.math.yorku.ca/~georges/Courses/6643. Note that a userid (mixed) and password (mixed) are needed to access this page.

## Week 1

### Assignment 1: due Friday May 12

Create a team page by clicking on your team name below. You can get your team organized by sharing notes entered on the team page.
When you create pages for your assignmnent, first create a link to it from your team page. Make sure that the name of the page starts with the team name followed by a colon. e.g. Someone in Team Erlang could edit the Team Erlang home page to create a link by entering [[Team Erlang: page 10 exer 1]] to work on that exercise.
When uploading a graph, pdf file or Word file, first rename the file on your PC so it begins with your team name. Then use the 'Upload file' link on the left of the wiki page display. See Editing hints for course assignments
• Team Erlang:
• McCulloch and Searle, p25f: Exercises 1, 6
• Use the Bryk and Raudenbush High School Math Achievement data set MATH 6643 Data sets and compare the relationship between 'mathach' and 'ses' in schools 4511 and 2651. Perform a suitable analysis, including diagnostics. Is there evidence of a difference in the relationship between 'mathach' and 'ses' in these two schools? Plot the data and estimated relationship in appropriate ways. Describe in some detail what your analysis says about this relationship.
• The textbook mentions "Quasi-likelihood estimation". Try to find a few recent references that discuss recent developments in this area. Include a recent survey article if you can find one. Write brief comments and summaries for each artcle.
• Team Fisher:
• McCulloch and Searle, p25f: Exercises, 2, 3, 4, 5
• Use the Bryk and Raudenbush High School Math Achievement data set MATH 6643 Data sets and compare the relationship between 'mathach' and 'ses' in schools 1641 and 2755. Perform a suitable analysis, including diagnostics. Is there evidence of a difference in the relationship between 'mathach' and 'ses' in these two schools? Plot the data and estimated relationship in appropriate ways. Describe in some detail what your analysis says about this relationship.
• The textbook mentions "generalized estimating equations". Try to find a few recent references that discuss recent developments in this area. Include a recent survey article if you can find one. Write brief comments and summaries for each artcle.

## Week 2

• We will meet at the regular time and place on F 9-12 in S 103 Ross

### Topics

Review of the multivariate normal: marginal and conditional expectation and variance, the concentration ellipse for the bivariate normal, Spectral Decomposition Theorem.
Linear regression: confidence regions and intervals for β, relationship to Var(X)
Introduction to the theory of mixed models
A First Look at Multilevel and Longitudinal Models (http://www.math.yorku.ca/~georges/Slides/CourseNotes.pdf) pp 1-51
Re Simpsons' and Robinson's Paradox: See the first few pages of Some practical issues applying mixed longitudinal models for observational data (http://www.math.yorku.ca/~georges/Slides/TalkOnContextualEffectsv2.pdf)
Introduction to the Analysis of Hierarchical and Longitudinal Data - Part 1 (http://www.math.yorku.ca/~georges/Slides/IntroHierLong-1.pdf) pp 1-42

### Assignment 2 due Monday, May 22

See team memberships in http://www.math.yorku.ca/~georges/Courses/6643
Team Fox
Do the odd numbered questions in the list of exercises below.
Write a summary of the key points in the symposium on missing data by Rod Little

Team Galtom
Do the even numbered questions in the list of exercises below.
Write a summary of the key points in the symposium on missing data by Rod Little

### Exercises:

Note that the results of exercises can be used for exercises that are farther down in the list -- but not the other way around! The Spectral Decomposition Theorem and other standard results can be used anywhere.
1. Prove the Sherman-Morrison-Woodbury identity (state appropriate assumptions):
(A + UDV) − 1 = A − 1A − 1U(D − 1 + VA − 1U) − 1VA − 1
Hint: It might be easier to first prove a special case for (I + UV) − 1 and then use basic facts about inverses and products, e.g. (AB) − 1 = B − 1A − 1
2. Consider a linear regression model Y = Xβ + ε where X is a $n \times (k+1)$ matrix whose first column consists of 1's, β = (β01,...,βk)' and Var(ε) = σ2I. Let ΣXX be the $k \times k$variance matrix of the predictor variables and let sE be the residual standard error. Find and prove an expression for Var((β1,...,βk)')
3. Let A be a square matrix. Show that Σ = AA' is positive definite if and only if A is non-singular.
4. Let A and B be square matrices such that AA' = BB' = Σ with Σ positive definite. Show that there exists an orthogonal matrix Γ such that A = BΓ
5. We define two vectors x1 and x2 to be conjugate with respect to a positive definite matrix M if x'1Mx2 = 0. Given a non-singular matrix A show that the columns of A form a set of mutually conjugate vectors with respect to Σ − 1 where Σ = AA'
6. A basis for $\mathbb{R}^p$ that is a conjugate basis with respect to a positive definite matrix M is a sequence of vectors x1,x2,...,xp in $\mathbb{R}^p$ such that x'iMxi = 1 and x'iMxj = 0 if $i \neq j$. Show that the columns of a non-singular matrix A form a conjugate basis with respect to Σ − 1 if Σ = AA'. Note that a conjugate basis is merely an orthogonal basis with respect to the metric defined by | | x | | 2 = x − 1x.
7. Show that a matrix is a non-negative definite matrix if and only if it is a variance matrix.
8. We will call a "square root" of a square matrix M any matrix A such that M = AA'. Show that a matrix has a square root if and only if it is a variance matrix. [Hint: the Spectral Decomposition Theorem might be useful]
9. Consider a $2 \times 2$ variance matrix $\Sigma = \begin{bmatrix} \sigma_{11} & \sigma_{12} \\ \sigma_{21} & \sigma_{22}\end{bmatrix}$ for a random vector $\begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix}$. Verify that the Cholesky matrix $C = \begin{bmatrix} \sigma_{11}^{1/2} & 0 \\ \sigma_{21}/ \sigma_{11}^{1/2}& \sqrt{\sigma_{22} - \sigma_{12}^2 / \sigma_{11}}\end{bmatrix}$ is a square root of Σ.
Show that the Cholesky matrix can be written as $\begin{bmatrix} \sigma_1 & 0 \\ \beta_{21} \sigma_1 & \sigma_{2 \cdot 1}\end{bmatrix}$ where β21 is the regression coefficient of Y2 on Y1.
Draw a concentration (or data) ellipse and indicate the interpretation of the vectors defined by the columns of C relative to the ellipse.
10. Show that a non-singular $2 \times 2$ variance matrix, Σ can be factored so that Σ = AA' with A an upper triangular matrix [in contrast with problem 9 where the matrix is lower triangular]. Explain the interpretation of the elements of this matrix as in question 9.
11. Generate 100 observations for three variables Y, X and Z so that in the regression of Y on both X and Z neither regression coefficient is significant (at the 5% level) but a test of the hypothesis that both coefficients are 0 is rejected at the 1% level. Explain your strategy in generating the data. How should the data be generated to produce the required result? Show a data ellipse for X and Z and appropriate confidence ellipses for their two regression coefficients. What does this example illustrate about the appropriatenes of scanning regression output for significant p-values and concluding that nothing is happening if none of the p-value achieve significance?
12. Generate 100 observations for three variables Y, X and Z so that in the separate simple regressions of Y on each of X and Z neither regression coefficient is significant (at the 5% level) but a test of the hypothesis that both coefficients are 0 in a multiple regression of Y on both X and Z is rejected at the 5% level. Explain your strategy in generating the data. How should the data be generated to produce the required result? Show a data ellipse for X and Z and appropriate confidence ellipses for their two regression coefficients. Explain the relationship between the ellipses and the phenomenon exhibited in this problem. What does this example illustrate about the appropriatenes of forward stepwise regression to identify a suitable model to predict Y using both X and Z?

## Week 3

• No regular class meeting on Wednesday, May 17, so you can attend the symposium by Rod Little.
• On Friday, we had a look at Pine Trees, Comas and Migraines: Asymptotic functions of time (http://www.math.yorku.ca/%7Egeorges/Slides/TalkOnComasAndMigraines.pdf) to illustrate multilevel models (longitudinal models in this case) that have either a continuous normal response or a dichotomous response.
• We also considered the difference between fixed effects and random effects classification models discussed in Chapter 2 of the text and in pp 52-60 of A First Look at Multilevel and Longitudinal Models (http://www.math.yorku.ca/~georges/Slides/CourseNotes.pdf).

## Week 5

Classes cancelled this week for the 'Learneds'

### Assignment 3 due Wednesday June 15

See team memberships in http://www.math.yorku.ca/~georges/Courses/6643
Prloblems are in MATH_6643:_Assignment_3
Team Geisser
Section 1
1(a), 1(c), 1(e)
2 4 6 8 10 12
Section 2
2.1, 2.3, 2.9,
3.4, 3.11
Section 3
2, 4, 6, 8
Section 4
all
Section 6
a
Team Gini
Section 1
1(b), 1(d), 1(f)
3 5 7 9 11
Section 2
2.2, 2.5, 2.10
3.3, 3.7,
Section 3
1, 3, 5, 7
Section 5
all
Section 6
b

## Week 6

• Relationships between models and contextual effects:
Mixed Models -- Behind the Scenes (http://www.math.yorku.ca/~georges/Slides/HandoutM.pdf)
Contextual Effects (http://www.math.yorku.ca/~georges/Slides/TalkOnContextualEffectsv2.pdf)
• R scripts
MATH 6643 hsintro.R
MATH 6643 hsgraph.R

## Week 7

• Diagnostics for Mixed Models:
Mixed Model Influence Diagnostics by Oliver Schabenberger, SAS Institute Inc., Cary, NC (http://www2.sas.com/proceedings/sugi29/189-29.pdf)
Judith Singer on SAS PROC MIXED (http://www.jstor.org/view/10769986/ap040018/04a00030/0)
• Plan for Friday, June 14:
Finish proof re Mixed Model with contextual effects.
Fitting models MATH 6643 hsdetail.R GM comment: Fix this
Interpreting T (http://www.math.yorku.ca/~georges/Slides/NMakingVariance.pdf)
Longitudinal models A First Look at Multilevel and Longitudinal Models: A second look ... pp.67ff (http://www.math.yorku.ca/~georges/Slides/CourseNotes.pdf)
Interpreting R output:
Introduction to Hierachical and Longitudinal Models Part 1 (http://www.math.yorku.ca/~georges/Slides/IntroHierLong-1.pdf)
Introduction to Hierachical and Longitudinal Models Part 2 (http://www.math.yorku.ca/~georges/Slides/IntroHierLong-2.pdf)
Longitudinal models A First Look at Multilevel and Longitudinal Models: A second look ... pp.86ff (http://www.math.yorku.ca/~georges/Slides/CourseNotes.pdf)
Introduction to non-linear models:
Comas and Migraines: Non-linear models for time (http://www.math.yorku.ca/~georges/Slides/TalkOnComasAndMigraines.pdf)
PROC NLMIXED Summary (http://www.math.yorku.ca/~georges/Slides/PROC%20NLMIXED%20SUMMARY.pdf)
NLME from José C. Pinheiro
User's guide (http://stat.bell-labs.com/NLME/UGuide.pdf)
Help files of NLME function (http://cm.bell-labs.com/cm/ms/departments/sia/project/nlme/HelpFunc.pdf)

## Week 8

### Assignment 4 due: Friday, June 23, 2006

Use the 'IQ' data at http://www.math.yorku.ca/~georges/Data/IQ.csv.
The following questions refer to the analyses presented in Comas and Migraines: Non-linear models for time (http://www.math.yorku.ca/~georges/Slides/TalkOnComasAndMigraines.pdf)
Where it is feasible report confidence intervals as well as p-values.
Include raw output in an appendix. Your discussion of results should only include confidence intervals and/or p-values that are specifically related to the questions you are addressing.
Include a generous number of explanatory graphs explaining the nature of the relationships you are describing.
You may use 'library(nlme)' in R or 'PROC NLMIXED' in SAS or other suitable software.
Explore VIQ models with a more complex model for initial deficit: e.g. Can you treat it as having a random effect? Does it depend on the duration of coma?
Explore the possible relationship between sex and age and recovery trajectories for VIQ.
Team assignments are at http://www.math.yorku.ca/~georges/Courses/6643.
Explore PIQ models with a more complex model for initial deficit: e.g. Can you treat it as having a random effect? Does it depend on the duration of coma?
Explore the possible relationship between sex and age and the recovery trajectories for PIQ.
Explore models for VIQ with a more complex model for 'half-recovery time'? Can it be treated as a random effect? Can you find any evidence that it depends on other variables?
Explore alternative models for the effect of the duration of coma on VIQ trajectories. Are there other transformations that would seem more appropriate? Can you use non-linear models to estimate the transformation?

Hello, I am Tianshu Ma and I have uploaded my presentation pdf on the image page of the special pages. I hope that will be helpful for you. Thanks Professor George and everyone in this morning class.