# R: lmer

### From MathWiki

**This page needs cleaning up.**

Many new methods have been written for 'lmer' objects. For example, there is now an 'mcmcsamp' method that allows mcmc sampling of a posterior distribution. There is also a non-linear version, 'nlmer'. The main limitation of 'lmer' in comparison with 'lme' is the absence of an "R-side" model.

- Display VarCorr in lmer: http://tolstoy.newcastle.edu.au/R/help/05/10/14767.html

[edit]

## Using lmer and nlme at the same time

Summary:

A workaround: Redefine print.summary.lme by commenting the line

class(corr) <- "correlation"

R list posting by Martin Maechler:

>>>>> "Dirk" == Dirk Enzmann <dirk.enzmann@uni-hamburg.de> >>>>> >>>>> on Mon, 24 Jul 2006 13:55:44 +0200 writes: Dirk> After loading the package "Matrix" (version 0.995-12), Dirk> using the summary function with an lme (package nlme Dirk> version 3.1-75) object results in an error message Dirk> saying Dirk> Fehler in dim(x) : kein Slot des Namens "Dim" für Dirk> dieses Objekt der Klasse "correlation" Dirk> (translated: 'Error in dim(x) : no slot of the name Dirk> "Dim" for this object of class "correlation")'. Dirk> Without loading "Matrix" this error message does not Dirk> occur. Dirk> Any ideas? Yes. Thank you for the report. The error happens when PRINTing a summary.lme object, i.e., ssl <- summary(<lme> ) ## works fine ssl ##-> gives the error The short answer: Don't use 'Matrix' together with 'nlme' for the time being (but rather use 'lme4' and lmer() if possible) A long answer: The problem happens because 'Matrix' defines a (S4) class "correlation" (which inherits from "dpoMatrix"). Hence dim( <correlation> ) is accessing the 'Dim' slot of the <correlation> object --- which obviously must fail for that part of a "summary.lme" object which is not a proper S4 object at all. Here is reproducible example of showing the problem without even doing an nlme related thing: > x <- round(cor(matrix(rnorm(100), 20,5)),2); class(x) <- "correlation" ; x [,1] [,2] [,3] [,4] [,5] [1,] 1.00 -0.29 -0.07 -0.11 0.03 [2,] -0.29 1.00 0.10 -0.02 0.20 [3,] -0.07 0.10 1.00 0.05 0.13 [4,] -0.11 -0.02 0.05 1.00 -0.21 [5,] 0.03 0.20 0.13 -0.21 1.00 attr(,"class") [1] "correlation" > dim(x) [1] 5 5 > library(Matrix) Loading required package: lattice > dim(x) Error in dim(x) : no slot of name "Dim" for this object of class "correlation" > dim .Primitive("dim") > showMethods(dim) Function "dim": x = "ANY" x = "Matrix" x = "correlation" (inherited from x = "Matrix") > showClass("correlation") Slots: Name: sd x Dim Dimnames uplo factors Class: numeric numeric integer list character list Extends: Class "dpoMatrix", directly Class "dsyMatrix", by class "dpoMatrix" Class "ddenseMatrix", by class "dpoMatrix" Class "symmetricMatrix", by class "dpoMatrix" Class "dMatrix", by class "dpoMatrix" Class "denseMatrix", by class "dpoMatrix" Class "Matrix", by class "dpoMatrix" Class "Matrix", by class "dpoMatrix" Class "compMatrix", by class "dpoMatrix" Class "Matrix", by class "dpoMatrix" > A workaround -- not for you but the authors of 'Matrix' and/or 'nlme' -- is for one of the two packages to call the class differently. At the moment, I tend to do the change in 'nlme', since there, the "correlation" class has been almost entirely hidden from the user. A workaround for you: Redefine print.summary.lme, e.g., by commenting the line class(corr) <- "correlation" Regards, Martin ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

[edit]

## R-list messages from Doug Bates

Hi Spencer, On 7/15/06, Spencer Graves <spencer.graves@pdf.com> wrote: >> p.s. I intended to include the following extension to an example from >> the 'lmer' help page: >> >> fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy) >> (fm1.fix <- fixef(fm1)) >> (fm1.fix.se <- sqrt(diag(vcov(fm1)))) >> fm1.fix/fm1.fix.se >> >> fm1.ran <- VarCorr(fm1) >> diag(fm1.ran$Subject) I'm confident that you are aware that sqrt(diag(vcov(fm1))) and sqrt(diag(fm1.ran$Subject)) refer to different parameters the model. However, some readers of your reply may not see the distinction. Allow me to try to clarify. The summary for this fitted model is lmer> (fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)) Linear mixed-effects model fit by REML Formula: Reaction ~ Days + (Days | Subject) Data: sleepstudy AIC BIC logLik MLdeviance REMLdeviance 1753.628 1769.593 -871.8141 1751.986 1743.628 Random effects: Groups Name Variance Std.Dev. Corr Subject (Intercept) 612.090 24.7405 Days 35.072 5.9221 0.066 Residual 654.941 25.5918 number of obs: 180, groups: Subject, 18 Fixed effects: Estimate Std. Error t value (Intercept) 251.4051 6.8246 36.838 Days 10.4673 1.5458 6.771 Correlation of Fixed Effects: (Intr) Days -0.138 The fixed effects described in the lower part of this summary are a familiar type of parameter in a statistical model. They are coefficients in a linear predictor. We produce estimates of these parameters and also provide a measure of the precision of these estimates - their standard errors. The vcov generic function returns an estimate of the precision of the estimated parameters (typically parameters that are coefficients in a linear predictor). Thus >> sqrt(diag(vcov(fm1))) [1] 6.824558 1.545789 provides the standard errors of the fixed effects estimates. The random effects, although they also appear in the linear predictor, are not formally parameters in the model. They are unobserved random variables whose variance covariance matrix has a known form but unknown value. It is the values that determine the variance-covariance matrix that are parameters in the model. These parameters are returned by VarCorr >> VarCorr(fm1) $Subject 2 x 2 Matrix of class "dpoMatrix" (Intercept) Days (Intercept) 612.09032 9.60428 Days 9.60428 35.07165 attr(,"sc") [1] 25.59182 In other words, the 2 x 2 matrix shown above is the estimate of the variance-covariance matrix of the random effects associated with the grouping factor "Subject". Thus >> sqrt(diag(VarCorr(fm1)$Subject)) [1] 24.740459 5.922132 gives the estimated standard deviations of the random effects. These are estimates of parameters in the model. They are *not* standard errors of parameter estimates. The lmer function and related software does not return standard errors of the estimates of variance components. This is intentional. Below I give my familiar rant on why I think returning such standard errors is a bad practice. I encourage users of lmer who wish to determine the precision of the estimates of the variance components to create a Markov chain Monte Carlo sample of the parameters and evaluate the HPDintervals. >> sm1 <- mcmcsamp(fm1, 50000) >> library(coda) Warning message: use of NULL environment is deprecated >> HPDinterval(sm1) lower upper (Intercept) 236.6518363 266.5465536 Days 7.0136243 13.8947993 log(sigma2) 6.2550082 6.7295329 log(Sbjc.(In)) 5.4928205 7.5751372 log(Sbjc.Days) 2.8197523 4.6337518 atanh(Sbj.(I).Dys) -0.6988632 0.9836688 deviance 1752.5158501 1766.6461469 attr(,"Probability") [1] 0.95 <rant> Some software, notably SAS PROC MIXED, does produce standard errors for the estimates of variances and covariances of random effects. In my opinion this is more harmful than helpful. The only use I can imagine for such standard errors is to form confidence intervals or to evaluate a z-statistic or something like that to be used in a hypothesis test. However, those uses require that the distribution of the parameter estimate be symmetric, or at least approximately symmetric, and we know that the distribution of the estimate of a variance component is more like a scaled chi-squared distribution which is anything but symmetric. It is misleading to attempt to summarize our information about a variance component by giving only the estimate and a standard error of the estimate. </rant> >> >> ######################## >> The structure of 'lmer' objects and helper functions is outlined in >> the 'lmer' and 'lmer-class' help pages. The latter mentions "'vcov >> 'signature(object = "mer")': Calculate variance-covariance matrix of the >> _fixed_ effect terms, see also 'vcov'." Thus, >> sqrt(diag(vcov(lmer.object))) should give you standard errors of the >> fixed effects. >> >> The parameter estimates can be obtained using 'VarCorr'. However, >> characterizing their random variability is harder, because their >> distribution is not as simply summarized. The 'lmer-class' help page >> says that an 'lmer' object includes a slot, "'Omega': A list of >> positive-definite matrices stored as '"dpoMatrix"' objects that are the >> relative precision matrices of the random effects associated with each >> of the grouping factors." However, I don't know how to use this. For >> problems like this, the 'lme4' and 'Matrix' packages include a function >> 'simulate', which is what I might use, at least until I figured out how >> to get essentially the same answer from the Omega slot. >> >> Hope this helps, >> Spencer Graves >> >> prabhu bhaga wrote: > >>> > Dear all, >>> > >>> > I'm trying to store/extract the mean& standard error of the fixed effects >>> > parameter and the variance of the random effects parameter from "lmer" >>> > procedure from mlmre4 package developed by bates n pinheiro. while storing >>> > fixed effects parameter is straight forward, the same is not true for >>> > storing the variance parameter of the random effects. kindly help me >>> > >>> > ~prabhu >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-help@stat.math.ethz.ch mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > >> >> ______________________________________________ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

On 7/15/06, A.R. Criswell <rstatistics@gmail.com> wrote: >> Hello All, >> >> I would like to retrieve some of the results from the lmer(...) >> function in library lme4. If I run a model, say >> >> fm.1 <- lmer(y ~ 1 + (1 | x), data = dog) >> >> and try names(fm.1), I get NULL. Is there anyway to retrieve the information? Yes. The recommended way of retrieving information form a fitted model object like an lmer object is with extractor functions. See ?lmer-class for a list of such methods. That help page also documents the slots in the object. The lmer class is an S4 class and uses typed slots instead of named components. The str function displays the structure of pretty well any type of R object, including S3 classed objects or S4 classed objects. That is my favorite way of checking the structure of an object. Please remember that it is risky to count on being able to reach in to an object and pull out slots or components and operate on them. The names and contents of slots are not guaranteed to stay constant. The lme4 and Matrix packages have been under development for a long time and should continue to be regarded as under development. When we change the internal representation we do change the extractor functions accordingly. It is a bug if an internal change causes an extractor function in the package to fail to return correct results. It is not a bug if an internal change causes your code that assumes a particular, obsolete representation to break. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

The apparent contradiction here is again a case of inadequate documentation and checking. Because the order of the triplets in the triplet form of a sparse matrix is essentially random it is permissible to have repeated indices. As you have seen, the interpretation of repeated indices is that the value at any index is the sum of the values in the triplets corresponding to that index. It is not permissible to have repeated indices in the compressed form. In the compressed form there is a well-defined ordering of the indices, first by columns then by row within column and the row indices must be increasing within columns. Your matrix Mc should be flagged as invalid. Martin and I should discuss whether we want to add such a test to the validity method. It is not difficult to add the test but there will be a penalty in that it will slow down all operations on such matrices and I'm not sure if we want to pay that price to catch a rather infrequently occuring problem. There is some documentation of the internal representations of these formats in the directory $R_LIB/Matrix/doc/UFSparse/. The User Guides in that directory are taken directly from the various sparse matrix packages that Tim Davis at the University of Florida has written. He has a book that is scheduled for publication this September Tim Davis (2006), Direct Methods for Sparse Linear Systems, SIAM, Philadelphia, PA I hope we will be able to refer to that book for details of the representation and algorithms. On 7/8/06, Thaden, John J <ThadenJohnJ@uams.edu> wrote: >> In the Matrix package v. 0.995-11 I see that the dgTMatrix >> Class for compressed, sparse, triplet-form matrices handles >> Identically indexed data instances by summing their values, >> e.g., >> >> library(Matrix) >> (Mt <- new("dgTMatrix", >> i = as.integer(c(0,0,1,1,4)), >> j = as.integer(c(0,1,2,2,4)), >> x = as.double(1:5), >> Dim = as.integer(c(5,5)))) >> ## 5 x 5 sparse Matrix of class "dgTMatrix" >> ## [1,] 1 2 . . . >> ## [2,] . . 7 . . <--- 7 = 3 + 4. >> ## [3,] . . . . . >> ## [4,] . . . . . >> ## [5,] . . . . 5 >> >> # If instead I make a dgCMatrix-class matrix, the first >> # instance is overwritten by the second, e.g., >> >> library(Matrix) >> (Mc <- new("dgCMatrix", >> i = as.integer(c(0,0,1,1,4)), >> p = as.integer(c(0,1,2,4,5)), >> x = as.double(1:5), >> Dim = as.integer(c(5,5)))) >> ## 5 x 5 sparse Matrix of class "dgCMatrix" >> ## >> ## [1,] 1 2 . . >> ## [2,] . . 4 . <-- the datum '3' has been lost. >> ## [3,] . . . . >> ## [4,] . . . . >> ## [5,] . . . 5 >> >> # If one arrives at the dgCMatrix via the dgTMatrix class, >> # the summed value is of course preserved, e.g., >> >> (Mtc <- as(Mt, "dgCMatrix")) >> ## 5 x 5 sparse Matrix of class "dgCMatrix" >> ## >> ## [1,] 1 2 . . . >> ## [2,] . . 7 . . >> ## [3,] . . . . . >> ## [4,] . . . . . >> ## [5,] . . . . 5 >> >> As there is nothing inherent in either compressed, sparse, >> format that would prevent recognition and handling of >> duplicated index pairs, I'm curious why the dgCMatrix >> class doesn't also add x values in those instances? >> I wonder also if others might benefit also by being able >> to choose how these instances are handled, i.e., >> whether they are summed, averaged or overwritten? >> >> -John Thaden, Ph.D. >> Research Assistant Professor of Geriatrics >> University of Arkansas for Medical Sciences >> Little Rock AR, USA >> >> >> Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. >> >> ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[edit]

## Degrees of freedom for lmer

On 7/26/06, Bill Shipley <bill.shipley@usherbrooke.ca> wrote: > > Hello. Douglas Bates has explained in a previous posting to R why he does > > not output residual degrees of freedom, F values and probabilities in the > > mixed model (lmer) function: because the usual degrees of freedom (obs - > > fixed df -1) are not exact and are really only upper bounds. I am > > interpreting what he said but I am not a professional statistician, so I > > might be getting this wrong... > > Does anyone know of any more recent results, perhaps from simulations, that > > quantify the degree of bias that using such upper bounds for the demoninator > > degrees of freedom produces? Is it possible to calculate a lower bounds for > > such degrees of freedom? I have not seen any responses to your request yet Bill. I was hoping that others might offer their opinions and provide some new perspectives on this issue. However, it looks as if you will be stuck with my responses for the time being. You have phrased your question in terms of the denominator degrees of freedom associated with terms in the fixed-effects specification and, indeed, this is the way the problem is usually addressed. However, that is jumping ahead two or three steps from the iniital problem which is how to perform an hypothesis test comparing two nested models - a null model without the term in question and the alternative model including this term. If we assume that the F statistic is a reasonable way of evaluating this hypothesis test and that the test statistic will have an F distribution with a known numerator degrees of freedom and an unknown denominator degrees of freedom then we can reduce the problem of testing the hypothesis to one of approximating the denominator degrees of freedom. However, there is a lot of assumption going on in that argument. These assumptions may be warranted or they may not. As far as I can see, the usual argument made for those assumptions is by analogy. If we had a balanced design and if we used error strata to get expected and observed mean squares and if we equated expected and observed mean squares to obtain estimates of variance components then the test for a given term in the fixed effects specification would have a certain form. Even though we are not doing any of these things when estimating variance components by maximum likelihood or by REML, the argument is that the test for a fixed effects term should end up with the same form. I find that argument to be a bit of a stretch. Because the results from software such as SAS PROC MIXED are based on this type of argument many people assume that it is a well-established result that the test should be conducted in this way. Current versions of PROC MIXED allow for several different ways of calculating denominator degrees of freedom, including at least one, the Kenward-Roger method, that uses two tuning parameters - denominator degrees of freedom and a scale factor. Some simulation studies have been performed comparing the methods in SAS PROC MIXED and other simulation studies are planned but for me this is all putting the cart before the horse. There is no answer to the question "what is the _correct_ denominator degrees of freedom for this test statistic" if the test statistic doesn't have a F distribution with a known numerator degrees of freedom and an unknown denominator degrees of freedom. I don't think there is a perfect answer to this question. I like the approach using Markov chain Monte Carlo samples from the posterior distribution of the parameters because it allows me to assess the distribution of the parameters and it takes into account the full range of the variation in the parameters (the F-test approach is conditional on estimates of the variance components). However, it does not produce a nice cryptic p-value for publication. I understand the desire for a definitive answer that can be used in a publication. However, I am not satisfied with any of the "definitive answers" that are out there and I would rather not produce an answer than produce an answer that I don't believe in. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.