Monday, August 31, 2009

Analyzing mixed (nested) designs in R

Analyzing mixed (nested) designs in R

} I’m trying to understand the differences among the different ways of analyzing mixed models, both in theory and in how they are implemented in various R packages. Specifically, for a classic balanced design where the method of moments/ F ratio approach works, what are the comparable answers from the model-fitting-and-comparison approach? How do the old (nlme) and new (lmer) approaches to this analysis compare? Quinn and Keough [3] have a good discussion of nested designs: they give a worked example based on Andrew and Underwood [1].

Reading data, analyzing it the wrong way (and fixing it by hand)

The following discussion and R code is modified from http://www.stat.sfu.ca/~thompson/stat403-650/complexdesigns.html, by Steve Thompson, Simon Fraser University. Grab data from [1] (from Quinn and Keough’s web site):

> datafile <- "http://www.zoology.unimelb.edu.au/qkstats/chpt9/andrew.csv"
The first three columns (TREAT, PATCH, QUAD) are categorical, the last (ALGAE) is numeric (percent algal cover):

> urchins <- read.csv(file = datafile, colClasses = c(rep("factor",
+ 3), "numeric"))
What do the data look like?

> summary(urchins)
TREAT PATCH QUAD ALGAE
con :20 1 : 5 1:16 Min. : 0.00
rem :20 10 : 5 2:16 1st Qu.: 0.00
t0.33:20 11 : 5 3:16 Median : 5.00
t0.66:20 12 : 5 4:16 Mean :20.26
13 : 5 5:16 3rd Qu.:41.00
14 : 5 Max. :83.00
(Other):50
(plot(urchins) actually does a not-half-bad job of summarizing the experimental design and results with a scatterplot matrix). Fit a two-way, fixed-effect ANOVA:

> lm1 = lm(ALGAE ~ TREAT * PATCH, data = urchins)
> a1 = anova(lm1)
> a1
Analysis of Variance Table

Response: ALGAE
Df Sum Sq Mean Sq F value Pr(>F)
TREAT 3 14429.1 4809.7 16.1075 6.579e-08
PATCH 12 21241.9 1770.2 5.9282 8.323e-07
Residuals 64 19110.4 298.6
The above analysis of variance table has the right mean squares and degrees of freedom, but the wrong F value for testing treatments, because it has used the residual (subsampling) mean square in the denominator. This gives a highly inflated F value and unrealistically small p-value because the variance among nearby quadrats within a patch is small compared to the variability between patches. The correct F value is obtained by dividing the mean square for treatments by the mean square for patches. It is calculated below together with the correct p-value. So the experiment has not shown strong evidence for an effect of sea urchin density on algae cover, even though the pattern of the results is suggestive of such an effect.

> msq <- a1$`Mean Sq`
> fratio <- msq[1]/msq[2]
> fratio
[1] 2.717102
> dfs <- a1$Df
> pf(fratio, dfs[1], dfs[2], lower.tail = FALSE)
[1] 0.091262
Test for significant effect of patches:

> fratio2 <- msq[2]/msq[3]
> fratio2
[1] 5.928207
> pf(fratio2, dfs[2], dfs[3], lower.tail = FALSE)
[1] 8.322613e-07
The "right" way: with ''aov''

Alternatively, we can do this directly with aov:

> a2 <- aov(ALGAE ~ TREAT + Error(PATCH), data = urchins)
> summary(a2)
Error: PATCH
Df Sum Sq Mean Sq F value Pr(>F)
TREAT 3 14429.1 4809.7 2.7171 0.09126
Residuals 12 21242.0 1770.2

Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 64 19110.4 298.6
In this data set PATCH is coded as 1–16 — i.e., distinct values for each treatment type (1–4 are control, 5–8 are 66\ density, etc.), so I don’t actually have to say that they are nested. If they were coded as 1–4 I would have to use Error(TREAT:PATCH) to specify that “patch 1” should not be treated as the same among treatments (if patch 1 were the same in each treatment — e.g. if patch 1 were always the northernmost patch — I would have to specify a crossed design instead). I can do the next level (patch effect) F test by hand:

> fratio = 1770.2/298.6
> pf(fratio, 12, 64, lower.tail = FALSE)
[1] 8.320035e-07
How do I get both levels of error analysis — i.e. the test of MS(patch)/MS(residual) as well — calculated automatically?

Checking assumptions etc.

Other notes: Quinn and Keough say: “there were large difference in within-cell variances. Even the variances among patch means within treatments varied, with very low variances among control patch means. These data are percentages, although an arcsin-[square root transformation] had not effect in improving variance homogeneity, nor did a log transformation. Like Andrew \& Underwood (1993), we analyzed untransformed data, relying on the robustness of tests in balanced ANOVA designs.” Here are the residual values by treatment group and Q-Q plot (which should be a straight line if the residuals are normally distributed). (We don’t have to worry about error structure for these purposes, so we can use plain old lm, not aov.)

> par(mfrow = c(1, 2))
> plot(lm1, which = c(5, 2))
(Quinn and Keough also tried the analysis omitting the control group, and still got the same [non-significant] result for treatment effect, so felt better about the whole thing.) Here are the diagnostic plots if we arcsin-square-root transform:

> par(mfrow = c(1, 2))
> plot(lm(asin(sqrt(ALGAE/100)) ~ TREAT * PATCH, data = urchins),
+ which = c(5, 2))
Or if we log(1+x)-transform:

> par(mfrow = c(1, 2))
> plot(lm(log10(ALGAE + 1) ~ TREAT * PATCH, data = urchins), which = c(5,
+ 2))
Does transforming change the results?

> summary(aov(asin(sqrt(ALGAE/100)) ~ TREAT + Error(PATCH), data = urchins))[[1]]
Df Sum Sq Mean Sq F value Pr(>F)
TREAT 3 3.2830 1.0943 2.8525 0.08181
Residuals 12 4.6037 0.3836
> summary(aov(log(ALGAE + 1) ~ TREAT + Error(PATCH), data = urchins))[[1]]
Df Sum Sq Mean Sq F value Pr(>F)
TREAT 3 72.737 24.246 2.9028 0.07859
Residuals 12 100.229 8.352
Not much.

Using ''lme''

R has another way to do linear mixed models, which allows for more complex designs etc. Pinheiro and Bates recommend this approach, which gives the same result as above:

> library(nlme)
> lme1 = lme(ALGAE ~ TREAT, random = ~1 | PATCH, data = urchins,
+ method = "REML")
> anova(lme1)
numDF denDF F-value p-value
(Intercept) 1 64 18.555081 0.0001
TREAT 3 12 2.717102 0.0913
It is possible (but strongly discouraged) to compare such fits using a likelihood ratio test (from [2], p. 87–88: `Even though a likelihood ratio test for the ML fits of models with different fixed effects can be calculated, we do not recommand using such tests. Such likelihood ratio tests using the standard χ2 reference distribution tend to be “anticonservative”—sometimes quite badly so.’) For example:

> lme1A = lme(ALGAE ~ TREAT, random = ~1 | PATCH, data = urchins,
+ method = "ML")
> lme2A = lme(ALGAE ~ 1, random = ~1 | PATCH, data = urchins, method = "ML")
> anova(lme1A, lme2A)
Model df AIC BIC logLik Test L.Ratio p-value
lme1A 1 6 718.8312 733.1234 -353.4156
lme2A 2 3 721.1250 728.2711 -357.5625 1 vs 2 8.2938 0.0403
almost halves the p value.

With ''lmer''

> detach("package:nlme")
> library(lme4)
> lme1r = lmer(ALGAE ~ TREAT + (1 | PATCH), data = urchins, method = "REML")
> lme1r.ML = lmer(ALGAE ~ TREAT + (1 | PATCH), data = urchins,
+ method = "ML")
> anova(lme1r)
Analysis of Variance Table
Df Sum Sq Mean Sq
TREAT 3 2434.27 811.42
> anova(lme1r.ML)
Analysis of Variance Table
Df Sum Sq Mean Sq
TREAT 3 3245.3 1081.8
Don’t know why these sums of squares/mean square values are so different (even without having a p-value). To get the same values as listed above (but with a warning):

> anova(lmer(ALGAE ~ TREAT + (1 | TREAT), data = urchins, method = "ML"))
Analysis of Variance Table
Df Sum Sq Mean Sq
TREAT 3 14429.1 4809.7
From Doug Bates:

A "p-value" could be formulated from an MCMC sample if we assume
that the marginal distribution of the parameter estimates for beta\_2
and beta\_3 has roughly elliptical contours and you can evaluate that
by, say, examining a hexbin plot of the values in the MCMC sample.
One could take the ellipses as defined by the standard errors and
estimated correlation or, probably better, by the observed standard
deviations and correlations in the MCMC sample. Then determine the
proportion of (beta\_2, beta\_3) pairs in the sample that fall outside
the ellipse centered at the estimates and with that eccentricity and
scaling factors that passes through (0,0). That would be an
empirical p-value for the test.
> mcmcpvalue <- function(samp) {
+ std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp),
+ transpose = TRUE)
+ sqdist <- colSums(std * std)
+ sum(sqdist[-1] > sqdist[1])/nrow(samp)
+ }
> m1 = mcmcsamp(lme1r, 5000)
> mcmcpvalue(m1[, 2:4])
[1] 0.0848
In the same ballpark as the other methods.

languageR

The languageR has similar methods — the aovlmer.fnc function will produce a (somewhat bogus) ANOVA table based only on the fixed effects degrees of freedom, as well as the MCMC sampling p-value.

> library(languageR)
> aovlmer.fnc(lme1r, noMCMC = TRUE)
Analysis of Variance Table
Df Sum Sq Mean Sq F Df2 p
TREAT 3 2434.27 811.42 2.7174 76.00 0.05
> aovlmer.fnc(lme1r, mcmc = m1, which = 2:4)
$MCMC
$MCMC$p
[1] 0.0848

$MCMC$which
[1] 2 3 4


$Ftests
Analysis of Variance Table
Df Sum Sq Mean Sq F Df2 p
TREAT 3 2434.27 811.42 2.7174 76.00 0.05
> detach("package:lme4")
Randomization tests

A function to permute the sampled data (ALGAE), leaving the experimental design — the first three rows of the data set — intact:

> permdat <- function() {
+ x = cbind(urchins[, 1:3], urchins[sample(nrow(urchins), replace = FALSE),
+ "ALGAE"])
+ names(x)[4] = "ALGAE"
+ x
+ }
The hardest part is digging inside the object for the F statistic:

> getFstat <- function(x) {
+ summary(x)[[1]][[1]]$"F value"[1]
+ }
Permute and run the analysis 5000 times:

> set.seed(1001)
> nsim = 5000
> Fdistrib <- replicate(nsim, getFstat(aov(ALGAE ~ TREAT + Error(PATCH),
+ data = permdat())))
Plot:

> trueFstat = getFstat(a2)
> plot(density(Fdistrib, from = 0), main = "", ylim = c(0, 0.7),
+ xlim = c(0, 10))
> curve(df(x, 3, 12), col = 2, add = TRUE)
> abline(v = qf(0.95, 3, 12), col = 2, lty = 2)
> abline(v = quantile(Fdistrib, 0.95), lty = 2)
> abline(v = trueFstat, lwd = 2)
> legend("topright", c("permutation distrib.", "theoretical F distrib.",
+ "perm. 95% cutoff", "theor. 95% cutoff", "observed F stat"),
+ lty = c(1, 1, 2, 2, 1), col = c(1, 2, 1, 2, 1))
Comparing p-values:

> sum(Fdistrib > trueFstat)/nsim
[1] 0.0882
> summary(a2)[[1]][[1]]$"Pr(>F)"[1]
[1] 0.091262
Bottom line: the ANOVA [[]] conclusion seems very robust in this case!

References

N. L. Andrew and A. J. Underwood. Density-dependent foraging in the sea urchin {Centrostephanus rodgersii} on shallow subtidal reefs in {New South Wales}, {Australia}. MEPS, 99:89–98, 1993.
Jos{\’e} C. Pinheiro and Douglas M. Bates. Mixed-effects models in {S and {S-PLUS}}. Springer, New York, 2000.
Gerry P. Quinn and Michael J. Keough. Experimental Design and Data Analysis for Biologists. Cambridge University Press, Cambridge, England, 2002.
down somewhere \ldots)

No comments: