Calculating R-square values for two LME models with R

Calculating Rvalues cannot be done immediately (as far as we’ve looked) in R for LME models. Thanks to Corine Horsch we followed a method found in this book and created two functions to immediately calculate R2, which you can see below:

getRSquare <- function(Model0, Model1){

  sigma0SQ <-Model0$sigma^2
  tau0SQ <- diag(sqrt(getVarCov(Model0)))^2
  sigma1SQ <-Model1$sigma^2
  tau1SQ <- diag(sqrt(getVarCov(Model1)))^2

  Rsquare <- 1 - ((sigma1SQ+tau1SQ)/(sigma0SQ+tau0SQ))

  return (Rsquare)


In models with nested random effects, the function getVarCov() may not work. We then have to resort to a sneakier way of getting the tau value, using the more generic varCorr() function:

vc.Model0 <- VarCorr(Model0)
tau0SQ <- as.numeric(vc.Model0[4,2])^2
vc.Model1 <- VarCorr(Model1)
tau1SQ <- as.numeric(vc.Model1[4,2])^2

Note here that we are accessing a table at a certain cell: [4,2]. If you print vc.Model0 or vc.Model1 out, the cell at the 4th row/2nd column corresponds exactly to the tau value relevant to the inner nested random effect. To get the outer one for example, you should use cell [2,2]. Experiment with this table to understand better how to access the right cell for models with several nested random effects.

Specifying correlation matrices for longitudinal models in R

Specifying correlation matrices in various data analysis packages for longitudinal data in R (e.g. geepack, nlme) can be quite confusing, especially if the model converges with its default correlation matrix and therefore subjecting your analysis to certain assumptions. Here are various ways you can specify a correlation matrix, depending on the package.

In geepack, you can use the variable “constr”:

gee.model <- geeglm(response ~ fixed, id = rand,
data = yourData, family=gaussian, corstr="unstructured")

In nlme, it can be a little more complicated. You can start by specifying and initializing the matrix:

corMatrix <- corSymm(form = ~1|rand)
corMatrix <- Initialize(matrix, data = yourData)

And then, you can assign the initialized matrix to your model. Note that there has to be a match between the formula in your matrix and the random effect structure.

lme.model <- lme(response ~ fixed, random = ~1|rand,
data = yourData, correlation = corMatrix)

p.s. The R package “lmer” still, as of the writing of this post, does not support a custom correlation matrix.

Mixed models with nested random effects in R

Once again, SPSS failed me miserably while trying to run my mixed, multilevel model below:

nested model

The first two levels, BSO(i.e. day care center) and pp (participant) are nested random effects. As the model shows, each participant’s “emotional” or behavioral code (e.g. engaged, excited, bored, …) was sampled several times, during two sessions (once for each of the app’s two versions). SPSS failed to allow me to build this model properly, mostly failing to comprehend that random effects have more than one level.

So I turned to R and found several, very interesting (though maybe a bit long to read) tutorials using the package lme4, so credit goes to Bodo and Ben for their extensive effort into simplifying what could be a very complex process.

The way it worked for me is below, described in an overly simplified way.

require(xlsx) #to read excel sheets
require(lme4) #the package
Behavioral <- read.xlsx("behavioral", 1) #the data set

#certain things we need to ensure are treated as factors
#default is covariants
Behavioral$version = as.factor(Behavioral$version)
Behavioral$pp = as.factor(Behavioral$pp)

#first we construct the null model
#emotion type as response with no fixed factors
#and a nested BSO and pp as random factors
model.null <-lmer(emo.type ~ 1 + 1|BSO/pp, data = Behavioral, REML=FALSE)

#we then construct our model, adding version as fixed factor
model <-lmer(emo.type ~ version + 1|BSO/pp, data = Behavioral, REML=FALSE)

#see summaries of the models here
summary (model.null)
summary (model)

#to obtain the p-value, i.e. to see if adding
#the fixed factor made a significant difference

#and finally
#will generate a bibtex entry so you cite the authors of the package!

Guest lecture – Empirical Research Methods course @TU Delft

Gave another, short guest lecture in the Empirical Research Methods (ERM) masters course at TU Delft. It is about using Bootstrapping to overcome a several obstacles in data analysis, including violation of the assumption of normality. Bootstrapping can be used to calculate new confidence intervals in order to permit the analyst again to perform tests that assume normality, such as t-tests, ANOVA, etc.

The lecture is recorded for OpenCourseWare, you can access it here.

Generalized Estimating Equations (GEE) models using R

I’ve been having the feeling lately that using SPSS for running multilevel analysis may not be the greatest idea. I had to switch to something that is more robust and better documented. Enter R: free, open source, greatly documented, tons of well maintained packages, great for both simple and complex analyses, mathematics in general, plotting capabilities, as well as pretty much anything you would have used MATLAB to do.

Here’s a quick explanation on how to perform the same analysis in here which was done using SPSS, this time using R. The package I’m using here is geepack, and It’s documented here.

require(xlsx) #to read excel sheets
require(geepack) #the gee package

#reading my data file in excel, with a first row of column headers
J1 <- read.xlsx("J1\ multimodel.xlsx", 1)

#making sure I read my predictors as factors
#default would treat them as covariates.
J1$N <- as.factor(J1$N)
J1$A <- as.factor(J1$A)
J1$Cr <- as.factor(J1$Cr)
J1$Cd <- as.factor(J1$Cd)

#constructing the model:
#slider as response,
#N, A, Cr, and CD as fixed effects
#and pp as random effect
#with slider's distribution as gaussian
# and an unstructured covariance matrix
gee01 <- geeglm (slider ~ N*A*Cr*Cd,
id =pp, data = J1,

#and finally, to see the results:

Following two master’s courses @TU Delft now

I’m now following two masters courses at TU Delft. It has been quite a long time since I actually took courses, so this feels a bit unusual. The two courses will last for two quarters each, or more like a whole semester in US college terms. So a course will provide my graduate school tally with 5 points (10 in total) and, well quite some hefty homework!

One of the two courses is called Intelligent User Experience Engineering (IUXE), a hybrid of of interaction design and HCI techniques, along with AI, and evaluation methods. The other is Empirical Research Methods (ERM), a course focused on teaching new researchers the knowledge and competence required for developing models, preparing experiments, and choosing and conducting statistical tests using software tools like SPSS or R.

The lectures in these courses (including audio!) are actually available on TU Delft’s OpenCourseWare, here (IUXE) and here (ERM).