If you’re working with linear mixed effects models in R using the lmer function from the lme4 package, you may have wondered how to calculate the 95% confidence interval (CI) for the root mean squared error (RMSE). In this article, we’ll take a deep dive into the world of linear mixed effects models and show you how to calculate the 95% CI for RMSE from lmer.

What is lmer?

lmer is a function in R that allows you to fit linear mixed effects models to your data. It’s a part of the lme4 package, which provides a comprehensive set of tools for fitting linear mixed effects models.

Linear mixed effects models are a type of regression model that can account for the nested structure of data. They’re commonly used in fields such as psychology, education, and biology, where data is often collected from multiple sources or levels.

What is RMSE?

RMSE (root mean squared error) is a measure of the average distance between predicted values and observed values. It’s a widely used metric in statistics and machine learning to evaluate the performance of a model.

RMSE is calculated using the following formula:

RMSE = sqrt(mean((predicted - observed)^2))

Why do we need the 95% CI for RMSE?

The 95% CI for RMSE provides a range of values within which the true RMSE is likely to lie. This can be useful for several reasons:

  • It provides a measure of uncertainty around the RMSE estimate
  • It allows us to compare the performance of different models
  • It can be used to determine whether the model is statistically significant

Calculating the 95% CI for RMSE from lmer

Calculating the 95% CI for RMSE from lmer involves several steps. We’ll use the sleepstudy dataset from the lme4 package to illustrate the process.


# Fit the linear mixed effects model
fit <- lmer(Reaction ~ Days + (1|Subject), data = sleepstudy)

The first step is to extract the residual standard deviation from the model summary.


# Extract the residual standard deviation
sigma <- sigma(fit)

Next, we need to calculate the degrees of freedom for the model. The degrees of freedom are the number of observations minus the number of parameters estimated in the model.

n <- nrow(sleepstudy)
p <- length(fixef(fit))
df <- n - p

Now, we can calculate the RMSE using the residual standard deviation and degrees of freedom.

rmse <- sigma / sqrt(df)

To calculate the 95% CI for RMSE, we can use the following formula:

ci_rmse <- rmse * sqrt(df / qchisq(c(0.025, 0.975), df))

The qchisq function returns the quantiles of the chi-squared distribution, which we use to calculate the confidence interval.

Interpreting the Results

The 95% CI for RMSE provides a range of values within which the true RMSE is likely to lie. For example, if the CI is (10, 20), we can say that the true RMSE is likely to be between 10 and 20.

A narrower CI indicates that the model is more precise, whereas a wider CI indicates that the model is less precise.

Common Issues and Troubleshooting

When calculating the 95% CI for RMSE from lmer, you may encounter some common issues. Here are some troubleshooting tips:

  • If you encounter an error message when trying to extract the residual standard deviation, check that you have specified the correct model formula and data.
  • If the CI is very wide, check that the model is correctly specified and that the data is not overly complex.
  • If the CI is very narrow, check that the model is not overfitting the data.


Calculating the 95% CI for RMSE from lmer is a straightforward process that provides valuable insights into the performance of a linear mixed effects model. By following the steps outlined in this article, you can calculate the 95% CI for RMSE and gain a deeper understanding of your model's performance.


