Chapter 5 Adding Fixed Predictors to MLMs
5.1 Learning Objectives
In this chapter, we will introduce fixed predictors at both level-1 and level-2.
The learning objectives for this chapter are:
- Code and interpret fixed effects in multilevel models;
- Explain the difference between conditional and unconditional effects;
- Evaluate the utility of predictors in a model by considering the information from regression coefficients and variance reduced.
All materials for this chapter are available for download here.
5.2 Data Demonstration
The data for this chapter were taken from chapter 3 of Heck, R. H., Thomas, S. L., & Tabata, L. N. (2011). Multilevel and Longitudinal Modeling with IBM SPSS: Taylor & Francis. Students are clustered within schools in the data.
5.2.1 Load Data and Dependencies
For this data demo, we will use the following packages:
library(dplyr) # for data manipulation
library(ggplot2) # for visualizations
library(lme4) # for multilevel models
library(lmerTest) # for p-values
library(performance) # for intraclass correlation
And the same dataset of students’ math achievement:
5.2.2 MLM with Level-1 Predictor
As a reminder, in Chapter 4 we estimated the random-intercept-only model, also called the null model:
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: math ~ 1 + (1 | schcode)
## Data: data
##
## REML criterion at convergence: 48877.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6336 -0.5732 0.1921 0.6115 5.2989
##
## Random effects:
## Groups Name Variance Std.Dev.
## schcode (Intercept) 10.64 3.262
## Residual 66.55 8.158
## Number of obs: 6871, groups: schcode, 419
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 57.6742 0.1883 416.0655 306.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now that we’ve explored the null model and variance decomposition it gives us access to, let’s practice adding a level-1 predictor to our model. Level-1 predictors vary at level-1, which in our example is the student level, meaning that students have different values for a variable. In our data, socioeconomic status (ses
) and sex (female
) vary across students, at level-1. Let’s add a fixed effect for ses
as a predictor to our model.
The following equations describe this model:
Level | Equation |
---|---|
Level 1 | \(math_{ij} = \beta_{0j} + \beta_{1j}ses_{ij} + R_{ij}\) |
Level 2 | \(\beta_{0j} = \gamma_{00} + U_{0j}\) |
\(\beta_{1j} = \gamma_{10}\) | |
Combined | \(math_{ij} = \gamma_{00} + \gamma_{10}ses_{ij} + U_{0j} + R_{ij}\) |
We’ll be estimating four parameters:
- \(\gamma_{00}\): the fixed effect for the intercept, controlling for
ses
; - \(\gamma_{10}\): the fixed effect for the slope of
ses
; - \(\tau_0^2\): a random effect variance for the intercept capturing the variance of schools around the intercept, controlling for
ses
; - \(\sigma^2\): a random effect variance capturing the variance of students around their school mean math achievement, controlling for
ses
.
Notice that the parameters are now conditional on ses
. The intercept is no longer interpreted as the intercept across all schools; it’s the intercept across all schools conditional on ses
being equal to 0, or at the mean ses
level for the sample given that ses
is z-scored in these data. Additionally, note that there is no \(U_j\) term associated with the coefficient for ses
; that’s because we’re only adding a fixed effect for ses
right now. This implies that the relationship between ses
and math achievement is the same across all schools (i.e., the slope is fixed, not randomly varying). We’ll look at adding random slope effects in the next chapter. For now, let’s run our model.
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: math ~ ses + (1 | schcode)
## Data: data
##
## REML criterion at convergence: 48215.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.7733 -0.5540 0.1303 0.6469 5.6908
##
## Random effects:
## Groups Name Variance Std.Dev.
## schcode (Intercept) 3.469 1.863
## Residual 62.807 7.925
## Number of obs: 6871, groups: schcode, 419
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 57.5960 0.1329 375.6989 433.36 <2e-16 ***
## ses 3.8739 0.1366 3914.6382 28.35 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## ses -0.025
Per the intercept, the average math achievement across all schools at mean ses
is 57.596. A one-standard-deviation increase in ses
is associated with a 3.87-point increase in math achievement. The variance term describing how schools vary around the intercept is 3.469, whereas the variance term describing how the students vary within schools, about their schools’ mean, is 62.807. These variance terms are different from our null model that had no predictors; we can quantify that difference in at least two ways.
One option is to calculate how much level-1 variance was reduced by adding ses
as a level-1 predictor. If we divide the difference between our null model’s level-1 variance and this new model’s (l1) level-1 variance by the null model variance, we can see what proportion of variance was reduced.
## [1] 0.05624991
So we reduced about 5.6% of level-1 variance by adding ses
as a level-1 predictor. Another way of stating this is that we reduced the unexplained within school variance by 5.6%.
Another option is to calculate the conditional ICC, or the proportion of variance explained by clustering after we account for ses
. Recall from last chapter that the adjusted ICC accounts only for random effect variances, while the conditional ICC accounts for both variance of both random effects and fixed effects. With the null model, the adjusted and conditional ICC values from performance
are the same because there are no predictors in the model, but with a fixed level-1 predictor in the model, we should reference the conditional ICC.
## # Intraclass Correlation Coefficient
##
## Adjusted ICC: 0.052
## Unadjusted ICC: 0.046
After accounting for the effect of socioeconomic status, 4.6% of the variance in math achievement is accounted for by school membership.
5.2.3 Compare Regular and Multilevel Regression
In the previous chapter, we compared a regular regression to a cluster-robust standard error regression. Now, let’s compare those two with a multilevel model.
The regular regression from Chapter 4:
##
## Call:
## lm(formula = math ~ ses, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.459 -4.678 1.144 5.355 47.560
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 57.59817 0.09819 586.61 <2e-16 ***
## ses 4.25468 0.12566 33.86 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.132 on 6869 degrees of freedom
## Multiple R-squared: 0.143, Adjusted R-squared: 0.1429
## F-statistic: 1146 on 1 and 6869 DF, p-value: < 2.2e-16
The cluster-robust standard error regression from Chapter 4:
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 57.59817 0.13020 442.38 < 2.2e-16 ***
## ses 4.25468 0.14981 28.40 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
These two models had the same coefficients, with different significance values.
This is our multilevel model:
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: math ~ ses + (1 | schcode)
## Data: data
##
## REML criterion at convergence: 48215.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.7733 -0.5540 0.1303 0.6469 5.6908
##
## Random effects:
## Groups Name Variance Std.Dev.
## schcode (Intercept) 3.469 1.863
## Residual 62.807 7.925
## Number of obs: 6871, groups: schcode, 419
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 57.5960 0.1329 375.6989 433.36 <2e-16 ***
## ses 3.8739 0.1366 3914.6382 28.35 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## ses -0.025
The intercepts are the same between the MLM and regular regressions, but the coefficient for ses
is not. Why? The coefficient for ses
represents the mean relationship between SES and math achievement across all schools, weighted by the reliability of the cluster. The weighting reflects cluster-level sample size, and thus varies from the regular regression estimates that treat all observations equally.
5.2.4 MLM with Level-2 Predictor
We added ses
as a level-1 predictor to explain some of the student-level variance in math achievement. Now, let’s add a predictor that varies at level-2, meaning that the value is different across level 2 units, which is the school level. Level-2 predictors are different across schools but the same for all students within a school. There are three possible level-2 predictors:
ses_mean
: the mean SES per school (this variable is centered, we’ll discuss centering more in Chapter 9)pro4yc
: the percentage of students at a school who intend to study at a 4-year college/universitypublic
: whether the school is private (0) or public (1)
This is where we begin to unlock the potential of MLMs, to ask questions about both individual differences (level-1 variables) and school differences (level-2 variables) at the same time while accounting for clustered data structures. Let’s consider the role of school type in our model by adding a fixed effect for public
as a predictor of our intercept.
The following equations describe this model:
Level | Equation |
---|---|
Level 1 | \(math_{ij} = \beta_{0j} + \beta_{1j}ses_{ij} + R_{ij}\) |
Level 2 | \(\beta_{0j} = \gamma_{00} + \gamma_{01}public_j + U_{0j}\) |
\(\beta_{1j} = \gamma_{10}\) | |
Combined | \(math_{ij} = \gamma_{00} + \gamma_{01}public_{j} + \gamma_{10}ses_{ij} + U_{0j} + R_{ij}\) |
A few things to note here: first, public_j
only has a j subscript because only different schools (j’s) have different values of public
. All students (i’s) within a school have the same value. Second, public
is currently only a predictor for the intercept. In Chapter 6 we’ll look at using level-2 variables as predictors of level-1 slopes and the cross-level interactions that result.
We’ll be estimating five parameters:
- \(\gamma_{00}\): the fixed effect for the intercept, controlling for
ses
andpublic
; - \(\gamma_{01}\): the fixed effect for the slope of
public
controlling forses
- \(\gamma_{10}\): the fixed effect for the slope of
ses
controlling forpublic
; - \(\tau_0^2\): a random effect variance for the intercept capturing the variance of schools around the intercept, controlling for
ses
andpublic
; - \(\sigma^2\): a random effect variance capturing the variance of students around their school mean math achievement, controlling for
ses
andpublic
.
Notice that the parameters are conditional on both ses
and on public
now. Let’s run our model.
ses_l1_public_l2 <- lmer(math ~ 1 + ses + public + (1|schcode), data = data, REML = TRUE)
summary(ses_l1_public_l2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: math ~ 1 + ses + public + (1 | schcode)
## Data: data
##
## REML criterion at convergence: 48216
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.7718 -0.5541 0.1309 0.6477 5.6916
##
## Random effects:
## Groups Name Variance Std.Dev.
## schcode (Intercept) 3.486 1.867
## Residual 62.807 7.925
## Number of obs: 6871, groups: schcode, 419
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 57.63143 0.25535 381.81733 225.693 <2e-16 ***
## ses 3.87338 0.13673 3928.37427 28.329 <2e-16 ***
## public -0.04859 0.29862 385.93649 -0.163 0.871
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) ses
## ses 0.013
## public -0.854 -0.031
Let’s look at our fixed effects, which describes the conditional mean effect of a variable on the outcome, across all schools. Per the intercept, the average math achievement across all private schools (public
= 0) at mean SES (ses
= 0) is 57.70. A one-standard-deviation increase in ses
across all private schools is associated with a 3.87-point increase in math achievement. Public schools at mean ses
have a -0.14-point decrease on average in math achievement relative to private schools.
From our random effect variances, the variance term describing how schools vary around the intercept (at mean SES at private schools) is 3.48, and the variance term describing how students vary around their school means is 62.81.
Let’s calculate variance reduced at level 1 and level 2 by adding school type as a predictor.
# level-1 variance reduced
sigma2_null <- sigma(null_model)^2
sigma2_public <- sigma(ses_l1_public_l2)^2
(sigma2_null - sigma2_public) / sigma2_null
## [1] 0.05624525
# level-2 variance reduced
tau2_null <- VarCorr(null_model)$schcode[1]
tau2_public <- VarCorr(ses_l1_public_l2)$schcode[1]
(tau2_null - tau2_public) / tau2_null
## [1] 0.6724414
We reduced around 5.6% of variance in math achievement at level-1 and 67.2% of variance at level-2 by adding public
as a level-2 predictor. It makes sense that the variance at level-2 was reduced by so much more, because we added a level-2 predictor that varies at level-2.
So, does it seem like school type is related to math achievement? We have two sources of information to consider so far: the regression coefficient and the variance reduced. While the regression coefficient is relatively small, the intercept variance reduced at level-2 is quite large (67%!), so it seems like school type is a valuable predictor in our model.
5.3 Conclusion
In this chapter, we added level-1 and level-2 fixed effects to our models, considered the difference between conditional and unconditional effects, and used regression coefficients and variance reduced to make a decision about retaining model parameters. In Chapter 6, we’ll work with random slopes and explain cross-level interactions.