Multiple Group Growth Model - Multilevel & SEM Implementation in R

Overview

This tutorial illustrates fitting of multiple group linear growth models in the multilevel and SEM frameworks in R.

Example data and code are drawn from Chapter 6 of Grimm, Ram, and Estabrook (2017). Specifically, using the NLSY-CYA Dataset we examine how change in children’s mathematics achievement across grade differs across groups defined by low (< 5.5 lbs) and normal birth weight. Please see the book chapter for additional interpretations and insights about the analyses.

Preliminaries - Loading libraries used in this script.

library(psych)  #for basic functions
library(plyr)   #for data management
library(ggplot2)  #for plotting
library(nlme) #for mixed effects models
library(lme4) #for mixed effects models
library(lavaan) #for SEM 
library(semPlot) #for making SEM diagrams

Preliminaries - Data Preparation and Description

For our examples, we use the mathematics achievement scores from the NLSY-CYA Long Data.

Load the repeated measures data

#set filepath for data file
filepath <- "https://raw.githubusercontent.com/LRI-2/Data/main/GrowthModeling/nlsy_math_long_R.dat"
#read in the text data file using the url() function
dat <- read.table(file=url(filepath),
                  na.strings = ".")  #indicates the missing data designator
#copy data with new name 
nlsy_math_long <- dat  

#Add names the columns of the data set
names(nlsy_math_long) = c('id'     , 'female', 'lb_wght', 
                          'anti_k1', 'math'  , 'grade'  ,
                          'occ'    , 'age'   , 'men'    ,
                          'spring' , 'anti')

#reducing to variables of interest 
nlsy_math_long <- nlsy_math_long[ ,c("id","grade","math","lb_wght")]

#adding another dummy code variable for normal birth weight that coded the opposite of the low brithweight variable. 
nlsy_math_long$nb_wght <- 1 - nlsy_math_long$lb_wght

#view the first few observations in the data set 
head(nlsy_math_long, 10)
##      id grade math lb_wght nb_wght
## 1   201     3   38       0       1
## 2   201     5   55       0       1
## 3   303     2   26       0       1
## 4   303     5   33       0       1
## 5  2702     2   56       0       1
## 6  2702     4   58       0       1
## 7  2702     8   80       0       1
## 8  4303     3   41       0       1
## 9  4303     4   58       0       1
## 10 5002     4   46       0       1

Our specific interest is intraindividual change in the repeated measures of math change across grade, and how those those trajectories differ across lb_wght and nb_wght groups.

As noted in Chapter 2 , it is important to plot the data to obtain a better understanding of the structure and form of the observed phenomenon. Here, we want to examine the data to see how the repeated measures of math are structured with respect to grade, and how that differs across groups.

Longitudinal Plot of Math across Grade at Testing

#intraindividual change trajetories
ggplot(data=nlsy_math_long,                    #data set
       aes(x = grade, y = math, group = id)) + #setting variables
  geom_point(size=.5) + #adding points to plot
  geom_line() +  #adding lines to plot
  theme_bw() +   #changing style/background
  #setting the x-axis with breaks and labels
  scale_x_continuous(limits=c(2,8),
                     breaks = c(2,3,4,5,6,7,8), 
                     name = "Grade at Testing") +    
  #setting the y-axis with limits breaks and labels
  scale_y_continuous(limits=c(10,90), 
                     breaks = c(10,30,50,70,90), 
                     name = "PIAT Mathematics") +
  facet_wrap(~lb_wght)

Note that we are able to see the trajectories for the two groups separately, normal birth weight on the left (a larger group) and low birth weight on the right (a smaller group). Cooll!

Multilevel: Multiple Group Growth Models using nlme() function

To capture the systematic growth in the mathematics scores over grade, we use the grade variable, so that the rate of growth indicates (linear) school-related change in mathematics scores.

We fit a series of 4 models to test a collection of possible group differences.

M1. Invariance model (all parameters equivalent)

M2. Means model (means differ)

M3. Means & Covariances model (means + intercept & slope variances and covariance differ)

M4. Means & Covariances & Residual Variances (all parameters differ)

M1: Invariance model

Note that in fitting the linear growth model, we have recentered the time variable, so that the intercept is the expected score at Grade 2 (the beginning of the data).

We use the nb_wght and lb_wght variables to “turn on” and “turn off” the model formula for each group. Parameters with identical labels are constrained to be equivalent across groups.

#fitting linear growth model with tic and assigning it to an object
mg_math_M1_nlme <- nlme(math ~ nb_wght*((beta_1 + d_1i) + (beta_2 + d_2i)*(grade-2)) +
                               lb_wght*((beta_1 + d_1i) + (beta_2 + d_2i)*(grade-2)),
                          data = nlsy_math_long,
                          fixed = beta_1 + beta_2 ~ 1,
                          random = d_1i + d_2i ~  1,
                          group = ~id,
                          start=c(beta_1=35, beta_2=4),
                          na.action=na.exclude,
                          control = c(returnObject = TRUE))

#obtaining summary of the model using the object we just created
summary(mg_math_M1_nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: math ~ nb_wght * ((beta_1 + d_1i) + (beta_2 + d_2i) * (grade -      2)) + lb_wght * ((beta_1 + d_1i) + (beta_2 + d_2i) * (grade -      2)) 
##   Data: nlsy_math_long 
##        AIC      BIC    logLik
##   15949.39 15983.62 -7968.693
## 
## Random effects:
##  Formula: list(d_1i ~ 1, d_2i ~ 1)
##  Level: id
##  Structure: General positive-definite, Log-Cholesky parametrization
##          StdDev    Corr  
## d_1i     8.0350377 d_1i  
## d_2i     0.8559002 -0.026
## Residual 6.0191136       
## 
## Fixed effects:  beta_1 + beta_2 ~ 1 
##           Value Std.Error   DF  t-value p-value
## beta_1 35.26736 0.3551567 1288 99.30083       0
## beta_2  4.33933 0.0873767 1288 49.66231       0
##  Correlation: 
##        beta_1
## beta_2 -0.532
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.030277915 -0.525876524  0.001632888  0.540751622  2.542250378 
## 
## Number of Observations: 2221
## Number of Groups: 932

M2: Means model

We use the nb_wght and lb_wght variables to “turn on” and “turn off” the model formula for each group. Parameters with identical labels are constrained to be equivalent across groups.

#fitting linear growth model with tic and assigning it to an object
mg_math_M2_nlme <- nlme(math ~ nb_wght*((beta_1_N + d_1i) + (beta_2_N + d_2i)*(grade-2)) +
                               lb_wght*((beta_1_L + d_1i) + (beta_2_L + d_2i)*(grade-2)),
                          data = nlsy_math_long,
                          fixed = beta_1_N + beta_2_N + beta_1_L + beta_2_L  ~ 1,
                          random = d_1i + d_2i ~  1,
                          group = ~id,
                          start=c(beta_1_N=35, beta_2_N=4,beta_1_L=35, beta_2_L=4),
                          na.action=na.exclude,
                          control = c(returnObject = TRUE))

#obtaining summary of the model using the object we just created
summary(mg_math_M2_nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: math ~ nb_wght * ((beta_1_N + d_1i) + (beta_2_N + d_2i) * (grade -      2)) + lb_wght * ((beta_1_L + d_1i) + (beta_2_L + d_2i) *      (grade - 2)) 
##   Data: nlsy_math_long 
##        AIC      BIC    logLik
##   15948.18 15993.83 -7966.092
## 
## Random effects:
##  Formula: list(d_1i ~ 1, d_2i ~ 1)
##  Level: id
##  Structure: General positive-definite, Log-Cholesky parametrization
##          StdDev    Corr  
## d_1i     7.9814728 d_1i  
## d_2i     0.8360326 -0.005
## Residual 6.0266661       
## 
## Fixed effects:  beta_1_N + beta_2_N + beta_1_L + beta_2_L ~ 1 
##             Value Std.Error   DF  t-value p-value
## beta_1_N 35.48755 0.3693084  931 96.09192       0
## beta_2_N  4.29228 0.0906144 1287 47.36865       0
## beta_1_L 32.73331 1.2445560 1287 26.30120       0
## beta_2_L  4.90490 0.3201715 1287 15.31961       0
##  Correlation: 
##          bt_1_N bt_2_N bt_1_L
## beta_2_N -0.530              
## beta_1_L  0.000  0.000       
## beta_2_L  0.000  0.000 -0.529
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.063587390 -0.528511594 -0.003018735  0.536375339  2.542164540 
## 
## Number of Observations: 2221
## Number of Groups: 932

Comparison of the fit of the M1 and M2 models suggests that the means do not differ.

anova(mg_math_M1_nlme,mg_math_M2_nlme)
##                 Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mg_math_M1_nlme     1  6 15949.39 15983.62 -7968.693                        
## mg_math_M2_nlme     2  8 15948.18 15993.83 -7966.092 1 vs 2 5.200611  0.0743

M3: Means & Covariances model

We use the nb_wght and lb_wght variables to “turn on” and “turn off” the model formula for each group. Parameters with identical labels are constrained to be equivalent across groups.

#fitting linear growth model with tic and assigning it to an object
mg_math_M3_nlme <- nlme(math ~ nb_wght*((beta_1_N + d_1i_N) + (beta_2_N + d_2i_N)*(grade-2)) +
                               lb_wght*((beta_1_L + d_1i_L) + (beta_2_L + d_2i_L)*(grade-2)),
                          data = nlsy_math_long,
                          fixed = beta_1_N + beta_2_N + beta_1_L + beta_2_L  ~ 1,
                          #random effects structured to be different across groups 
                          random = pdBlocked(list(d_1i_N + d_2i_N ~ 1, 
                                                  d_1i_L + d_2i_L ~ 1)),
                          group = ~id,
                          start=c(beta_1_N=35, beta_2_N=4,beta_1_L=35, beta_2_L=4),
                          na.action=na.exclude,
                          control = c(returnObject = TRUE))
## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step
#obtaining summary of the model using the object we just created
summary(mg_math_M3_nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + d_2i_N) *      (grade - 2)) + lb_wght * ((beta_1_L + d_1i_L) + (beta_2_L +      d_2i_L) * (grade - 2)) 
##   Data: nlsy_math_long 
##        AIC      BIC    logLik
##   15951.46 16014.22 -7964.728
## 
## Random effects:
##  Composite Structure: Blocked
## 
##  Block 1: d_1i_N, d_2i_N
##  Formula: list(d_1i_N ~ 1, d_2i_N ~ 1)
##  Level: id
##  Structure: General positive-definite
##        StdDev    Corr  
## d_1i_N 7.8142066 d_1i_N
## d_2i_N 0.8143276 0.038 
## 
##  Block 2: d_1i_L, d_2i_L
##  Formula: list(d_1i_L ~ 1, d_2i_L ~ 1)
##  Level: id
##  Structure: General positive-definite
##          StdDev   Corr  
## d_1i_L   9.761297 d_1i_L
## d_2i_L   1.138890 -0.342
## Residual 6.022945       
## 
## Fixed effects:  beta_1_N + beta_2_N + beta_1_L + beta_2_L ~ 1 
##             Value Std.Error   DF  t-value p-value
## beta_1_N 35.48504 0.3647143  930 97.29543       0
## beta_2_N  4.29270 0.0901960 1288 47.59295       0
## beta_1_L 32.85003 1.4112368  930 23.27748       0
## beta_2_L  4.88119 0.3377520 1288 14.45201       0
##  Correlation: 
##          bt_1_N bt_2_N bt_1_L
## beta_2_N -0.527              
## beta_1_L  0.000  0.000       
## beta_2_L  0.000  0.000 -0.564
## 
## Standardized Within-Group Residuals:
##           Min            Q1           Med            Q3           Max 
## -3.0955259339 -0.5313771004 -0.0009827286  0.5352487334  2.5425565635 
## 
## Number of Observations: 2221
## Number of Groups: 932

Comparison of the fit of the M2 and M3 model suggests that the variances and covariances of the intercepts and slopes do not differ.

anova(mg_math_M2_nlme,mg_math_M3_nlme)
##                 Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mg_math_M2_nlme     1  8 15948.18 15993.83 -7966.092                        
## mg_math_M3_nlme     2 11 15951.46 16014.22 -7964.728 1 vs 2 2.728012  0.4355

M4: Means & Covariances and Residual Variances model

We use the nb_wght and lb_wght variables to “turn on” and “turn off” the model formula for each group. Parameters with identical labels are constrained to be equivalent across groups.

#fitting linear growth model with tic and assigning it to an object
mg_math_M4_nlme <- nlme(math ~ nb_wght*((beta_1_N + d_1i_N) + (beta_2_N + d_2i_N)*(grade-2)) +
                               lb_wght*((beta_1_L + d_1i_L) + (beta_2_L + d_2i_L)*(grade-2)),
                          data = nlsy_math_long,
                          fixed = beta_1_N + beta_2_N + beta_1_L + beta_2_L  ~ 1,
                          #random effects structured to be different across groups 
                          random = pdBlocked(list(d_1i_N + d_2i_N ~ 1, 
                                                  d_1i_L + d_2i_L ~ 1)),
                          #residual variances strutured to be different across groups
                          weights = varIdent(form = ~1 | factor(lb_wght)),
                          group = ~id,
                          start=c(beta_1_N=35, beta_2_N=4,beta_1_L=35, beta_2_L=4),
                          na.action=na.exclude,
                          control = c(returnObject = TRUE))
## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step

## Warning in nlme.formula(math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + :
## step halving factor reduced below minimum in PNLS step
#obtaining summary of the model using the object we just created
summary(mg_math_M4_nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: math ~ nb_wght * ((beta_1_N + d_1i_N) + (beta_2_N + d_2i_N) *      (grade - 2)) + lb_wght * ((beta_1_L + d_1i_L) + (beta_2_L +      d_2i_L) * (grade - 2)) 
##   Data: nlsy_math_long 
##        AIC      BIC    logLik
##   15950.12 16018.59 -7963.061
## 
## Random effects:
##  Composite Structure: Blocked
## 
##  Block 1: d_1i_N, d_2i_N
##  Formula: list(d_1i_N ~ 1, d_2i_N ~ 1)
##  Level: id
##  Structure: General positive-definite
##        StdDev    Corr  
## d_1i_N 7.8921982 d_1i_N
## d_2i_N 0.8795751 -0.009
## 
##  Block 2: d_1i_L, d_2i_L
##  Formula: list(d_1i_L ~ 1, d_2i_L ~ 1)
##  Level: id
##  Structure: General positive-definite
##          StdDev     Corr  
## d_1i_L   8.99419583 d_1i_L
## d_2i_L   0.02849714 0.986 
## Residual 5.93062142       
## 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | factor(lb_wght) 
##  Parameter estimates:
##        0        1 
## 1.000000 1.170217 
## Fixed effects:  beta_1_N + beta_2_N + beta_1_L + beta_2_L ~ 1 
##             Value Std.Error   DF  t-value p-value
## beta_1_N 35.48131 0.3647427  931 97.27765       0
## beta_2_N  4.29731 0.0901739 1287 47.65583       0
## beta_1_L 32.79781 1.4062572 1287 23.32277       0
## beta_2_L  4.87806 0.3382249 1287 14.42253       0
##  Correlation: 
##          bt_1_N bt_2_N bt_1_L
## beta_2_N -0.527              
## beta_1_L  0.000  0.000       
## beta_2_L  0.000  0.000 -0.549
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.080667912 -0.533467904 -0.002345872  0.537432380  2.566342194 
## 
## Number of Observations: 2221
## Number of Groups: 932

Comparison of the fit of the M3 and M4 model suggests that the residual variances do not differ.

anova(mg_math_M3_nlme,mg_math_M4_nlme)
##                 Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mg_math_M3_nlme     1 11 15951.46 16014.22 -7964.728                        
## mg_math_M4_nlme     2 12 15950.12 16018.59 -7963.061 1 vs 2 3.334605  0.0678

We get information about how the model fits the data, as well as some information about the relations of the trajectories with the time-invariant covariates. Specifically, we see from the model that there is evidence that differneces in low birth weight status and antisocial behavior are related to individual differences in intercept, but not in linear slope. More detailed interpetation is given in the accompanying video.

Plotting the predicted trajectories.

#obtaining predicted scores for individuals
nlsy_math_long$pred_mg <- predict(mg_math_M4_nlme)

#obtaining predicted scores for prototype
nlsy_math_long$proto_mg <- predict(mg_math_M4_nlme, level=0)

#intraindividual change trajetories
#plotting predicted trajectory
ggplot(data = nlsy_math_long, aes(x = grade, y = pred_mg, group = id)) +
  geom_point(color="gray", size=.5) + 
  geom_line() +  #adding lines to plot
  geom_line(aes(x = grade, y = proto_mg), color="red",size=2) +
  theme_bw() +   #changing style/background
  #setting the x-axis with breaks and labels
  scale_x_continuous(limits=c(2,8),
                     breaks = c(2,3,4,5,6,7,8), 
                     name = "Grade at Testing") +    
  #setting the y-axis with limits breaks and labels
  scale_y_continuous(limits=c(10,90), 
                     breaks = c(10,30,50,70,90), 
                     name = "Predicted PIAT Mathematics") +
  facet_wrap(~lb_wght)

We see in this plot that even in M4, the predicted trajectories, and the variability in those trajectories is similar across the two groups.

SEM: Multiple Group Growth Model using lavaan

For the SEM implementation, we use the mathematics achievement scores and time-invariant covariates from the NLSY-CYA Wide Data.

Load the repeated measures data

#set filepath for data file
filepath <- "https://raw.githubusercontent.com/LRI-2/Data/main/GrowthModeling/nlsy_math_wide_R.dat"
#read in the text data file using the url() function
dat <- read.table(file=url(filepath),
                  na.strings = ".")  #indicates the missing data designator
#copy data with new name 
nlsy_math_wide <- dat  

# Give the variable names
names(nlsy_math_wide)=c('id'     ,'female' ,'lb_wght','anti_k1',
                        'math2'  ,'math3'  ,'math4'  ,'math5'  ,'math6'  ,'math7'  ,'math8'  ,
                        'age2'   ,'age3'   ,'age4'   ,'age5'   ,'age6'   ,'age7'   ,'age8'   ,
                        'men2'   ,'men3'   ,'men4'   ,'men5'   ,'men6'   ,'men7'   ,'men8'   ,
                        'spring2','spring3','spring4','spring5','spring6','spring7','spring8',
                        'anti2'  ,'anti3'  ,'anti4'  ,'anti5'  ,'anti6'  ,'anti7'  ,'anti8')


#view the first few observations (and columns) in the data set 
head(nlsy_math_wide[ ,1:11], 10)
##      id female lb_wght anti_k1 math2 math3 math4 math5 math6 math7 math8
## 1   201      1       0       0    NA    38    NA    55    NA    NA    NA
## 2   303      1       0       1    26    NA    NA    33    NA    NA    NA
## 3  2702      0       0       0    56    NA    58    NA    NA    NA    80
## 4  4303      1       0       0    NA    41    58    NA    NA    NA    NA
## 5  5002      0       0       4    NA    NA    46    NA    54    NA    66
## 6  5005      1       0       0    35    NA    50    NA    60    NA    59
## 7  5701      0       0       2    NA    62    61    NA    NA    NA    NA
## 8  6102      0       0       0    NA    NA    55    67    NA    81    NA
## 9  6801      1       0       0    NA    54    NA    62    NA    66    NA
## 10 6802      0       0       0    NA    55    NA    66    NA    68    NA

Our specific interest is in change across the the repeated measures of math, math2 to math8 and how those changes differ across groups defined by lb_wght.

We fit a series of 4 models to test a collection of possible group differences.

M1. Invariance model (all parameters equivalent)

M2. Means model (means differ)

M3. Means & Covariances model (means + intercept & slope variances and covariance differ)

M4. Means & Covariances & Residual Variances (all parameters differ)

Specifying linear growth model

#writing out linear growth model in full SEM way 
mg_math_lavaan_model <- '
  # latent variable definitions
      #intercept (note intercept is a reserved term)
      eta_1 =~ 1*math2
      eta_1 =~ 1*math3
      eta_1 =~ 1*math4
      eta_1 =~ 1*math5
      eta_1 =~ 1*math6
      eta_1 =~ 1*math7
      eta_1 =~ 1*math8

      #linear slope 
      eta_2 =~ 0*math2
      eta_2 =~ 1*math3
      eta_2 =~ 2*math4
      eta_2 =~ 3*math5
      eta_2 =~ 4*math6
      eta_2 =~ 5*math7
      eta_2 =~ 6*math8

  # factor variances
      eta_1 ~~ eta_1
      eta_2 ~~ eta_2

  # covariances among factors 
      eta_1 ~~ eta_2

  # factor means 
      eta_1 ~ start(35)*1
      eta_2 ~ start(4)*1

  # manifest variances (made equivalent by naming theta)
      math2 ~~ theta*math2
      math3 ~~ theta*math3
      math4 ~~ theta*math4
      math5 ~~ theta*math5
      math6 ~~ theta*math6
      math7 ~~ theta*math7
      math8 ~~ theta*math8
  # manifest means (fixed at zero)
      math2 ~ 0*1
      math3 ~ 0*1
      math4 ~ 0*1
      math5 ~ 0*1
      math6 ~ 0*1
      math7 ~ 0*1
      math8 ~ 0*1
' #end of model definition

M1: Invariance model

We can do the multiple group designation and contraints in the estimation of the model …

#estimating the model using sem() function
mg_math_lavaan_fitM1 <- sem(mg_math_lavaan_model, 
                          data = nlsy_math_wide,
                          meanstructure = TRUE,
                          estimator = "ML",
                          missing = "fiml",
                          group = "lb_wght", #to separate groups
                          group.equal = c("loadings", #for constraints
                                          "means",
                                          "lv.variances",
                                          "lv.covariances",
                                          "residuals"))
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: some cases are empty and will be ignored:
##   741
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage

## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage
## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

We can look more specifically at the numerical summaries of the model.

#obtaining summary of the model using the object we just created
summary(mg_math_lavaan_fitM1, fit.measures=TRUE)
## lavaan 0.6-9 ended normally after 23 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        24
##   Number of equality constraints                    18
##                                                       
##   Number of observations per group:               Used       Total
##     0                                              857         858
##     1                                               75          75
##   Number of missing patterns per group:                           
##     0                                               60            
##     1                                               25            
##                                                                   
## Model Test User Model:
##                                                       
##   Test statistic                               249.111
##   Degrees of freedom                                64
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     0                                          191.954
##     1                                           57.156
## 
## Model Test Baseline Model:
## 
##   Test statistic                               887.887
##   Degrees of freedom                                42
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.781
##   Tucker-Lewis Index (TLI)                       0.856
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -7968.693
##   Loglikelihood unrestricted model (H1)      -7844.138
##                                                       
##   Akaike (AIC)                               15949.386
##   Bayesian (BIC)                             15978.410
##   Sample-size adjusted Bayesian (BIC)        15959.354
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.079
##   90 Percent confidence interval - lower         0.069
##   90 Percent confidence interval - upper         0.089
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.128
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2   (.17.)   -0.181    1.150   -0.158    0.875
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1   (.18.)   35.267    0.355   99.229    0.000
##     eta_2   (.19.)    4.339    0.088   49.136    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1   (.15.)   64.562    5.659   11.408    0.000
##     eta_2   (.16.)    0.733    0.327    2.238    0.025
##    .math2   (thet)   36.230    1.867   19.410    0.000
##    .math3   (thet)   36.230    1.867   19.410    0.000
##    .math4   (thet)   36.230    1.867   19.410    0.000
##    .math5   (thet)   36.230    1.867   19.410    0.000
##    .math6   (thet)   36.230    1.867   19.410    0.000
##    .math7   (thet)   36.230    1.867   19.410    0.000
##    .math8   (thet)   36.230    1.867   19.410    0.000
## 
## 
## Group 2 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2   (.17.)   -0.181    1.150   -0.158    0.875
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1   (.18.)   35.267    0.355   99.229    0.000
##     eta_2   (.19.)    4.339    0.088   49.136    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1   (.15.)   64.562    5.659   11.408    0.000
##     eta_2   (.16.)    0.733    0.327    2.238    0.025
##    .math2   (thet)   36.230    1.867   19.410    0.000
##    .math3   (thet)   36.230    1.867   19.410    0.000
##    .math4   (thet)   36.230    1.867   19.410    0.000
##    .math5   (thet)   36.230    1.867   19.410    0.000
##    .math6   (thet)   36.230    1.867   19.410    0.000
##    .math7   (thet)   36.230    1.867   19.410    0.000
##    .math8   (thet)   36.230    1.867   19.410    0.000

We get information about how the model fits the data, and the model parameters. We see that the model parameters are identical across groups.

We can make a diagram to check if we specified and estimated the model as intended.

#diagram of fitted model
semPaths(mg_math_lavaan_fitM1,what = "path", whatLabels = "par")

We get 2 diagrams!

M2: Means model

We can do the multiple group designation and contraints in the estimation of the model …

#estimating the model using sem() function
mg_math_lavaan_fitM2 <- sem(mg_math_lavaan_model, 
                          data = nlsy_math_wide,
                          meanstructure = TRUE,
                          estimator = "ML",
                          missing = "fiml",
                          group = "lb_wght", #to separate groups
                          group.equal = c("loadings", #for constraints
                                          #"means", commented out so can differ
                                          "lv.variances",
                                          "lv.covariances",
                                          "residuals"))
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: some cases are empty and will be ignored:
##   741
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage

## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage
## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

We can look more specifically at the numerical summaries of the model.

#obtaining summary of the model using the object we just created
summary(mg_math_lavaan_fitM2, fit.measures=TRUE)
## lavaan 0.6-9 ended normally after 32 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        24
##   Number of equality constraints                    16
##                                                       
##   Number of observations per group:               Used       Total
##     0                                              857         858
##     1                                               75          75
##   Number of missing patterns per group:                           
##     0                                               60            
##     1                                               25            
##                                                                   
## Model Test User Model:
##                                                       
##   Test statistic                               243.910
##   Degrees of freedom                                62
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     0                                          191.440
##     1                                           52.470
## 
## Model Test Baseline Model:
## 
##   Test statistic                               887.887
##   Degrees of freedom                                42
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.785
##   Tucker-Lewis Index (TLI)                       0.854
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -7966.093
##   Loglikelihood unrestricted model (H1)      -7844.138
##                                                       
##   Akaike (AIC)                               15948.185
##   Bayesian (BIC)                             15986.884
##   Sample-size adjusted Bayesian (BIC)        15961.477
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.079
##   90 Percent confidence interval - lower         0.069
##   90 Percent confidence interval - upper         0.090
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.127
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2   (.17.)   -0.035    1.144   -0.031    0.975
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            35.488    0.369   96.080    0.000
##     eta_2             4.292    0.092   46.898    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1   (.15.)   63.704    5.637   11.301    0.000
##     eta_2   (.16.)    0.699    0.325    2.147    0.032
##    .math2   (thet)   36.321    1.871   19.413    0.000
##    .math3   (thet)   36.321    1.871   19.413    0.000
##    .math4   (thet)   36.321    1.871   19.413    0.000
##    .math5   (thet)   36.321    1.871   19.413    0.000
##    .math6   (thet)   36.321    1.871   19.413    0.000
##    .math7   (thet)   36.321    1.871   19.413    0.000
##    .math8   (thet)   36.321    1.871   19.413    0.000
## 
## 
## Group 2 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2   (.17.)   -0.035    1.144   -0.031    0.975
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            32.733    1.244   26.314    0.000
##     eta_2             4.905    0.320   15.320    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1   (.15.)   63.704    5.637   11.301    0.000
##     eta_2   (.16.)    0.699    0.325    2.147    0.032
##    .math2   (thet)   36.321    1.871   19.413    0.000
##    .math3   (thet)   36.321    1.871   19.413    0.000
##    .math4   (thet)   36.321    1.871   19.413    0.000
##    .math5   (thet)   36.321    1.871   19.413    0.000
##    .math6   (thet)   36.321    1.871   19.413    0.000
##    .math7   (thet)   36.321    1.871   19.413    0.000
##    .math8   (thet)   36.321    1.871   19.413    0.000

We get information about how the model fits the data, and the model parameters.

Comparison of the fit of the M1 and M2 model suggests that the means do not differ.

anova(mg_math_lavaan_fitM1,mg_math_lavaan_fitM2)
## Chi-Squared Difference Test
## 
##                      Df   AIC   BIC  Chisq Chisq diff Df diff Pr(>Chisq)  
## mg_math_lavaan_fitM2 62 15948 15987 243.91                                
## mg_math_lavaan_fitM1 64 15949 15978 249.11     5.2005       2    0.07425 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M3: Means and Covariances model

We can do the multiple group designation and contraints in the estimation of the model …

#estimating the model using sem() function
mg_math_lavaan_fitM3 <- sem(mg_math_lavaan_model, 
                          data = nlsy_math_wide,
                          meanstructure = TRUE,
                          estimator = "ML",
                          missing = "fiml",
                          group = "lb_wght", #to separate groups
                          group.equal = c("loadings", #for constraints
                                          #"means", commented out so can differ
                                          #"lv.variances",
                                          #"lv.covariances",
                                          "residuals"))
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: some cases are empty and will be ignored:
##   741
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage

## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage
## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

We can look more specifically at the numerical summaries of the model.

#obtaining summary of the model using the object we just created
summary(mg_math_lavaan_fitM3, fit.measures=TRUE)
## lavaan 0.6-9 ended normally after 57 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        24
##   Number of equality constraints                    13
##                                                       
##   Number of observations per group:               Used       Total
##     0                                              857         858
##     1                                               75          75
##   Number of missing patterns per group:                           
##     0                                               60            
##     1                                               25            
##                                                                   
## Model Test User Model:
##                                                       
##   Test statistic                               241.182
##   Degrees of freedom                                59
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     0                                          191.157
##     1                                           50.024
## 
## Model Test Baseline Model:
## 
##   Test statistic                               887.887
##   Degrees of freedom                                42
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.785
##   Tucker-Lewis Index (TLI)                       0.847
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -7964.728
##   Loglikelihood unrestricted model (H1)      -7844.138
##                                                       
##   Akaike (AIC)                               15951.457
##   Bayesian (BIC)                             16004.668
##   Sample-size adjusted Bayesian (BIC)        15969.732
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.081
##   90 Percent confidence interval - lower         0.071
##   90 Percent confidence interval - upper         0.092
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.124
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2             0.243    1.147    0.212    0.832
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            35.485    0.365   97.271    0.000
##     eta_2             4.293    0.091   47.089    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            61.062    5.692   10.727    0.000
##     eta_2             0.663    0.326    2.033    0.042
##    .math2   (thet)   36.276    1.870   19.402    0.000
##    .math3   (thet)   36.276    1.870   19.402    0.000
##    .math4   (thet)   36.276    1.870   19.402    0.000
##    .math5   (thet)   36.276    1.870   19.402    0.000
##    .math6   (thet)   36.276    1.870   19.402    0.000
##    .math7   (thet)   36.276    1.870   19.402    0.000
##    .math8   (thet)   36.276    1.870   19.402    0.000
## 
## 
## Group 2 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2            -3.801    4.912   -0.774    0.439
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            32.850    1.413   23.241    0.000
##     eta_2             4.881    0.341   14.332    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            95.283   24.221    3.934    0.000
##     eta_2             1.297    1.315    0.986    0.324
##    .math2   (thet)   36.276    1.870   19.402    0.000
##    .math3   (thet)   36.276    1.870   19.402    0.000
##    .math4   (thet)   36.276    1.870   19.402    0.000
##    .math5   (thet)   36.276    1.870   19.402    0.000
##    .math6   (thet)   36.276    1.870   19.402    0.000
##    .math7   (thet)   36.276    1.870   19.402    0.000
##    .math8   (thet)   36.276    1.870   19.402    0.000

We get information about how the model fits the data, and the model parameters.

Comparison of the fit of the M2 and M3 model suggests that the latent variable variances and covariances do not differ.

anova(mg_math_lavaan_fitM2,mg_math_lavaan_fitM3)
## Chi-Squared Difference Test
## 
##                      Df   AIC   BIC  Chisq Chisq diff Df diff Pr(>Chisq)
## mg_math_lavaan_fitM3 59 15952 16005 241.18                              
## mg_math_lavaan_fitM2 62 15948 15987 243.91     2.7283       3     0.4354

M4: Means and Covariances and Residual Variances model

To keep our homogeneity of residual variances within group we need to change a specification in the base model and the way we name the residual variance theta parmaters.

Specifying linear growth model with group-specific thetas.

#writing out linear growth model in full SEM way 
mg_math_lavaan_model4 <- '
  # latent variable definitions
      #intercept (note intercept is a reserved term)
      eta_1 =~ 1*math2
      eta_1 =~ 1*math3
      eta_1 =~ 1*math4
      eta_1 =~ 1*math5
      eta_1 =~ 1*math6
      eta_1 =~ 1*math7
      eta_1 =~ 1*math8

      #linear slope 
      eta_2 =~ 0*math2
      eta_2 =~ 1*math3
      eta_2 =~ 2*math4
      eta_2 =~ 3*math5
      eta_2 =~ 4*math6
      eta_2 =~ 5*math7
      eta_2 =~ 6*math8

  # factor variances
      eta_1 ~~ start(60)*eta_1
      eta_2 ~~ start(.75)*eta_2

  # covariances among factors 
      eta_1 ~~ eta_2

  # factor means 
      eta_1 ~ start(35)*1
      eta_2 ~ start(4)*1

  # manifest variances (made equivalent by naming theta)
      math2 ~~ c(theta1,theta2)*math2
      math3 ~~ c(theta1,theta2)*math3
      math4 ~~ c(theta1,theta2)*math4
      math5 ~~ c(theta1,theta2)*math5
      math6 ~~ c(theta1,theta2)*math6
      math7 ~~ c(theta1,theta2)*math7
      math8 ~~ c(theta1,theta2)*math8
  # manifest means (fixed at zero)
      math2 ~ 0*1
      math3 ~ 0*1
      math4 ~ 0*1
      math5 ~ 0*1
      math6 ~ 0*1
      math7 ~ 0*1
      math8 ~ 0*1
' #end of model definition

We can free the rest of the contraints in the estimation of the model …

#estimating the model using sem() function
mg_math_lavaan_fitM4 <- sem(mg_math_lavaan_model4, 
                          data = nlsy_math_wide,
                          meanstructure = TRUE,
                          estimator = "ML",
                          missing = "fiml",
                          group = "lb_wght", #to separate groups
                          group.equal = c("loadings")) #for constraints
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: some cases are empty and will be ignored:
##   741
## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage

## Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan
## WARNING: due to missing values, some pairwise combinations have less than 10%
## coverage
## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations

## Warning in lav_mvnorm_missing_h1_estimate_moments(Y = X[[g]], wt = WT[[g]], : lavaan WARNING:
##     Maximum number of iterations reached when computing the sample
##     moments using EM; use the em.h1.iter.max= argument to increase the
##     number of iterations
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
                                          #"means", commented out so can differ
                                          #"lv.variances",
                                          #"lv.covariances",
                                          #"residuals"))

We can look more specifically at the numerical summaries of the model.

#obtaining summary of the model using the object we just created
summary(mg_math_lavaan_fitM4, fit.measures=TRUE)
## lavaan 0.6-9 ended normally after 62 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        24
##   Number of equality constraints                    12
##                                                       
##   Number of observations per group:               Used       Total
##     0                                              857         858
##     1                                               75          75
##   Number of missing patterns per group:                           
##     0                                               60            
##     1                                               25            
##                                                                   
## Model Test User Model:
##                                                       
##   Test statistic                               237.836
##   Degrees of freedom                                58
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     0                                          190.833
##     1                                           47.004
## 
## Model Test Baseline Model:
## 
##   Test statistic                               887.887
##   Degrees of freedom                                42
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.787
##   Tucker-Lewis Index (TLI)                       0.846
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -7963.056
##   Loglikelihood unrestricted model (H1)      -7844.138
##                                                       
##   Akaike (AIC)                               15950.111
##   Bayesian (BIC)                             16008.159
##   Sample-size adjusted Bayesian (BIC)        15970.048
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.082
##   90 Percent confidence interval - lower         0.071
##   90 Percent confidence interval - upper         0.092
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.124
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2            -0.063    1.161   -0.054    0.957
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            35.481    0.365   97.257    0.000
##     eta_2             4.297    0.091   47.145    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            62.287    5.728   10.873    0.000
##     eta_2             0.774    0.334    2.314    0.021
##    .math2   (tht1)   35.172    1.904   18.473    0.000
##    .math3   (tht1)   35.172    1.904   18.473    0.000
##    .math4   (tht1)   35.172    1.904   18.473    0.000
##    .math5   (tht1)   35.172    1.904   18.473    0.000
##    .math6   (tht1)   35.172    1.904   18.473    0.000
##    .math7   (tht1)   35.172    1.904   18.473    0.000
##    .math8   (tht1)   35.172    1.904   18.473    0.000
## 
## 
## Group 2 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 =~                                            
##     math2             1.000                           
##     math3             1.000                           
##     math4             1.000                           
##     math5             1.000                           
##     math6             1.000                           
##     math7             1.000                           
##     math8             1.000                           
##   eta_2 =~                                            
##     math2             0.000                           
##     math3             1.000                           
##     math4             2.000                           
##     math5             3.000                           
##     math6             4.000                           
##     math7             5.000                           
##     math8             6.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   eta_1 ~~                                            
##     eta_2             0.745    5.522    0.135    0.893
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            32.800    1.407   23.314    0.000
##     eta_2             4.873    0.341   14.298    0.000
##    .math2             0.000                           
##    .math3             0.000                           
##    .math4             0.000                           
##    .math5             0.000                           
##    .math6             0.000                           
##    .math7             0.000                           
##    .math8             0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     eta_1            79.627   25.629    3.107    0.002
##     eta_2            -0.157    1.477   -0.106    0.915
##    .math2   (tht2)   48.686    8.444    5.766    0.000
##    .math3   (tht2)   48.686    8.444    5.766    0.000
##    .math4   (tht2)   48.686    8.444    5.766    0.000
##    .math5   (tht2)   48.686    8.444    5.766    0.000
##    .math6   (tht2)   48.686    8.444    5.766    0.000
##    .math7   (tht2)   48.686    8.444    5.766    0.000
##    .math8   (tht2)   48.686    8.444    5.766    0.000

We get information about how the model fits the data, and the model parameters.

Comparison of the fit of the M3 and M4 model suggests that the residual variances do not differ.

anova(mg_math_lavaan_fitM3,mg_math_lavaan_fitM4)
## Chi-Squared Difference Test
## 
##                      Df   AIC   BIC  Chisq Chisq diff Df diff Pr(>Chisq)  
## mg_math_lavaan_fitM4 58 15950 16008 237.84                                
## mg_math_lavaan_fitM3 59 15952 16005 241.18     3.3457       1    0.06738 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion

This tutorial has presented how the multiple group growth model can be used to test a vareity of possible group differences, including means, variances and covariances, and residual variances.

Now that is pushing for more research questions! So much to do!