Chapter 17 - Multivariate Latent Change Score Models
Overview
This tutorial walks through the fitting of a bivariate latent change
score model in the structural equation modeling framework in R using
using the lavaan package.
The example follows Chapter 17 of Grimm, Ram, and Estabrook (2017). Please refer to the chapter for further interpretations and insights about the analyses.
Preliminaries
Loading Libraries Used in This Script.
Reading in Repeated Measures Data
We use data from the NLSY-CYA (Center for Human Resource Research, 2009) that includes repeated measures of children’s math ability (math) and reading comprehension ability (rec) from the second through eighth grade.
#set filepath
filepath <- "https://raw.githubusercontent.com/The-Change-Lab/collaborations/refs/heads/main/GrowthModeling/nlsy_math_hyp_wide_R.dat"
#read in the text data file using the url() function
nlsy_data <- read.table(file=url(filepath), na.strings = ".")
#adding names for the columns of the data set
names(nlsy_data) <- c('id', 'female', 'lb_wght', 'anti_k1',
'math2', 'math3', 'math4', 'math5',
'math6', 'math7', 'math8',
'comp2', 'comp3', 'comp4', 'comp5',
'comp6', 'comp7', 'comp8',
'rec2', 'rec3', 'rec4', 'rec5',
'rec6', 'rec7', 'rec8',
'bpi2', 'bpi3', 'bpi4', 'bpi5',
'bpi6', 'bpi7', 'bpi8',
'asl2', 'asl3', 'asl4', 'asl5',
'asl6', 'asl7', 'asl8',
'ax2', 'ax3', 'ax4', 'ax5',
'ax6', 'ax7', 'ax8',
'hds2', 'hds3', 'hds4', 'hds5',
'hds6', 'hds7', 'hds8',
'hyp2', 'hyp3', 'hyp4', 'hyp5',
'hyp6', 'hyp7', 'hyp8',
'dpn2', 'dpn3', 'dpn4', 'dpn5',
'dpn6', 'dpn7', 'dpn8',
'wdn2', 'wdn3', 'wdn4', 'wdn5',
'wdn6', 'wdn7', 'wdn8',
'age2', 'age3', 'age4', 'age5',
'age6', 'age7', 'age8',
'men2', 'men3', 'men4', 'men5',
'men6', 'men7', 'men8',
'spring2', 'spring3', 'spring4', 'spring5',
'spring6', 'spring7', 'spring8',
'anti2', 'anti3', 'anti4', 'anti5',
'anti6', 'anti7', 'anti8')
#reduce data down to the id variable and the math and reading variables of interest
nlsy_data <- nlsy_data %>%
select(id, math2, math3, math4, math5, math6, math7, math8,
rec2, rec3, rec4, rec5, rec6, rec7, rec8)
psych::describe(nlsy_data)## vars n mean sd median trimmed mad min max
## id 1 933 532334.90 328020.79 506602.0 520130.77 391999.44 201 1256601
## math2 2 335 32.61 10.29 32.0 32.28 10.38 12 60
## math3 3 431 39.88 10.30 41.0 39.88 10.38 13 67
## math4 4 378 46.17 10.17 46.0 46.22 8.90 18 70
## math5 5 372 49.77 9.47 48.0 49.77 8.90 23 71
## math6 6 390 52.72 9.92 50.5 52.38 9.64 24 78
## math7 7 173 55.35 10.63 53.0 55.09 11.86 31 81
## math8 8 142 57.83 11.53 56.0 57.43 12.60 26 81
## rec2 9 333 34.68 10.36 34.0 33.90 10.38 15 79
## rec3 10 431 41.29 11.46 40.0 40.80 11.86 19 81
## rec4 11 376 47.56 12.33 47.0 47.17 11.86 21 83
## rec5 12 370 52.91 13.03 52.0 52.86 13.34 21 84
## rec6 13 389 55.99 12.62 56.0 55.93 13.34 21 82
## rec7 14 173 60.56 13.61 62.0 61.16 14.83 23 84
## rec8 15 142 64.37 12.15 66.0 65.11 13.34 32 84
## range skew kurtosis se
## id 1256400 0.28 -0.91 10738.92
## math2 48 0.27 -0.46 0.56
## math3 54 -0.05 -0.33 0.50
## math4 52 -0.06 -0.08 0.52
## math5 48 0.04 -0.34 0.49
## math6 54 0.25 -0.38 0.50
## math7 50 0.21 -0.97 0.81
## math8 55 0.16 -0.52 0.97
## rec2 64 0.81 1.06 0.57
## rec3 62 0.43 0.05 0.55
## rec4 62 0.32 -0.07 0.64
## rec5 63 0.04 -0.48 0.68
## rec6 61 -0.03 -0.37 0.64
## rec7 61 -0.39 -0.53 1.03
## rec8 52 -0.50 -0.56 1.02
Plotting the Repeated Measures Data
#reshaping wide to long (using tidyverse)
data_long <- nlsy_data %>%
pivot_longer(.,
cols = c(math2, math3, math4, math5, math6, math7, math8,
rec2, rec3, rec4, rec5, rec6, rec7, rec8),
cols_vary = "fastest", #to keep same-id rows close together
names_to = c(".value", "grade"),
names_pattern = "(math|rec)(\\d+)",
names_transform = list(grade = as.integer))
#looking at the long data
head(data_long, 8)## # A tibble: 8 × 4
## id grade math rec
## <int> <int> <int> <int>
## 1 201 2 NA NA
## 2 201 3 38 35
## 3 201 4 NA NA
## 4 201 5 55 52
## 5 201 6 NA NA
## 6 201 7 NA NA
## 7 201 8 NA NA
## 8 303 2 26 26
#Plotting intraindividual change MATH
data_long %>%
ggplot(aes(x = grade, y = math, group = id)) +
geom_point(color="blue", alpha=.7) +
geom_line(color="blue", alpha=.7) +
xlab("Grade") +
ylab("PIAT Mathematics") +
scale_x_continuous(limits=c(2, 8), breaks=seq(2, 8, by=1)) +
scale_y_continuous(limits=c(0, 90), breaks=seq(0, 90, by=10))## Warning: Removed 4310 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2787 rows containing missing values or values outside the scale range
## (`geom_line()`).
#Plotting intraindividual change READING
data_long %>%
ggplot(aes(x = grade, y = rec, group = id)) +
geom_point(color="red", alpha=.7) +
geom_line(color="red", alpha=.7) +
xlab("Grade") +
ylab("PIAT Reading Recognition") +
scale_x_continuous(limits=c(2, 8), breaks=seq(2, 8, by=1)) +
scale_y_continuous(limits=c(0, 90), breaks=seq(0, 90, by=10))## Warning: Removed 4317 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2804 rows containing missing values or values outside the scale range
## (`geom_line()`).
#plotting intraindividual change in the bivariate space
#Plotting vector field with .25 unit time change
data_long %>%
ggplot(aes(x = rec, y = math, group = id)) +
geom_point(alpha=.1) +
geom_line(alpha=.3, arrow = arrow(length = unit(0.1, "cm"))) +
xlab("Reading Recognition") +
ylab("Mathematics") +
scale_x_continuous(limits=c(10, 90), breaks=seq(10, 90, by=10)) +
scale_y_continuous(limits=c(10, 90), breaks=seq(10, 90, by=10))## Warning: Removed 4317 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 4317 rows containing missing values or values outside the scale range
## (`geom_line()`).
This last plot shows individual trajectories in the bivariate space. Our general research goal is to describe how movement in the horizontal direction is coupled with movement in the vertical direction.
Full Coupling Bivariate Dual Change Score Model
Model Specification
We first specify the full coupling model. The model invokes the latent variables and builds the difference scores, incorporates the constant growth factor and the proportional change. Lines invoking each set of latent variables and arrows is indicated in the model specification.
The model diagram follows this kind of setup …
bdcm_lavaan <- ' #opening quote
#MATHEMATICS
#latent true scores (loadings = 1)
lm1 =~ 1*math2
lm2 =~ 1*math3
lm3 =~ 1*math4
lm4 =~ 1*math5
lm5 =~ 1*math6
lm6 =~ 1*math7
lm7 =~ 1*math8
#latent true score means (initial free, others = 0)
lm1 ~ 1
lm2 ~ 0*1
lm3 ~ 0*1
lm4 ~ 0*1
lm5 ~ 0*1
lm6 ~ 0*1
lm7 ~ 0*1
#latent true score variances (initial free, others = 0)
lm1 ~~ start(15)*lm1
lm2 ~~ 0*lm2
lm3 ~~ 0*lm3
lm4 ~~ 0*lm4
lm5 ~~ 0*lm5
lm6 ~~ 0*lm6
lm7 ~~ 0*lm7
#observed intercepts (fixed to 0)
math2 ~ 0*1
math3 ~ 0*1
math4 ~ 0*1
math5 ~ 0*1
math6 ~ 0*1
math7 ~ 0*1
math8 ~ 0*1
#observed residual variances (constrained to equality)
math2 ~~ sigma2_u*math2
math3 ~~ sigma2_u*math3
math4 ~~ sigma2_u*math4
math5 ~~ sigma2_u*math5
math6 ~~ sigma2_u*math6
math7 ~~ sigma2_u*math7
math8 ~~ sigma2_u*math8
#autoregressions (fixed = 1)
lm2 ~ 1*lm1
lm3 ~ 1*lm2
lm4 ~ 1*lm3
lm5 ~ 1*lm4
lm6 ~ 1*lm5
lm7 ~ 1*lm6
#latent change scores (fixed = 1)
dm2 =~ 1*lm2
dm3 =~ 1*lm3
dm4 =~ 1*lm4
dm5 =~ 1*lm5
dm6 =~ 1*lm6
dm7 =~ 1*lm7
#latent change score means (constrained to 0)
dm2 ~ 0*1
dm3 ~ 0*1
dm4 ~ 0*1
dm5 ~ 0*1
dm6 ~ 0*1
dm7 ~ 0*1
#latent change score variances (constrained to 0)
dm2 ~~ 0*dm2
dm3 ~~ 0*dm3
dm4 ~~ 0*dm4
dm5 ~~ 0*dm5
dm6 ~~ 0*dm6
dm7 ~~ 0*dm7
#constant change factor (loadings = 1)
g2 =~ 1*dm2 +
1*dm3 +
1*dm4 +
1*dm5 +
1*dm6 +
1*dm7
#constant change factor mean
g2 ~ start(15)*1
#constant change factor variance
g2 ~~ g2
#constant change factor covariance with the initial true score
g2 ~~ lm1
#proportional effects (constrained equal)
dm2 ~ start(-.2)*pi_m * lm1
dm3 ~ start(-.2)*pi_m * lm2
dm4 ~ start(-.2)*pi_m * lm3
dm5 ~ start(-.2)*pi_m * lm4
dm6 ~ start(-.2)*pi_m * lm5
dm7 ~ start(-.2)*pi_m * lm6
#READING RECOGNITION
#latent true scores (loadings = 1)
lr1 =~ 1*rec2
lr2 =~ 1*rec3
lr3 =~ 1*rec4
lr4 =~ 1*rec5
lr5 =~ 1*rec6
lr6 =~ 1*rec7
lr7 =~ 1*rec8
#latent true score means (initial free, others = 0)
lr1 ~ 1
lr2 ~ 0*1
lr3 ~ 0*1
lr4 ~ 0*1
lr5 ~ 0*1
lr6 ~ 0*1
lr7 ~ 0*1
#latent true score variances (initial free, others = 0)
lr1 ~~ start(15)*lr1
lr2 ~~ 0*lr2
lr3 ~~ 0*lr3
lr4 ~~ 0*lr4
lr5 ~~ 0*lr5
lr6 ~~ 0*lr6
lr7 ~~ 0*lr7
#observed intercept variances (fixed = 0)
rec2 ~ 0*1
rec3 ~ 0*1
rec4 ~ 0*1
rec5 ~ 0*1
rec6 ~ 0*1
rec7 ~ 0*1
rec8 ~ 0*1
#observed residual variances (constrained to eqaulity)
rec2 ~~ sigma2_s*rec2
rec3 ~~ sigma2_s*rec3
rec4 ~~ sigma2_s*rec4
rec5 ~~ sigma2_s*rec5
rec6 ~~ sigma2_s*rec6
rec7 ~~ sigma2_s*rec7
rec8 ~~ sigma2_s*rec8
#autoregressions (fixed = 1)
lr2 ~ 1*lr1
lr3 ~ 1*lr2
lr4 ~ 1*lr3
lr5 ~ 1*lr4
lr6 ~ 1*lr5
lr7 ~ 1*lr6
#latent change scores (ficed = 1)
dr2 =~ 1*lr2
dr3 =~ 1*lr3
dr4 =~ 1*lr4
dr5 =~ 1*lr5
dr6 =~ 1*lr6
dr7 =~ 1*lr7
#latent change score means (fixed = 0)
dr2 ~ 0*1
dr3 ~ 0*1
dr4 ~ 0*1
dr5 ~ 0*1
dr6 ~ 0*1
dr7 ~ 0*1
#latent change score variances (fixed = 0)
dr2 ~~ 0*dr2
dr3 ~~ 0*dr3
dr4 ~~ 0*dr4
dr5 ~~ 0*dr5
dr6 ~~ 0*dr6
dr7 ~~ 0*dr7
#constant change factor (fixed = 1)
j2 =~ 1*dr2 +
1*dr3 +
1*dr4 +
1*dr5 +
1*dr6 +
1*dr7
#constant change factor mean
j2 ~ start(10)*1
#constant change factor variance
j2 ~~ j2
#constant change factor covariance with the initial true score
j2 ~~ lr1
#proportional effects (constrained to equality)
dr2 ~ start(-.2)*pi_r * lr1
dr3 ~ start(-.2)*pi_r * lr2
dr4 ~ start(-.2)*pi_r * lr3
dr5 ~ start(-.2)*pi_r * lr4
dr6 ~ start(-.2)*pi_r * lr5
dr7 ~ start(-.2)*pi_r * lr6
#BIVARIATE INFORMATION
#covariances between the latent growth factors
lm1 ~~ lr1
lm1 ~~ j2
lr1 ~~ g2
j2 ~~ g2
#residual covariances
math2 ~~ sigma_su*rec2
math3 ~~ sigma_su*rec3
math4 ~~ sigma_su*rec4
math5 ~~ sigma_su*rec5
math6 ~~ sigma_su*rec6
math7 ~~ sigma_su*rec7
#COUPLING PARMETERS
#math to changes in reading
dr2 ~ delta_r*lm1
dr3 ~ delta_r*lm2
dr4 ~ delta_r*lm3
dr5 ~ delta_r*lm4
dr6 ~ delta_r*lm5
dr7 ~ delta_r*lm6
#reading to changes in math
dm2 ~ delta_m*lr1
dm3 ~ delta_m*lr2
dm4 ~ delta_m*lr3
dm5 ~ delta_m*lr4
dm6 ~ delta_m*lr5
dm7 ~ delta_m*lr6
' #closing quoteModel Estimation and Interpretation
We fit the model using lavaan.
#Model fitting
dcs_fullcoupling <- lavaan.fit <- lavaan(bdcm_lavaan,
data = nlsy_data, #note that fitting uses wide data
meanstructure = TRUE,
estimator = "ML",
missing = "fiml",
fixed.x = FALSE,
mimic="mplus",
control=list(iter.max=500),
verbose=FALSE)## Warning: lavaan->lav_data_full():
## some cases are empty and will be ignored: 741.
## Warning: lavaan->lav_data_full():
## due to missing values, some pairwise combinations have less than 10%
## coverage; use lavInspect(fit, "coverage") to investigate.
## Warning: lavaan->lav_mvnorm_missing_h1_estimate_moments():
## Maximum number of iterations reached when computing the sample moments
## using EM; use the em.h1.iter.max= argument to increase the number of
## iterations
## lavaan 0.6-19 ended normally after 253 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 58
## Number of equality constraints 37
##
## Used Total
## Number of observations 932 933
## Number of missing patterns 66
##
## Model Test User Model:
##
## Test statistic 166.098
## Degrees of freedom 98
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 2710.581
## Degrees of freedom 91
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.974
## Tucker-Lewis Index (TLI) 0.976
##
## Robust Comparative Fit Index (CFI) 0.876
## Robust Tucker-Lewis Index (TLI) 0.884
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -15680.484
## Loglikelihood unrestricted model (H1) -15597.435
##
## Akaike (AIC) 31402.968
## Bayesian (BIC) 31504.552
## Sample-size adjusted Bayesian (SABIC) 31437.857
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.027
## 90 Percent confidence interval - lower 0.020
## 90 Percent confidence interval - upper 0.034
## P-value H_0: RMSEA <= 0.050 1.000
## P-value H_0: RMSEA >= 0.080 0.000
##
## Robust RMSEA 0.135
## 90 Percent confidence interval - lower 0.104
## 90 Percent confidence interval - upper 0.166
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 0.997
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.111
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## lm1 =~
## math2 1.000
## lm2 =~
## math3 1.000
## lm3 =~
## math4 1.000
## lm4 =~
## math5 1.000
## lm5 =~
## math6 1.000
## lm6 =~
## math7 1.000
## lm7 =~
## math8 1.000
## dm2 =~
## lm2 1.000
## dm3 =~
## lm3 1.000
## dm4 =~
## lm4 1.000
## dm5 =~
## lm5 1.000
## dm6 =~
## lm6 1.000
## dm7 =~
## lm7 1.000
## g2 =~
## dm2 1.000
## dm3 1.000
## dm4 1.000
## dm5 1.000
## dm6 1.000
## dm7 1.000
## lr1 =~
## rec2 1.000
## lr2 =~
## rec3 1.000
## lr3 =~
## rec4 1.000
## lr4 =~
## rec5 1.000
## lr5 =~
## rec6 1.000
## lr6 =~
## rec7 1.000
## lr7 =~
## rec8 1.000
## dr2 =~
## lr2 1.000
## dr3 =~
## lr3 1.000
## dr4 =~
## lr4 1.000
## dr5 =~
## lr5 1.000
## dr6 =~
## lr6 1.000
## dr7 =~
## lr7 1.000
## j2 =~
## dr2 1.000
## dr3 1.000
## dr4 1.000
## dr5 1.000
## dr6 1.000
## dr7 1.000
##
## Regressions:
## Estimate Std.Err z-value P(>|z|)
## lm2 ~
## lm1 1.000
## lm3 ~
## lm2 1.000
## lm4 ~
## lm3 1.000
## lm5 ~
## lm4 1.000
## lm6 ~
## lm5 1.000
## lm7 ~
## lm6 1.000
## dm2 ~
## lm1 (pi_m) -0.293 0.115 -2.560 0.010
## dm3 ~
## lm2 (pi_m) -0.293 0.115 -2.560 0.010
## dm4 ~
## lm3 (pi_m) -0.293 0.115 -2.560 0.010
## dm5 ~
## lm4 (pi_m) -0.293 0.115 -2.560 0.010
## dm6 ~
## lm5 (pi_m) -0.293 0.115 -2.560 0.010
## dm7 ~
## lm6 (pi_m) -0.293 0.115 -2.560 0.010
## lr2 ~
## lr1 1.000
## lr3 ~
## lr2 1.000
## lr4 ~
## lr3 1.000
## lr5 ~
## lr4 1.000
## lr6 ~
## lr5 1.000
## lr7 ~
## lr6 1.000
## dr2 ~
## lr1 (pi_r) -0.495 0.115 -4.308 0.000
## dr3 ~
## lr2 (pi_r) -0.495 0.115 -4.308 0.000
## dr4 ~
## lr3 (pi_r) -0.495 0.115 -4.308 0.000
## dr5 ~
## lr4 (pi_r) -0.495 0.115 -4.308 0.000
## dr6 ~
## lr5 (pi_r) -0.495 0.115 -4.308 0.000
## dr7 ~
## lr6 (pi_r) -0.495 0.115 -4.308 0.000
## dr2 ~
## lm1 (dlt_r) 0.391 0.129 3.029 0.002
## dr3 ~
## lm2 (dlt_r) 0.391 0.129 3.029 0.002
## dr4 ~
## lm3 (dlt_r) 0.391 0.129 3.029 0.002
## dr5 ~
## lm4 (dlt_r) 0.391 0.129 3.029 0.002
## dr6 ~
## lm5 (dlt_r) 0.391 0.129 3.029 0.002
## dr7 ~
## lm6 (dlt_r) 0.391 0.129 3.029 0.002
## dm2 ~
## lr1 (dlt_m) 0.053 0.098 0.540 0.589
## dm3 ~
## lr2 (dlt_m) 0.053 0.098 0.540 0.589
## dm4 ~
## lr3 (dlt_m) 0.053 0.098 0.540 0.589
## dm5 ~
## lr4 (dlt_m) 0.053 0.098 0.540 0.589
## dm6 ~
## lr5 (dlt_m) 0.053 0.098 0.540 0.589
## dm7 ~
## lr6 (dlt_m) 0.053 0.098 0.540 0.589
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## lm1 ~~
## g2 14.165 2.164 6.545 0.000
## lr1 ~~
## j2 26.323 3.483 7.557 0.000
## lm1 ~~
## lr1 56.493 5.487 10.297 0.000
## j2 6.163 3.103 1.986 0.047
## g2 ~~
## lr1 9.791 3.083 3.176 0.001
## j2 0.681 2.552 0.267 0.790
## .math2 ~~
## .rec2 (sgm_) 7.008 1.259 5.566 0.000
## .math3 ~~
## .rec3 (sgm_) 7.008 1.259 5.566 0.000
## .math4 ~~
## .rec4 (sgm_) 7.008 1.259 5.566 0.000
## .math5 ~~
## .rec5 (sgm_) 7.008 1.259 5.566 0.000
## .math6 ~~
## .rec6 (sgm_) 7.008 1.259 5.566 0.000
## .math7 ~~
## .rec7 (sgm_) 7.008 1.259 5.566 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## lm1 32.543 0.446 72.980 0.000
## .lm2 0.000
## .lm3 0.000
## .lm4 0.000
## .lm5 0.000
## .lm6 0.000
## .lm7 0.000
## .math2 0.000
## .math3 0.000
## .math4 0.000
## .math5 0.000
## .math6 0.000
## .math7 0.000
## .math8 0.000
## .dm2 0.000
## .dm3 0.000
## .dm4 0.000
## .dm5 0.000
## .dm6 0.000
## .dm7 0.000
## g2 15.092 0.926 16.291 0.000
## lr1 34.386 0.433 79.425 0.000
## .lr2 0.000
## .lr3 0.000
## .lr4 0.000
## .lr5 0.000
## .lr6 0.000
## .lr7 0.000
## .rec2 0.000
## .rec3 0.000
## .rec4 0.000
## .rec5 0.000
## .rec6 0.000
## .rec7 0.000
## .rec8 0.000
## .dr2 0.000
## .dr3 0.000
## .dr4 0.000
## .dr5 0.000
## .dr6 0.000
## .dr7 0.000
## j2 10.886 0.934 11.650 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## lm1 72.886 6.603 11.038 0.000
## .lm2 0.000
## .lm3 0.000
## .lm4 0.000
## .lm5 0.000
## .lm6 0.000
## .lm7 0.000
## .math2 (sigm2_) 31.306 1.609 19.460 0.000
## .math3 (sigm2_) 31.306 1.609 19.460 0.000
## .math4 (sigm2_) 31.306 1.609 19.460 0.000
## .math5 (sigm2_) 31.306 1.609 19.460 0.000
## .math6 (sigm2_) 31.306 1.609 19.460 0.000
## .math7 (sigm2_) 31.306 1.609 19.460 0.000
## .math8 (sigm2_) 31.306 1.609 19.460 0.000
## .dm2 0.000
## .dm3 0.000
## .dm4 0.000
## .dm5 0.000
## .dm6 0.000
## .dm7 0.000
## g2 5.743 1.697 3.384 0.001
## lr1 71.978 7.273 9.896 0.000
## .lr2 0.000
## .lr3 0.000
## .lr4 0.000
## .lr5 0.000
## .lr6 0.000
## .lr7 0.000
## .rec2 (sgm2_s) 33.217 1.763 18.844 0.000
## .rec3 (sgm2_s) 33.217 1.763 18.844 0.000
## .rec4 (sgm2_s) 33.217 1.763 18.844 0.000
## .rec5 (sgm2_s) 33.217 1.763 18.844 0.000
## .rec6 (sgm2_s) 33.217 1.763 18.844 0.000
## .rec7 (sgm2_s) 33.217 1.763 18.844 0.000
## .rec8 (sgm2_s) 33.217 1.763 18.844 0.000
## .dr2 0.000
## .dr3 0.000
## .dr4 0.000
## .dr5 0.000
## .dr6 0.000
## .dr7 0.000
## j2 18.384 6.806 2.701 0.007
#Model Diagram
# semPaths(dcs_fullcoupling, what="est",
# sizeLat = 7, sizeMan = 7, edge.label.cex = .75)
#Note that semPaths does not quite know how to draw this kind of model.
#So, better to map to the diagram from the book. The change equations based on the output could be written as
Change in Math:
\[\Delta_{math} = 15.09 - 0.293(math_{t-1}) + 0.053(rec_{t-1})\] Change in Reading Recognition:
\[\Delta_{rec} = 10.89 - 0.495(rec_{t-1}) + 0.391(math_{t-1})\] However, need to be careful in interpretation as not all of the parameters are significantly different than zero.
The different coupling parameters (math to change in reading, reading to change in math) can each be constrained to zero in order to test the relative fit of models with unidirectional coupling and no coupling in order to reject some models and retain others.
Predicted scores
#obtaining predicted scores
nlsy_predicted <- cbind(nlsy_data$id,
as.data.frame(lavPredict(dcs_fullcoupling,
type = "yhat")))
names(nlsy_predicted)[1] <- "id"
#looking at data
head(nlsy_predicted)## id math2 math3 math4 math5 math6 math7 math8 rec2
## 1 201 32.40521 40.43828 46.43794 51.00672 54.52540 57.25279 59.37446 32.34026
## 2 303 23.61177 30.13241 34.98586 38.67431 41.51168 43.70955 45.41867 24.71381
## 3 2702 47.51443 55.21694 60.98346 65.38087 68.77031 71.39871 73.44390 38.52543
## 4 4303 35.87597 43.97910 49.99209 54.55345 58.05867 60.77225 62.88171 31.80798
## 5 5002 33.25369 41.78506 48.24162 53.19648 57.02935 60.00764 62.32767 38.48188
## 6 5005 37.28750 44.27939 49.71251 53.94488 57.24643 59.82385 61.83680 45.58593
## rec3 rec4 rec5 rec6 rec7 rec8
## 1 38.46715 44.70660 50.20687 54.77363 58.45777 61.38633
## 2 29.37223 34.27788 38.65573 42.31092 45.26792 47.62191
## 3 44.66042 50.77456 56.12026 60.54186 64.10208 66.92931
## 4 37.24952 43.17024 48.51473 52.99990 56.63757 59.53725
## 5 46.59422 54.03159 60.31579 65.42965 69.51313 72.74162
## 6 54.91746 62.36797 68.25815 72.89014 76.52224 79.36580
#reshaping wide to long (using tidyverse)
predicted_long <- nlsy_predicted %>%
pivot_longer(.,
cols = c(math2, math3, math4, math5, math6, math7, math8,
rec2, rec3, rec4, rec5, rec6, rec7, rec8),
cols_vary = "fastest", #to keep same-id rows close together
names_to = c(".value", "grade"),
names_pattern = "(math|rec)(\\d+)",
names_transform = list(grade = as.integer))
#looking at the long data
head(predicted_long, 14)## # A tibble: 14 × 4
## id grade math rec
## <int> <int> <dbl> <dbl>
## 1 201 2 32.4 32.3
## 2 201 3 40.4 38.5
## 3 201 4 46.4 44.7
## 4 201 5 51.0 50.2
## 5 201 6 54.5 54.8
## 6 201 7 57.3 58.5
## 7 201 8 59.4 61.4
## 8 303 2 23.6 24.7
## 9 303 3 30.1 29.4
## 10 303 4 35.0 34.3
## 11 303 5 38.7 38.7
## 12 303 6 41.5 42.3
## 13 303 7 43.7 45.3
## 14 303 8 45.4 47.6
#Plotting intraindividual change MATH
predicted_long %>%
ggplot(aes(x = grade, y = math, group = id)) +
geom_line(color="blue", alpha=.4) +
xlab("Grade") +
ylab("Predicted PIAT Mathematics") +
scale_x_continuous(limits=c(2, 8), breaks=seq(2, 8, by=1)) +
scale_y_continuous(limits=c(0, 90), breaks=seq(0, 90, by=10))## Warning: Removed 7 rows containing missing values or values outside the scale range
## (`geom_line()`).
#Plotting intraindividual change MATH
predicted_long %>%
ggplot(aes(x = grade, y = rec, group = id)) +
geom_line(color="red", alpha=.4) +
xlab("Grade") +
ylab("Predicted PIAT Reading Recognition") +
scale_x_continuous(limits=c(2, 8), breaks=seq(2, 8, by=1)) +
scale_y_continuous(limits=c(0, 90), breaks=seq(0, 90, by=10))## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_line()`).
#plotting intraindividual change in the bivariate space
#Plotting vector field with .25 unit time change
predicted_long %>%
ggplot(aes(x = rec, y = math, group = id)) +
geom_line(alpha=.3, arrow = arrow(length = unit(0.1, "cm"))) +
xlab("Reading Recognition") +
ylab("Mathematics") +
scale_x_continuous(limits=c(10, 90), breaks=seq(10, 90, by=10)) +
scale_y_continuous(limits=c(10, 90), breaks=seq(10, 90, by=10))## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_line()`).
These figures (similar to Figure 17.2 in the book) show the nonlinear exponential-type shape that is captured by the dual change score model.
Vector field plot of the Bivariate Dynamics
It is informative to examine how the coupling manifests in the
bivariate space. To portray that, we create a vector field plot that
indicates the probable movement in the bivariate space from any given
starting location. We take a shortcut to the final plot using the
RAMpath library, which has special functions for fitting
the bivariate coupling model.
#creating a grid of starting values
df <- expand.grid(math=seq(10, 90, 5), rec=seq(10, 90, 5))
#calculating change scores for each starting value
#changes in math based on output from model
df$dm <- with(df, 15.09 - 0.293*math + 0.053*rec)
#changes in reading based on output from model
df$dr <- with(df, 10.89 - 0.495*rec + 0.391*math)
#Plotting vector field with .25 unit time change
df %>%
ggplot(aes(x = rec, y = math)) +
geom_point(data=data_long, aes(x=rec, y=math), alpha=.1) +
stat_ellipse(data=data_long, aes(x=rec, y=math)) +
geom_segment(aes(x = rec, y = math, xend = rec+.25*dr, yend = math+.25*dm),
arrow = arrow(length = unit(0.1, "cm"))) +
xlab("Reading Recognition") +
ylab("Mathematics") +
scale_x_continuous(limits=c(10, 90), breaks=seq(10, 90, by=10)) +
scale_y_continuous(limits=c(10, 90), breaks=seq(10, 90, by=10))## Warning: Removed 4317 rows containing non-finite outside the scale range
## (`stat_ellipse()`).
## Warning: Removed 4317 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_segment()`).
Now we’ve got a plot of the change dynamics!
The arrows in the vector field plot (similar to Figure 17.3 in the book) show the expected changes in the bivariate space for a .25 unit of time. Note that the data are located mostly within the ellipse (95%). Outside that area the directional vectors are interpolated based on what was observed and modeled in the actual data (i.e., within the ellipse). So, please be careful not to over-interpret what is happening outside the ellipse.
Conclusion
The bivariate latent change score models provides a framework for examining relations among two variables (objective #3 of Baltes and Nesselroade’s Objectives of Longitudinal Research). Some care should be taken in interpretation as we are still working from between-person covariance information - but we start to get into possibility to talk about “causes and effects” (which variable is leading changes in the other) and examination about dynamics of change. Lots of new possibilities!
As always, model responsibly when having fun!
Citations
Epskamp, S. (2022). semPlot: Path Diagrams and Visual Analysis of Various SEM Packages’ Output (Version 1.1.6). https://CRAN.R-project.org/package=semPlot
Grimm, K. J., Ram, N., & Estabrook, R. (2017). Growth Modeling: Structural Equation and Multilevel Modeling Approaches. Guilford Publications.
R Core Team. (2024). R: A Language and Environment for Statistical Computing. Foundation for Statistical Computing. https://www.R-project.org/
Revelle, W. (2024). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University. https://CRAN.R-project.org/package=psych
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48, 1–36. https://doi.org/10.18637/jss.v048.i02
Wei, T., & Simko, V. (2024). R package “corrplot”: Visualization of a Correlation Matrix (Version 0.95). https://github.com/taiyun/corrplot
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686