Two-Occasion Change

Overview

This script works through some basic representations of change: Auto-Regressive Models of Change, and Difference-Score Models of Change. These two types of models consider and answer different kinds of research questions: questions about “Change in Interindividual Differences” or questions about “Interindividual Differences and Intraindividual Change”.

Outline

Data Preparation
Auto-Regression Model
Difference Score Model
Conclusion

Loading Libraries Used In This Script

library(psych)    #data descriptives
library(ggplot2)  #data visualization
library(dplyr)    #data manipulation
library(tidyr)    #tidy data, reshaping

Data Preparation

We use two occasions of the multi-occasion WISC data for our examples.

Load the repeated measures data

#set filepath for data file
filepath <- "https://raw.githubusercontent.com/The-Change-Lab/collaborations/refs/heads/main/GrowthModeling/wisc3raw.csv"
#read in the .csv file using the url() function
wisc3raw <- read.csv(file=url(filepath), header=TRUE)

Subsetting to a dataset with just two occasions. We include the id, verb1 and verb6 variables.

wiscsub <- wisc3raw %>%
  select(id, verb1, verb6)

head(wiscsub)

##   id verb1 verb6
## 1  1 24.42 55.64
## 2  2 12.44 37.81
## 3  3 32.43 50.18
## 4  4 22.69 44.72
## 5  5 28.23 70.95
## 6  6 16.06 39.94

Some basic descriptives …

#descriptives
describe(wiscsub[,c("verb1","verb6")])

##       vars   n  mean    sd median trimmed   mad   min   max range skew kurtosis
## verb1    1 204 19.59  5.81  19.34   19.50  5.41  3.33 35.15 31.82 0.13    -0.05
## verb6    2 204 43.75 10.67  42.55   43.46 11.30 17.35 72.59 55.24 0.24    -0.36
##         se
## verb1 0.41
## verb6 0.75

#correlate
corr.test(wiscsub[,c("verb1","verb6")])

## Call:corr.test(x = wiscsub[, c("verb1", "verb6")])
## Correlation matrix 
##       verb1 verb6
## verb1  1.00  0.65
## verb6  0.65  1.00
## Sample Size 
## [1] 204
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##       verb1 verb6
## verb1     0     0
## verb6     0     0
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

And some bivariate plots of the two-occasion relations.

pairs.panels(wiscsub[,c("verb1","verb6")])

We can also plot intraindividual change, by putting time along the x-axis. This requires reshaping the data.

#reshaping wide to long
wiscsublong <- wiscsub %>%
  pivot_longer(!id,
               names_to = "grade", 
               names_prefix = "verb",
               values_to = "verb")

#making intraindividual change plot
wiscsublong %>%
  mutate(grade = as.numeric(grade)) %>%
  ggplot(aes(x = grade, y = verb, group = id)) +
  geom_point() + 
  geom_line() +
  xlab("Grade") + 
  ylab("WISC Verbal Score") + ylim(0, 100) +
  scale_x_continuous(breaks=seq(1, 6, by=1))

Notice here that each line indicates how an individual’s Grade 6 score differs from their Grade 1 score - Intraindividual Change.

Difference score calculation

Following from the plot, we can create the difference score from verb6 and verb1 by subtraction. We name the new variable verbD.

\[ \Delta verb_{i} = verbD_{i} = verb6_{i} - verb1_{i}\]

#calculating difference score
wiscsub <- wiscsub %>%
  mutate(verbD = verb6-verb1)

head(wiscsub)

##   id verb1 verb6 verbD
## 1  1 24.42 55.64 31.22
## 2  2 12.44 37.81 25.37
## 3  3 32.43 50.18 17.75
## 4  4 22.69 44.72 22.03
## 5  5 28.23 70.95 42.72
## 6  6 16.06 39.94 23.88

Look at the descriptives with the difference score.

describe(wiscsub[,c("verb1","verb6","verbD")])

##       vars   n  mean    sd median trimmed   mad   min   max range skew kurtosis
## verb1    1 204 19.59  5.81  19.34   19.50  5.41  3.33 35.15 31.82 0.13    -0.05
## verb6    2 204 43.75 10.67  42.55   43.46 11.30 17.35 72.59 55.24 0.24    -0.36
## verbD    3 204 24.16  8.15  23.91   23.85  8.09  4.62 50.88 46.26 0.38     0.14
##         se
## verb1 0.41
## verb6 0.75
## verbD 0.57

corr.test(wiscsub[,c("verb1","verb6","verbD")])

## Call:corr.test(x = wiscsub[, c("verb1", "verb6", "verbD")])
## Correlation matrix 
##       verb1 verb6 verbD
## verb1  1.00  0.65  0.14
## verb6  0.65  1.00  0.84
## verbD  0.14  0.84  1.00
## Sample Size 
## [1] 204
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##       verb1 verb6 verbD
## verb1  0.00     0  0.04
## verb6  0.00     0  0.00
## verbD  0.04     0  0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

Of particular interest in questions about intraindividual change is the relation between the “pre-test” score and the amount of intraindividual change. We can look at the bivariate association.

pairs.panels(wiscsub[,c("verb1","verbD")])

Models Models Models

There are two basic models of change …

Auto-Regression Model

The Auto-Regression (AR) model is useful for examining questions about “Change in Interindividual Differences”. The model is written as

\[ verb6_{i} = \beta_{0} + \beta_{1}verb1_{i} + e_{i}\]

We note that this is a model of relations among between-person differences. This model is similar to, but is not a single-subject time-series model (which are also called auto-regression models, but are fit to a different kind of data).

Translating the between-person Auto-Regression Mode into code and fitting to two-occasion data …

#AR Model
ARfit <- lm(formula= verb6 ~ 1 + verb1,
            data=wiscsub,
            na.action=na.exclude)
summary(ARfit)

## 
## Call:
## lm(formula = verb6 ~ 1 + verb1, data = wiscsub, na.action = na.exclude)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -20.2459  -5.8651   0.1781   4.9048  27.9976 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 20.22485    1.99608   10.13   <2e-16 ***
## verb1        1.20117    0.09773   12.29   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.087 on 202 degrees of freedom
## Multiple R-squared:  0.4279, Adjusted R-squared:  0.425 
## F-statistic: 151.1 on 1 and 202 DF,  p-value: < 2.2e-16

The intercept term, \(\beta_{0}\) = 20.22 is the expected value of Verbal Ability at the 2nd occasion, for an individual with a Verbal Ability score = 0 at the 1st occasion. The slope term, \(\beta_{1}\) = 1.20 indicates that for every 1 point difference in Verbal Ability at the 1st occasion, we expect a 1.2 point difference at the 2nd occassion.

We can plot the Auto-Regression model prediction with confidence intervals (CI). The function termplot takes the fitted lm object. The CI bounds are plotted with the se option and residuals with partial.resid option.

termplot(ARfit,se=TRUE,partial.resid=TRUE,
         main="Auto-Regression Model",
         xlab="Verbal Score at Grade 1",
         ylab="Verbal Score at Grade 6")

Note that this code makes use of the lm() model object.

We can also do something similar with the raw data using ggplot.

#making interindividual regression plot
wiscsub %>%
  ggplot(aes(x = verb1, y = verb6)) +
  geom_point() + 
  geom_smooth(method="lm", formula= y ~ 1 + x, 
              se=TRUE, fullrange=TRUE, color="red", linewiedth=2) +
  xlab("Verbal Score at Grade 1") + 
  ylab("Verbal Score at Grade 6") +
  ggtitle("Auto-Regression Model")

## Warning in geom_smooth(method = "lm", formula = y ~ 1 + x, se = TRUE, fullrange
## = TRUE, : Ignoring unknown parameters: `linewiedth`

Note that this code embeds an lm() model within the ggplot function.

Difference Score Model

The Difference Score model is useful for examining questions about “Interindividual Differences and Intraindividual Change”. The model is written as

\[ verbD_{i} = \beta_{0} + \beta_{1}verb1_{i} + e_{i}\]

#Difference score model
DIFfit <- lm(formula = verbD ~ 1 + verb1,
             data=wiscsub,
             na.action=na.exclude)
summary(DIFfit)

## 
## Call:
## lm(formula = verbD ~ 1 + verb1, data = wiscsub, na.action = na.exclude)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -20.2459  -5.8651   0.1781   4.9048  27.9976 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 20.22485    1.99608  10.132   <2e-16 ***
## verb1        0.20117    0.09773   2.058   0.0408 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.087 on 202 degrees of freedom
## Multiple R-squared:  0.02054,    Adjusted R-squared:  0.0157 
## F-statistic: 4.237 on 1 and 202 DF,  p-value: 0.04083

The intercept term, \(\beta_{0}\) = 20.22 is the expected value of the Difference score (Change in Verbal Ability), for an individual with a Verbal Ability score = 0 at the 1st occasion. The slope term, \(\beta_{1}\) = 0.20 indicates that for every 1 point difference in Verbal Ability at the 1st occasion, we expect a 1.2 point difference in the amount of intraindividual change.

The same methods as above can be used to plot the resutls of the Difference Score model.

termplot(DIFfit,se=TRUE,partial.resid=TRUE,
         main="Difference-score Model",
         xlab="Verbal Score at Time 1",
         ylab="Difference in G1 and G6 Verbal Scores")

We can also do something similar using ggplot.

#making interindividual regression plot
wiscsub %>%
  ggplot(aes(x = verb1, y = verbD)) +
  geom_point() + 
  geom_smooth(method="lm", formula= y ~ 1 + x, 
              se=TRUE, fullrange=TRUE, color="red", linewidth=2) +
  xlab("Verbal Score at Grade 1") + 
  ylab("Difference Score") +
  ggtitle("Difference Score Model")

Note that each of these model results plots are regression plots: outcome on the y-axis, predictor on the x-axis.

Conclusion

This session briefly touched on two different basic representations of change: Auto-Regressive Models of Change that are used to examine “Change in Interindividual Differences”; and Difference-Score Models of Change that are used to examine “Interindividual Differences and Intraindividual Change”. It may be noted that it is not possible to distinguish these models by goodness-of-fit tests with only two occasions. However, they can be distinguished when t > 2 repeated measures are available. The interpretations of change from the two models are fundamentally different, so choose carefully. Make sure that the model chosen matches the intended research question.

Citations

R Core Team. (2024). R: A Language and Environment for Statistical Computing. Foundation for Statistical Computing. https://www.R-project.org/

Revelle, W. (2024). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University. https://CRAN.R-project.org/package=psych

Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag. https://ggplot2.tidyverse.org/

Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2023). dplyr: A Grammar of Data Manipulation (Version 1.1.4). https://CRAN.R-project.org/package=dplyr

Wickham, H., Vaughan, D., & Girlich, M. (2024). tidyr: Tidy Messy Data (Version 1.3.1). https://CRAN.R-project.org/package=tidyr