Foundations of Response Surface Analysis for Congruence Hypothesis

Overview

Response surface analysis (RSA) or response surface methodology/modeling (RSM), is a statistical technique for exploring the relationship between several explanatory variables and one or more response variables with graphical representation and statistical testing.

This tutorial provides a demonstration of fitting, visualizing, and interpreting a basic RSA model for continuous data in which researchers are interested in the joint effects of two predictors on one outcome variable, along as a comparison of the results obtained using a difference-score approach.

History

RSA first introduced by Box and Wilson in 1951 to determine optimal conditions (the response, or the outcome variable) in chemical investigations and understand the interplay between various factors (the predictors). One way to determine what combinations of factors can produce the optimum outcome is to simply try out every possible combination of factors. A second approach, introduced via RSA, is to select a relatively small number of combinations of the factors (e.g., temperature, pressure, proportions of the reactants) and measure the outcome (i.e., experimental attainment) to estimate a response surface. By doing so, one could identify a sub-region of the whole surface to approximate the optimum outcome using a limited number of combinations of factors located within the sub-region (i.e., “sequential experimentation”).

For example, Trink and Kang (2010) used response surface methods to study the optimization of coagulation tests by estimating the following 3D surface using coagulation pH and Alum dose to predict turbidity removal. By fitting a quadratic model using RSA, they concluded that the removal of maximum amounts of 92.5% turbidity (i.e., optimal outcome) was found with 44 mg/L Alum dose at Coagulation pH 7.6.

Figure 1

Three dimensional response surface for predicting turbidity removal using Coagulation pH and Alum dose (reproduced using the regression coefficients from Figure 6 in Trinh & Kang, 2015).

Similarly, social scientists are sometimes interested in how two variables work together to predict one outcome. One area of research that finds RSA particularly interesting and useful is the investigation of the congruence hypothesis - the congruence (i.e., fit, match, similarity, or agreement) between two predictor variables can produce the optimum condition of an outcome variable.

For instance, Humberg and colleagues (2018) examined whether well-being is maximized when self-perceived ability is congruent with objective ability (i.e., the congruence hypothesis). They constructed the following response surface predicting well-being using objective vocabulary and self-estimated vocabulary. By fitting a quadratic model using RSA, they did not find evidence for the congruence hypothesis. Instead, they found that well-being is dominantly predicted by self-perceived ability - higher self-perceived ability can predict higher well-being at all levels of objective ability.

Figure 2

Three dimensional response surface for predicting well-being using objective vocabulary and self-estimated vocabulary (reproduced using the regression coefficients from Figure 5 in Humberg et al., 2018).

Three dimensional response surface for predicting wellbeing using objective vocabulary and self-estimated vocabulary (Figure 5 reproduced using the regression coefficients from Humberg et al., 2018).

Other example questions where RSA is useful include “the effects of the fit between a job and an individual on job performance” (person-environment fit, Caldwell & O’Reilly, 1990) and “the effect of the difference between the perceived harm of media on others and on oneself on the likelihood of endorsing media use restrictions” (third-person effect, Feng et al., 2017).

Why RSA?

Before RSA, researchers often used difference scores (or its adaptations like absolute values) and similarity measures (e.g., correlation coefficient) to provide statistical testings (e.g., the significance level of the coefficient of the difference score variable in the regression) for the congruence hypothesis. RSA is adapted from chemical investigations to overcome multiple problems with such approaches including reduced reliability, ambiguous interpretation, untested constraints on the regression estimation, and oversimplification through dimension reduction (for reviews, see Kenny et al., 2006; Edwards, 2002). RSA untangles those problems by restoring the one component measure (e.g., difference score or similarity measures) to its initial two predictors and retaining a three-dimensional surface (i.e., two predictors plus one outcome).

This tutorial provides a demonstration of fitting, visualizing, and interpreting a basic RSA model for continuous data, as well as its comparison to the results obtained using difference score.

Basic Response Surface Analysis with the RSA package

The main effort and advantage of extending the examination of congruence hypothesis to RSA is the construction of a surface, and how the features of the surface facilitate the statistical significance tests of such hypotheses. The core of RSA is fitting a polynomial regression model to construct a response surface in a three-dimensional coordinate system:

\[ Z_{i} = b_{0} + b_{1}X_{i} + b_{2}Y_{i} + b_{3}X^{2}_{i} + b_{4}X_{i}Y_{i} + b_{5}Y^{2}_{i} + e_{i} \]

where the outcome variable \(Z\) for individual \(i\) is modeled as a function of the intercept \(b_{0}\), linear \(b_{1}\) and \(b_{2}\), the quadratic \(b_{3}\) and \(b_{5}\) relationships with two predictors \(X\) and \(Y\) for individual \(i\) accordingly, and the interaction between the two predictors \(b_{4}\), and residual error \(e_{i}\).

In the current tutorial, we will use the perceived agency of the self (\(X\)) and the perceived agency of the partner (\(Y\)) in dyadic social interactions to predict happiness (\(Z\)) using intensive longitudinal data collected from a single individual. Here we demonstrate the five steps of a basic RSA model in R.

Load necessary libraries

If you are running macOS 10.9 or later, you will need to install XQuartz in order to use the RSA package. R will prompt you to do so the first time you install RSA, and you can also access XQuartz via their site.

library(psych)            # data descriptives
library(RSA)              # Response Surface Analysis
library(plot3D)           # visualize response surfaces
library(plotly)           # visualize response surfaces
library(scatterplot3d)    # visualize 3D scatter plot
library(rgl)              # visualize 3D scatter plot
library(tidyverse)        # data structures & visualization

Step 1: Specify Congruence Hypothesis

Introduction to iSAHIB data

We use data from the Intraindividual Study of Affect, Health and Interpersonal Behavior (iSAHIB), which is an intensive longitudinal study that collected rich repeated measures of its namesake variables at multiple time-scales. Participants, 150 adults (age 18-89 years, 51% women) from the Pennsylvania State University and surrounding community, completed three 3-week “measurements bursts” during which they reported on their social interactions, thoughts, feelings, and behaviors.

To request the iSAHIB data, please have a look at the study materials and documentary - and send a note to Nilam Ram.

Variables of Interest

Three variables are of interest here in constructing a response surface. First, Self Agency was measured by asking participants to rate how you acted in the social interaction in terms of agency on a 0-100 slider scale, with 0 tagged Submissive and 100 tagged Dominant. Second, Other Agency was measured by asking participants to rate how the other person acted in the corresponding dyadic social interaction on the same slider scale. Both were adapted from Moskowitz and colleagues (2005). Third, Happiness was measured by asking participants to rate how happy do you feel right now? on a 0 to 100 slider with 0 tagged not at all happy and 100 tagged extremely happy.

Research Question

The current demonstration of RSA explores the effect of the fit between the perceived agency of the self (predictor 1) and the perceived agency of the interaction partner (predictor 2) in dyadic interpersonal interactions on happiness (the outcome). We here test the congruent hypothesis - the more congruent between the perceived self agency and the perceived other agency, the happier oneself would be.

#set filepath for data file
filepath <- "https://raw.githubusercontent.com/The-Change-Lab/collaborations/refs/heads/main/iSAHIB_RSA/isahib_RSA.csv"

#read in the .csv file using the url() function
isahib <- read.csv(file=url(filepath), header=TRUE)

#look at the data
head(isahib, 10)
##    SelfAgency_X OtherAgency_Y Happy_Z
## 1            82            22      57
## 2            97            21      66
## 3            55            51      67
## 4            82            13      73
## 5            32            61      94
## 6            14            95      83
## 7            56            52      78
## 8            34            71      74
## 9            50            50      67
## 10           51            48      77
#describe the data
describe(isahib)
##               vars   n  mean    sd median trimmed   mad min max range  skew
## SelfAgency_X     1 866 51.51 15.76   53.0   51.85 14.83   2  97    95 -0.22
## OtherAgency_Y    2 866 55.93 17.07   57.5   56.45 17.05   4  99    95 -0.28
## Happy_Z          3 866 81.41 11.33   84.0   83.44  5.93   5  98    93 -2.91
##               kurtosis   se
## SelfAgency_X     -0.19 0.54
## OtherAgency_Y    -0.14 0.58
## Happy_Z          11.77 0.39
#plot the distribution of each variable and the correlation between them
psych::pairs.panels(isahib)

#plot raw data scatter plot in three dimensions
plot_ly(x = isahib$SelfAgency_X, y = isahib$OtherAgency_Y, z = isahib$Happy_Z, 
        type = "scatter3d", mode = "markers", color = isahib$Happy_Z) %>%
   layout(scene = list(xaxis=list(title = "Self Agency", nticks=10, range=c(0,100)),
                       yaxis=list(title = "Other Agency", nticks=10, range=c(0,100)),
                       zaxis=list(title = "Happiness", nticks=10, range=c(0,100))))

Step 2: Data Centering

Before doing the RSA (i.e., fitting the polynomial regression), it is helpful to center both the predictor variables, \(X\) (SelfAgency_X) and \(Y\) (OtherAgency_Y). The main goal of the centering is to lessen the correlation between the multiplicative terms (i.e., interaction and polynomial terms) and their component variables (both predictors).

There are multiple ways to choose the midpoint for data centering. One common approach is to center predictors on the scale midpoint (Barranti et al., 2017). In our case, this means centering at 50, the midpoint on a 0 to 100 scale. This could facilitate the interpretation of the congruent hypothesis as both predictors \(X\) and \(Y\) are on a commensurable scale and shared the scale midpoint, meaning the visualization of the surface would directly represent the exact numerical (in)congruence between the two predictors.

Other options include to center predictors at the variable-level mean (e.g., \(X-\bar{X}\), \(Y-\bar{Y}\)) or at the dyad-level mean (e.g., \(X-\frac{\bar{X}+\bar{Y}}{2}\), \(Y-\frac{\bar{X}+\bar{Y}}{2}\)). This would complicate the interpretation as each variable deviate from their respective mean or the shared mean. But it can be especially useful in certain cases, for example, the multilevel extension of RSA that will be discussed in following tutorials.

#center predictors with the scale midpoint
isahib <- isahib %>%
  mutate(cSelfAgency_X = SelfAgency_X - 50,
         cOtherAgency_Y = OtherAgency_Y - 50)

Step 3: Fit the Polynomial Regression Model and Plot the Response Surface

To examine the research question on the congruence between perceived self agency and perceived other agency on happiness, we fit data to a polynomial regression model. The polynomial regression model can be fitted using the RSA() function in the RSA package and the lm() or glm() functions in the stats package.

We use both packages to fit the model to demonstrate that they should provide same results (in the current case, same parameter estimates and slightly different standard deviations), but offer different characteristics and extension possibilities to other types of RSA models.

#RSA() function in the RSA package
rsa_happy_agency <- RSA::RSA(
  formula = Happy_Z ~ cSelfAgency_X * cOtherAgency_Y, #specify the outcome and the predictor
  data = isahib,
  scale = FALSE,           # do not rescale the predictors
  na.rm = TRUE,            # remove missing values
  out.rm = FALSE,          # do not remove outliers
  models = "full",         # full polynomial equation
  missing = "listwise",    # "listwise" to exclude NAs, default as FIML
  estimator = "ML",
  se = "standard"          # reproduce parameter estimates with OLS in lm()
  )
## [1] "Computing polynomial model (full) ..."

Note that the formula in RSA package is relatively simplified as this package was developed for its specific use in RSA. One only has to specify the outcome variable on the left side of the equation and the multiplication of the two predictors on the right side.

To specify the model, one should use the models =. In the current example, we use models = "full" for the full polynomial regression as shown in the beginning of the section. The models = “default” will compute all models available besides the absolute difference model (\(Z_{i} = b_{0} + b_{1}(X_{i} - Y_{i}) + e_{i}\)). One could also specify models = "all" to compute and compare all 24 models available in the package.

For all other models that can be specified (e.g., onlyx, onlyy, onlyx2), please refer to the RSA package documentation (Schönbrodt & Humberg, 2023).

#model summary
summary(rsa_happy_agency)
## No model has been specified - showing results for the full second-order polynomial model (<full>).
## RSA output (package version 0.10.6)
## ===========================================
## 
## Are there discrepancies in the predictors (with respect to numerical congruence)?
## (A cutpoint of |Δz| > 0.5 is used)
## 
## ----------------------------
## cOtherAgency_Y < cSelfAgency_X                      congruent 
##                          "18%"                          "49%" 
## cOtherAgency_Y > cSelfAgency_X 
##                          "32%" 
## 
## Is the corresponding global model (full second-order polynomial model) significant?
## ----------------------------
## Test on model significance: R^2 = 0.079, p <.001
## 
## 
## Number of observations: n = 866
## ----------------------------
## 
## 
## Regression coefficients for model <full>
## ----------------------------
##                                      label    est    se ci.lower ci.upper
## Happy_Z~1                               b0 84.301 0.533   83.256   85.346
## Happy_Z~cSelfAgency_X                   b1 -0.079 0.024   -0.126   -0.031
## Happy_Z~cOtherAgency_Y                  b2 -0.063 0.023   -0.108   -0.018
## Happy_Z~cSelfAgency_X2                  b3 -0.004 0.001   -0.007   -0.002
## Happy_Z~cSelfAgency_X_cOtherAgency_Y    b4  0.000 0.001   -0.002    0.002
## Happy_Z~cOtherAgency_Y2                 b5 -0.004 0.001   -0.006   -0.002
##                                        beta   pvalue sig
## Happy_Z~1                             7.443  p <.001 ***
## Happy_Z~cSelfAgency_X                -0.110 p = .001  **
## Happy_Z~cOtherAgency_Y               -0.095 p = .007  **
## Happy_Z~cSelfAgency_X2               -0.125  p <.001 ***
## Happy_Z~cSelfAgency_X_cOtherAgency_Y  0.007 p = .842    
## Happy_Z~cOtherAgency_Y2              -0.147  p <.001 ***
## 
## 
## 
## Surface tests (a1 to a5) for model <full>
## ----------------------------
##   label    est    se ci.lower ci.upper   pvalue sig
## 1    a1 -0.142 0.032   -0.204   -0.079  p <.001 ***
## 2    a2 -0.008 0.002   -0.012   -0.004  p <.001 ***
## 3    a3 -0.016 0.035   -0.084    0.052 p = .648    
## 4    a4 -0.009 0.001   -0.011   -0.006  p <.001 ***
## 5    a5  0.000 0.002   -0.004    0.004 p = .911    
## 
## a1: Linear additive effect on line of congruence? YES
## a2: Is there curvature on the line of congruence? YES
## a3: Is the ridge shifted away from the LOC? NO
## a4: Is there curvature on the line of incongruence? YES
## 
## 
## Location of stationary point for model <full>
## ----------------------------
## cSelfAgency_X = -9.372; cOtherAgency_Y = -7.96; predicted Happy_Z = 84.92
## 
## 
## Principal axes for model <full>
## ----------------------------
##                    label     est      se ci.lower ci.upper   pvalue sig
## Intercept of 1. PA   p10  14.519 157.177 -293.542  322.580 p = .926    
## Slope of 1. PA       p11   2.399  17.740  -32.371   37.168 p = .892    
## Intercept of 2. PA   p20 -11.867  29.442  -69.572   45.837 p = .687    
## Slope of 2. PA       p21  -0.417   3.083   -6.460    5.626 p = .892    
##   --> Lateral shift of first PA from LOC at point (0; 0): C1 =  -4.272 
##   --> Lateral shift of second PA from LOC at point (0; 0): C2 =  20.353

The model output of the RSA() function provides additional metrics, parameter estimates, and statistical tests to interpret the surface and examine the statistical significance of the congruence hypothesis. We will return to the interpretation of those results after visualizing the surface because visualization can first offer a rough interpretation of the congruence hypothesis and largely help interpret the model parameters.

The RSA package also offers the plotRSA() function which should only used when you directly input the regression coefficients. We suggest that the plot() function can be used to plot an RSA object for easy use and demonstration.

#visualizing the surface
plot(rsa_happy_agency,
     rotation = list(x = -63, y = 32, z = 15), # graph position
     surface = "predict",        # surface based on the predicted values of outcome
     param = FALSE,              # TRUE to display RSA hypothesis testing parameters
     coefs = FALSE,              # TRUE to display polynomial coefficients
     axes = c("LOC", "LOIC"),    # display the line of congruence and incongruence
     project = c("LOC", "PA1"),  # projections onto the bottom
     points = list(show = TRUE), # default to display raw scatter points
     hull = FALSE,               # TRUE to display a bag plot on the surface
     # you can also use type = "contour" here to obtain a contour plot
     type = "3d",                # or use type = "interactive" for an interactive surface
     legend = TRUE,              # TRUE to display color legend
     main = ""                   # title
     )

This figure is a three-dimensional representation of the “response surface”, which is an estimation of the joint distribution of the two predictors (cSelfAgency_X and cOtherAgency_Y) and one outcome (Happy_Z).

We can visually tell from the convex surface that this person is the happiest when they perceive agency for both themself and their interaction partner to be around the middle ground (near the middle point of the measurement scale), as the predicted highest value for happiness lies around the center of the surface.

This person would be less happy when both self-perceived agency or perceived agency of their interaction partner is too extreme (going towards either ends of the measurement scale), as indicated by the relatively low values of the predicted outcome happiness when one of the two predictors hits the high (low) end and the other reached the low (high) end.

In addition, we can conceptually tell from the figure that it seems the outcome happiness would be higher when the self-perceived agency is similar to the perceived agency of others while both remain at the mean of the scale (i.e., 0) instead going further towards either ends (i.e., -50 and 50).

Step 4: Parameter Transformation

To use the constructed polynomial regression and the response surface for congruence hypothesis testing, one needs to first transform the obtained parameter estimations from the regression to another set of parameters to better describe features of the change on the response surface.

This process involves the estimation of the stationary point, the first principle axis of the surface, the Line of Congruence (LOC), and the Line of InCongruence (LOIC) (Edwards, 2002; Humberg et al., 2019). The book by Khuri and Cornell (1996) details the rationale and calculation of those features from the response surfaces.

Here we first estimate those parameters in Step 4 and then map them to the congruence hypothesis testing in Step 5.

Line of Congruence

The line of congruence (LOC) is the collection of points where \(Y = X\).

In the plot above, the LOC is shown as the dotted black line. We can see when looking at the projection of the LOC onto the flat surface at the bottom of the cube, that this is the line where \(Y = X\).

The hypothesis for “perfect” congruence is that the outcome Z should be maximized when \(Y = X\), or in regression for when which can be written as \(Y = 0 + 1X\).

#visualizing the surface
plot(rsa_happy_agency,
     rotation = list(x = -63, y = 32, z = 15), # graph position
     surface = "predict",           # surface based on the predicted values of outcome
     param = FALSE,                 # TRUE to display RSA hypothesis testing parameters
     coefs = FALSE,                 # TRUE to display polynomial coefficients
     axes = c("LOC"),               # display the line of congruence and incongruence
     project = c("LOC", "PA1"),     # projections onto the bottom
     points = list(show = FALSE),   # default to display raw scatter points
     hull = FALSE,                  # TRUE to display a bag plot on the surface
     # you can also use type = "contour" here to obtain a contour plot
     type = "3d",                   # or use type = "interactive" for an interactive surface
     legend = TRUE,                 # TRUE to display color legend
     main = ""                      # title
     )

To describe the LOC where \(Y = X\) is on the response surface (i.e., the corresponding hyperbola for LOC), we could substitute this into the polynomial regression:

\[ \begin{split} Z_{i} & = b_{0} + b_{1}X_{i} + b_{2}Y_{i} + b_{3}X^{2}_{i} + b_{4}X_{i}Y_{i} + b_{5}Y^{2}_{i} + e_{i} \\ & = b_{0} + b_{1}X_{i} + b_{2}X_{i} + b_{3}X^{2}_{i} + b_{4}X_{i}X_{i} + b_{5}X^{2}_{i} + e_{i} \\ & = b_{0} + (b_{1}+b_{2})X_{i} + (b_{3}+b_{4}+b_{5})X^{2}_{i} + e_{i} \\ \end{split} \]

First Principal Axis

The first principal axis, along which the upward curvature is the greatest when the surface is convex (i.e., the current example) or saddle-shaped, and which the downward curvature is the least when the surface is concave (Edwards, 2002).

In the plot above, the LOC is shown as the blue diagonal line. To describe the principal axes in the X, Y plane for hypothesis testing (for a detailed transformation to the 2D X, Y plane, or using the contour plot, see Khuri & Cornell, 1996), the equation for the first principal axis is:

\[ Y_i = p_{10} + p_{11}X_i \]

To solve this equation, one first needs to obtain the Hessian matrix of the polynomial equation, then calculate the eigenvectors by solving each eigenvalue of the Hessian matrix. For the purpose of this tutorial we did not present the mathematical process of obtaining \(p_{10}\) and \(p_{11}\) here but anyone interested could consult the book by Khuri and Cornell (1996) for a detailed derivation process (with some adaptations on expression the eigenvectors and stationary point adjustment to correspond to their application in psychology).

Here \(p_{10}\) and \(p_{11}\) are given by:

\[ \begin{split} p_{11} & = \frac{b_{5}-b_{3}+\sqrt[]{(b_{3}-b_{5})^{2}+b_{4}^{2}}}{b_{4}} \\ \\ p_{10} & = Y_{0} - p_{11}X_{0} \end{split} \]

where \((X_{0}, Y_{0})\) is the stationary point of the surface, which is the point at which the slope of the surface is zero in all directions, given by:

\[ \begin{split} X_{0} & = \frac{b_{2}b_{4}-2b_{1}b_{5}}{4b_{3}b_{5}-b_{4}^2} \\ \\ Y_{0} & = \frac{b_{1}b_{4}-2b_{2}b_{3}}{4b_{3}b_{5}-b_{4}^2} \end{split} \]

Line of InCongruence

The line of incongruence (LOIC) is the collection of points where \(Y = -X\).

We now change the project = in plotting to visualize the LOIC as the blue diagonal line. Its associated parabola on the surface is visualized as a blue parabola by changing axes =.

#visualizing the surface
plot(rsa_happy_agency,
     rotation = list(x = -63, y = 32, z = 15), # graph position
     surface = "predict", # surface based on the predicted values of outcome
     param = FALSE, # TRUE to display RSA hypothesis testing parameters
     coefs = FALSE, # TRUE to display polynomial coefficients
     axes = c("LOIC"), # display the line of congruence and incongruence
     project = c("LOIC"), # projections onto the bottom
     points = list(show = FALSE), # default to display raw scatter points
     hull = FALSE, # TRUE to display a bag plot on the surface
     # you can also use type = "contour" here to obtain a contour plot
     type = "3d", # or use type = "interactive" for an interactive surface
     legend = TRUE, # TRUE to display color legend
     main = "" # title
     )

To describe the LOIC where \(Y = -X\) on the response surface (i.e., the corresponding hyperbola for LOIC), we could substitute this into the polynomial regression:

\[ \begin{split} Z_{i} & = b_{0} + b_{1}X_{i} + b_{2}Y_{i} + b_{3}X^{2}_{i} + b_{4}X_{i}Y_{i} + b_{5}Y^{2}_{i} + e_{i} \\ & = b_{0} + b_{1}X_{i} + b_{2}(-X_{i}) + b_{3}X^{2}_{i} + b_{4}X_{i}(-X_{i}) + b_{5}(-X^{2}_{i}) + e_{i} \\ & = b_{0} + (b_{1}-b_{2})X_{i} + (b_{3}-b_{4}+b_{5})X^{2}_{i} + e_{i} \\ \end{split} \]

Step 5: Congruence Hypothesis Testing and Interpretation

The formal statistical significance testing for the congruence hypothesis in psychology has a broad version and a strict version, corresponding to a group of 4 and a group of 6 significance tests (Edwards, 2002; Humberg et al., 2019).

We will first describe the 4 significance tests for the broad sense of congruence and then outline the additional 2 tests for the more strict version of congruence.

Broad Congruence Hypothesis

The first statistical significance testing tests that the first principle axis should not significantly differ from the LOC. Visually, this means that the ridge of the surface (or its projection on the XY plane) should not be significantly different than the diagonal of the XY plane (the LOC, extending from the lower left to the upper right). The properties \(p_{10}\approx0\) and \(p_{11}\approx1\) are the first two tests for congruence effects.

The congruence hypothesis suggests that the surface should predict the highest outcome (i.e., happiness) for people with congruent predictors (\(Y = X\)). In incongruent cases where \(Y \neq X\), the predictors deviate from the LOC. All such deviated combinations of X and Y should render lower outcome Z values than the values obtained with numerically equal predictors. If the first principal axis is significantly different from the LOC, then the highest outcome Z value will be obtained with numerically non-equal predictors \(Y \neq X\).

The estimation and statistical significance testing of \(p_{10}\approx0\) are provided in the last section, Principal Axes for the Model, of the model output using the RSA() function.

In the current demonstration, this first conditions are fulfilled with \(p_{10}\) not significantly different from 0 (\(CI: [-293.542, 322.580]\), \(p=.926\)).

This section only provides significance testing of whether \(p_{11}\) is significantly different from 0 instead of 1. To test whether \(p_{11}\) is significantly from 1, we conduct a z-test:

#z-test the p-value (p11 = 2.399, se = 17.740)
z_p11 <- (2.399 - 1) / 17.740
#which is equivalent to obtaining the z-statistics of p11 against the null hypothesis value of 1
p_value <- 2 * (1 - pnorm(z_p11))
p_value
## [1] 0.9371429

The results indicate that the current demonstration data also fulfilled the second condition of the congruent hypothesis that \(p_{11}\) is not significantly different from 1 (\(p = .937\)).

For the third and forth significance testing of the broad sense of congruent hypothesis, we looked at the line of incongruence and its corresponding parabola on the surface. In the RSA package, they set \(a_{3} = b_{1} - b_{2}\) (\(a_{3}X_{i}\)) and \(a_{4} = b_{3}-b_{4}+b_{5}\) (\(a_{4}X_{i}^2\)) to describe LOIC’s parabola for brevity. For any congruence effects to occur, the parabola we just obtained needs to reach its maximum at the point where \(X = Y\) (perfect congruence) and have an inverted U-shape (Humberg et al., 2019; Parry & Edwards, 1993). Those two conditions correspond to the assumptions of the congruence hypothesis that the outcome \(Z\) is the largest only at the numerically aligned predictors combination (\(Y = X\)) and that more deviation from this point would predict lower outcome (the inverted U-shape parabola).

We thus obtain the third and forth significance testing of the congruence effect:

  1. \(a_{3}\), the slope of tangent line at the vertex (\(X=0\), \(Y=0\)) on the response surface, should not be significantly different from \(0\) as a parabola is maximized at its vertex of which the slope of the tangent line is 0 and

  2. \(a_{4}\) should be significantly negative for the inverted U-shape.

Those results are included in the Surface tests (a1 to a5) for model section of the RSA model output. As in the current demonstration, those two conditions are fulfilled with \(a_{3}\) not significantly different from 0 (\(CI: [-.084, .052]\), \(p = .648\)) and \(a_{4}\) significantly negative (\(CI: [-.011, -.006]\), \(p <.001\)).

In sum, the current demonstration data indicates a congruence effect in a broad sense, meaning that the more similar the perceived self agency and the perceived other agency are, the happier one would report.

To further optimize the equations that satisfy those four conditions for a broad sense of congruence hypothesis, we could get see that only three conditions are required.

\[ \begin{split} p_{11} & = \frac{b_{5}-b_{3}+\sqrt[]{(b_{3}-b_{5})^{2}+b_{4}^{2}}}{b_{4}} \\ \\ & = 1\\ \text{thus} \qquad \quad & \\ \\ \quad b_{4} & + b_{3} - b_{5} = \sqrt[]{(b_{3}-b_{5})^{2}+b_{4}^{2}} \\ \\ b_{3} & = b_{5} \end{split} \]

Similarly,

\[ \begin{split} p_{10} & = Y_{0} - p_{11}X_{0}\\ \\ & = 0 \\ \\ \text{thus} \qquad \quad & \\ \\ \quad \frac{b_{1}b_{4}-2b_{2}b_{3}}{4b_{3}b_{5}-b_{4}^2} & = 1 * \frac{b_{2}b_{4}-2b_{1}b_{5}}{4b_{3}b_{5}-b_{4}^2}\\ \\ b_{1}b_{4}-2b_{2}b_{3} & = b_{2}b_{4}-2b_{1}b_{5}\\ \\ \\ \text{since} \qquad \quad & \\ \\ \quad a_{3} & = b_{1} - b_{2} \\ \\ & = 0 \\ \\ \text{thus} \qquad \quad & \\ \\ \quad b_{3} & = b_{5} \end{split} \]

We could thus consider the four significance tests of broad sense of congruence (i.e., \(p_{10}=0\), \(p_{11}=1\), \(a_{3} = b_{1} - b_{2} = 0\), and \(a_{4} = b_{3}-b_{4}+b_{5} < 0\)) are essentially three tests: \(a_{5} = b_{3} - b_{5} = 0\) (corresponding to \(p_{10}=0\) and \(p_{11}=1\)), \(b_{1} = b_{2}\) (or \(a_{3} = 0\)), and \(b_{3}-b_{4}+b_{5} < 0\) (or \(a_{4} < 0\)). Those are all implemented in the RSA package following (Schönbrodt et al., 2018).

Strict Congruence Hypothesis

We have achieved the broad sense of congruence effects in the demonstration data, while allowing for the fact that the predictors can have “main effects”.

For example, as we can see from the response surface with the demonstration data, the predicted outcome happiness (\(Z=56.85\), \(CI:[47.67, 66.03]\)) is lower when the predictors perceived self agency and perceived others agency are congruent on its high end (e.g., \(X=Y=50\)) than the predicted happiness (\(Z=84.30\), \(CI: [83.25, 85.35]\)) when the two predictors are congruent at the center of the scale (e.g., \(X=Y=0\)). Both sets of predictors are congruent (numerically equal) but the predicted outcome is different with a negative main effect that higher predictors are associated with lower outcome.

Such congruence effects combined with main effects might seem theoretically sound at its first sight: adolescents at stages with higher desired separation and actual separation from family reported higher interpersonal and academic adjustment than themselves at stages with lower desired separation and actual separation from family (Hoffman, 1984).

However, this could also violate one basic assumption for congruence hypothesis that we might observe cases where incongruent predictors can predict better outcome than congruent predictors due to main effects. This is also the case with our demonstration data here that the predicted happiness (\(Z=56.85\), \(CI:[47.67, 66.03]\)) is lower with congruent and higher perceived self agency and other agency (\(X=Y=50\)) than the predicted happiness (\(Z=75.32\), \(CI:[72.70, 77.94]\)) with incongruent and lower predictor combinations (\(X = 30\), \(Y = 20\)).

To avoid such violation of the congruent hypothesis when desired, one could consider adopting the last two significance testing for the strict sense of congruence by looking at the LOC and its corresponding parabola on the response surface. In the RSA package, they set \(a_{1} = b_{1} + b_{2}\) (\(a_{1}X_{i}\)) and \(a_{2} = b_{3}+b_{4}+b_{5}\) (\(a_{2}X_{i}^2\)) to describe LOC’s parabola for brevity. For the strict sense of congruence to occur, the line along all congruent combinations of predictors should be constant. The corresponding statistical testings include two conditions that both \(a_{1}\) and \(a_{2}\) should not be significantly different from 0 so there are no estimated main effects of the predicted surface. Those results can also be found in the Surface tests (a1 to a5) for model section of the RSA model output.

For our current demonstration data, both \(a_{1}\) (\(β = -0.14\), \(CI: [-0.20, -0.08]\), \(p < .001\)) and \(a_{2}\) (\(β = -0.01\), \(CI: [-0.012, -0.004]\), \(p < .001\)) are significantly negative. We did observe the negative linear and curvilinear association between the congruent predictors combination and the outcome, visualized as the corresponding parabola of the LOC on the response surface.

With that being said, we also wanted to note that the broad sense of congruence hypothesis might be more applicable in psychology even with the violation the assumptions of “congruence” as we often expect “main” effects of the combination of the predictors in psychological theories (for example in studying person-environment fit).

In the following table, we report all six hypothesis tests for examining congruence hypothesis using RSA as well as the results for the current empirical example. As shown in the table, LOIC’s parabola has an inverted U-shape (\(a_{4} = -0.01\), \(p < .001\)) and reaches the maximum happiness when perceived self agency is congruent with (numerically equal to) perceived other agency (\(a_{3} = -0.02\), \(p = .648\)). Correspondingly, the first principal axis is not significantly different from the line of congruence (\(p_{10} = 14.52\), \(p = .926\), \(p_{11} = 2.40\), \(p = .937\); or \(a_{5} = 0.00\), \(p = .911\)), meaning that the highest happiness scores, located on the ridge of the surface, are predicted when perceived self agency and perceived other agency are congruent (numerically equal). In addition, the congruent combinations of perceived self agency and perceived other agency are linearly (\(a_{1} = -0.14\), \(p < .001\)) and curvilinearly (\(a_{2} = -0.01\), \(p < .001\)) associated with happiness, meaning that happiness is maximized with a moderate self agency and other agency (\(X = Y = -7\), \(Z = 84.79\)) and either lower or higher congruent combinations of self agency and other agency would predict lower happiness.

Table 1

Results from Response Surface Analysis Examining How the Congruence of Perceived Self Agency and Perceived Other Agency Influences Happiness

Note. SE: Standard Error, CI: Confidence Interval

So far, we have achieved specifying, fitting, visualizing, and interpreting a polynomial regression model with the RSA package for both the broad and strict sense of congruence hypothesis.

Next, we provide a comparison with the results obtained with the more conventional difference score approach.

Comparison with the “Difference Score” Approach

The last section of this tutorial provides a comparison of the results and interpretation of the congruence hypothesis tested with RSA and with the conventional difference score approach to discuss the potential advantages of adopting RSA. We will first fit a difference score model and interpret the results, followed by a comparison of the interpretation we could gather from the model results.

The difference score approach starts with the congruence hypothesis by assuming that a smaller difference between predictor \(X\) and predictor \(Y\) would predict a higher outcome \(Z\). In our demonstration, the assumption is that smaller difference between self agency and other agency can predict higher happiness. Thus, the difference score model assumes a (often linear) function between the absolute difference between self agency and other agency (representing the magnitude of the difference, other approaches include using the square of the difference) and happiness.

For reasons of equivalence, we will specify the model using the RSA package. The RSA() function can fit more than 20 types of model implementation (for details, see Schönbrodt, 2016) for examining congruence hypothesis. In the absolute difference model, such constraints are imposed on the original polynomial regression model:

\[ \begin{array}{c} b_{1} + b_{2} = 0 \\ \\ b_{3} = 0 \\ \\ b_{4} = 0 \\ \\ b_{5} = 0 \end{array} \] We then use the notations in the RSA package for uniformity,

\[ \begin{array}{c} w_{2} = b_{1} & \text{if} \hspace{2mm} X_{i} \ge Y_{i} \hspace{2mm} \\ \\ w_{3} = b_{2} & \text{if} \hspace{2mm} Y_{i} < X_{i} \hspace{2mm} \\ \\ \end{array} \] Thus,

\[ Z_{i} = \begin{cases} b_{0} + w_{2}(X_{i} - Y_{i}) + e_{i} & \text{if} \hspace{2mm} X_{i} \ge Y_{i}\\ b_{0} + w_{3}(Y_{i} - X_{i}) + e_{i} & \text{if} \hspace{2mm} Y_{i} < X_{i} \end{cases} \] while,

\[ w_{2} + w_{3} = 0 \]

which is equivalent to

\[ \begin{split} Z_{i} &= b_{0} + w_{2}(|X_{i} - Y_{i}|) + e_{i} \end{split} \]

In the current tutorial, we first compute the absolute difference score between perceived self agency \(X\) and perceived other agency \(Y\), then fit the difference score model, and visualize the fitted linear relationship between happiness \(Z\) and the absolute difference in perceived self agency and other agency \((|X_{i} - Y_{i}|)\).

#calculate the difference score to create a new variable
isahib <- isahib %>%
  mutate(Agency_diff = SelfAgency_X - OtherAgency_Y)

# scatter plot
plot(Happy_Z ~ abs(Agency_diff),
     data = isahib) 

# with regression on top
lm_happy_agency_diff <- lm(formula = Happy_Z ~ abs(Agency_diff),
                           data = isahib)

abline(lm_happy_agency_diff)

We can visually see that the absolute difference scores between perceived self agency and perceived other agency are predominantly less than 20. The negative slope indicates that higher absolute difference between perceived self agency and perceived other agency is associated with lower happiness.

# difference score model with RSA()
rsa_happy_agency_diff <- RSA::RSA(
  formula = Happy_Z ~ cSelfAgency_X * cOtherAgency_Y, #specify the outcome and the predictor
  data = isahib,
  scale = FALSE,         # do not rescale the predictors
  na.rm = TRUE,          # remove missings
  out.rm = FALSE,        # do not remove outliers
  models = "absdiff",    # absolute difference model
  missing = "listwise",  # "listwise" to exclude NAs, default as FIML
  estimator = "ML",
  se = "standard"        # reproduce parameter estimates with OLS in lm()
  )
## [1] "Computing polynomial model (full) ..."
## [1] "Computing constrained absolute difference model (absdiff) ..."
# even specified as models = "diff", it automatically calculates the intended absolute difference score model (absdiff) and the full polynomial regression model (full)
rsa_diff <- summary(rsa_happy_agency_diff$models$absdiff)
print(rsa_diff)
## lavaan 0.6-19 ended normally after 6 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         7
##   Number of equality constraints                     4
## 
##                                                   Used       Total
##   Number of observations                           866         869
## 
## Model Test User Model:
##                                                       
##   Test statistic                                27.187
##   Degrees of freedom                                 4
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   Happy_Z ~                                           
##     cSlfAgn_X (b1)    0.000       NA                  
##     cOthrAg_Y (b2)    0.000                           
##     W         (w1)    0.000                           
##     W_cSlfA_X (w2)   -0.099    0.022   -4.447    0.000
##     W_cOthA_Y (w3)    0.099    0.022    4.447    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Happy_Z          82.931    0.512  162.085    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Happy_Z         125.403    6.026   20.809    0.000
## 
## Constraints:
##                                                |Slack|
##     b1 - 0                                       0.000
##     b2 - 0                                       0.000
##     w1 - 0                                       0.000
##     w2 - (-w3)                                   0.000

The fitted model and formal hypothesis testing confirmed our visual inspection and accepted the congruence hypothesis. Higher happiness is associated with smaller absolute difference between perceived self agency and perceived other agency (\(β = -0.10\), \(p < .001\)). We could also visualize this absolute difference model as a response surface.

#create the data grid
x_grid <- seq(-50, 50, by = 1) #use the range of the scale
y_grid <- seq(-50, 50, by = 1)
newdat <- expand.grid(x_grid, y_grid)
colnames(newdat) <- c("Agency_diff", "Agency_diff_add")

#predict the z value from the lm() model
pred <- predict(lm_happy_agency_diff, 
                newdata = newdat, 
                se.fit = TRUE)

#output for visualization
z_matrix <- matrix(pred$fit, nrow = length(x_grid), byrow = TRUE)

#plot
plot_ly(x = x_grid, y = y_grid) %>% 
  add_surface(z = z_matrix,
              colorscale = list(c(0,1),c("red","blue"))) %>% 
  add_trace(x = -50,
            y = -50:50,
            z = z_matrix[1,1:101],
            type = "scatter3d",
            mode = "lines",
            line = list(color = "black", width = 20)) %>%
    layout(
      scene = list(
      xaxis=list(title = "Self Agency", nticks=10, range=c(-50,50)),
      yaxis=list(title = "Other Agency", nticks=10, range=c(-50,50)),
      zaxis=list(title = "Happiness", nticks=10, range=c(70,90))))

This response surface again restored the original 3D multivariate structure (\(X-Y-Z\)), rather than compressing it to a bivariate correlation (\((|X-Y|-Z)\)). The regression line we plotted earlier with in the 2D coordinate system is now denoted by the black line.

Why RSA?

While both the absolute difference score approach and the full polynomial model using RSA indicated the existence of congruence hypothesis, RSA offers at least three advantages in examining the congruence hypothesis:

  1. RSA can capture possible nonlinear associations beyond the linear assumption with difference scores;
  2. RSA makes distinctions between the broad and strict sense of congruence, allowing for interpretations of the congruence combined with linear and/or curvilinear “main” effects of the combined predictors. The difference score approach only involves one testing about whether the congruence is statistically significant or not;
  3. In cases of incongruence, multiple hypotheses testing in the five steps of RSA could help diagnose the source of incongruence with the visualization of the surface and facilitate further exploration of the surface (the estimated stationary point as the maximum/minimum of the outcome).

Conclusion

We have achieved fitting, visualizing, and interpreting an RSA model in multiple packages that can either easily be interpreted regarding its congruence hypothesis (RSA()) or offer more flexibility in model specification (lm(), glm(), and sem()), as well as the comparison with the conventional absolute difference score approach. When using it, one should be careful of the assumptions of the model as well as the hypothesis being specified when referring to “congruence” hypothesis. Have fun!

References

Box, G. E. P., & Wilson, K. B. (1951). On the Experimental Attainment of Optimum Conditions. Journal of the Royal Statistical Society: Series B (Methodological), 13(1), 1–38. https://doi.org/10.1111/j.2517-6161.1951.tb00067.x

Caldwell, D. F., & O’Reilly III, C. A. (1990). Measuring person-job fit with a profile-comparison process. Journal of Applied Psychology, 75(6), 648–657. https://doi.org/10.1037/0021-9010.75.6.648

C. Feng, G. (2017). Do difference scores make a difference on the third-person effect? Communications in Statistics - Simulation and Computation, 46(7), 5085–5104. https://doi.org/10.1080/03610918.2016.1143104

Edwards, J. R. (2002). Alternatives to difference scores: Polynomial regression analysis and response surface methodology. In F. Drasgow & N. Schmitt (Eds.), Measuring and analyzing behavior in organizations: Advances in measurement and data analysis (pp. 350–400). Jossey-Bass/Wiley.

Edwards, J. R., & Parry, M. E. (1993). On the use of polynomial regression equations as an alternative to difference scores in organizational research. Academy of Management Journal, 36(6), 1577–1613. https://www.jstor.org/stable/256822

Gelman, A., & Rubin, D. B. (1992). Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7(4), 457–472. https://doi.org/10.1214/ss/1177011136

Hoffman, J. A. (1984). Psychological separation of late adolescents from their parents. Journal of Counseling Psychology, 31(2), 170–178. https://doi.org/10.1037/0022-0167.31.2.170

Humberg, S., Dufner, M., Schönbrodt, F. D., Geukes, K., Hutteman, R., Küfner, A. C. P., van Zalk, M. H. W., Denissen, J. J. A., Nestler, S., & Back, M. D. (2019). Is accurate, positive, or inflated self-perception most advantageous for psychological adjustment? A competitive test of key hypotheses. Journal of Personality and Social Psychology, 116(5), 835–859. https://doi.org/10.1037/pspp0000204

Humberg, S., Nestler, S., & Back, M. D. (2019). Response Surface Analysis in Personality and Social Psychology: Checklist and Clarifications for the Case of Congruence Hypotheses. Social Psychological and Personality Science, 10(3), 409–419. https://doi.org/10.1177/1948550618757600

Humberg, S., Schönbrodt, F. D., Back, M. D., & Nestler, S. (2022). Cubic response surface analysis: Investigating asymmetric and level-dependent congruence effects with third-order polynomial models. Psychological Methods, 27(4), 622–649. https://doi.org/10.1037/met0000352

Kenny, D. A., Kashy, D. A., & Cook, W. L. (2006). Dyadic Data Analysis. Guilford Press.

Khuri, A. I., & Cornell, J. A. (1996). Response Surfaces: Designs and Analyses. Taylor & Francis Group.

Ligges, U., & Martin, M. (2003). scatterplot3d—An R Package for Visualizing Multivariate Data | Journal of Statistical Software. Journal of Statistical Software, 8(11), 1–20. https://doi.org/10.18637/jss.v008.i11

Moskowitz, D. S., & Zuroff, D. C. (2005). Assessing Interpersonal Perceptions Using the Interpersonal Grid. Psychological Assessment, 17(2), 218–230. https://doi.org/10.1037/1040-3590.17.2.218

Murdoch, D., & Adler, D. (2025). rgl: 3D Visualization Using OpenGL (Version 1.3.18). https://CRAN.R-project.org/package=rgl

R Core Team., (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

Revelle, W. (2024). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University. https://CRAN.R-project.org/package=psych

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research (pp. 399–419). Sage Publications, Inc.

Schönbrodt, F. D. (2016, November 25). Testing fit patterns with polynomial regression models. https://doi.org/10.31219/osf.io/ndggf

Schönbrodt, F. D., & Humberg, S. (2023). RSA: An R package for response surface analysis (Version 0.10.6). https://cran.r-project.org/package=RSA

Schönbrodt, F. D., Humberg, S., & Nestler, S. (2018). Testing Similarity Effects with Dyadic Response Surface Analysis. European Journal of Personality, 32(6), 627–641. https://doi.org/10.1002/per.2169

Sievert, C. (2020). Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC. https://doi.org/10.1201/9780429447273

Soetaert, K. (2024). plot3D: Plotting Multi-Dimensional Data (Version 1.4.1). https://CRAN.R-project.org/package=plot3D

Trinh, T. K., & Kang, L. S. (2010). Application of Response Surface Method as an Experimental Design to Optimize Coagulation Tests. Environmental Engineering Research, 15(2), 63–70.

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686