This script reproduces all of the analysis and graphs for the MMRA of the Rohwer data in the paper and also includes other analyses not described there. It is set up as an R script that can be “compiled” to HTML, Word, or PDF using knitr::knit(). This is most convenient within R Studio via the File -> Compile Notebook option.

Load packages and the data (subsetted)

library(car)
library(heplots)

data("Rohwer", package="heplots")
Rohwer2 <- subset(Rohwer, subset=group==2)
rownames(Rohwer2)<- 1:nrow(Rohwer2) # apply rownames

fit the multivariate regression model, test hypotheses

Here, we use linearHypothesis to test the overall null, B = 0

rohwer.mod <- lm(cbind(SAT, PPVT, Raven) ~ n + s + ns + na + ss, data=Rohwer2)
Anova(rohwer.mod)
## 
## Type II MANOVA Tests: Pillai test statistic
##    Df test stat approx F num Df den Df Pr(>F)   
## n   1     0.202     2.02      3     24 0.1376   
## s   1     0.310     3.59      3     24 0.0284 * 
## ns  1     0.358     4.46      3     24 0.0126 * 
## na  1     0.465     6.96      3     24 0.0016 **
## ss  1     0.089     0.78      3     24 0.5173   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(linearHypothesis(rohwer.mod, 
                       c("n", "s", "ns", "na", "ss")), SSP=FALSE)
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df   Pr(>F)    
## Pillai            5    1.0386    2.753     15  78.00 0.001912 ** 
## Wilks             5    0.2431    2.974     15  66.65 0.001154 ** 
## Hotelling-Lawley  5    2.0615    3.115     15  68.00 0.000697 ***
## Roy               5    1.4654    7.620      5  26.00 0.000160 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

View univariate coeffcients:

(rohwer.coef <- coef(rohwer.mod)[-1,])
##        SAT     PPVT    Raven
## n   3.2571  0.06728  0.05935
## s   2.9966  0.36998  0.49244
## ns -5.8591 -0.37438 -0.16402
## na  5.6662  1.52301  0.11898
## ss -0.6227  0.41016 -0.12116

Alternatively, we can view all coefficients and significance values using the tidy() function in the broom package

library(broom)
tidy(rohwer.mod)
##    response        term  estimate std.error statistic   p.value
## 1       SAT (Intercept) -28.46747   25.7190   -1.1069 2.785e-01
## 2       SAT           n   3.25713    1.2958    2.5136 1.848e-02
## 3       SAT           s   2.99658    1.5000    1.9977 5.631e-02
## 4       SAT          ns  -5.85906    1.5450   -3.7924 8.016e-04
## 5       SAT          na   5.66622    1.3384    4.2335 2.537e-04
## 6       SAT          ss  -0.62265    1.1406   -0.5459 5.898e-01
## 7      PPVT (Intercept)  39.69709   12.2687    3.2356 3.298e-03
## 8      PPVT           n   0.06728    0.6181    0.1088 9.142e-01
## 9      PPVT           s   0.36998    0.7155    0.5171 6.095e-01
## 10     PPVT          ns  -0.37438    0.7370   -0.5080 6.157e-01
## 11     PPVT          na   1.52301    0.6385    2.3854 2.464e-02
## 12     PPVT          ss   0.41016    0.5441    0.7538 4.577e-01
## 13    Raven (Intercept)  13.24384    2.6143    5.0660 2.824e-05
## 14    Raven           n   0.05935    0.1317    0.4506 6.560e-01
## 15    Raven           s   0.49244    0.1525    3.2298 3.346e-03
## 16    Raven          ns  -0.16402    0.1570   -1.0445 3.059e-01
## 17    Raven          na   0.11898    0.1360    0.8746 3.898e-01
## 18    Raven          ss  -0.12116    0.1159   -1.0450 3.056e-01

Fit separate univariate models models to prepare Table 1 in the paper

For a publication-quality summary table of the coefficients, it is most convenient to fit the separate models for each response.

rohwer.mod1 <- lm(SAT   ~ n + s + ns + na + ss, data = Rohwer2)
rohwer.mod2 <- lm(PPVT  ~ n + s + ns + na + ss, data = Rohwer2)
rohwer.mod3 <- lm(Raven ~ n + s + ns + na + ss, data = Rohwer2)

summary table for univariate models:

The stargazer package creates nice tables in HTML, LaTeX or just ASCII text. There are many, many options.

library(stargazer, quietly=TRUE)
stargazer(rohwer.mod1, rohwer.mod2, rohwer.mod3,
          type = "text", digits=2,
          omit.table.layout = "#",
          star.cutoffs = c(0.05, 0.01, 0.001),
          omit = "Constant",
          title="Univariate regression models for Rohwer data")
## 
## Univariate regression models for Rohwer data
## =============================================================
##                                     Dependent variable:      
##                               -------------------------------
##                                   SAT       PPVT      Raven  
## -------------------------------------------------------------
## n                                3.26*      0.07      0.06   
##                                 (1.30)     (0.62)    (0.13)  
##                                                              
## s                                3.00       0.37     0.49**  
##                                 (1.50)     (0.72)    (0.15)  
##                                                              
## ns                             -5.86***     -0.37     -0.16  
##                                 (1.54)     (0.74)    (0.16)  
##                                                              
## na                              5.67***     1.52*     0.12   
##                                 (1.34)     (0.64)    (0.14)  
##                                                              
## ss                               -0.62      0.41      -0.12  
##                                 (1.14)     (0.54)    (0.12)  
##                                                              
## -------------------------------------------------------------
## Observations                      32         32        32    
## R2                               0.56       0.35      0.31   
## Adjusted R2                      0.47       0.23      0.18   
## Residual Std. Error (df = 26)    25.67      12.25     2.61   
## F Statistic (df = 5; 26)        6.54***     2.85*     2.32   
## =============================================================
## Note:                           *p<0.05; **p<0.01; ***p<0.001

Visualize the MMRA model

Set colors to use in the following plots. They correspond to the E ellipse and the H ellipses for all hypotheses.

cols <-  c("red", "blue", "black", "darkgreen", "darkcyan", 
           "magenta", "gray20")

HE plot, with ellipse to test all 5 regressors

hyp <- list("Regr" = c("n", "s", "ns", "na", "ss"))
heplot(rohwer.mod, 
       hypotheses = hyp,
       fill=TRUE, fill.alpha=0.1, cex=1.5, cex.lab=1.4, col=cols,
       lwd=c(1,3))

View all pairs of HE plots

pairs(rohwer.mod, hypotheses=hyp, col=cols, var.cex=4,
      fill=TRUE, fill.alpha=0.1, cex=1.5)

generate a bivariate coefficient plot

This plot shows bivariate confidence regions for the each of the coefficients. Those that don’t overlap (0,0) are shaded.

coefplot(rohwer.mod, lwd=2, fill=(1:5) %in% 3:4, 
         main="Bivariate 95% coefficient plot for SAT and PPVT", level=0.95)

analyze using canonical correlations:

Canonical correlation analysis finds the linear combinations of the Y variables most highly correlated with linear combinations of the _X_s. Only one canonical correlation is significant here, and accounts for 73% of the shared variance between the predictors and responses, but 100% is accounted for in two dimensions.

library(candisc)
X <- as.matrix(Rohwer[1:37, 6:10]) # Extract X variables for High SES students
Y <- as.matrix(Rohwer[1:37, 3:5])  # Extract Y variables for High SES students
cc <- cancor(X, Y, set.names=c("PA", "Ability"))
cc
## 
## Canonical correlation analysis of:
##   5   PA  variables:  n, s, ns, na, ss 
##   with    3   Ability  variables:  SAT, PPVT, Raven 
## 
##     CanR  CanRSQ   Eigen percent    cum                          scree
## 1 0.7165 0.51341 1.05512  72.829  72.83 ******************************
## 2 0.4906 0.24071 0.31702  21.882  94.71 *********                     
## 3 0.2668 0.07117 0.07662   5.289 100.00 **                            
## 
## Test of H0: The canonical correlations in the 
## current row and all that follow are zero
## 
##    CanR WilksL    F df1  df2 p.value
## 1 0.717  0.343 2.54  15 80.5   0.004
## 2 0.491  0.705 1.43   8 60.0   0.203
## 3 0.267  0.929 0.79   3 31.0   0.508

visualize canonical correlation output via HE plot

This HE plot shows the representation of the regression relations in the space of the first two canonical variates for the Y variables, Ycan1 and Ycan2. The X predictors are shown by their H ellipses in this space. The relationship of the Y response variables to their canonical scores is shown by the arrows, reflecting their correlations (structure coefficients) with the canonical variables.

Note we can also visualize the relationship of joint hypothesis to individual ones in canonical space.

heplot(cc, hypotheses=list("na+ns"=c("na", "ns")),
       fill = TRUE, fill.alpha=0.1, 
       label.pos = c(3, rep(1,5), .1),
       cex=1.4, var.cex=1.25, var.lwd=3, var.col="black")

## Vector scale factor set to  1