This file is one of a series of supplemental explanatory documents for the study “Determining optimal parameters of the Self Referent Encoding Task: A large-scale examination of self-referent cognition and depression”. Data and code are located at doi: 10.18738/T8/XK5PXX, and websites with visual R Markdown explanations are located and navigable on the paper’s github pages website.
This file uses the models created in beset
(refer to the file creating models), loads them, and then prints summary statistics, and plots. The plots are included in the paper, as are many of the summary statistics. Further info on those summary statistics and what can be gleaned from the models can be gleaned by reading the documentation for beset
– help("summary.beset")
.
If you are viewing this as an HTML file, and wish to see the code, please download the R Markdown file from the Texas Data Repository.
library(beset); library(tidyverse)
load("utmodel-all.Rdata")
load("mtmodel-all.Rdata")
load("adomodel-all.Rdata")
load("model_summaries.Rdata")
The “best model” was the one chosen with the fewest predictors, which still fell within one standard error of the absolute best model.
In order to make these more comparable between models, we’ve standardized them based on the null model (with 0 predictors). Lower (standardized) cross-entropy error indicates a better model fit.
This is comparable to R2, but explains deviance rather than variance. Higher R2D indicates more deviance explained and thus a better model.
Note on the R-squared estimate: Deviance explained:
beset
also identifies the R2D (deviance explained) for the models. To quote from the function’s explanations, it “calculates R-squared as the fraction of deviance explained, which … generalizes to exponential family regression models. [It] also returns a predictive R-squared for how well the model predicts responses for new observations and/or a cross-validated R-squared with a bootstrapped confidence interval.”
The model has also determined the estimates of the size of the negative binomial function, theta. These are printed along with the model summary.
=======================================================
Best Model:
dep ~ num.neg.endorsed + v.positive + szr
16 Nearly Equivalent Models:
dep ~ num.neg.endorsed + zr.negative + v.positive
dep ~ num.neg.endorsed + numSRnegrecalled + v.positive
dep ~ num.neg.endorsed + numposrecalled + v.positive
dep ~ num.neg.endorsed + v.negative + v.positive
dep ~ num.neg.endorsed + zr.positive + v.positive
...
+ 11 more
...
Deviance Residuals:
Min 1Q Median 3Q Max
-3.8128 -0.7494 -0.0423 0.4748 2.2857
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.652969 0.074644 35.541 < 2e-16 ***
num.neg.endorsed 0.048017 0.005188 9.255 < 2e-16 ***
v.positive -0.132866 0.017212 -7.719 1.17e-14 ***
szr -0.464225 0.192406 -2.413 0.0158 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for Negative Binomial(6.1336) family taken to be 1)
Log-likelihood: -1437 on 5 Df
AIC: 2883.6
Number of Fisher Scoring iterations: 1
Train-sample R-squared = 0.45, Test-sample R-squared = 0.43
Cross-validated R-squared = 0.44, 95% CI [0.42, 0.46]
=======================================================
=======================================================
Best Model:
dep ~ num.neg.endorsed + v.negative + v.positive + st0
34 Nearly Equivalent Models:
dep ~ num.pos.endorsed + zr.negative + v.negative + st0
dep ~ zr.negative + v.negative + v.positive + st0
dep ~ numSRnegrecalled + v.negative + v.positive + st0
dep ~ num.pos.endorsed + numSRnegrecalled + v.negative + st0
dep ~ num.pos.endorsed + num.neg.endorsed + v.negative + st0
...
+ 29 more
...
Deviance Residuals:
Min 1Q Median 3Q Max
-2.74733 -0.88922 -0.09719 0.41308 2.42986
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.34900 0.20878 11.251 < 2e-16 ***
num.neg.endorsed 0.03422 0.01309 2.614 0.00896 **
v.negative 0.13464 0.06529 2.062 0.03918 *
v.positive -0.10619 0.04256 -2.495 0.01259 *
st0 1.43876 0.56658 2.539 0.01110 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for Negative Binomial(2.337) family taken to be 1)
Log-likelihood: -746.9 on 6 Df
AIC: 1505.8
Number of Fisher Scoring iterations: 1
Train-sample R-squared = 0.43, Test-sample R-squared = 0.29
Cross-validated R-squared = 0.41, 95% CI [0.4, 0.43]
=======================================================
=======================================================
Best Model:
dep ~ zr.negative + v.negative + v.positive + a
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8491 -1.1893 -0.2880 0.4967 3.1040
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.71379 0.31652 5.415 6.15e-08 ***
zr.negative 1.97101 0.46791 4.212 2.53e-05 ***
v.negative 0.43193 0.06897 6.263 3.78e-10 ***
v.positive -0.24173 0.06231 -3.879 0.000105 ***
a -0.52008 0.16460 -3.160 0.001580 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for Negative Binomial(3.4068) family taken to be 1)
Log-likelihood: -384.7 on 6 Df
AIC: 781.39
Number of Fisher Scoring iterations: 1
Train-sample R-squared = 0.43, Test-sample R-squared = 0.41
Cross-validated R-squared = 0.4, 95% CI [0.36, 0.43]
=======================================================
We also wrote functions (the code for which can be found in the full .Rmd document) to compare models for every possible pairing of the predictors. By plotting these similarly to a correlation table, we are able to note specific patterns in our models. We can then ask questions based on those patterns.
Thus, the following images plot, per-sample, the \(R_D^2\) for every two-predictor model. (Note that these are not the best models within 1SE—however, those models are multidimensional and more difficult to visualize.) These plots, stitched into one figure, are included in the paper.
In order to compare the consistency of the predictors chosen by the best models, we compared \(R_D^2\) across samples for the best models described above. Given that \(R_D^2\) is affected by the number of parameters, we compared the best models with four predictors-the minimum required for the MTurk and adolescent samples-for all three samples; the best model with four predictors for the college students included the three predictors described above, and the relative starting point (\(zr\)) for negative words.
For the sample [college students] using the model parameters from the [MTurkers] model:
Compared to other models with 4 predictors (j = 3060 options), the model with formula
--> dep ~ num.neg.endorsed + v.negative + v.positive + st0
is the 56th best model, at the 98th percentile.
The model has a deviance R-squared of 0.446, which is 0.009 less than the best model and 0.445 worse than the worst model.
For the sample [college students] using the model parameters from the [Adolescents] model:
Compared to other models with 4 predictors (j = 3060 options), the model with formula
--> dep ~ zr.negative + v.negative + v.positive + a
is the 140th best model, at the 95th percentile.
The model has a deviance R-squared of 0.44, which is 0.015 less than the best model and 0.439 worse than the worst model.
For the sample [MTurkers] using the model parameters from the [college students] model:
Compared to other models with 4 predictors (j = 3060 options), the model with formula
--> dep ~ num.neg.endorsed + zr.negative + v.positive + szr
is the 242th best model, at the 92th percentile.
The model has a deviance R-squared of 0.405, which is 0.023 less than the best model and 0.401 worse than the worst model.
For the sample [MTurkers] using the model parameters from the [Adolescents] model:
Compared to other models with 4 predictors (j = 3060 options), the model with formula
--> dep ~ zr.negative + v.negative + v.positive + a
is the 21th best model, at the 99th percentile.
The model has a deviance R-squared of 0.418, which is 0.009 less than the best model and 0.415 worse than the worst model.
For the sample [adolescents] using the model parameters from the [college students] model:
Compared to other models with 4 predictors (j = 3060 options), the model with formula
--> dep ~ num.neg.endorsed + zr.negative + v.positive + szr
is the 51th best model, at the 98th percentile.
The model has a deviance R-squared of 0.38, which is 0.047 less than the best model and 0.376 worse than the worst model.
For the sample [adolescents] using the model parameters from the [MTurkers] model:
Compared to other models with 4 predictors (j = 3060 options), the model with formula
--> dep ~ num.neg.endorsed + v.negative + v.positive + st0
is the 80th best model, at the 97th percentile.
The model has a deviance R-squared of 0.378, which is 0.048 less than the best model and 0.374 worse than the worst model.
That these models at worst indicated 0.05 less deviance explained is indicative of a strong degree of consistency. (The worst models chosen with four predictors had R_D^2 of less than 0.001.)
Based on trends highlighted in the plots of the models, we ran small, specific comparisons within each sample. For example, we hypothesized that endorsements of positive and negative word alone were substantially better at predicting depression symptoms than so-called negative/positive processing biases (e.g., the ratio of the number of negative words endorsed to the total number of words endorsed). We tested these using cross-validated R2D calculated using the r2d() function from the beset
package.
These comparisons can be seen in the file on comparing specific models.