This file is one of a series of supplemental explanatory documents for the study “Determining optimal parameters of the Self Referent Encoding Task: A large-scale examination of self-referent cognition and depression”. Data and code are located at doi: 10.18738/T8/XK5PXX, and websites with visual R Markdown explanations are located and navigable on the paper’s github pages website.
This file plots theoretical data distributions based on R’s dnorm() and dbinom() functions on top of the observed data for the three samples. If you are viewing this as an HTML file, and wish to see the code, please download the R Markdown file from the Texas Data Repository.
The plots generated herein are made using ggplot2
and compare each sample’s data to the Negative Binomial (in red) and Normal (in blue) curves that a theoretical density distribution with the characteristics of each sample would look like. These plots are intended to illustrate why a negative binomial distribution was used for analyses. The tip of the negative binomial curve closest to 0 for the adolescent sample (sample 3) has been interpolated using the built-in R function spline() to smooth the curve. Note that this sample has a different measure (CDI:S rather than CES-D) and therefore a different scale (scores on the CDI:S range from 0-20; on the CES-D from 0-60). Plots extend beyond 0 to show the Gaussian distribution’s extension beyond possible scores; the negative binomial distribution does not include negative numbers.
Create theoretic distribution lines to compare actual distributions.
## Warning: Removed 40 rows containing missing values (geom_path).