OK, sorry, I haven’t been updating my blog lately because I’ve been working on my upcoming anthology (an excellent 2023 Christmas gift), so I’ve been falling behind. For my upcoming book, I think I’ve dug up most of my most significant 3,000-word “Cousin Marriage Conundrum”–style essays. But what about my best one-sentence or one-paragraph quotes? Your suggestions are more than welcome.
From MANKIND QUARTERLY:
Emil O.W. Kirkegaard Ulster Institute of Social Research, London, UK John G.R. Fuerst* Cleveland State University, USA University of Maryland Global Campus,
Corresponding author: firstname.lastname@example.org
We used data from the Adolescent Brain Cognitive Development Study to create a multimodal MRI-based predictor of intelligence. We applied the elastic net algorithm to over 50,000 neurological variables. We find that race can confound models when a multiracial training sample is used, because models learn to predict race and use race to predict intelligence. When the model is trained on non-Hispanic Whites only, the MRI-based predictor has an out-of-sample model accuracy of r = .51, which is 3 to 4 times greater than the validity of whole brain volume in this dataset. This validity generalized across the major socially-defined racial/ethnic groupings (White, Black, and Hispanic). There are race gaps on the predicted scores, even though the model is trained on White subjects only. This predictor explains about 37% of the relation between both the Black and Hispanic classification and intelligence.
Whole brain volume, as measured by Magnetic Resonance Imaging (MRI), and intelligence are known to correlate (Haier, 2016; Pietschnig et al., 2022). This is as expected based on what is known about the evolution of human brain size and intelligence (DeSilva et al., 2021). However, it is also known that whole brain volume is only a small part of the explanation for human differences in intelligence. For instance, a large study of British adults using Structural Equation Modeling (SEM) found a latent correlation of about r = .28 between whole brain volume and intelligence (Cox et al., 2019). As this result is based on SEM, it avoids random measurement error, a major confounding factor in previous analyses of the relations between brain size and intelligence (Gignac & Bates, 2017). Taking this value as the current best estimate and using the usual path squaring rule, this equates to about 10% of the variance in intelligence explained by whole brain volume alone. Multi-modal and fine-grained analyses are needed to better explain intelligence differences (Kievit et al., 2012). We searched for the best published multivariable model for intelligence. One of the best predictive models so far published was able to achieve a correlation of r = .67 using functional MRI data in the Human Connectome Project dataset (Sripada et al., 2018). However, this study has still not been formally published after four years. There are two additional studies claiming very high accuracies. First, Wang et al. (2015) used a small sample of children (N = 164) and found a cross-validated correlation of r = .72. Given overfitting issues with small samples, and the inability of other research teams to achieve such results in larger samples, it may not replicate. Second, a team of Chinese researchers reported cross-validated correlations of r = .45 and r = .51 across multiple smaller samples in two studies (Jiang et al., 2020a,b). In contrast to these results, teams who took part in the 2019 ABCD Neurocognitive Prediction Challenge to build a structural-MRI predictor of fluid intelligence achieved much weaker results: r’s = .15, .14, .28, .21, .11, and .19. (Brueggeman et al., 2019; Li et al., 2019; Ren et al., 2019; Valverde et al., 2019; Wlaszczyk et al., 2019; Zhang-James et al., 2019). These results were based on a proper hold-out sample that the researchers did not have access to while designing and training their models. As such, it seems like overfitting is a serious concern in this literature. Clearly much more research is needed. In addition to intelligence, brain volume is also known to correlate with racial or ancestry group membership (Rushton & Ankney, 2009). However, it is not known to what extent race is related to multimodal MRI-predictors of intelligence. Additionally, it is not known if multimodal MRI-predictors of intelligence are equally predictive across racial groups. As it is, some research reports that the validity of specific MRI variables differs across socially-identified races (e.g., Zahodne et al., 2015). So, a multimodal MRI-predictor trained in one group may not work as well
MANKIND QUARTERLY 2023 63:3 376 in another. Furthermore, we wanted to move beyond whole brain volume towards a machine learning (ML) algorithm trained for optimal predictive validity of intelligence. In this paper, we apply machine learning to a rich set of MRI variables to create MRI-based predictors of intelligence. We then examine the relationship between the MRI-based predictors, intelligence, and socially-identified race. Based on past research, we make the following predictions: 1. Predictors based on all available MRI modalities will achieve validities for intelligence that are significantly higher than that for brain volumes alone. 2. MRI-based predictors will show race differences, and these MRI-based differences will statistically mediate the association between race and intelligence. 3. Validities for multimodal predictors, as for brain volume (Farias et al., 2012; Jensen & Johnson, 1994), will generalize across races, underscoring the neuropsychological unity of mankind.
Data & Methods Data source We used the 3.01 dataset of the ABCD study (Adolescent Brain Cognitive Development; https://nda.nih.gov/abcd). This is a publicly funded nationally representative sample of nearly 12,000 American children who were first interviewed at age 9-10. It contains high-quality microarray genomic data, multi-modal MRI data, psychological data, as well as a wide range of surveys (Casey et al., 2018; Hagler et al., 2019; Jernigan et al., 2018). Details about sampling are provided by Garavan et al. (2018). The sample is 52.2% male and 47.8% female. We used a set of binary questions to determine socially-identified race/ethnicity (from now on: “social race”). To be clear, we call this variable social race to distinguish it from morphologically or genetically-identified race or ancestry (e.g., Fuerst, 2015; Sesardic, 2010). In this case, each child’s primary guardian, most often the mother, was asked whether the child belonged to a series of racial groups (“What race do you consider the child to be? Please check all that apply.”): White/European American, Black/African American, American Indian, Alaska Native, Native Hawaiian, Guamanian, Samoan, Other Pacific Islander, Asian Indian, Chinese, Filipino, Japanese, Korean, Vietnamese, Other Asian, and Other. The primary guardian was also asked if the child was of Hispanic ethnicity (i.e., Spanish cultural origin). These questions are based on the 1997 Office of Management and Budget guidelines, according to which race may be thought of in terms of ancestry in addition to culture and explicitly refers to the geographic origin of one’s ancestors (Office of Management and Budget, 1997). While one can code the data in many ways, we were interested in relatively large groupings for our models. As such, we coded the data so we have categories for non-Hispanic Whites and non-Hispanic Blacks, an inclusive category for Hispanics (with any racial combination), and an Other category, which contains everybody else, including individuals who were identified as belonging to multiple racial categories.
Table 1 shows the sample composition in terms of social race. Table 1. Sample composition and mean intelligence by social race. SD = standard deviation. The total sample mean IQ was 95.4 with an SD of 16.9. Group Count Percent IQ Mean ± SD White 6179 52.0 100.0 ± 15.0 Hispanic 2411 20.3 90.6 ± 15.9 Black 1782 15.0 81.2 ± 16.4 Other 1454 12.2 97.3 ± 17.0