Abstract
Gender and ethnicity are increasingly studied topics within I-O psychology, helpful for understanding the composition of collectives, experiences of marginalized group members, and differences in outcomes between demographics and capturing diversity at higher levels. However, the absence of explicit, structured, demographic information online makes applying these research questions to Big Data sources challenging. We highlight how deep neural networks can be used to infer demographics based on people's names, which are commonly found online (e.g., social media profiles, employee pages, and membership rosters), using broad international data to train and evaluate the effectiveness of these models and find that validity coefficients meet minimum reliability thresholds at the individual level (rgender = .91, rethnicity = .80) highlighting their ability to contextualize and facilitate Big Data research. Using empirical data extracted from databases, websites, and mobile apps, we highlight how these models can be applied to large organizational data sets by presenting illustrative demonstrations of research questions that incorporate the information provided by the model. To promote broader usage, we offer an online application to infer demographics from names without requiring advanced programming knowledge.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
