Abstract
For more than a century, self-report inventories have been the traditional method for assessing vocational interests. Little research has examined the use of machine learning techniques, such as natural language processing (NLP), in interest assessment. This paper explores the extent to which natural language on social media can be used to predict individuals’ self-ratings on eight basic interests representing the SETPOINT model: Agriculture, Engineering, Human Resources, Life Science, Management/Administration, Mechanics/Electronics, Media, and Social Science. Leveraging closed- (Linguistic Inquiry and Word Count; LIWC) and open-vocabulary NLP approaches (Latent Dirichlet allocation (LDA) topic modeling), we analyzed 3.2 million Facebook posts from 2,834 participants who completed a 32-item basic interest measure. We found that the convergent validities of these NLP approaches for predicting vocational interest scores (LIWC:
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
