Abstract
Performance appraisal narratives are qualitative descriptions of employee job performance. This data source has seen increased research attention due to the ability to efficiently derive insights using natural language processing (NLP). The current study details the development of NLP scoring for performance dimensions from narrative text and then investigates validity and generalizability evidence for those scores. Specifically, narrative valence scores were created to measure a priori performance dimensions. These scores were derived using bag of words and word embedding features and then modeled using modern prediction algorithms. Construct validity evidence was investigated across three samples, revealing that the scores converged with independent human ratings of the text, aligned numerical performance ratings made during the appraisal, and demonstrated some degree of discriminant validity. However, construct validity evidence differed based on which NLP algorithm was used to derive scores. In addition, valence scores generalized to both downward and upward rating contexts. Finally, the performance valence algorithms generalized better in contexts where the same qualitative survey design was used compared with contexts where different instructions were given to elicit narrative text.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
