Abstract
We propose, illustrate, and evaluate the use of artificial intelligence (AI) to advance rigorous hypothesis-driven scale validation. Using a qualitative approach, we found that AI provided useful suggestions for measures to be used as criteria in scale validation research. Using data and expert predictions previously used to validate nine scales/subscales, we evaluated AI’s ability to produce precise, psychologically reasonable validity hypotheses. ChatGPT and Gemini produced hypotheses with “inter-trial consistency” similar to experts’ “inter-rater consistency,” and their hypotheses agreed strongly with experts’ hypotheses. Importantly, their hypothesized validity correlations were roughly as accurate (in terms of corresponding with actual validity correlations) as were experts’ hypotheses. Replicating across nine scales/subscales, results are encouraging regarding the use of AI to facilitate a precise hypothesis-driven approach to convergent and discriminant validity in a way that saves time with little-to-no cost in psychological or psychometric quality.
Get full access to this article
View all access options for this article.
