Abstract
While natural language documents, such as intervention transcripts and participant writing samples, can provide highly nuanced insights into educational and psychological constructs, researchers often find these materials difficult and expensive to analyze. Recent developments in machine learning, however, have allowed social scientists to harness the power of artificial intelligence for complex data categorization tasks. One approach, supervised learning, supports high-performance categorization yet still requires a large, hand-labeled training corpus, which can be costly. An alternative approach—zero- and few-shot classification with pretrained large language models—offers a cheaper, compelling alternative. This article considers the application of zero-shot and few-shot classification in educational research. We provide an overview of large language models, a step-by-step tutorial on using the Python openai package for zero-shot and few-shot classification, and a discussion of relevant research considerations for social scientists.
Get full access to this article
View all access options for this article.
