Large Language Models for Text Classification: From Zero-Shot Learning to Instruction-Tuning

Abstract

Large language models (LLMs) have tremendous potential for social science research as they are trained on vast amounts of text and can generalize to many tasks. We explore the use of LLMs for supervised text classification, specifically the application to stance detection, which involves detecting attitudes and opinions in texts. We examine the performance of these models across different architectures, training regimes, and task specifications. We compare 10 models ranging in size from tens of millions to hundreds of billions of parameters and test four distinct training regimes: Prompt-based zero-shot learning and few-shot learning, fine-tuning, and instruction-tuning, which combines prompting and fine-tuning. The largest, most powerful models generally offer the best predictive performance even with little or no training examples, but fine-tuning smaller models is a competitive solution due to their relatively high accuracy and low cost. Instruction-tuning the latest generative LLMs expands the scope of text classification, enabling applications to more complex tasks than previously feasible. We offer practical recommendations on the use of LLMs for text classification in sociological research and discuss their limitations and challenges. Ultimately, LLMs can make text classification and other text analysis methods more accurate, accessible, and adaptable, opening new possibilities for computational social science.

Keywords

stance detection large language models text classification natural language processing computational social science

Get full access to this article

View all access options for this article.

References

Abid

Farooqi

Zou

2021. “Persistent Anti-Muslim Bias in Large Language Models.” Pp. 298-306 in Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, Virtual Event USA: ACM.

Aldayel

Magdy

2021. “Stance Detection on Social Media: State of the Art and Trends.” Information Processing & Management 58(4): 102597.

Allaway

McKeown

2020. “Zero-Shot Stance Detection: A Dataset and Model using Generalized Topic Representations.” Pp. 8913-8931 in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics.

Argyle

L. P.

Busby

E. C.

Fulda

Rytting

Wingate

2023. “Out of One, Many: Using Language Models to Simulate Human Samples.” Political Analysis 31(3):337–351.

Augenstein

Rocktäschel

Vlachos

Bontcheva

2016. “Stance Detection with Bidirectional Conditional Encoding.” Pp. 876-885 in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas: Association for Computational Linguistics.

Backstrom

Kleinberg

Lee

Danescu-Niculescu-Mizil

2013. “Characterizing and Curating Conversation Threads: Expansion, Focus, Volume, Re-entry.” Pp. 13-22 in Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, ACM.

Bahl

L. R.

Jelinek

Mercer

R. L.

1983. “A Maximum Likelihood Approach to Continuous Speech Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-5(2): 179–190.

Bail

C. A

. 2024. “Can Generative AI Improve Social Science?” Proceedings of the National Academy of Sciences 121(21): e2314021121. Publisher: Proceedings of the National Academy of Sciences.

Barberá

Boydstun

A. E.

Linn

McMahon

Nagler

2021. “Automated Text Classification of News Articles: A Practical Guide.” Political Analysis 29(1):19–42.

10.

Bender

E. M.

Gebru

McMillan-Major

Shmitchell

2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” Pp. 610-623 in FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Canada: ACM.

11.

Bengio

Ducharme

Vincent

Jauvin

2003. “A Neural Probabilistic Language Model.” Journal of Machine Learning Research 3: 1137–55.

12.

Berry

Taylor

S. J.

2017. “Discussion Quality Diffuses in the Digital Public Square.” Pp. 1371-1380 in Proceedings of the 26th International Conference on World Wide Web - WWW ’17, Perth, Australia: ACM Press.

13.

Bestvater

S. E.

Monroe

B. L.

2023. “Sentiment is Not Stance: Target-Aware Opinion Classification for Political Text Analysis.” Political Analysis 31(2): 235–256.

14.

Bianchi

Kalluri

Durmus

Ladhak

Cheng

Nozza

Hashimoto

Jurafsky

Zou

Caliskan

2023. “Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale.” Pp. 1493-1504 in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, New York, NY, USA: Association for Computing Machinery.

15.

Bommasani

Hudson

D. A.

Adeli

Altman

Arora

von Arx

Bernstein

M. S.

Bohg

Bosselut

Brunskill

Brynjolfsson

Buch

Card

Castellon

Chatterji

Chen

Creel

Davis

J. Q.

Demszky

Donahue

Doumbouya

Durmus

Ermon

Etchemendy

Ethayarajh

Fei-Fei

Finn

Gale

Gillespie

Goel

Goodman

Grossman

Guha

Hashimoto

Henderson

Hewitt

D. E.

Hong

Hsu

Huang

Icard

Jain

Jurafsky

Kalluri

Karamcheti

Keeling

Khani

Khattab

Koh

P. W.

Krass

Krishna

Kuditipudi

Kumar

Ladhak

Lee

Leskovec

Levent

X. L.

Malik

Manning

C. D.

Mirchandani

Mitchell

Munyikwa

Nair

Narayan

Narayanan

Newman

Nie

Niebles

J. C.

Nilforoshan

Nyarko

Ogut

Orr

Papadimitriou

Park

J. S.

Piech

Portelance

Potts

Raghunathan

Reich

Ren

Rong

Roohani

Ruiz

Ryan

Ré

Sadigh

Sagawa

Santhanam

Shih

Srinivasan

Tamkin

Taori

Thomas

A. W.

Tramèr

Wang

R. E.

Wang

Xie

S. M.

Yasunaga

You

Zaharia

Zhang

Zheng

Zhou

Liang

2022. “On the Opportunities and Risks of Foundation Models.” arXiv:2108.07258 [cs].

16.

Bonikowski

Luo

Stuhler

2022. “Politics As Usual? Measuring Populism, Nationalism, and Authoritarianism in U.S. Presidential Campaigns (1952–2020) with Neural Language Models.” Sociological Methods & Research 51(4): 1721–87.

17.

Brown

P. F.

Cocke

Della Pietra

S. A.

Della Pietra

V. J.

Jelinek

Lafferty

J. D.

Mercer

R. L.

Roossin

P. S.

1990. “A Statistical Approach to Machine Translation.” Computational Linguistics 16(2): 79–85.

18.

Brown

P. V.

deSouza

P. F.

Mercer

R. L.

Della Pietra

V. J.

Lai

J. C.

1992. “Class-Based N-gram Models of Natural Language.” Computational Linguistics 18(4): 14.

19.

Brown

Mann

Ryder

Subbiah

Kaplan

J. D.

Dhariwal

Neelakantan

Shyam

Sastry

Askell

Agarwal

Herbert-Voss

Krueger

Henighan

Child

Ramesh

Ziegler

Winter

Hesse

Chen

Sigler

Litwin

Gray

Chess

Clark

Berner

McCandlish

Radford

Sutskever

Amodei

2020. “Language Models are Few-Shot Learners.” Pp. 1877-1901 in Advances in Neural Information Processing Systems, volume 33.

20.

Buolamwini

Gebru

. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” Pp. 1-15 in Proceedings of Machine Learning Research, volume 81.

21.

Burnham

. 2024. “Stance Detection: A Practical Guide to Classifying Political Beliefs in Text.” Political Science Research and Methods FirstView: 1–18.

22.

Caruana

. 1997. “Multitask Learning.” Machine Learning 28(1): 41–75.

23.

Chae

Davidson

. 2025. “Replication materials for Large Language Models for Text Classification: From Zero-Shot Learning to Instruction-Tuning.” https://github.com/yjin-chae/LLMs-for-text-classification.

24.

Cho

van Merriënboer

Gulcehre

Bahdanau

Bougares

Schwenk

Bengio

. 2014. “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation.” Pp. 1724-1734 in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), edited by A. Moschitti, B. Pang and W. Daelemans. Doha, Qatar: Association for Computational Linguistics.

25.

Chung

H. W.

Hou

Longpre

Zoph

Tay

Fedus

Wang

Dehghani

Brahma

Webson

S. S.

Dai

Suzgun

Chen

Chowdhery

Castro-Ros

Pellat

Robinson

Valter

Narang

Mishra

Zhao

Huang

Dai

Petrov

Chi

E. H.

Dean

Devlin

Roberts

Zhou

Q. V.

Wei

. 2022. “Scaling Instruction-finetuned Language Models”.

26.

Dai

A. M.

Q. V.

2015. “Semi-supervised Sequence Learning.” in Advances in Neural Information Processing Systems, volume 28, Curran Associates, Inc.

27.

Danescu-Niculescu-Mizil

West

Jurafsky

Leskovec

Potts

. 2013. “No Country for Old Members: User Lifecycle and Linguistic Change in Online Communities.” Pp. 307-318 in Proceedings of the 22nd International Conference on World Wide Web, ACM.

28.

Davidson

. 2019. “Black-Box Models and Sociological Explanations: Predicting High School Grade Point Average Using Neural Networks.” Socius: Sociological Research for a Dynamic World 5: 237802311881770.

29.

Davidson

. 2024. “Start Generating: Harnessing Generative Artificial Intelligence for Sociological Research.” Socius: Sociological Research for a Dynamic World 10:1–17.

30.

Davidson

Bhattacharya

Weber

. 2019. “Racial Bias in Hate Speech and Abusive Language Detection Datasets.” Pp. 25-35 in Proceedings of the Third Workshop on Abusive Language Online, Florence, Italy: ACL.

31.

Davidson

Warmsley

Macy

Weber

. 2017. “Automated Hate Speech Detection and the Problem of Offensive Language.” Pp. 512-515 in Proceedings of the 11th International Conference on Web and Social Media (ICWSM).

32.

Dettmers

Pagnoni

Holtzman

Zettlemoyer

2023. “Qlora: Efficient Finetuning of Quantized LLMs”.

33.

Devlin

Chang

M. -W.

Lee

Toutanova

2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” Pp. 4171-4186 in Proceedings of NAACL-HLT 2019, ACL.

34.

Dixon

Sorensen

Thain

Vasserman

2018. “Measuring and Mitigating Unintended Bias in Text Classification.” Pp. 67-73 in Proceedings of the 2018 Conference on AI, Ethics, and Society, ACM Press.

35.

Ollion

É.

Shen

2024. “The Augmented Social Scientist: Using Sequential Transfer Learning to Annotate Millions of Texts with Human-Level Accuracy.” Sociological Methods & Research 53(3):1167–1200.

36.

Efron

Tibshirani

1986. “Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy.” Statistical Science 1(1): 54–75.

37.

Egami

Hinck

Stewart

Wei

2023. “Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models.” Advances in Neural Information Processing Systems 36: 68589–601.

38.

Egami

Hinck

Stewart

B. M.

Wei

2024. “Using Large Language Model Annotations for the Social Sciences: A General Framework of Using Predicted Variables in Downstream Analyses”.

39.

Evans

J. A.

Aceves

2016. “Machine Translation: Mining Text for Social Theory.” Annual Review of Sociology 42(1): 21–50.

40.

Felmlee

DellaPosta

Rodis

P. d. C. I.

Matthews

S. A.

. 2020. “Can Social Media Anti-abuse Policies Work? A Quasi-experimental Study of Online Sexist and Racist Slurs.” Socius: Sociological Research for a Dynamic World 6: 237802312094871.

41.

Feng

Park

C. Y.

Liu

Tsvetkov

. 2023. “From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models.” Pp. 11737-11762 in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, volume Volume 1: Long Papers.

42.

Flores

R. D

. 2017. “Do Anti-immigrant Laws Shape Public Sentiment? A Study of Arizona’s SB 1070 Using Twitter Data.” American Journal of Sociology 123(2): 333–84.

43.

Gebru

Morgenstern

Vecchione

Vaughan

J. W.

Wallach

Iii

H. D.

Crawford

. 2021. “Datasheets for Datasets.” Communications of the ACM 64(12): 86–92.

44.

Gilardi

Alizadeh

Kubli

. 2023. “ChatGPT Outperforms Crowd Workers for Text-annotation Tasks.” Proceedings of the National Academy of Sciences 120(30): e2305016120. Publisher: Proceedings of the National Academy of Sciences.

45.

Grimmer

Roberts

M. E.

Stewart

B. M

. 2022. Text As Data: A New Framework for Machine Learning and the Social Sciences. Princeton, NJ: Princeton University Press.

46.

Grossmann

Feinberg

Parker

D. C.

Christakis

N. A.

Tetlock

P. E.

Cunningham

W. A.

2023. “AI and the Transformation of Social Science Research.” Science (New York, N.Y.) 380(6650): 1108–1109. Publisher: American Association for the Advancement of Science.

47.

Hacker

Engel

Mauer

. 2023. “Regulating ChatGPT and other Large Generative AI Models.” Pp. 1112-1123 in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, New York, NY, USA: Association for Computing Machinery.

48.

Hamilton

W. L.

Clark

Leskovec

Jurafsky

. 2016. “Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora.” Pp. 595-605 in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics.

49.

Hanna

. 2013. “Computer-aided Content Analysis of Digitally Enabled Movements.” Mobilization: An International Quarterly 18(4): 367–88.

50.

Heseltine

Clemm von Hohenberg

2024. “Large Language Models As a Substitute for Human Experts in Annotating Political Text.” Research & Politics 11(1): 20531680241236239. Publisher: SAGE Publications Ltd.

51.

Hoffmann

Borgeaud

Mensch

Buchatskaya

Cai

Rutherford

de Las Casas

Hendricks

L. A.

Welbl

Clark

Hennigan

Noland

Millican

van den Driessche

Damoc

Guy

Osindero

Simonyan

Elsen

Vinyals

Rae

Sifre

2022. “An Empirical Analysis of Compute-optimal Large Language Model Training.” Pp. 30016-30030 in Advances in Neural Information Processing Systems, edited by S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho and A. Oh. volume 35, Curran Associates, Inc.

52.

Hofman

J. M.

Sharma

Watts

D. J.

. 2017. “Prediction and Explanation in Social Systems.” Science (New York, N.Y.) 355(6324): 486–8.

53.

Howard

Ruder

2018. “Universal Language Model Fine-tuning for Text Classification.” Pp. 328-339 in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), edited by I. Gurevych and Y. Miyao. Melbourne, Australia: Association for Computational Linguistics.

54.

Ibrahim

E. I.

Voyer

2024. “The Augmented Qualitative Researcher: Using Generative AI in Qualitative Text Analysis”.

55.

Jensen

J. L.

Karell

Tanigawa-Lau

Habash

Oudah

Fairus Shofia Fani

2022. “Language Models in Sociological Research: An Application to Classifying Large Administrative Data and Measuring Religiosity.” Sociological Methodology 52(1): 30–52.

56.

Jiang

A.Q.

Sablayrolles

Mensch

Bamford

Chaplot

D. S.

de las Casas

Bressand

Lengyel

Lample

Saulnier

Lavaud

L. R.

Lachaux

M. -A.

Stock

Scao

T. L.

Lavril

Wang

Lacroix

Sayed

W. E.

2023. “Mistral 7b”.

57.

Jiang

A. Q.

Sablayrolles

Roux

Mensch

Savary

Bamford

Chaplot

D.S.

Casas

D. d. l.

Hanna

E. B.

Bressand

Lengyel

Bour

Lample

Lavaud

L. R.

Saulnier

Lachaux

M. -A.

Stock

Subramanian

Yang

Antoniak

Scao

T. L.

Gervet

Lavril

Wang

Lacroix

Sayed

W. E.

2024. “Mixtral of Experts.” arXiv:2401.04088 [cs].

58.

Joseph

Shugars

Gallagher

Green

Quintana Mathé

Lazer

2021. “(Mis)alignment Between Stance Expressed in Social Media Data and Public Opinion Surveys.” Pp. 312-324 in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.

59.

Kaplan

McCandlish

Henighan

Brown

T. B.

Chess

Child

Gray

Radford

Amodei

2020. “Scaling Laws for Neural Language Models.” arXiv:2001.08361 [cs, stat].

60.

Khattab

Singhvi

Maheshwari

Zhang

Santhanam

Vardhamanan

Haq

Sharma

Joshi

T.T.

Moazam

Miller

Zaharia

Potts

2023. “DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines.” arXiv:2310.03714 [cs].

61.

Kojima

S. S.

Reid

Matsuo

Iwasawa

2024. “Large Language Models are Zero-shot Reasoners.” Pp. 22199-22213 in Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY, USA: Curran Associates Inc.

62.

Kozlowski

A. C.

Taddy

Evans

J. A.

2019. “The Geometry of Culture: Analyzing the Meanings of Class Through Word Embeddings.” American Sociological Review 84(5):905–949.

63.

Küçük

Can

2020. “Stance Detection: A Survey.” ACM Computing Surveys 53(1): 12:1–12:37.

64.

Laurer

Casas

W. v.

Atteveldt

Welbers

2024. “Less Annotating, More Classifying: Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT-NLI.” Political Analysis 32(1): 84–100. Publisher: Cambridge University Press.

65.

Le Mens

Kovács

Hannan

Pros

2023. “Using Machine Learning to Uncover the Semantics of Concepts: How Well Do Typicality Measures Extracted From a BERT Text Classifier Match Human Judgments of Genre Typicality?” Sociological Science 10: 82–117.

66.

Lewis

Liu

Goyal

Ghazvininejad

Mohamed

Levy

Stoyanov

Zettlemoyer

2019. “Bart: Denoising Sequence-to-sequence Pre-training for Natural language Generation, Translation, and Comprehension”.

67.

. 2022. “Language Models: Past, Present, and Future.” Communications of the ACM 65(7): 56–63.

68.

Liu

Ott

Goyal

Joshi

Chen

Levy

Lewis

Zettlemoyer

Stoyanov

2019. “RoBERTa: A Robustly Optimized BERT Pretraining Approach.” arXiv:1907.11692 [cs].

69.

Liu

D. M.

Salganik

M. J.

2019. “Successes and Struggles with Computational Reproducibility: Lessons From the Fragile Families Challenge.” Socius: Sociological Research for a Dynamic World 5: 1–21.

70.

Manning

C. D

. 2022. “Human Language Understanding & Reasoning.” Daedalus 151(2): 127–38.

71.

Martin

J. H.

Jurafsky

2024. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 3rd ed. Upper Saddle River, NJ: Prentice Hall.

72.

Mei

Fereidooni

Caliskan

2023. “Bias Against 93 Stigmatized Groups in Masked Language Models and Downstream Sentiment Classification Tasks.” Pp. 1699-1710 in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, New York, NY, USA: Association for Computing Machinery.

73.

Méndez

J. R.

Iglesias

E. L.

Fdez-Riverola

Díaz

Corchado

J. M.

2006. “Tokenising, Stemming and Stopword Removal on Anti-spam Filtering Domain.” Pp. 449-458 in Current Topics in Artificial Intelligence, Lecture Notes in Computer Science, edited by R. Marín, E. Onaindía, A. Bugarín and J. Santos Berlin, Heidelberg: Springer.

74.

Mikolov

Chen

Corrado

Dean

2013a. “Efficient Estimation of Word Representations in Vector Space.” arXiv preprint arXiv:1301.3781.

75.

Mikolov

Sutskever

Chen

Corrado

G. S.

Dean

2013b. “Distributed Representations of Words and Phrases and Their Compositionality.” Pp. 3111-3119 in Advances in Neural Information Processing Systems.

76.

Miller

Linder

Mebane Jr.

W. R.

2020. “Active Learning Approaches for Labeling Text: Review and Assessment of the Performance of Active Learning Approaches.” Political Analysis 28(4):532–551.

77.

Mitchell

Zaldivar

Barnes

Vasserman

Hutchinson

Spitzer

Raji

I. D.

Gebru

2019. “Model Cards for Model Reporting.” Pp. 220-229 in Proceedings of the Conference on Fairness, Accountability, and Transparency, arXiv:1810.03993 [cs].

78.

Mohammad

Kiritchenko

Sobhani

Zhu

Cherry

2016. “SemEval-2016 Task 6: Detecting Stance in Tweets.” Pp. 31-41 in Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), San Diego, California: Association for Computational Linguistics.

79.

Mohammad

S. M.

Sobhani

Kiritchenko

2017. “Stance and Sentiment in Tweets.” ACM Transactions on Internet Technology 17(3): 26:1–26:23.

80.

Molina

Garip

2019. “Machine Learning for Sociology.” Annual Review of Sociology 45: 27–45.

81.

Mullainathan

Spiess

2017. “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives 31(2): 87–106.

82.

Murakami

Raymond

2010. “Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions.” Pp. 869-875 in Coling 2010: Posters, edited by C. -R. Huang and D. Jurafsky. Beijing, China: Coling 2010 Organizing Committee.

83.

Naab

T. K.

Ruess

H. -S.

Küchler

2025. “The Influence of the Deliberative Quality of User Comments on the Number and Quality of Their Reply Comments.” New Media & Society 27(1):62–83.

84.

Nelson

L. K

. 2020. “Computational Grounded Theory: A Methodological Framework.” Sociological Methods & Research 49(1):3–42.

85.

Nelson

L. K.

Brewer

Mueller

A. S.

O’Connor

D. M.

Dayal

Arora

V. M.

2023. “Taking the Time: The Implications of Workplace Assessment for Organizational Gender Inequality.” American Sociological Review 88(4): 627–55.

86.

Nelson

L. K.

Burk

Knudsen

McCall

2021. “The Future of Coding: A Comparison of Hand-Coding and Three Types of Computer-Assisted Text Analysis Methods.” Sociological Methods & Research 50(1):202–237.

87.

Noble

S. U

. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York, NY: NYU Press.

88.

O’Neil

. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY: Broadway Books.

89.

OpenAI. 2023. GPT-4 Technical Report. Technical report.

90.

Ouyang

Jiang

Almeida

Wainwright

C. L.

Mishkin

Zhang

Agarwal

Slama

Ray

Schulman

Hilton

Kelton

Miller

Simens

Askell

Welinder

Christiano

Leike

Lowe

2022. “Training Language Models to Follow Instructions with Human Feedback.” arXiv:2203.02155 [cs].

91.

Palmer

Smith

N. A.

Spirling

2024. “Using Proprietary Language Models in Academic Research Requires Explicit Justification.” Nature Computational Science 4(1): 2–3. Publisher: Nature Publishing Group.

92.

Pang

Lee

2008. “Opinion Mining and Sentiment Analysis.” Foundations and Trends in Information Retrieval 2(1-2): 1–135.

93.

Pasquale

. 2015. The Black Box Society. Cambridge, MA: Harvard University Press.

94.

Radford

Kim

J. W.

Hallacy

Ramesh

Goh

Agarwal

Sastry

Askell

Mishkin

Clark

Krueger

Sutskever

. 2021. Learning Transferable Visual Models From Natural Language Supervision.” Proceedings of the 38th International Conference on Machine Learning.

95.

Radford

Narasimhan

Salimans

Sutskever

2018. “Improving Language Understanding by Generative Pre-Training”.

96.

Radford

Child

Luan

Amodei

Sutskever

2019. “Language Models are Unsupervised Multitask Learners.” Technical report, OpenAI.

97.

Raffel

Shazeer

Roberts

Lee

Narang

Matena

Zhou

Liu

P. J.

2020. “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.” arXiv:1910.10683 [cs, stat].

98.

Rathje

Mirea

D. -M.

Sucholutsky

Marjieh

Robertson

Bavel

J. J. V.

2023. “GPT is an Effective Tool for Multilingual Psychological Text Analysis.” OSF.

99.

Reimers

Gurevych

2019. “Sentence-bert: Sentence Embeddings Using Siamese Bert-networks.

100.

Ren

Bloemraad

2022. “New Methods and the Study of Vulnerable Groups: Using Machine Learning to Identify Immigrant-Oriented Nonprofit Organizations.” Socius: Sociological Research for a Dynamic World 8: 237802312210769.

101.

Rodriguez

P. L.

Spirling

2022. “Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research.” The Journal of Politics 84(1): 101–15.

102.

Röttger

Kirk

Vidgen

Attanasio

Bianchi

Hovy

2024. “XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models.” Pp. 5377-5400 in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), edited by K. Duh, H. Gomez and S. Bethard. Mexico City, Mexico: Association for Computational Linguistics.

103.

Röttger

Vidgen

Nguyen

Waseem

Margetts

Pierrehumbert

2021. “HateCheck: Functional Tests for Hate Speech Detection Models.” Pp. 41-58 in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL.

104.

Rudin

. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1(5): 206–15.

105.

Sap

Card

Gabriel

Choi

Smith

N. A.

2019. “The Risk of Racial Bias in Hate Speech Detection.” Pp. 1668-1678 in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL.

106.

Sen

Flöck

Wagner

2020. “On the Reliability and Validity of Detecting Approval of Political Actors in Tweets.” Pp. 1413-1426 in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics.

107.

Shannon

C. E

. 1948. “A Mathematical Theory of Communication.” The Bell System Technical Journal 27(3): 379–423. Publisher: Nokia Bell Labs.

108.

Shor

Van De Rijt

Miltsov

Kulkarni

Skiena

2015. “A Paper Ceiling: Explaining the Persistent Underrepresentation of Women in Printed News.” American Sociological Review 80(5): 960–84.

109.

Shugars

Beauchamp

2019. “Why Keep Arguing? Predicting Engagement in Political Conversations Online.” Sage Open 9(1): 2158244019828850. Publisher: SAGE Publications.

110.

Smith

N. A

. 2020. “Contextual Word Representations: Putting Words Into Computers.” Communications of the ACM 63(6): 66–74.

111.

Sobhani

Inkpen

Matwin

2015. “From Argumentation Mining to Stance Classification.” Pp. 67-77 in Proceedings of the 2nd Workshop on Argumentation Mining.

112.

Sobhani

Inkpen

Zhu

2017. “A Dataset for Multi-Target Stance Detection.” Pp. 551-557 in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain: Association for Computational Linguistics.

113.

Somasundaran

Wiebe

2010. “Recognizing Stances in Ideological On-Line Debates.” Pp. 116-124 in Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text.

114.

Spirling

. 2023. “Why Open-source Generative AI Models are An Ethical Way Forward for Science.” Nature 616(7957): 413.

115.

Sridhar

Foulds

Huang

Getoor

Walker

2015. “Joint Models of Disagreement and Stance in Online Debate.” Pp. 116-125 in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China: Association for Computational Linguistics.

116.

Stoltz

D. S.

Taylor

M. A.

2021. “Cultural Cartography with Word Embeddings.” Poetics 88:101567.

117.

Strubell

Ganesh

McCallum

2019. “Energy and Policy Considerations for Deep Learning in NLP.” Pp. 3645-3650 in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy: Association for Computational Linguistics.

118.

Taori

Gulrajani

Zhang

Dubois

Guestrin

Liang

Hashimoto

T. B.

2023. “Alpaca: A Strong, Replicable Instruction-following Model.” Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html 3(6): 7.

119.

TeBlunthuis

Hase

Chan

C. -H

. 2024. “Misclassification in Automated Content Analysis Causes Bias in Regression. Can We Fix It? Yes We Can!.” Communication Methods and Measures 0(0): 1–22. Publisher: Routledge _eprint: https://doi.org/10.1080/19312458.2023.2293713.

120.

Touvron

Lavril

Izacard

Martinet

Lachaux

M. -A.

Lacroix

Rozière

Goyal

Hambro

Azhar

Rodriguez

Joulin

Grave

Lample

2023. “LLaMA: Open and Efficient Foundation Language Models.” arXiv:2302.13971 [cs].

121.

Vaswani

Shazeer

Parmar

Uszkoreit

Jones

Gomez

A. N.

Kaiser

Ł.

Polosukhin

2017. “Attention is All you Need.” P. 11 in NIPS, Long Beach, CA, USA.

122.

Voyer

Kline

Z. D.

Danton

Volkova

2022. “From Strange to Normal: Computational Approaches to Examining Immigrant Incorporation Through Shifts in the Mainstream.” Sociological Methods & Research 51(4): 1540–1579. Publisher: SAGE Publications Inc.

123.

Walker

Anand

Abbott

Grant

2012. “Stance Classification using Dialogic Properties of Persuasion.” Pp. 592-596 in Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, edited by E. Fosler-Lussier, E. Riloff and S. Bangalore. Montréal, Canada: Association for Computational Linguistics.

124.

Wang

Kordi

Mishra

Liu

Smith

N. A.

Khashabi

Hajishirzi

2023. “Self-Instruct: Aligning Language Models with Self-Generated Instructions.” Pp. 13484-13508 in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), edited by A. Rogers, J. Boyd-Graber and N. Okazaki. Toronto, Canada: Association for Computational Linguistics.

125.

Wankmüller

. 2024. “Introduction to Neural Transfer Learning With Transformers for Social Science Text Analysis.” Sociological Methods & Research 53(4):1676–1752.

126.

Wei

Bosma

Zhao

Guu

A. W.

Lester

Dai

A. M.

Q. V.

2022. “Finetuned Language Models are Zero-Shot Learners.” in International Conference on Learning Representations.

127.

Wei

Wang

Schuurmans

Bosma

Ichter

Xia

Chi

Zhou

2023a. “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” arXiv:2201.11903 [cs].

128.

Wei

Tay

Tran

Webson

Chen

Liu

Huang

Zhou

2023b. “Larger Language Models Do In-context Learning Differently.” arXiv:2303.03846 [cs].

129.

Weidinger

Uesato

Rauh

Griffin

Huang

P. -S.

Mellor

Glaese

Cheng

Balle

Kasirzadeh

Biles

Brown

Kenton

Hawkins

Stepleton

Birhane

Hendricks

L. A.

Rimell

Isaac

Haas

Legassick

Irving

Gabriel

2022. “Taxonomy of Risks posed by Language Models.” Pp. 214-229 in 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul Republic of Korea: ACM.

130.

Widmann

Wich

2022. “Creating and Comparing Dictionary, Word Embedding, and Transformer-Based Models to Measure Discrete Emotions in German Political Text.” Political Analysis 31(4): 626–641.

131.

Yin

Hay

Roth

2019. “Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach.” Pp. 3912-3921 in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics.

132.

Yosinski

Clune

Bengio

Lipson

2014. “How Transferable are Features in Deep Neural Networks?” Pp. 3320-3328 in Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, Cambridge, MA, USA: MIT Press.

133.

Yuksekgonul

Bianchi

Boen

Liu

Huang

Guestrin

Zou

2024. “TextGrad: Automatic “Differentiation” via Text.” arXiv:2406.07496 [cs].

134.

Zaller

J. R

. 1992. The Nature and Origins of Mass Opinion. New York, NY: Cambridge University Press.

135.

Zhang

Chang

Danescu-Niculescu-Mizil

Dixon

Hua

Taraborelli

Thain

2018. “Conversations Gone Awry: Detecting Early Signs of Conversational Failure.” Pp. 1350-1361 in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia: Association for Computational Linguistics.

136.

Zhang

Pan

2019. “CASM: A Deep-Learning Approach for Identifying Collective Action Events with Text and Image Data From Social Media.” Sociological Methodology 49(1): 1–57.

137.

Zhao

Wallace

Feng

Klein

Singh

2021. “Calibrate Before Use: Improving Few-shot Performance of Language Models.” Pp. 12697-12706 in Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, edited by M. Meila and T. Zhang. PMLR.

138.

Zhou

Muresanu

A. I.

Han

Paster

Pitis

Chan

2022. “Large Language Models are Human-Level Prompt Engineers.” in The Eleventh International Conference on Learning Representations.

139.

Ziegler

D.M.

Stiennon

Brown

T. B.

Radford

Amodei

Christiano

Irving

2020. “Fine-Tuning Language Models from Human Preferences”.

140.

Ziems

Held

Shaikh

Chen

Zhang

Yang

2024. “Can Large Language Models Transform Computational Social Science?” Computational Linguistics 50(1): 237–291.

141.

Zubiaga

Kochkina

Liakata

Procter

Lukasik

2016. “Stance Classification in Rumours as a Sequential Task Exploiting the Tree Structure of Social Media Conversations.” Pp. 2438-2448 in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, edited by Y. Matsumoto and R. Prasad. Osaka, Japan: The COLING 2016 Organizing Committee.