Sage Journals: Discover world-class research

Abstract

A central concern for experimental studies is participant motivation, which is crucial for internal validity. When participants are not committed to the task, internal validity diminishes because responses might not be authentic. This study introduces and tests the seriousness technique as a method for increasing participant investment in political science experiments that use student samples. The seriousness technique aims at creating a sense of responsibility by telling students that their participation is important because science needs quality data. Results from a computer-assisted foreign policy decision-making experiment show that the seriousness technique increased the degree of information participants access during the foreign policy simulation and the time they spent on the study. These findings suggest that political scientists who use student samples in their experiments can nurture serious subjects by employing the seriousness technique. It is argued that the results should be of interest not only to experimentalists but also to all scholars who use human subjects, including survey researchers, in their research.

Keywords

Experimental research internal validity participant engagement

Introduction

As Rose McDermott observed ‘[the] success of the experiment depends on the subject taking the task seriously and experimenters can foster such engagement to the degree they can create and establish a situation which forces psychological investment on the part of subjects’ (McDermott, 2011: 45). While participant engagement is important for any study that relies on human subjects (e.g. surveys, interviews, self-reports), participants’ investment in experiments is a particularly critical issue. If participants do not take the study seriously, internal validity decreases, which in turn jeopardizes the validity of the experiment. This paper tests and shows the value of the seriousness technique as a method for increasing participant investment in a foreign policy decision-making experiment conducted with a student sample.

Political science research has benefited substantially from experimental methods over the last decade (Druckman et al., 2006, 2011; Green and Gerber, 2002; Hyde, 2015; McDermott, 2002a, 2002b). Although experiments offer the unparalleled advantage of inferring causal relationships with greater confidence, researchers wrestle with questions of external and internal validity (Druckman and Kam, 2011; McDermott, 2011; Sears, 1986).

External validity is an especially crucial issue in political science experiments that use student and other convenience samples. A number of studies have examined the concerns related to the external validity of experiments based on student samples, and have compared the results from student samples to those obtained from political and military elite samples (Bayram, 2017a; Hafner-Burton et al., 2014; Mintz et al., 2006), or the mass public (Druckman and Kam, 2011; Krupnikov and Levine, 2014; Mullinix et al., 2015). Scholars have also explored the external validity of studies based on MTurk survey respondents (e.g. Berinsky et al., 2012; Clifford et al., 2015; Huff and Tingley, 2015). To increase the generalizability of their findings, political scientists also strive to leverage of samples of politicians (e.g. Bayram, 2017b; Hafner-Burton et al., 2014; Mintz et al., 2006).

However, the use of convenience samples in political science experiments is still very common and continues to grow seemingly exponentially. In particular, as Kam et al. (2007: 419) stated, student samples are ‘omnipresent’. There are good reasons for the prevalence of student samples. Beyond the convenience of low-cost recruitment, student samples are suitable for a set of specific research objectives (Kam et al., 2007). They are appropriate for investigating a causal relationship between two factors, performing a ‘critical test’ of a specific hypothesis, or – by definition – examining a phenomenon where students are the population of interest. In addition, if there is little theoretical or empirical reason to suppose that the effect of the experimental treatment would be different in student and non-student populations (Druckman and Kam, 2011), using student samples is an attractive method of data collection.

A major concern for experimental studies that use student samples is participant investment or engagement, a factor crucial for internal validity. Internal validity “refers to the extent to which an experimenter can be confident his or her findings result from their experimental manipulations” (McDermott, 2011: 44). Internal validity is what makes experiments the ‘gold standard’ of causal inference (Campbell and Stanley, 1963; Green and Gerber, 2002; McDermott, 2002a; Shadish et al. 2002). A key requirement for internal validity is participants’ psychological investment in the study (Aronson et al., 1990; McDermott, 2002a). When participants are engaged, and committed to the task at hand, internal validity increases. Conversely, when participants are disengaged, internal validity diminishes because the responses subjects provide might not be authentic (McDermott, 2011).

College students typically participate in political science experiments to earn extra credit. They are well-intentioned and likely wish to do a good job. However, they might not have an intrinsic motivation to take the experiment seriously. Extrinsic motivation is also difficult to create. Institutional Review Boards usually prohibit researchers from making the rewards for participation conditional upon the completion of the study. In fact, researchers are required to indicate that participants can withdraw from the study at any time without penalty. Because students might lack an intrinsic motivation, there is a risk that they will not take the experiment seriously. Lack of commitment to the task at hand might cause students to rush through the study; fail to consider carefully the experimental treatments and the response options – or, worse, randomly select responses; stop paying attention as the experiment progresses; or simply drop out from the study.

Political scientists are increasingly interested in controlling for participant attention. Berinsky et al. (2014), for example, encourage scholars to take advantage of ‘screeners’ to weed out inattentive participants. While such instructional manipulation checks are useful to filter out participants who do not pay attention, other techniques aim at increasing participant engagement. One method to increase participant engagement is the seriousness technique developed by Reips (2000, 2002a, 2002b); see also Peden and Tiry (2013). It is part of a set of ‘high-hurdle techniques’, originally created for Internet-based experiments, to facilitate participant motivation and reduce drop out. The seriousness technique involves telling subjects that their participation is serious and important, and that science needs quality data. It aims at strengthening participant motivation by cultivating a sense of responsibility for the advancement of science.

While the effects of the seriousness and other high-hurdle techniques have been explored in online psychology experiments and surveys (e.g. Joinson et al., 2007; Göritz and Stieger, 2008), the seriousness technique has not been applied to political science experiments that use student samples. The technique can be useful for different kinds of survey and experimental research; but here I examine whether the seriousness technique can be employed to encourage college students to be conscientious subjects in political science experiments. The hypothesis that follows from the seriousness technique is: ‘Exposure to the seriousness technique increases participant engagement’.

Testing the seriousness technique

I examine the effect of the seriousness technique in the context of a foreign policy decision-making experiment. This promises to be a demanding test of the seriousness technique. Foreign policy decision-making experiments typically require participants to process large amounts of information about a conflict between two countries, which could be real or fictitious. Students are asked to understand the historical origins of the conflict and the stakes related to the national security of the countries involved, and make sense of the different policy outcomes they could choose. If the seriousness technique succeeds in increasing participant investment in a cognitively taxing foreign policy decision-making experiment, there is reason to suppose that it will be useful in other settings.

Sample

Recruited from several political science courses between November–December 2016, 246 undergraduate students from a West Coast American university participated in a computer-assisted foreign policy experiment. The students had previously signed an agreement to participate in experiments; and all interested students legally considered as adults were included. There was no exclusion criterion other than age. The mean age of the participants was 19.6 (SD = 1.8). About 58% identified as female and 42% as male. The majority (70%) of the participants were white. Six percent identified as African-American, 5% as Latino, 2% as Middle Eastern, 12% as Asian, and about 6% were of mixed or other ethnic origin. The distribution of the data on ideology was skewed to the left. About 67% of the participants identified as ‘extremely liberal’, ‘liberal’, or ‘slightly liberal’. Thirteen percent indicated that their ideological views were ‘moderate–middle of the road’. About 20% identified as ‘extremely conservative’, ‘conservative’, or ‘slightly conservative’. The mean household income was in the range of US$50,000 to US$90,000. There was little variation regarding participants’ interest in politics and public affairs: almost all of them reported that they were either ‘very interested’ or ‘somewhat interested’ in politics. In addition, 90% of the students indicated that they participated in the experiment in order either to earn extra credit or because participation was required in the course they were taking. Of the 130 who answered questions on motivation for participating, only seven indicated that they wanted to help the researchers; and two mentioned that they thought the study looked interesting. This suggests that the self-selection of students with intrinsic motivations to contribute to research was not a major concern.

Experimental design and manipulation of seriousness

The experiment involved two fictitious countries engaged in a dispute over the territory and oil resources of a group of islands called the ‘Genova Islands’. The foreign policy dispute was borrowed from Beer et al.’s (1987) study on war and public opinion. The dispute chosen for the present study has a degree of real-world realism because it was modelled on the Falklands–Malvinas crisis of 1982 between Great Britain and Argentina.

Prior to participating in the experiment, students responded to questions about their demographic characteristics. The questions were delivered to them via email: Qualtrics survey software was used.¹

The experiment has a 2x1 between-subjects design. Individual participants were randomly assigned to either the control condition or the treatment condition. Those assigned to the control condition were thanked for their participation and asked to consider a foreign policy dispute between two countries. Participants assigned to the treatment condition were similarly thanked for taking part in the study and asked to consider the dispute; however, their instructions included the seriousness treatment which read: ‘Your participation is serious because science needs quality data’.

After these basic instructions, participants in both the treatment and the control conditions were introduced to the following foreign policy dispute:

Two democratic nations, Afslandia and Bagumba share a common border and are involved in a dispute over the possession of a set of offshore islands, called the Genova islands, which are replete with oil resources. Afslandia and Bagumba have been arguing about the possession of the Genova islands for several hundred years. Genova islands were discovered by a team of explorers at the end of the 16th century. In the mid 20th century, rich oil resources were discovered on the Genova islands. The islands also have lucrative fishing bases and serve as a port for the exploration of neighboring natural resource deposits. In short, the Genova islands are a huge source of revenue for a country. One of the explorers who discovered the Genova islands was from Afslandia. The other was from Bagumba. Afslandia is located 25 miles away from the Genova islands. Bagumba is located 100 miles away from the Genova islands. Both countries claim sovereignty over the Genova islands and dispute how the territory and oil resources should be divided.

The foreign policy decision-making experiment consisted of five stages, as shown in Figure 1. At levels 1–4, participants had the option of accessing more information (provided here in the Supplementary Material) or skipping to the next section. After being introduced to the conflict between Afslandia and Bagumba (Level 1), participants were told that they would be asked questions about how these countries should divide the territory and the oil resources of the Genova islands. They were then given a choice: they could either move onto the questions regarding the distribution of resources, or obtain more information about the dispute.

Figure 1.

Design of the experiment and accessible levels of information.

If participants chose to obtain more information, they accessed the next level of information. Participants who accessed Level 2 information learned that Aflsandian settlers occupied the Genova islands in the 17th century, and that Bagumba opposed the colonists. Those participants who chose to access Level 3 information were presented with additional details about the history of the conflict and the international community’s response. At Level 4, participants were informed about the advisory opinion of the International Court of Justice (ICJ). Level 5 served as the last level of information; at this level participants read a report of the result of the latest military skirmishes between Afslandia and Bagumba and the reaction of the United Nations Security Council to the clashes. Irrespective of whether participants stopped at Level 1 or accessed all five levels of information, participants were asked about how Afslandia and Bagumba should divide the territory and the oil resources of the Genova islands.

Dependent variable

I measured participant investment in the experiment by examining the extent of information acquisition and the overall time spent on the study (not including the time spent on answering demographic and individual-difference questions). First, in accordance with Mintz et al. (2006), I relied on process tracing methodology to measure the level of information participants elected to obtain. Process tracing is a method for identifying what information participants access, and in which order (Ford et al., 1989; Mintz et al., 1997). If the seriousness technique is a useful way of inspiring students to be conscientious participants, the expectation would be that those assigned to the treatment condition would access more information than those who did not receive the seriousness manipulation. The variable ‘Information’ is an ordinal one that ranges from ‘1’ if participants stopped at Level 1 information to ‘5’ if they accessed all available levels of information.

The second measure of participant investment is the time spent on the experiment. The time participants spend on the study and the level of information they access are likely to be correlated. However, the strength of this relationship is ultimately an empirical question. It is possible that some participants access a large amount of information but do not commit sufficient time to processing this knowledge. Therefore, examining the effect of the seriousness treatment on the time spent is important. The time participants spent on the simulation was measured by the computer they used to complete the study. ‘Time’ is an interval level variable measured in minutes (M = 21, SD = 6.0, minimum 4.5, maximum 33.33).

Results

Balance tests suggested that there were no important differences between the treatment and the control group in terms of demographic characteristics, ideology, and interest in politics (see Supplementary Material). To test the impact of the seriousness treatment on the degree of information acquisition, I compared the differences between the treatment and the control group in the level of information obtained by performing a Wilcoxon–Mann–Whitney test, a nonparametric alternative to the independent samples t-test. As shown in Table 1, the extent of information accessed is larger for participants who received the seriousness treatment than those who did not (z = −2.009, p = 0.0445). Participants primed to think about the seriousness of their participation for science showed an interest in obtaining more information about the conflict between Afslandia and Bagumba than those who had not been primed. A chi-square test shows that results remain robust (χ² (1, 246) = 13.40, p < 0.001) if the data are collapsed on the level of information accessed to create a binary variable with categories of high (for Levels 4–5) and low (for Levels 1–3) information.

Table 1.

Differences between treatment and control conditions in the level of information accessed shown to be significant.

	Level 1	Level 2	Level 3	Level 4	Level 5	Total
Treatment group	23.31 % (n = 31)	9.02 % (n = 12)	9.77% (n = 13)	19.55 % (n = 26)	38.35 % (n = 51)	100 % (n =133)
Control group	23.89 % (n = 27)	22.12% (n = 25)	19.47 % (n = 22)	4.42 % (n = 5)	30.09 % (n = 34)	100% (n = 113)

The seriousness technique also increased the time participants spent on the foreign policy simulation. Not surprisingly, time spent and the level of information accessed are positively related, but the correlation coefficient is not statistically significant (r(244) = 0.068, p = 0.28). An analysis of variance test shows that participants significantly differed in the time they spent on the foreign policy task (F(1, 244) = 447.18, p = 0.0000; eta2 = 0.65).² Post-hoc group analysis using the Scheffé criterion for significance revealed that participants who received the seriousness treatment (M = 25.33, SD = 3.65) spent about 9.63 minutes more time than those in the control condition (M = 15.69, SD = 3.44).

Table 2 summarizes the average time spent on the study as a function of the experimental conditions and the level of information accessed. As can be seen – and as confirmed by an analysis of the variance test – the level of information obtained did not have a significant impact on the time spent (F(4, 240) = 1.11, p = 0.3514).³ What affected the amount of time participants put in was whether they had been exposed to the seriousness treatment.

Table 2.

Average time spent on the experiment by experimental conditions and the level of information accessed.

	Level 1Mean(SD)	Level 2Mean(SD)	Level 3Mean(SD)	Level 4Mean(SD)	Level 5Mean(SD)	TotalMean(SD)
Treatment group	25.74(3.79) (n = 27)	26.47 (3.67) (n = 12)	24.97 (3.96) (n = 13)	24.64 (3.47) (n = 26)	25.24 (3.62) (n = 51)	25.33 (3.65) (n =133)
Control group	16.12(3.604) (n = 27)	15.74 (3.08) (n = 25)	15.96 (3.25) (n = 22)	12.46 (4.73) (n = 5)	15.63 (3.44) (n = 34)	15.69 (3.44) (n = 113)

There were no significant differences in participants’ preferences regarding the distribution of territory (z = 0.931, p = 0.3516) and oil resources (z = 0.407, p = 0.6840) between the conflicting states as a function of the seriousness treatment. This is to be expected: the experimental treatment was not intended to change the policy preferences of participants.

To summarize, findings from the foreign policy experiment indicate that the seriousness technique is an effective method for increasing participants’ investment as measured by information acquisition and time spent on the experiment.

Conclusion

This project examined the effect of the seriousness technique as a method for increasing participant investment in experiments that use student samples. To the best of my knowledge, this research is the first application of the seriousness method to political science experiments based on student samples. I have shown that, when encouraged to think about the seriousness of their participation for the advancement of science, college students access more information about the experimental task at hand, and on average, spend nine minutes more on the study. How the seriousness technique will affect average treatment effects (ATEs) in a study is ultimately an empirical question. However, given the reported results it is reasonable to expect that the seriousness technique will make participants more attentive to the treatment at hand, and thus strengthen the difference between the treatment and control groups. Stated differently, the seriousness technique will help guard against failing to find ATEs merely because participants did not pay sufficient attention to the treatment.

Some might worry that the seriousness technique could lead to demand characteristics, namely motivating participants to answer the questions in ways they think they experimenter expects of them. However, this risk is minimal at best and no greater than any other potential threat to validity. The seriousness technique only mentions the importance of quality data for science. It offers no cues about the direction of the results the experimenter hopes to observe. In the present study, for example, there is little reason to think that mentioning the importance of quality data would affect how participants chose to distribute resources between the conflicting states.

The results contribute to the ‘defense of the narrow database’ (Druckman and Kam, 2011). By adding a simple sentence about the importance to science of their participation, scholars can significantly increase the engagement of students as experimental participants. While future studies should conduct additional tests of the seriousness technique in settings other than foreign policy decision-making and across multiple issue areas, and marry the seriousness technique with ‘screeners’, the findings of this research suggest that the seriousness technique is a viable method of fostering serious subjects. The results should be of interest not only to experimentalists but also to all scholars who use human subjects in their research.

Supplemental Material

Serious_Subjects_Supplementary_Appendix – Supplemental material for Serious subjects: A test of the seriousness technique to increase participant motivation in political science experiments

Supplemental material, Serious_Subjects_Supplementary_Appendix for Serious subjects: A test of the seriousness technique to increase participant motivation in political science experiments by A. Burcu Bayram in Research and Politics

Footnotes

Acknowledgements

The author would like to thank Marcus Holmes and the Department of Government at the College of William and Mary for the opportunity to conduct this experiment.

Correction (June 2025):

The article has been updated with correct dataverse link in the supplementary material section. For more details, please see the correction notice .

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Supplementary material

The supplementary material is available at: http://journals.sagepub.com/doi/suppl/10.1177/2053168018767453. The replication files are available at: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QUJY64&version=DRAFT.

Notes

Carnegie Corporation of New York Grant

This publication was made possible (in part) by a grant from Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.

References

Aronson

Ellsworth

Carlsmith

Gonzales

(1990) Methods of Research in Social Psychology (2nd Edition). New York: McGraw-Hill.

Bayram

(2017a) Due deference: Cosmopolitan social identity and the psychology of legal obligation. International Organization 71(S1): S137–S163. First published online in April 2017: DOI:10.1017/S0020818316000485.s

Bayram

(2017b) Good Europeans? How European identity and costs interact to explain compliance with European Union law. Journal of European Public Policy 24(1): 42–60.

Beer

Healy

Sinclair

Bourne

(1987) War cues and foreign policy acts. American Political Science Review 81(3): 701–716.

Berinsky

Margolis

Sances

(2014) Separating the shirkers from the workers? Making sure respondents pay attention on self-administered surveys. American Journal of Political Science 58(3): 739–753.

Berinsky

Huber

Lenz

(2012) Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis 20(3): 351–368.

Campbell

Stanley

(1963) Experimental and Quasi-experimental Designs for Research. Boston, MA: Houghton Mifflin.

Clifford

Jewell

Waggoner

(2015) Are samples drawn from Mechanical Turk valid for research on political ideology? Research & Politics 2(4): 1–9.

Druckman

Kam

(2011) Students as experimental participants: A defense of the ‘narrow data base’. In: Druckman

Green

Kuklinski

Lupia

(eds) Cambridge Handbook of Experimental Political Science. New York: Cambridge University Press, pp.41–57.

10.

Druckman

Green

Kuklinski

Lupia

(2006) The growth and development of experimental research in political science. American Political Science Review 100(4): 627–635.

11.

Druckman

Green

Kuklinski

Lupia

(eds) (2011) Cambridge Handbook of Experimental Political Science. Cambridge: Cambridge University Press.

12.

Ford

Schmitt

Schechtman

Hults

Doherty

(1989) Process-tracing methods: Contributions, problems, and neglected research questions. Organizational Behavior and Human Decision Processes 43(1): 75–117.

13.

Green

Gerber

(2002) Reclaiming the experimental tradition in political science. In: Katzelnson

Milner

(eds) Political Science: State of the Discipline. New York: Norton, pp. 803–832.

14.

Göritz

Stieger

(2008) The high-hurdle technique put to the test. Behavior Research Methods 40(1): 322–327.

15.

Hafner-Burton

Le Veck

Victor

Fowler

(2014) Decision-maker preferences for international legal cooperation. International Organization 68(4): 845–876.

16.

Huff

Tingley

(2015) ‘Who are these people?’ Evaluating the demographic characteristics and political preferences of MTurk survey respondents. Research & Politics 2(3). Available at: https://doi.org/10.1177/2053168015604648

17.

Hyde

(2015) Experiments in international relations: Lab, survey, and field. Annual Review of Political Science 18: 403–424.

18.

Joinson

Woodley

Reips

(2007) Personalization, authentication and self-disclosure in self-administered Internet surveys. Computers in Human Behavior 23(1): 275–285.

19.

Kam

Wilking

Zechmeister

(2007) Beyond the ‘narrow data base’: Another convenience sample for experimental research. Political Behavior 29(4): 415–440.

20.

Krupnikov

Levine

(2014) Cross-sample comparisons and external validity. Journal of Experimental Political Science 1(1): 59–80.

21.

McDermott

(2002a) Experimental methodology in political science. Political Analysis 10(4): 325–342.

22.

McDermott

(2002b) Experimental methods in political science. Annual Review of Political Science 5: 31–61.

23.

McDermott

(2011) Internal and external validity. In: Druckman

Green

Kuklinski

Lupia

(eds) Cambridge Handbook of Experimental Political Science. New York: Cambridge University Press, pp. 27–40.

24.

Mintz

Geva

Redd

Carnes

(1997) The effect of dynamic and static choice sets on political decision-making: An analysis using the decision board platform. American Political Science Review 91(3): 553–566.

25.

Mintz

Redd

Vedlitz

(2006) Can we generalize from student experiments to the real world in political science, military affairs, and international relations? Journal of Conflict Resolution 50(5): 757–776.

26.

Mullinix

Leeper

Druckman

Freese

(2015) The generalizability of survey experiments. Journal of Experimental Political Science 2(2): 109–138.

27.

Peden

Tiry

(2013) Using web surveys for psychology experiments. In: Sappleton

(ed.) Advancing Research Methods with New Technologies. Hershey, PA: IGI Global, pp. 70–100.

28.

Reips

(2000) The web experiment method: Advantages, disadvantages, and solutions. In: Birnbaum

(ed.) Psychological Experiments on the Internet. San Diego, CA: Academic Press, pp. 89–117.

29.

Reips

(2002a) Standards for Internet-based experimenting. Experimental Psychology 49(4): 243–256.

30.

Reips

(2002b) Internet-based psychological experimenting: Five dos and five don’ts. Social Science Computer Review 20(3): 241–249.

31.

Sears

(1986) College sophomores in the laboratory: Influences of a narrow data base on social psychology’s view of human nature. Journal of Personality and Social Psychology 51(3): 515–530.

32.

Shadish

Cook

Campbell

(2002) Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston, MA: Houghton-Mifflin.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.29 MB