Sage Journals: Discover world-class research

Abstract

The Open Science movement is taking hold around the world, and language testers are taking part. In this Viewpoint, I discuss how sharing, collaborating, and building trust, guided by Open Science principles, benefit the language testing field. To help more language testers join in, I present a standard definition of Open Science and describe four ways language testing researchers can immediately partake. Overall, I share my views on how Open Science is an accelerating process that improves language testing as a scientific and humanistic field.

Keywords

Language assessment language testing open access open code open data Open Science postprint preprint preregistration publishing

The field of language testing needs to adopt Open Science. Fundamentally, Open Science is about making scientific and scholarly research transparent and available to all, a worthy goal. According to the United Nations Educational, Scientific, and Cultural Organization (UNESCO, 2023), Open Science “combines various movements, practices, and actions that aim to make all fields of scientific research accessible to everyone for the benefit not only of scientists but also society as a whole.” UNESCO’s (2021, p. 7) “Recommendation on Open Science” defined Open Science as practices that promote:

Open knowledge: “to make multilingual scientific knowledge openly available, accessible, and reusable for everyone,”

Open collaboration: “to increase scientific collaborations and sharing of information for the benefits of science and society,” and

Open processes: “to open the processes of scientific knowledge creation, evaluation, and communication to societal actors beyond the traditional scientific community.”

In language testing, Open Science advances three vital goals: quality, clarity, and equity. First, the quality of language assessment research will only benefit from increased collaboration and sharing of knowledge and methods. The accountability built into Open Science should encourage excellence. And by encouraging best practices such as preregistration of research protocols, Open Science can foster more deliberate methodologies, resulting in higher caliber research. Second, by promoting clarity of research methods and communication, Open Science will also promote language assessment literacy (Coombe et al., 2020; Deygers & Malone, 2019; Gan & Lam, 2022) through increased access to and comprehension of language-testing research outcomes. Coombe et al. (2020) summarized language assessment literacy as the knowledge and skills a person needs to understand, evaluate, or construct language tests, or to analyze test data. The ability to acquire such knowledge and skills will only increase with more comprehensibly reported and accessible research being available. Improving understanding of language testing is especially important because test programs are often built on trust: Tests work when society trusts that test scores are useful, accurate, and fair. Moreover, trust in language testing is important because testing comes with responsibility: Language test scores are often used as an important gatekeeper (McNamara et al., 2019; Shohamy, 2007) to various life-changing opportunities and resources. Test scores can impact the test takers’ self-esteem, their motivation for continued learning, and their concepts of identity and worth (Dobson & Fudiyartanto, 2023). Open Science will hopefully remove trust in the field of language testing that is based on reputation without reasonable proof, and ensure trust in the field is based on evidence gathered through rigorous and accessible peer-reviewed research. In an age of Open Science, high-stakes language tests will work when society has evidence (from iterative, peer-reviewed validation studies) that their test scores are useful, accurate, and fair. Third and finally, Open Science fosters equity. Research about language learning and language testing should be available to all people and all researchers, not just those from institutions with large budgets.

In a nutshell, using Open Science to increase evidence-based, public trust in language testing will create better tests and promote better uses of the tests’ scores. Open Science is an accelerating, worldwide movement, on its way toward being a normative practice in many fields, including applied linguistics (Marsden & Morgan-Short, 2023). Thus, language testers need to embrace Open Science because language testing is the backbone of applied linguistics and second language acquisition (SLA) research. As Read (2007, p. 22) reported, “SLA researchers can learn from language testing the importance of evaluating the reliability of their elicitation instruments, and the need to consider carefully how adequately they have operationalized their theoretical constructs.” In other words, applied linguists are ubiquitous designers and users of language tests and heavy consumers of language testing principles when they design their data elicitation instruments. Thus, the more transparent, sharing, and explanatory language testers can be about their language testing research, especially their test-validation research, the better the science of applied linguistics will be, consequently. With this viewpoint, I ask each language tester to partake in Open Science to improve testing and applied linguistics. To help language testers new to Open Science, I discuss Open Science as a series of four steps that should help anyone get started.

Four Open Science steps and their pertinence within language testing

Step 1. Use persistent identifiers, such as an ORCID

First, locate yourself with accuracy on the Open Science roadmap: Register for an Open Researcher and Contributor ID (ORCID; https://orcid.org/) and use it at every opportunity. ORCIDs are persistent, immutable IDs that identify a researcher. ORCIDs differentiate researchers with the same name. An ORCID is stable even when the researcher undergoes a name change, or publishes under a different version of their name. Alongside the Research Organization Registry (ROR; https://ror.org/), that is, ORCIDs for organizations, and The DOI Foundation (https://doi.org), which produces and maintains digital object identifiers (DOIs) for products such as research papers, materials, and data deposits, ORCIDs can be viewed as Open Science start-up tools. Similarly, if resources allow, language testing publication venues, grant and award programs, and institutions should have an ROR, should identify their human agents with ORCIDs, and should identify the academic materials they produce or host with DOIs. Doing so will help keep entities and their connections within language testing clear and transparent. As Schiltz wrote (2018) in the preamble to cOAlition-S (a European consortium for the promotion of Open Science and open-access research), “Researchers and research funders have a collective duty of care for the science system as a whole.” Participating where possible through the use of persistent identifiers is a responsible first step.

Step 2. Earn the four Open Science badges

The Open Science movement has an incentive program to reward good Open Science behavior. If a researcher produces a research paper that involves Preregistration, Open Materials, Open Data, or Open Code, participating journals will recognize those efforts by literally placing an electronic badge representing the effort on the article’s front page, like a teacher putting a gold star on the top of your paper (see Kidwell et al., 2016). See the first three badges and a list of participating journals here: https://www.cos.io/initiatives/badges. The fourth badge, the Open Code badge, is not yet awarded by journals in language testing. It is the newest of the badges, with information here: https://www.comses.net/resources/open-code-badge. Below, I describe how to obtain the badges for your research.

Badge 1: Preregister your research protocol

Before you begin collecting data, preregister your research proposal or plans with a time stamp on the deposit indicating the date of the registration. Researchers can self-publish their research plans using preregistration templates or forms on platforms such as the Open Science Framework (OSF) (https://www.cos.io/initiatives/prereg).

Preregistration improves research transparency: Readers of the final published study can compare the final product to the preregistration to understand which parts of the study were planned prior to data collection, and which parts were added after the data were collected. Even more importantly, preregistration helps avoid p-hacking, which is when researchers explore the data and adjust analyses in pursuit of statistically significant results (see Lombrozo, 2014, for an overview of p-hacking), and HARK-ing (hypothesizing after the results are known, Kerr, 1998), which is when researchers present exploratory findings as if the original intent of the research was to confirm a prespecified hypothesis. Registration thus keeps researchers honest (“I did the analyses I intended and found X, but then I also explored the data and found Y. . .”).

Some journals, such as Language Learning, Language Testing, and TESOL Quarterly, allow authors to submit their registration to the journal for peer review (see e.g., Isaacs & Winke, 2024). If the reviewers find it sound, the journal will either publish the registration as a report or ask the authors to self-publish the registration in a public repository. At the same time, the journal promises to publish the study no matter what the results will be. This ameliorates reviewer biases in favor of studies that have significant results. The point is that good science can lead to nonsignificant or even messy results, which is useful scientific information that should be published and shared, especially when the study itself is solid. Thus, Open Science should produce more honest and less biased research, which will help language testing researchers know more about what they do not know, expanding language testing’s knowledge base and pathways for exploration. Researchers often aim to “fill the gap” or “address the gaps” in research, but as Lombrozo (2014) wrote, “while the gap can (and sometimes should) be narrowed, it cannot be closed.” Open Science may help us widen the gap to understand needed directions.

It may be transformative for language testing grant programs to require a public registration of any successful grant proposal after grant review, but before funding is provided and data are collected. Such processes may hold the researchers more accountable to their original plans, and force them to more accurately document changes to those plans when they are needed for various reasons. Transparent writing on such processes would provide unique insights to readers on real-world project planning and management, again, bettering the field.

On another note, preregistration will help suppress research-publication biases: Researchers commit themselves to reporting the results of their proposed analyses, even if it turns out they do not like the results. This helps the field remain committed to exploring diverse, forward-thinking hypotheses and not confine itself to a restricted, narrow, or backward-looking research canon. Rather than only publishing research results that protect one’s brand, or that build upon one’s previously established theoretical stance or reputation, preregistration asks branding and theory to follow the science, and defines reputation as consistent, methodological rigor and research transparency. Simply put, preregistration makes reporting on hypotheses that didn’t work out unavoidable. Null, unexpected, or “negative” results see the full light of day. Indeed, eventually, scrutiny of noncompleted, preregistered research in the field could at some point serve as a litmus test for the field’s biases.

Admittedly, preregistration may not be feasible for fully exploratory research, and good science can happen without registration. Nonetheless, I encourage language testing researchers to preregister at least one upcoming study, especially if you are doing empirical work, a thesis, or grant-sponsored work where a proposal was thoroughly written. Doing so is a formative exercise, and helps authors understand what preregistration can do for science. As examples, see Choi and Winke (2020), which is Choi’s dissertation preregistration on an investigation into video-conference language tests, Dezfuly and Archibald (2022), which is a preregistration of a study that investigated the constructs underlying a newly developed language and working memory test, and Kelly et al. (2022), which is a preregistration of a laboratory-based, training study on L2 Japanese prosody perception. Note that researchers can embargo their registrations: That is, a researcher can have a registration logged and date/time stamped, but not have the registration viewable to the public for a set amount of time determined by the researcher. The researcher can also, at will, release the embargo earlier than originally set, or extend the length of the embargo. In addition, within the OSF, for example, researchers can generate a “view-only,” anonymized link to their registration, which they can include in their manuscript for peer review.

Badge 2: Publish the materials used in your study

To the extent possible, publish the background questionnaires, interview protocols, and even test items or tests that underlie your research. Publishing and sharing materials keep field members from reinventing the wheel and help the field build upon prior knowledge and grow. In language testing, not all test material can be shared, as it may be proprietary. This is okay. Researchers are only asked to share what they can. Researchers can put these materials in supplemental files or store them in open-access, public repositories that provide DOIs, such as the OSF, Humanities Commons (https://hcommons.org), or IRIS (https://www.iris-database.org/), with IRIS—formerly known as “Instruments for Research Into Second languages”—having been specifically created for applied linguists (Marsden & Mackey, 2014). Language testing researchers can also use other field-specific repositories, such as PsyArXiv (psychological science; https://psyarxiv.com/) and SocArXiv (social science; https://osf.io/preprints/socarxiv), both of which are hosted on the OSF, and both of which have “applied linguistics” and assessment-related subfield categorizations that researchers can select to identify their work. For a short list of other public science repositories from around the world, such as FigShare (http://figshare.com) and Zenodo (http://zenodo.org/), see Wilkinson et al., 2016. Four examples of open materials are Kremmel and Harding (2020), whose manuscript included a stable URL to the study’s main survey instrument; Rossi (2022), who posted a language assessment literacy test she created on the OSF; Hui (2022), who published his study’s reading-aloud listening test materials (plus the data and analysis code) on the OSF, and O’Reilly (n.d.), who published a test he used in his dissertation and two subsequent publications on IRIS. Such open collaboration is good for science and good for the researchers. With public deposits, materials can be accurately cited and openly utilized by others, with full credit given to the creator. A side effect of sharing materials is that studies that use the same materials are more directly comparable. Publishing one’s materials also specifically improves the facilitation of research replication.

Badge 3: Publish the anonymized data underlying your research

Sharing data is perhaps the most important aspect of Open Science. Sharing data provides resources for training, allows for secondary analyses, promotes reproducibility, and garners evidence-based trust. Participating in Open Science by publishing your research takes time and energy, but is enriching as a process, informative as a scientific endeavor, and ensures others can use the data, and cite you, to continue to build (and test) theory. As described by the Interuniversity Consortium for Political and Social Research (ICPSR, n.d.), which is a public data archive supported by more than 750 academic research organizations and institutions around the world, “repositories that preserve and disseminate social and behavioral data perform a critical service to the scholarly community and to society at large by ensuring that these culturally significant materials are available in perpetuity” (p. 5). Publishing your data means boldly standing by your data, and inviting external researchers to scrutinize them (see O’Grady, 2023; Scheiber, 2023, for information on a recent, and if not rather sensational, data investigation). Secondary analyzers in graduate classes can use your data as training, try out your analyses, or use them to answer new questions.

Publishing data requires early preparation. Begin at the research proposal stage with a robust data management plan that has the public sharing of the data as an end goal. Be sure to obtain the necessary consent from the test subjects and any associated organizations. When you publish the data, include a code book, which is a file containing information (i.e., metadata; data about the data) about each of the variables in the data set (a variable list, with definitions), including the variable name, its type (numeric or string), its scale (nominal, ordinal, continuous), its range, number of decimal points, and, if coded, what each numeric code represents.

An excellent repository into which to make one’s first data deposit is ICPSR, because when you prepare your data for deposit with ICPSR, you must fill out a data deposit template that helps you parse the process and understand the information needed for a robust and useful deposit. ICPSR has a free-to-download “Guide to Social Science Data Preparation and Archiving” (ICPSR, n.d.), which walks researchers through data sharing considerations, such as the cost of preparing and documenting your data, the future needs of the potential secondary users of the data, long-term archiving and preservation goals, as well as ethics and privacy, legal requirements, and quality assurance. Examples of data deposits using ICPSR include Ma (2021) and Winke et al. (2020). Other examples of shared data from published studies can be found in Burton (2020), Hui (2022), and Isbell and Son (2022), who used the OSF for their data deposits, and in Isbell and Lee (2022), who used IRIS. Researchers will find the ease of use and “one-stop shopping” aspect of depositing to the OSF a compelling reason to post there. For example, see Hui’s deposit, which combines the materials, data, and data analysis code from one project in one deposit. Researchers do not always have to decide, however, as most repositories allow for authors to deposit to other repositories at the same time, and doing so can help with worldwide access to your work. For example, In’nami et al. (2022) published their data and data analysis code both on IRIS and the OSF, and explained in their article that they did that to take advantage of each repository’s unique features and reach. But regardless of the particular repository (or repositories) used, what is key in these studies is that the authors collected the data for research with public sharing of the data as an initial goal and obtained informed consent. Moreover, the researchers all carefully anonymized the data to preserve confidentiality.

Consent and confidentiality are obviously critical concerns that can constrain your ability to share data. In considering those limits, it is important to distinguish operational testing from research-based testing. Data from operational testing will typically be more sensitive and have more protections. While such data may not be able to be publicly shared, it can sometimes still be borrowed or investigated in-house with proper controls, nondisclosure agreements, and protections in place. Test data from research projects can typically be shared more freely so long as the researcher collects the data with research-participant consent. But even then, the need to preserve confidentiality can affect what details may be shared. For example, researchers may be able to share all data with the identifying information stripped, or when needed, may have to aggregate data by group. Researchers may need to employ techniques such as small cell-size suppression to protect the anonymity of single participants or small groups of participants. Small cell-size suppression is when any cell size of, for example, 1 to 10, is not reported (Centers for Medicare & Medicaid Services, 2020) and is replaced with an indication of suppressed data, such as an asterisk, to protect participants from reidentification. For example, a group of 3 students from Nepal in a data set might need to be described as a group of less than 10 students from an anonymized Asian country. Extreme care will need to be undertaken to prevent reidentification in published data sets of any kind, and sensitive data will simply have to be protected as nonpublishable. Thus, it is not wrong for the field of language testing to approach the Open Science agenda of data sharing carefully. At times, publishing data in language testing may mean partial data publication only after appropriate and strong levels of data anonymization.

Corporations and government have a vital role to play in data sharing, since they have by far the largest data sets. Most language testing agencies and companies collect extremely valuable information from large volumes of language learners, data that could be—and often has been—mined to answer important questions about language assessment, language acquisition, and the brain. Thus, in the spirit of Open Science, language testing companies and government agencies should aspire to devise more transparent ways to allow researchers to borrow and analyze their treasure troves of test data, when possible, to allow for the investigation of language-learning questions to which society needs answers.

My state of Michigan provides an example of how this process can work. The Michigan Educational Research Initiative (MERI, https://miedresearch.org), which takes in requests for state test data, has researcher members and partners who investigate the data themselves, and who review research based on the data. MERI’s goal is to improve education, including English-language learner education, across the state. As part of this endeavor, MERI provides standardized data request forms on its website. Researchers can fill them out to ask for access to test data and certain participant-background data. Such borrowing privileges always come with strict restrictions. For example, a colleague and I (Winke & Zhang, 2019) were allowed to borrow MERI data after being certified in ethical data-borrowing practices and after signing data-borrowing and security agreements, including the condition that we could not publish the data nor share them with anyone. While we could not publish the data, we could publish the data request form as we filled it out, thus informing researchers how to borrow the exact same data for replication purposes.

But while many language testing agencies do allow test data to be borrowed for research purposes, private companies’ data have typically been less available to the public, and their data-borrowing procedures have been less transparent. One possible improvement would be for participating companies to publish their data request forms on their websites so that external researchers could see at any time what types of data are available. A step further would be for the companies to describe to the public the data sets that are available and describe how representative the data available for borrowing are of the data that the company owns. For example, suppose that after obtaining proper consent and carefully checking data anonymity, a testing company decided to make 100,000 speech samples with select metadata available to the public, whether for free or for a fee. For that data to be fully useful, the company would need to describe how it selected those 100,000 samples from the larger pool, so that researchers could make appropriately generalizable inferences from their analyses. With that information available, our understanding of testing and learning could advance by leaps and bounds.

Likewise, language testing companies and government agencies that have practice or sample tests available could provide more transparent information to the public on how educators or researchers can request access to them or use them for research purposes. Many companies routinely publish practice or sample tests in books or online in PDF format, publish retired tests in full, or have trails of their online item types available on their websites. These have often become research materials. However, with tests increasingly offered solely online, online sample tests (and with scores reported) are often not as easy to locate or need privately delivered access codes. Transparency on what is available, and how one must request access (what criteria are required of the researchers), would help the field advance.

Badge 4: Publish your data analysis code

Language testing researchers use code to analyze their data. Publishing that analysis code enables other researchers to retrace the steps of your research, and thus to more fully understand it. As summarized by In’nami et al. (2022), “sharing data and code makes it possible to examine the analytic robustness of results.” Committing to publishing your analytical code may even improve your work. As Tackett et al. (2019) wrote,

one does not learn how messy their analytical code is until they think about posting it for others to see. [Posting] minimizes individual differences in organization and reduces confusion for those who may later have to navigate our project (including our future selves) (p. 1389).

Finally, publishing code provides a useful learning resource for novice researchers and makes it easier for other researchers to accurately replicate your study, as they can faithfully implement your analyses line-by-line. In addition, some journals require it. For example, the journal Applied Psycholinguistics is requiring the publishing of the data analysis code for both original manuscripts and replication studies.

Publishing data analysis code does not have to be difficult. Data analysis software such as SPSS (https://www.ibm.com/products/spss-statistics) or JASP (https://jasp-stats.org/) allows researchers to download the code, even when the analysis is performed through drop-down menus. The code does not need to clog up the manuscript. Some researchers include the code as an appendix in the manuscript. Alternatively, it can be uploaded to a public repository, often together with the data. For examples of published code in applied linguistics and language testing, see Hui (2022) and Kang et al. (2023).

Step 3. Write transparently

The third Open Science practice is to report your research clearly and straightforwardly. Of course, what is “clear” and “straightforward” in a particular circumstance will depend on the culture of your institution and the subject of your research. Moreover, your supervisors, associations, funders, and editors may constrain your choices. Still, there are a few basic principles to follow.

First, be transparent about not just how the data were collected and analyzed, but about who did the work. Most research in our field is co-authored, and even single-authored studies often rely on the efforts of many people. Explain who did what tasks. Doing so helps replicators understand the resources needed to replicate and can help researchers and evaluators accurately understand the size and scope of the study. A few applied linguistics journals, including Language Learning and Language Testing, have adopted the CRediT Contributor Roles Taxonomy system (Allen et al., 2014; see also https://credit.niso.org), which has authors identify which roles they each undertook while conducting the research. Graduate programs and faculty review systems can start using CRediT too. Early users of this practice were applied linguists McDonough and Chaikitmongkol (2007, p. 117), who described their research-task divisions using their first names, as in this example: “As a final step in the analysis procedure, Wanpen read the entire corpus and checked the validity of the general themes and supporting segments that Kim had identified.” Transparent methodologies will particularly help junior scholars (Hui et al., 2023): When they read papers, they will learn about how research actually gets done, and when they serve as research assistants, they will get credit for their work—as is only fair.

Second, write simply. When possible, use the first person. Avoid jargon. Transparent writing will make the research more accessible. This can be counterintuitive for some of us in the field of language testing because we have spent our academic training and at least some of our professional lives preparing, teaching, or evaluating academic writing with rubrics that value complex sentence structures, uncommon vocabulary, and a high usage of the passive voice. Helpful guides on writing academic papers as simply as possible are from the American Psychological Association (2020), Pinker (2014), and Sword (2012).

Step 4. Publish openly

The fourth Open Science practice is to get your work into the hands of the public, enabling as many people as possible to participate in up-to-date academic discussions and research. Research should be quickly and widely available even to researchers at institutions, many in the Global South, that lack the funding for language testing books and journals. They need access to language testing knowledge for the field to move forward with diverse voices.

The most obvious way to make your work widely available is to publish open access. The trend toward open-access publication is spreading across all fields of scholarship. In fact, open access and open data will be mandated for all federally funded research in the United States by 2025 (The White House, 2023). Open access is lauded as the ultimate goal by multigovernment coalitions such as the European Science Foundation’s cOAlition-S (https://www.coalition-s.org), which since 2021 has required that publications based on research funded by its member nations be published in open-access journals, platforms, or repositories. Thus, publishing open access is a good idea for junior, mid-career, and senior scholars.

Preferring open access, however, does not mean authors should simply go for any open-access publication. Quality still matters. Avoid predatory publishers that take an open-access fee (i.e., assess an article processing charge, or APC) but do not engage in rigorous peer review (i.e., their articles may be, for example, accepted within days). Recently, more than 50 fast-growing, open-access journals have been stripped of their impact factors and removed from the Web of Science for having severely reduced the peer-review process, landing themselves within the predatory journal classification (Brainard, 2023). A journal’s impact factor and its editorial board composition should indicate its peer-review quality. Journal impact factors are debated as a metric of quality for promotion and tenure (Brainard, 2023; Callaway, 2016), but still, the general aim is good: Researchers should strive for high quality and rigorous peer review, and strive for open access to their work.

But what if you are publishing in a hybrid journal and lack funding for the APC? For now, the best answer is to self-publish an early version of the paper in an open-access repository, such as the OSF, so that the information contained in the paper can be accessed by anyone, regardless of their economic situation and resource support network. There are basically two early versions of a manuscript that an author can self-publish: preprints or postprints. Preprints are manuscript versions prior to peer review (also called the original manuscript version). Postprints are manuscript versions that have been peer-reviewed and accepted for publication, but not yet typeset nor published (also called the accepted manuscript version). Within the field of applied linguistics, there is a strong call to publish postprints, with an online pledge available for those who commit to doing so (see Al-Hoorie & Hiver, 2023). Both preprints and postprints can be posted on preprint servers such as the OSF’s preprint server (https://osf.io/preprints/). A difference is that many publishers establish unique restrictions on publishing postprints, including Sage, the publisher of Language Testing. Each publisher normally posts their preprint and postprint restrictions (see Sage, n.d., as an example), and it is incumbent on authors to inform themselves of their publisher’s restrictions. For example, Sage’s (n.d.) online information underscores that preprints can be posted immediately with no restrictions, while postprints can be published on one’s institutional repository or department website immediately, but on a public repository not at one’s institution only 12 months postpublication (after a 12-month embargo). In the OSF, an author can upload a new version of a preprint: Each revision is dated and timestamped, and older versions are immutable and, in addition, accessible at the posting site, which has only one DOI. Thus, an author can update their preprint to a postprint over time, rendering the main difference between the preprint and postprint one of timing within the deposit, with the time connected to the peer review. Other private repositories such as Elsevier’s Social Science Research Network, Germany-based ResearchGate.net, or US-based Academia.edu can also be used for the publishing of preprints or postprints, but these may require membership to download or may not be as well aligned with Open Science principles as others are. For example, Fitzpatrick (2015) summarized a longevity issue with for-profit, venure-capitalist-based academic-sharing platforms such as Academia.edu: “There are a limited number of options for the network’s future: at some point, it will be required to turn a profit, or it will be sold for parts, or it will shut down.”

Still, there are pros and cons related to each repository. In addition to longevity, another factor to consider is reach. Many researchers around the world, and especially those in the Global South, are at universities that do not have libraries that subscribe to searchable indexes, or they live in countries where there are restrictions on Internet searches. For example, Google and Google Scholar may be banned. Thus, many of the public and private research repositories are searched as indexes of research in applied linguistics and language testing. For example, researchers in Africa may be more familiar with ResearchGate, and thus use it when searching for articles and works on specific language testing topics. Researchers in South America may be more apt to search Google Scholar (which captures OSF and other well-indexed repositories), IRIS, or Academia.edu for research within the field. Ergo, choosing which repository to use is similar to choosing which journal to publish in or which social media platform to use: There are many factors to consider, and longevity and reach are just two of them.

The field of language testing as a whole could also better capitalize on public repositories. Conferences within the field could ask presenters to voluntarily upload their materials to the OSF or other repositories with a hashtag like #LTRC2026, so that the collection could be viewed comprehensively without fees to readers. Indeed, an excellent example of this type of work is represented by the British Association for Applied Linguistics (BAAL), which is using OSF Meetings to provide public access to BAAL 2023 conference materials (https://osf.io/meetings/BAAL2023). Humanities Commons can be used similarly, and, in addition, provides space for entities such as interest groups to have members, a shared multi-member-curated website, and deposits (see the Society for Music Theory’s Popular Music Interest Group’s website at https://smtpmig.hcommons.org/ as an example). Book editors could do similar, cross-author, open-access work: Editors could ask all contributing chapter authors to upload their chapter postprints to a public repository with a shared search term (or connected through a shared project), so that those unable to access the full book due to a lack of funding could do so. Open-access book publishing with standard publishing houses has a current fee structure of approximately 8,000 to 24,000 USD per book. Thus, unless funding is obtained to cover such fees, postprint publishing on open-access repositories, when allowed, may be a more feasible, full-access option for books.

Both for individual researchers and the field as a whole, placing language testing work in open-access, public repositories serves to make language testing knowledge available to everyone, while still allowing for rigorous peer review. Other Open Science goals in the field of language testing should include promoting the field’s diamond open-access publication venues (see Al-Hoorie’s, n.d., list of diamond open-access journals in the field of applied linguistics) and encouraging more fully open-access publishing opportunities within the field of language testing wherever we can.

Conclusion

My main goal in writing this viewpoint was to describe Open Science for language testing to help language testers more readily join the Open Science movement. I hope that by reading this viewpoint, language testers will agree that Open Science supports the growth of language testing through its goals of open knowledge, collaboration, and transparent processes. Coombe et al. (2020, p. 11) wrote that the field of language testing has recognized that language assessment literacy is multifaceted, and because of that, there is a “lack of unanimity within the professional assessment community as to what shapes the assessment knowledge that will be passed on to future experts in the field.” Open Science will help. The Open Science mandate is to pass on all that you can by making language testing information transparent and open, without delay. Open Science will put language testing’s focus on quality, not quantity (Marsden & Morgan-Short, 2023; National Academies of Sciences, Engineering, and Medicine, 2020, p. 23). As Marsden and Morgan-Short (2023) described, Open Science Practices, when fully engaged, can be time-consuming, but the reward is better science for all.

The main winner with increased Open Science practices within the field of language testing will be the language testing researcher, policy maker, or test stakeholder who otherwise does not have access to language testing’s vast body of research and teachings due to a lack of funding, library services, and/or resources. As language testers, our goal is to scientifically and socially guide the design of language assessments and the uses of scores from them. Open science will help us do this as part of a humanistic drive to make language education and assessment more accessible, comprehensible, and fair.

Footnotes

Acknowledgements

Thank you to Dylan Burton, Xiaowan Zhang, and Wenyue Ma, whose discussions with me on Open Science helped me write this Viewpoint. Thank you also to Luke Harding, Emma Marsden, and Kara Morgen-Short for a valuable lunch on Open Science at the American Association of Applied Linguistics (AAAL) conference in 2019. Thank you to Ali Al-Hoorie for advice on Open Science given at AAAL in 2023.

Author Contributions

Paula Winke: Conceptualization; Investigation; Methodology; Resources; Writing—original draft; Writing—review & editing

Declaration of Conflicting Interests

The author declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Paula Winke was Co-Editor of the journal, Language Testing, from 2019 to 2023. Recently, she has served as an advisor to the US Foreign Service Institute and as a senior editor within the British Council. She was Daniel R. Isbell’s dissertation advisor at Michigan State University and will be spending a sabbatical year with the Language Testing Research Group Innsbruck (LTRGI), of which Benjamin Kremmel is the director, during the 2024-25 academic year.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The author received financial support for the publication of this article from Michigan State University (MSU) and the MSU College of Arts and Letters.

ORCID iD

Paula Winke

References

Al-Hoorie

A. H.

(n.d.). List of open access journals. https://www.ali-alhoorie.com/applied-linguistics-open-access-journals

Al-Hoorie

A. H.

Hiver

(2023). The Postprint Pledge – Toward a culture of researcher-driven initiatives: A commentary on “(Why) are open research practices the future for the study of language learning?” Language Learning,73, 388–391 https://doi.org/10.1111/lang.12577

Allen

Scott

Brand

Hlava

Altman

(2014). Publishing: Credit where credit is due. Nature, 508(7496), 312–313. https://doi.org/10.1038/508312a

American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.).

Brainard

(2023). Fast-growing open-access journal stripped of coveted impact factors: Web of science delists some 50 journals, including one of the world’s largest. Science, 379(6639), 1283–1284. https://doi.org/10.1126/science.adi0092

Burton

J. D.

(2020). Raters’ perceptions and operationalization of authentic engagement in oral proficiency tests (Version V1) [Data set]. Open Science Framework. https://doi.org/10.17605/OSF.IO/TSDH9

Callaway

. (2016). Beat it, impact factor! Publishing elite turns against controversial metric. Nature, 535, 210–211. https://doi.org/10.1038/nature.2016.20224

Centers for Medicare & Medicaid Services. (2020, September 15). CMS cell suppression policy. https://www.hhs.gov/guidance/document/cms-cell-suppression-policy

Choi

Winke

(2020, December 23). Investigating test delivery modes of video-conferenced speaking assessment [Registration]. Open Science Framework. https://doi.org/10.17605/OSF.IO/EAC4U

10.

Coombe

Vafadar

Mohebbi

(2020). Language assessment literacy: What do we need to learn, unlearn, and relearn? Language Testing in Asia, 10(3). https://doi.org/10.1186/s40468-020-00101-6

11.

Deygers

Malone

M. E.

(2019). Language assessment literacy in university admission policies, or the dialogue that isn’t. Language Testing, 36(3), 347–368. https://doi.org/10.1177/0265532219826390

12.

Dezfuly

N. A.

Archibald

L. M.

(2022). The language and working memory token test [Registration]. Open Science Framework. https://doi.org/10.17605/OSF.IO/3BY7E

13.

Dobson

S. R.

Fudiyartanto

F. A.

(2023). Transforming assessment in education: The hidden world of language games. Springer. https://doi.org/10.1007/978-3-031-26991-2

14.

Fitzpatrick

(2015, October 26). Academia, not edu. Kfitz.info. https://kfitz.info/academia-not-edu/

15.

Gan

Lam

(2022). A review on language assessment literacy: Trends, foci and contributions. Language Assessment Quarterly, 19(5), 503–525. https://doi.org.10.1080/15434303.2022.2128802

16.

Hui

(2022). Reading aloud listening test items to young learners: Attention, item understanding, and test performance [Materials, data set, and data analysis code]. Open Science Framework. https://doi.org/10.17605/OSF.IO/CWNDK

17.

Hui

Koh

Ogawa

(2023). Voices of three junior scholars: A commentary on “(Why) are open research practices the future for the study of language learning?” Language Learning,73, 414–417 https://doi.org/10.1111/lang.12571

18.

In’nami

Mizumoto

Plonsky

Koizumi

(2022). Promoting computationally reproducible research in applied linguistics: Recommended practices and considerations. Research Methods in Applied Linguistics, 1(3), Article 100030. https://doi.org/10.1016/j.rmal.2022.100030

19.

Inter-university Consortium for Political and Social Research. (n.d.). Guide to social science data preparation and archiving: Best practices through the data life cycle (6th ed.). https://www.icpsr.umich.edu/web/pages/deposit/guide/index.html

20.

Isaacs

Winke

P. M.

(2024). Purposeful turns for more equitable and transparent publishing in language testing and assessment. Language Testing, 41(1), 3–8. https://doi.org/10.1177/02655322231203234

21.

Isbell

D. R.

Lee

(2022). Data. Datasets from “Self-assessment of comprehensibility and accentedness in second language Korean” [Data deposit]. IRIS Database. https://doi.org/10.48316/bbtt-qb58

22.

Isbell

D. R.

Son

Y.-A.

(2022). Open data and analysis scripts [Data and analysis code]. Open Science Framework. https://doi.org/10.17605/OSF.IO/H57E8

23.

Kang

Yan

Kostromitina

Thomson

Isaacs

(2023). Fairness of using different English accents: The effect of shared L1s in listening tasks of the Duolingo English Test. Language Testing, 41(2), 263–289. https://doi.org/10.1177/0265532223117913

24.

Kelly

Friedman

Kaicher

Hirata

(2022). Neural correlates and subjective assessments of multimodal training on perception of foreign language prosody [Registration]. Open Science Framework. https://doi.org/10.17605/OSF.IO/8QN9E

25.

Kerr

N. L.

(1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196–217. https://doi.org/10.1207/s15327957pspr0203_4

26.

Kidwell

M. C.

Lazarević

L. B.

Baranski

Hardwicke

T. E.

Piechowski

Falkenberg

L.-S.

Kennett

Slowik

Sonnleitner

Hess-Holden

Errington

T. M.

Fiedler

Nosek

B. A.

(2016). Badges to acknowledge Open Practices: A simple, low-cost, effective method for increasing transparency. PLOS Biology, 14, Article e1002456. https://doi.org/10.1371/journal.pbio.1002456

27.

Kremmel

Harding

(2020). Towards a comprehensive, empirical model of language assessment literacy across stakeholder groups: Developing the language assessment literacy survey. Language Assessment Quarterly, 17(1), 100–120. https://doi.org/10.1080/15434303.2019.1674855

28.

Lombrozo

(2014, June 2). Science, trust, and psychology in crisis. National Public Radio. https://www.npr.org/sections/13.7/2014/06/02/318212713/science-trust-and-psychology-in-crisis

29.

(2021). Data and code book for Language Assessment Quarterly article: An investigation of the impact of jagged profile on L2 speaking test ratings—Evidence from rating and eye-tracking data [Data deposit]. Inter-university Consortium for Political and Social Research. https://doi.org/10.3886/E150761V1

30.

Marsden

Mackey

(2014). IRIS: A new resource for second language research. Linguistic Approaches to Bilingualism, 4(1), 125–130. https://doi.org/10.1075/lab.4.1.05mar

31.

Marsden

Morgan-Short

(2023). (Why) are open research practices the future for the study of language learning? Language Learning, 73, 344–387. https://doi.org/10.1111/lang.12568

32.

McDonough

Chaikitmongkol

(2007). Teachers’ and learners’ reactions to a task-based EFL course in Thailand. TESOL Quarterly, 41, 107–132.

33.

McNamara

Knoch

Fan

(2019). Fairness, justice, and language assessment. Oxford University Press.

34.

National Academies of Sciences, Engineering, and Medicine. (2020). Enhancing scientific reproducibility in biomedical research through transparent reporting: Proceedings of a workshop. National Academies Press. https://doi.org/10.17226/25627

35.

O’Grady

(2023, July 18). After honesty researcher’s retractions, colleagues expand scrutiny of her work. Science. https://doi.org/10.1126/science.adj8331

36.

O’Reilly

(n.d.). Metaphoric competence test [Materials deposit]. IRIS. https://www.iris-database.org/details/wjBzr-sMI1F

37.

Pinker

(2014). The sense of style: The thinking person’s guide to writing in the 21st century. Allen Lane.

38.

Read

(2007). Towards a new collaboration: Research in SLA and language testing. New Zealand Studies in Applied Linguistics, 13(1), 22–35.

39.

Rossi

(2022). Language assessment literacy for language test item writers [Survey materials]. Open Science Framework. https://doi.org/10.17605/OSF.IO/5RFMP

40.

Sage. (n.d.). Sage author sharing guidelines. https://uk.sagepub.com/sites/default/files/author_archiving_policies_and_re-use.pdf

41.

Scheiber

(2023, June 24). Harvard scholar who studies honesty is accused of fabricating findings. The New York Times. https://www.nytimes.com/2023/06/24/business/economy/francesca-gino-harvard-dishonesty.html

42.

Schiltz

(2018, September 4). Why plan S. European Science Foundation. https://www.coalition-s.org/why-plan-s/

43.

Shohamy

(2007). The power of language tests, the power of the English language and the role of ELT. In Cummins

Davison

(Eds.), International handbook of English language teaching (pp. 521–531). Springer. https://doi.org/10.1007/978-0-387-46301-8_37

44.

Sword

(2012). Stylish academic writing. Harvard University Press.

45.

Tackett

J. L.

Brandes

C. M.

Reardon

K. W.

(2019). Leveraging the Open Science Framework in clinical psychological assessment research. Psychological Assessment, 31(12), 1386–1394. https://doi.org/10.1037/pas0000583

46.

United Nations Educational, Scientific, and Cultural Organization. (2021). UNESCO Recommendation on Open Science. https://doi.org/10.54677/MNMH8546

47.

United Nations Educational, Scientific, and Cultural Organization. (2023, April 3). UNESCO’s toolkit can help accelerate the transition to global open science. https://www.unesco.org/en/articles/unescos-toolkit-can-help-accelerate-transition-global-open-science

48.

The White House. (2023, January 11). Fact sheet: Biden-Harris administration announces new actions to advance open and equitable research. Office of Science and Technology Policy. https://www.whitehouse.gov/ostp/news-updates/2023/01/11/fact-sheet-biden-harris-administration-announces-new-actions-to-advance-open-and-equitable-research/

49.

Wilkinson

M. D.

Dumontier

Aalbersberg

I. J.

Appleton

Axton

Baak

Blomberg

Boiten

J-W.

da Silva Santos

L. B.

Bourne

P. E.

Bouwman

Brookes

A. J.

Clark

Crosas

Dillo

Dumon

Edmunds

Evelo

C. T.

Finkers

. . . Mons

(2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 15(3), Article 160018. https://doi.org/10.1038/sdata.2016.18

50.

Winke

P. M.

Zhang

(2019). How a third-grade reading retention law will affect ELLs in Michigan, and a call for research on child ELL reading development. TESOL Quarterly, 53(2), 529–542. https://doi.org/10.1002/tesq.481

51.

Winke

P. M.

Gass

S. M.

Soneson

Rubio

Hacking

J. F.

(2020). Foreign language proficiency test data from three American universities, 2014–2017 [Data deposit]. Inter-university Consortium for Political and Social Research. https://doi.org/10.3886/ICPSR37499.v1

Sharing,collaborating,and building trust: How Open Science advances language testing

Abstract

Keywords

Four Open Science steps and their pertinence within language testing

Step 1. Use persistent identifiers, such as an ORCID

Step 2. Earn the four Open Science badges

Badge 1: Preregister your research protocol

Badge 2: Publish the materials used in your study

Badge 3: Publish the anonymized data underlying your research

Badge 4: Publish your data analysis code

Step 3. Write transparently

Step 4. Publish openly

Conclusion

Footnotes

Acknowledgements

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iD

References