Abstract
Sharing research data promotes transparency and cumulative knowledge building. This is especially relevant in teaching effectiveness research, where acquiring classroom data is challenging. Despite the benefits, previous studies indicate low data availability in educational research. This study examines data sharing practices in 167 studies, acquired via a two-step process: first, identifying meta-analyses in teaching effectiveness research, and second, scanning them for articles meeting specific criteria (longitudinal, peer-reviewed, primary data, publication year). For each article, we checked for a repository link and contacted corresponding authors if none was provided. Only 13 publications included a repository link. Among the contacted researchers, one-third had outdated contact details. Of the remaining researchers, 36 responded, with 9 agreeing to share their data. The overall availability rate was 13.2%, with the odds of data-sharing decreasing by 19% annually post-publication. Our results show the need to archive data in repositories, as its availability upon request remains limited.
Keywords
Introduction
Sharing research data for secondary analyses is crucial for advancing science in general as it fosters an environment of transparency, reproducibility, and cumulative knowledge building (Kowalczyk, & Shankar, 2011). By making empirical data openly available according to the FAIR principles (Findable, Accessible, Interoperable, Reusable), researchers can verify results through replication studies (Ioannidis, 2005; J. A.R. Logan et al., 2021; Open Science Collaboration, 2015; Pashler, & Wagenmakers, 2012; Vanpaemel et al., 2015). Beyond replication, open data sharing enhances the visibility and impact of researchers’ work (Piwowar et al., 2007) and allows third-party researchers to explore research questions that extend beyond the original purpose of the data collection (Borgman, 2012).
In the context of teaching effectiveness research, a field that examines how variations in teaching practices relate to student learning and other educational outcomes (Charalambous & Praetorius, 2025), data sharing is especially important. This research area relies on diverse and complex datasets (D. K. Cohen et al., 2003) and commonly adopts longitudinal designs that allow researchers to observe changes in student outcomes over time (e.g., Dimosthenous et al., 2020; Kyriakides & Creemers, 2008). Longitudinal designs are viewed as the minimum requirement for causal inference in teaching effectiveness research, as they help disentangle teaching influences from other factors (Goldstein, 1997; Grützmacher et al., 2025).
Given the significant effort and resources required to gather high-quality longitudinal data, sharing these datasets maximizes their value while supporting a collaborative approach to advancing teaching effectiveness research (Owan & Bassey, 2019). The importance of data sharing is further underscored by inconsistent effects reported across multiple meta-analyses (e.g., J. Hattie, 2009; Kyriakides et al., 2013; Praetorius & Charalambous, 2018, Seidel & Shavelson, 2007), which emphasize the need for broader data access to enable replication and identification of moderating factors (Bayer et al., 2023).
Despite the numerous advantages of data sharing, several studies reveal low sharing rates in education and related fields (e.g., Huff & Bongartz, 2023; Tedersoo et al., 2021). Within the field of teaching effectiveness research, this study aims to assess the frequency of data sharing, identify challenges inherent to the sharing process, and document potential concerns or reasons for denying access to data. To achieve this, we attempted to obtain research data related to teaching effectiveness by contacting over 100 researchers via e-mail.
Data Sharing Policies and Transparency
Journals play a crucial role in advancing data sharing practices by encouraging or requiring researchers to share not only their data but also accompanying materials such as measurement instruments, codebooks, and syntax (Martone et al., 2018; Molloy, 2011). The Transparency and Openness Promotion (TOP) guidelines outline various levels of data transparency that journals can adopt, ranging from not addressing data sharing to mandating researchers to deposit data in a trusted repository and reproduce their findings through independent investigation (Nosek et al., 2015). Among these standards, publishing data in a trusted repository is regarded as one of the highest levels of transparency.
Many journals now mandate authors to include data availability statements, either by providing a URL to the dataset or indicating that data can be accessed upon reasonable request (Houtkoop et al., 2018; Hussey, 2023). However, standards for data publication vary. For example, the Journal of Educational Psychology (2024) mandates that all relevant data be made publicly accessible to facilitate unrestricted replication, while Learning and Instruction (2024) merely encourages data sharing.
Beyond scientific journals, in recent years, research funders have introduced increasingly stringent data sharing requirements, further reinforcing top-down transparency incentives. For example, following a 2022 memo by the U.S. Office of Science and Technology calling for immediate public access to funded publications and data, agencies such as the National Institutes of Health (NIH), the National Science Foundation (NSF), and the Institute of Education Sciences (IES) now require detailed data management and sharing plans with public data availability and minimal embargo periods (Institution of Education Sciences [IES], 2023; National Institutes of Health [NIH], 2023; National Science Foundation [NSF], 2025; Nelson, 2022). Similarly, the German Research Foundation (DFG) and the Federal Ministry of Education and Research promote open data practices, with the DFG recommending data publication in trusted repositories as crucial part of its guidelines for good scientific practice (Federal Ministry of Education and Research, n.d.; German Research Foundation, 2024).
These policies reflect the principle stated by the Horizon 2020 program, an EU funded research and innovation initiative, that data should be “as open as possible, as closed as necessary” (European Commission, 2016, p. 4). Furthermore, in situations where data cannot be publicly shared immediately, it is generally expected that the data be made available upon reasonable request. This expectation is reflected in professional guidelines, such as those outlined by the American Psychological Association ([APA], 2020). In light of these established expectations and policies regarding data sharing, numerous studies have empirically examined actual data sharing practices.
Research on Data Sharing
Data availability investigations generally differentiate between research data that is made available via a repository and data that is accessible only upon contacting the corresponding or first author of a publication. For example, Tedersoo et al. (2021) distinguish between initial data availability (i.e., finding data before contacting researchers) and final data availability (i.e., data received after contacting authors). Their findings indicate that initial data availability in the field of psychology varied between 10% and 60%, depending on whether the data were published between 2001 and 2009 or 2010 and 2019. However, Tedersoo et al.’s study focused solely on two high-impact journal groups, Science and Nature, which have stringent data availability policies and may not be representative of all scientific journals.
Early investigations into data sharing practices, particularly those focused on availability upon request, date back several decades. In one of the first notable instances, Wolins (1962) conducted seminal work revealing that only a small fraction of researchers in psychology were willing to share their data. Specifically, Wolins described a master’s degree student’s attempt to acquire research data for their thesis from 37 authors whose work appeared in APA journals between 1959 and 1961. Notably, only 9 researchers (28%) shared their data, with 21 reporting their data as misplaced, destroyed or lost. Subsequent investigations in psychology by Craig and Reese (1973) and Wicherts et al. (2006) further emphasized the challenges associated with data sharing. Craig and Reese (1973) managed to obtain 38% of research data, with data availability ranging between 30% and 75%, depending on the journal. Meanwhile, Wicherts et al. (2006) contacted the corresponding authors of every article in the last two 2004 issues of four APA journals, receiving data from 27% of them.
Despite increased awareness, especially in the past decade, data availability has remained relatively stagnant. Recent findings show availability rates of 40% in the social sciences (Tedersoo et al., 2021) and 32% in psychology and psychiatry (Hardwicke & Ioannidis, 2018). Similarly, Towse et al. (2021) found that the prevalence of public data was low with an availability rate of 4% across 15 psychology journals spanning various subfields. Data availability upon reasonable request (i.e., via contacting the corresponding researcher) has an additional challenge associated with it: outdated e-mails and changes in academic tenure. Vines et al. (2014) found that in biological research, contacting authors for data became increasingly difficult as articles aged, with the odds of finding a functional e-mail for the first, last, or corresponding author decreasing by 7% per year. Consequently, the odds of obtaining data were strongly correlated with article age, showing a 17% decline per year in the likelihood of gaining access (Vines et al., 2014). Similarly, Hussey (2023) found that it was sometimes impossible to contact corresponding authors in psychology. Additionally, there is a trend suggesting that more experienced academics are less likely to respond to e-mail requests across several disciplines, including psychology and sociology, with response rates declining over time (Allen et al., 2015).
With respect to teaching effectiveness research, studies directly investigating data availability remain scarce. However, the limited research in educational fields reveals similar patterns of low data sharing. For instance, in a survey of 1400 researchers who published in the American Education Research Association journals between 2008 and 2018, around 80% reported never or only once sharing their research data (Makel et al., 2021). Furthermore, in a more recent qualitative analysis, Makel et al. (2025) found that even educational researchers who supported openness faced practical, cultural or institutional barriers that discouraged actual data sharing behavior. Focusing on the availability of research data through repositories, Huff and Bongartz (2023) reported that, in 2020, the availability of research data through repository links was 7.16% across five journals within the field of educational psychology.
To address these challenges of data availability, many journals have established guidelines regarding the availability of research data. However, recent studies suggest that these mechanisms have been ineffective. For instance, in psychology, Hussey (2023) found that the presence of a data availability statement in articles did not correlate with actual data availability, and articles with such statements had lower data availability rates. Furthermore, the transparency level of educational psychology journals did not correlate with actual data availability (Huff & Bongartz, 2023). Taken together, these findings indicate that the implementation of data sharing policies in academic publishing, including in educational research, has not consistently led to improved data availability in practice (Huff & Bongartz, 2023; Hussey, 2023; Makel et al., 2021).
Obstacles to Data Sharing
A multitude of empirical studies have identified a wide range of obstacles to data sharing, which can be broadly grouped into two categories: formal challenges and concerns about repercussions.
Formal challenges include factors such as insufficient time and funding needed to make data accessible. Tenopir et al. (2011) found that these resource constraints are among the most common barriers to open data practices across several disciplines. In a similar vein, Schmidt et al. (2016) noted that in global environmental research, the lack of enforceable mandates and adequate support infrastructure hampered effective data sharing despite policy emphasis on openness. Other reasons include the absence of standardized data publishing protocols and the unavailability of appropriate repositories for data storage (Tenopir et al., 2011). Moreover, privacy and legal considerations also play a crucial role in researchers’ decisions to withhold data, particularly when the data involves human subjects. Sensitive personal information in these datasets is often subject to strict ethical and legal concerns related to consent and confidentiality (Tedersoo et al., 2021; Tenopir et al., 2011). Additionally, study sponsors, especially those from the industry, may hinder data sharing, as they are less likely to agree to release raw data (Zuiderwijk et al., 2020).
Beyond these formal barriers, researchers often express concerns about the social and professional consequences of data sharing. In their qualitative investigation, Cheah et al. (2015) found that many medical researchers were concerned about not being properly acknowledged when others used their primary data, raising concerns that secondary research might undermine the original researchers’ investment of time and effort. Survey research provides additional evidence for these barriers. In a study of over 600 psychologists, Houtkoop et al. (2018) found that both perceived time limitations and concerns about research data appropriation represented significant obstacles to data sharing practices. Similarly, educational researchers report limited knowledge and training regarding open science practices, despite holding favorable attitudes toward transparency (Fleming et al., 2024; J. A. Logan et al., 2025). Personal interests have been found to influence data sharing as well, with researchers expressing reluctance to share data that might challenge their interpretations or original conclusions. Concerns often center around the possibility that secondary analyses, especially if done poorly, may lead to misinterpretations (Cheah et al., 2015; Tedersoo et al., 2021).
Overall, a low level of trust between researchers exacerbates these concerns. Many researchers across disciplines such as economics, ecology, sociology and the social sciences fear data misuse, especially when they lack information about the data requester (Enke et al., 2012; Fecher et al., 2015; Zuiderwijk et al., 2020). Additionally, some researchers, including educational researchers, are hesitant to lose control over their data, fearing that sharing it might limit their ability to explore future research questions (Cheah et al., 2015; J. A. Logan et al., 2025). These documented challenges create persistent barriers that hinder the adoption of open data practices despite the recognized benefits of data sharing.
Relevance of Data Sharing in Teaching Effectiveness Research
Data availability in teaching effectiveness research is crucial as it directly impacts the ability to assess and improve educational practices (Darling-Hammond, 2010a). Unlike data availability in general educational research, which encompasses a broad spectrum of variables and outcomes measured on different institutional levels (school-, city- or state-level), teaching effectiveness research covers nuanced metrics immediately relevant to the classroom itself, typically within a longitudinal setting (Grützmacher et al., 2025). These include instructional methods, teacher-student interactions, and classroom management strategies (e.g., Cheung & Slavin, 2016; J. A. Hattie, 2023; Thi & Nguyen, 2021). Access to comprehensive, high-quality data holds practical significance as it enables researchers to perform detailed analyses, identify best practices, and formulate evidence-based recommendations for teacher training and professional development (Darling-Hammond, 2010a, 2010b). Additionally, transparent and accessible data on teaching effectiveness assists policymakers in making informed decisions that enhance educational outcomes (J. Hattie, 2009). This targeted focus is essential for advancing pedagogical techniques and ensuring that students receive the highest standard of education. As highlighted by Hanushek and Rivkin (2010), robust data on teaching effectiveness is fundamental for understanding the complexities of teaching and learning, thereby contributing to the broader goal of improving education.
Research Questions
In this study, we outline the process of data acquisition in the field of teaching effectiveness research while also examining the extent to which current journals mandate open data sharing practices. Additionally, we investigate reasons researchers provide for refusing to share data. 1
Previous studies have shown that data availability in psychology, social sciences and related fields hovers around 30 to 40 % (e.g., Hardwicke & Ioannidis, 2018; Tedersoo et al., 2021). However, understanding the availability of data in teaching effectiveness research is important, as collecting longitudinal classroom-level data is both highly challenging and essential for improving teaching quality and by extension educational outcomes. Consequently, our first research question is:
R1: What is the overall availability of research data, both in repositories and through data-sharing requests, as well as the response rate to such requests, in the field of teaching effectiveness research?
Researchers often cite various reasons for their inability to share data, including a lack of time or resources and concerns about potential repercussions. With increased awareness of the replication crisis and the growing prevalence of journal requirements for data availability statements, it is worth exploring whether the reasons for refusing to share data have evolved. This brings us to our second research question:
R2: What reasons do researchers provide for refusing to share their data?
In related fields, studies have demonstrated that factors such as article age influence the likelihood of obtaining research data (e.g., Huff & Bongartz, 2023; Vines et al., 2014). Beyond article age, we aim to investigate whether requests for multiple datasets or researchers’ academic experience affect the likelihood of data acquisition. Accordingly, our third research question is:
R3: How is data availability impacted by publication date, requests for multiple datasets, and researchers’ academic experience?
Finally, the extent to which journals enforce data-sharing policies in teaching effectiveness research remains unclear. While some journals merely encourage voluntary data sharing, others mandate data publication in trusted repositories. This brings us to our fourth research question:
R4: What data sharing policies are implemented in journals focused on teaching effectiveness research?
Method
Study Search and Selection
We identified a sample of teaching effectiveness research publications to investigate data availability. These publications were gathered in two steps: First, by identifying relevant meta-analyses and second, by selecting studies synthesized in these meta-analyses. We began our search with meta-analyses because they provide a comprehensive overview of existing research, offering a robust foundation for identifying relevant literature. For the first step of identifying meta-analyses, we utilized search strings that combined terms such as “meta-analysis” with the terms “educational effectiveness,” “teaching,” “teaching quality,” “school effectiveness,” “teaching cognitive activation,” “teaching support,” “student support,” “cognitive activation,” “teaching content knowledge,” “classroom management,” “student outcomes,” and “teaching effectiveness” (for details see appendix A). The search terms were chosen based on the dimensions and subdimensions of the three basic dimensions of teaching quality model (TBD; Klieme et al., 2001). The TBD model differentiates between three core dimensions of teaching quality: classroom management (i.e., ensuring effective use of time and minimizing disruptions), student support (i.e., building a supportive teacher-student relationship and fostering motivation and interest), and cognitive activation (i.e., challenging students intellectually and promoting deep understanding). We chose the TBD model as the basis for our search strategy because (1) it reflects a generic, non-subject-specific framework of teaching quality and (2) is widely used and one of the prominent frameworks of teaching quality, especially in German-speaking countries (Praetorius et al., 2018). We searched for teaching effectiveness meta-analyses on Google Scholar, Sage Advanced Search and APA PsycNet in two steps. In the first step, we screened the abstracts and titles of meta-analyses to assess their relevance. To be included, a meta-analysis needed to meet the following criteria:
The meta-analysis included studies on teaching effectiveness, implying an investigation into the effect of at least one aspect of teaching quality on student outcomes (e.g., motivation, performance, or interest).
The publication was available in either English or German.
The meta-analysis was published in 2018 or later. We choose 2018 as a cut-off to avoid outdated meta-analyses and to increase the likelihood of acquiring the data (see Vines et al., 2014).
The meta-analysis was published in a peer-reviewed journal.
Beyond the aforementioned inclusion criteria, we did not restrict our search to specific student populations such as primary or secondary school attendees, those with special needs, gifted individuals, or at-risk students.
In the second step of our search, we analyzed the literature referenced in each meta-analysis to identify potentially relevant articles for our overarching research endeavor. For this phase, we included only articles that met the following criteria:
The article was published in 2000 or later.
The article was published in a peer-reviewed journal.
The study used a longitudinal design, measuring the student outcome variable at a minimum of two time points.
The article was based on primary data collection.
Overall, we found 13 meta-analyses and pinpointed 167 peer-reviewed articles among the more than 800 papers cited (see Figure 1).

PRISMA Flow Chart for Article Selection Process Across Meta-analyses.
Data Availability and Contacting the Authors
Subsequently, we scanned each article for details regarding research data availability. Our process involved searching for a data availability statement. If no explicit statement was present, we thoroughly examined the entire article for any mention of data access. If an URL link to a repository was provided, we considered our search complete at that point. In cases where no information was found within the text, we sought out the corresponding author to ask about data availability. If no corresponding author was designated, we searched the contact information for the first author, either within the article or online.
Of the 167 peer-reviewed articles, 13 included a statement linking to a data repository. Consequently, we only reached out to the corresponding or first authors of the remaining 154 publications. Of these, several publications had the same corresponding or first author. We contacted these authors only once for all their publications. This resulted in 132 data-sharing requests.
The template for the e-mail we sent can be found in the supplemental material (see appendix B). In our inquiry, we first introduced ourselves and then outlined our request for access to their research data. We provided detailed information of our goals and the types of analyses we plan to conduct. Notably, we made actual requests for data that we intended to analyze if provided, rather than hypothetical inquiries about researchers’ willingness to share their data. We assured the recipient that our analyses would not be linked to their original study findings, ensuring that their work will not be criticized or affected. Additionally, we committed to storing the data securely, not sharing it with others, and adhering to any additional conditions they may set. In case they were unable to share the data, we asked for a brief explanation. Lastly, we offered to arrange a virtual meeting to address any questions or provide further clarification, if necessary.
After our initial contact attempt, we followed up with the authors two more times, with intervals of approximately two to three weeks between each attempt, if no response was received to our earlier query. We chose a 60-day cut-off for non-response, aligning with previous research (Hussey, 2023; Tedersoo et al., 2021).
Initially, our plan was to rely solely on the contact information provided within the article, seeking updated information for the corresponding or first author only if the initial e-mail was no longer valid. Given the nature of academic careers, where researchers often change institutions, it is not surprising that we observed a low response rate of around 10% during the first wave of contacts. In response, we began proactively checking for updated e-mail addresses. This involved reviewing the institutional websites associated with the original e-mail addresses that were available to us for any updated information. Additionally, we searched for the authors’ profiles on platforms such as LinkedIn and ResearchGate or their personal professional websites via a Google search for current contact information. If we found more recent contact information and the author had not responded to our previous messages, we reached out again with the new e-mail address. From this point onward, we considered the timer for response to be reset. Therefore, in select cases, we contacted the corresponding authors up to four times, excluding the first attempt under the assumption that the provided e-mail address was no longer in use.
Data Transparency Level
We examined the data availability policies of the journals where the 167 articles were published, using information from their respective websites. These policies were categorized according to the data transparency levels established by Huff and Bongartz (2023) and Nosek et al. (2015; see appendix table C1). We differentiated between four levels of data transparency, with the lowest level of 0 being assigned when the journal’s policy simply encourages researchers to share their data or does not mention data availability at all. A transparency level of 1 was assigned if the journal mandates a statement on data availability that describes how to gain access (e.g., via a trusted repository or upon reasonable request from a corresponding author). We assigned a data transparency level of 2 if the policy mandates researchers to publish their data in a trusted repository, with exceptions stated. Lastly, the highest transparency level of 3 was assigned if the journal required researchers to publish their data in a trusted repository and report an independent reproduction of the reported analyses prior to publication.
Statistical Analyses
To address the first research question of overall data availability and response rate, we present descriptive statistics on the frequency of researchers’ responses and their agreement to share data. In alignment with Tedersoo et al. (2021), we distinguish between initial and final data availability. Initial data availability refers to cases where research data was accessible without needing to contact the corresponding or first author. Final data availability includes both data initially available and data obtained through contacting the corresponding or first author. For the second research question, which examines researchers’ reasons for refusing to share data, we provide a categorized qualitative summary of the reasons researchers gave for being unable to share the data. In order to address the third research question, we analyze how article age, the request for single or multiple datasets, and the researcher’s academic experience affect the likelihood of (1) locating a URL repository, (2) finding a functional e-mail, (3) receiving a response from the researcher, (4) obtaining a positive reply to our request, and (5) successfully acquiring research data overall. We conducted logistic regressions to model these relationships. Notably, in models (1) and (5), the outcome variables are the research articles, while models (2), (3) and (4) focus on the researchers as data points. Thus, our data basis consists of either the researchers or the articles.
Since logistic regression models are highly susceptible to extreme points in both the dependent and independent variable (Pregibon, 1981), we performed comprehensive regression diagnostics. Specifically, we computed hat values, studentized residuals, and Cook’s distance to identify influential data points. Extreme observations meeting our criteria were removed and the logistic regression analyses re-run. Any changes in the results following this adjustment are reported. Our criteria for identifying extreme data points were: (1) a hat value greater than twice the average hat value, (2) studentized residuals outside the range [−2; 2], or (3) Cook’s Distance values greater than 1 (J. Cohen et al., 2013; Fox, 2016; Fox & Weisberg, 2019).
Lastly, to answer the fourth research question, which examines the open data policies implemented in journals, we present descriptive statistics on the distribution of data sharing policies across journals. All analyses were conducted using R software (R Core Team, 2024). The data and code that support the findings of this study are openly available in OSF (doi.org/10.17605/OSF.IO/R8CZP).
Results
Initial and Final Data Availability
Of the 167 articles included in this investigation, 13 (7.8%) included a data availability statement with a URL leading to a repository (see Figure 2). For the remaining 154 publications, we sent out 132 data sharing requests after accounting for instances where individuals authored multiple publications. Notably, 33 (25%) of the 132 contact details were inactive, resulting in error messages after we sent out the e-mail. For these cases, we searched online for updated contact information and found that for 17 researchers, no working contact information could be located, reducing our sample size to 115 researchers contacted. To ensure accuracy, we verified the provided contact details of all e-mail addresses for which we received no error message, checking if they were up to date. Out of the 115 e-mails, 11 (9.6%) e-mail addresses were outdated due to researchers changing institutions. This means that a total of 44 (33.3%) contact details were inactive or outdated.

Sankey Plot Representing the Data Acquisition Process Across the 167 Publications.
Regarding overall data availability and response rate (RQ1), out of the 115 authors contacted, 36 researchers (31.3%) responded to our data-sharing request, but only 9 (7.8%) agreed to share their data. Notably, four of the 36 researchers responded after the 60-day cut-off, and all of them were unable to share their data. Additionally, 9 publications mentioned that data would be available upon request. Of these, one researcher provided the data upon our request. Overall, we achieved a final data availability rate of 13.2% (22 datasets), combining the initial availability through the repository URLs and the additional 7.8% of research data acquired through contacting the corresponding authors of the remaining publications. Furthermore, regarding the 27 cases in which data sharing was not possible, the researchers provided several reasons for their inability to share the research data, which we describe in the following section. On average, researchers responded to our request within 15 days (SD = 20, Md = 8).
Reasons for Denying Data Access
For our investigation into the reasons why researchers may be unable to share data (RQ2), we requested that researchers provide a reason if they declined our initial request. Six researchers gave no specific reason, simply stating their inability to provide the data. Seventeen provided reasons, which we summarize qualitatively below (see Figure 3). Note that some researchers offered multiple reasons, which is why the number of reasons does not match with the number of rejections.

Summary of the Reasons for Refusing Data Access Provided by Researchers.
Retirement and Institutional Change
By far the most prevalent reason for refusal was retirement or leaving academia, cited by nine researchers. In one instance, the researcher mentioned that their co-author should have access to the data. After looking up their contact information, we found that the co-author also retired. Out of the nine, two researchers reported changing institutions and consequently losing access to their research data. One provided the contact information for their previous institution, but we have not received a response to our subsequent request.
Data Loss or Unplanned Destruction
In three cases, the data was destroyed: In one case, the data was accidentally deleted; in two others, the hardware which stored the data was damaged. Two more researchers stated that they were unable to find the research data.
Planned Data Destruction, Legal Restrictions, and Lack of Resources
In four cases, the destruction of the data was by design as the informed consent form included the information that the research data will be destroyed after a certain period. Furthermore, three researchers provided the information that they are unable to share their data due to very strict guidelines of their institutional review board and/or because they explicitly stated in their informed consent form that the research data will not be provided to third parties. Lastly, out of the 27 rejections, two researchers stated that they lack the time to provide the research data in such a format that is understandable for third parties.
Open Data Policies and Influencing Factors
We explored how factors such as publication date, requests for multiple datasets, and researchers’ academic experience influenced various stages of our data acquisition process (RQ3), focusing on outcomes related to (1) locating repository links, (2) finding valid e-mail addresses, (3) receiving responses, (4) receiving positive responses, and (5) acquiring research data overall. The results of every regression model can be seen in Table 1.
Results of Logistic Regression Models Using Full and Reduced Datasets
Note. B = unstandardized regression coefficients for the logit outcome; SE = standard error; OR = odds ratio; 95% CI = 95% confidence interval of the odds ratio (lower limit, upper limit).
p < .05.
Unable to be estimated due to large standard error.
With respect to the first model, article age was a significant predictor of locating repository links. For each additional year since publication, the odds of finding a repository link decreased by 11% (b = −0.13, OR = 0.89, 95% CI [0.76, 0.99], p < 0.05). After accounting for influential data points with high studentized residuals and hat values, this effect became even stronger: The odds of locating a repository link decreased by 24% per year (b = −0.27, OR = 0.76, 95% CI [0.61, 0.90], p = 0.01).
Regarding the second model, article age had no significant relationship with the likelihood of finding a valid e-mail address. However, researchers’ academic experience became significant when accounting for influential data points. Specifically, each additional year of experience since earning a PhD increased the odds of finding a functional e-mail address by 6% (b = 0.06, OR = 1.06, 95% CI [1.01, 1.14], p = 0.03) once accounting for influential data points.
The likelihood of receiving a response (3) from a researcher was significantly influenced by their academic experience. Each additional year of experience increased the odds of receiving a response by 5% (b = 0.05, OR = 1.05, 95% CI [1.01, 1.09], p = 0.02). Article age, however, showed no significant relationship with response likelihood.
With respect to the fourth model, none of the variables showed a significant association with the likelihood of receiving a positive response. However, the likelihood of acquiring research data overall (5), whether through a repository or by directly contacting the authors, was significantly associated with article age. Each additional year reduced the odds of obtaining data by 14% (b = −0.15, OR = 0.86, 95% CI [0.77, 0.96], p = 0.01). After adjusting for influential data points, this decline became more pronounced, with the odds decreasing by 19% per year (b = −0.21, OR = 0.81, 95% CI [0.71, 0.92], p = 0.00).
Regarding the data sharing policies of the journals included in this investigation (RQ4), 167 articles were published across 89 different journals, with the top ten journals covering a third of all publications (55; see Table 2). In terms of data availability, 35 journals (40.7%) either did not include any open data policy in their submission guidelines or simply encouraged researchers to share their data. Most journals, namely 46 (53.5%), obligate the researchers to state how their research data is accessible, while allowing them to state that the data is “available upon reasonable request.” The remaining 5 journals (5.8%) required researchers to publish their research data in a trusted repository. Notably, 5 of the top 10 journals in which the researchers have published to date do not have a proper data sharing policy in place.
Top Ten Journals Data Transparency and Availability
Note. The levels and descriptions of data transparency are taken from Nosek et al. (2015). The top ten journals covered 55 of all 167 publications.
Discussion
This study investigated data availability rates in teaching effectiveness research, aiming to document the data sharing process. Through a systematic search, 167 longitudinal studies published after 2000 were identified. Among these, 13 studies (7.8%) provided a URL to a data repository. For the remaining 154 publications, 132 data sharing requests were sent to either the first or corresponding author. However, approximately one-third of the contact information was outdated or inactive. Of the 115 researchers successfully contacted, 36 responded, with 9 agreeing to share their data, resulting in an overall data availability rate of 13.2%. Among the 27 refusals to share data, the most common reason was that researchers had left academia or retired and were therefore unable to provide the data. Additionally, exploratory analyses revealed that the age of an article significantly impacted data availability. Specifically, the odds of finding a repository link decreased by 24% per year, while the odds of acquiring research data overall declined by 19% per year. Academic experience was positively linked to researcher responsiveness.
Data Availability and Contacting Process
Although making research data available via repositories minimizes the need for direct contact with researchers, only 7.8% of articles in this investigation provided a URL to a repository, a result consistent with the recent findings of Huff and Bongartz (2023). Moreover, the odds of finding a repository link were inversely related to the age of the article, decreasing by 24% per year. This suggests that older studies are significantly less likely to include accessible links, further complicating data retrieval.
Consequently, the majority of research data remains inaccessible through repositories and must instead be acquired by directly contacting the researchers, which presents a significant obstacle identified in this study. Approximately 33% of the contact details for corresponding authors were outdated or no longer functional. This aspect introduced the additional challenge of acquiring working contact details via online searches of the researchers. In several cases, it was impossible to find a functioning e-mail address. Notably, this included researchers who published papers as recently as 2016. These findings align with those of Hussey (2023) and Tedersoo et al. (2021), who also noted the challenge of outdated contact details and the passing of researchers. Furthermore, engagement with data access requests was limited, with only 36 out of 115 researchers responding. One possible explanation for the low response rate is that our e-mails may have been perceived as spam rather than legitimate requests. This adds complexity to secondary data acquisition, as researchers must ensure that their communication is perceived as genuine, which is highly subjective and unclear. Overall, our contacting process illustrates that contacting researchers for data access is associated with numerous avoidable obstacles, highlighting the need for more sustainable data sharing practices.
Overall data availability was found to be just 13.2%, reflecting a significant gap compared to related disciplines. For instance, psychology and social sciences report data availability rates between 30% and 40% (e.g., Tedersoo et al., 2021), while Hamilton et al. (2023) observed rates ranging from 0% to 37%. These discrepancies may stem from methodological differences, as this study did not limit itself to specific journal groups (e.g., Tedersoo et al., 2021) and covered a broader two-decade timeframe (e.g., Huff & Bongartz, 2023). Nonetheless, they reflect a trend observable in educational research in general, with most researchers rarely, if ever, sharing their data (Makel et al., 2021).
Another possible explanation is the nature of the studies analyzed. This investigation focused on teaching effectiveness research, which involves longitudinal observational classroom-level data in real-world settings (Grützmacher et al., 2025). Such studies may involve sensitive data from minors that make data sharing more difficult compared to simple questionnaire-based studies in other fields such as sociology or psychology (e.g., Tedersoo et al., 2021; Vines et al., 2014; Wolins, 1962) or laboratory-based studies implemented in medicine or microbiology (e.g., Tedersoo et al., 2021). Nevertheless, the overall low data availability in teaching effectiveness research is concerning given the field’s reliance on empirical evidence to inform teacher development and educational improvement efforts (Darling-Hammond, 2010a).
The results further reveal that the likelihood of acquiring data decreases with article age, with a 19% reduction per year. This aligns with Vines et al. (2014), who noted that data older than six years is unlikely to remain accessible. Since older studies often serve as the foundation for new research, their data inaccessibility risks the loss of valuable insights and undermines scientific progress (National Research Council, 2002).
Inability to Share Data as a Structural Problem
Overall, we identified several barriers to sharing research data, including retirement, planned and unplanned data destruction, legal restrictions, and a lack of resources. Our findings align with those of Hussey (2023), highlighting that current data sharing policies and mechanisms in scientific disciplines, including teaching effectiveness research, are inadequate to address foreseeable challenges. For instance, data access became especially challenging when researchers retired or left academia, underscoring the reliance on individual researchers to manage data availability. To improve data availability, robust data sharing mechanisms are essential to ensure access regardless of the researcher’s circumstances.
Beyond these formal challenges, the unintentional loss or destruction of data poses considerable risks. Researchers often accumulate extensive datasets over decades, and without safeguards against hardware failures or accidental deletions, a single incident can result in the loss of multiple datasets. To mitigate such risks, publishing data in repositories, whose focus is the curation, preservation, and dissemination of research data, is essential, offering a more reliable solution than individual data management approaches (Strecker et al., 2023).
Moreover, the lack of resources and limited institutional support further complicates data sharing. Researchers frequently juggle heavy workloads, leaving insufficient time for careful data curation, documentation, and sharing. Current academic reward structures typically prioritize publication outcomes over data sharing activities, with educational researchers facing limited institutional support and unclear expectations around open science practices (Fleming et al., 2024; Makel et al., 2025). These systemic factors contribute to suboptimal data sharing rates, as the uncompensated effort required for comprehensive data preparation can discourage researchers from making data available (see J. A. Logan et al., 2025).
Lastly, legal and ethical constraints present complex challenges for data sharing in educational contexts. Meyer (2018) provides practical recommendations for navigating these challenges, advising against commitments to destroy data after a certain period or promises of data confidentiality that may limit sharing. Although such assurances are intended to protect participants and facilitate faster review board approvals, they often prevent researchers from sharing their data even when they wish to do so.
Overall, these findings suggest that broader infrastructural shifts are necessary to normalize data sharing practices, reduce implementation concerns, and increase researchers’ confidence in open data sharing beyond simply improving attitudes toward open science principles (Fleming et al., 2024).
Journals Lack Open Data Policies
Despite growing concern over the replication crisis, a significant portion of journals still lack formal open data policies. In our analysis of 89 different journals, we found that 35 journals (40.7%) either had no open data policy in their submission guidelines or simply encouraged data sharing without enforcing it. In contrast, only 5 journals (5.8%) mandated the publication of research data in a trusted repository. Notably, 5 of the top 10 journals, which accounted for a third of all publications, do not currently have a proper data sharing policy in place. These findings highlight considerable inconsistency in the presence and breadth of data sharing policies among journals publishing teaching effectiveness research, with a predominant reliance on voluntary compliance, an approach that has been shown to be ineffective (e.g., Gabelica et al., 2022).
Resources for Data Sharing
While this study highlights several barriers to data sharing in teaching effectiveness research, existing resources can support educational researchers in adopting sustainable practices. J. A. R. Logan et al. (2021), for example, developed a detailed guideline specifically for educational researchers, offering a step-by-step checklist that covers the entire data-sharing process from obtaining informed consent for sharing data to de-identifying sensitive information and thoroughly documenting the study. In addition, a number of international and national initiatives already provide guidance for researchers seeking to improve their data management. For instance, the Data Management and Expert Guide developed by CESSDA (2017), an international consortium of social science data archives, offers extensive guidance on planning, organizing, documenting, and archiving data in line with FAIR principles. Furthermore, the Open Science Framework provides online training for researchers seeking to enhance their open science competence, including modules specifically focused on data sharing and reproducibility (Center for Open Science, n.d.).
Additionally, several large-scale repositories offer both secure data storage and personalized support. Notably, repositories such as the Inter-University Consortium for Political and Social Research (ICPSR) in the United States and the UK Data Archive provide not only long-term storage options but also consulting services for researchers who are concerned about ethical, legal or technical barriers to data sharing (ICPSR, n.d.; UK Data Service, n.d.). These services often include assistance with anonymization, licensing, and compliance with institutional and funder requirements. In Germany, the Verbund Forschungsdaten Bildung serves a similar role within the education research community, although its materials as of now are only available in German (Verbund Forschungsdaten Bildung, 2024). As such, beyond data storage, repositories reflect an infrastructure that is designed to support researchers throughout the data lifecycle.
Limitations
Generalizability
The findings of this investigation are mostly consistent with previous studies on data availability in disciplines related to teaching effectiveness research. However, it is important to emphasize that our results are specific to teaching effectiveness research, and their applicability to other areas of education or psychology remains uncertain. While a meta-analysis of data availability across various fields could offer broader insights, both Hussey (2023) and Hamilton et al. (2023) have noted that such an analysis is challenging due to methodological variations between studies.
Furthermore, several methodological limitations may affect the external validity of these findings. For our literature search, we focused on identifying relevant publications through meta-analyses as our interest was in quantitative studies that met the quality standards required for inclusion in such analyses. This approach may have systematically excluded recent investigations not yet incorporated into comprehensive reviews, potentially underrepresenting current data sharing practices. Moreover, the search strategy was constrained by the three basic dimension model of teaching quality, which may have resulted in the omission of meta-analyses addressing constructs not readily captured by the employed search terms. Additionally, the publication search via meta-analyses may have introduced selection bias toward studies with enhanced data availability, particularly repository-based accessibility, as such studies may be more likely to meet meta-analytic inclusion standards. Consequently, these methodological choices suggest that the current sample may not fully represent the broader population of teaching effectiveness research.
Cultural Differences
This study involved contacting researchers from diverse cultural contexts via e-mail in English. Although language may not have been a barrier, as the researchers published in either English or German, this communication method might not align with the preferred practices of researchers in different regions, potentially leading to misunderstandings or perceptions of inauthenticity. In some cultures, direct e-mail requests, especially in a non-native language, may be viewed as unconventional, reducing the likelihood of a response (see Harzing, 2000; Hung et al., 2012). Additionally, our requests were sent from our institution in Germany, and it is unclear whether the response rate or the willingness to share data would have differed had the request come from the researchers’ home country. Moreover, the wording of our data request may have appeared vague or overly exploratory to recipients, potentially lowering its credibility. Future research might investigate how variations in contact methods and message phrasing affect researchers’ responsiveness and willingness to share data across different cultural and institutional settings. Nonetheless, beyond potential variations in response rates due to methodological choices, the predominantly ignored or declined responses suggest that teaching effectiveness research data remains largely inaccessible.
Data Quality
This study did not systematically assess the quality of the data provided, focusing solely on the acquisition of research data. Future research should incorporate a systematic evaluation of the quality of the data received to better understand its value and usability.
Conclusions and Suggestions for Improving Data Availability
The findings of the present study highlight the notably low level of data sharing in teaching effectiveness research, with several obstacles, such as difficulties in finding functional contact information and receiving responses, hindering data acquisition. As demonstrated by our investigation, the practice of making data available upon request has proven ineffective and underscores the need for systematic change toward repository-based sharing approaches.
Current mechanisms to promote data availability are inadequate, particularly since researchers lack incentives, such as career advancement or financial gain, to share their data (Hussey, 2023). While publications are central to career progression, additional research materials like datasets, code, and codebooks are not currently valued in the same way, leading to limited sharing. As the results of this investigation show, researchers often destroy their data after a certain period or face legal restrictions that prevent sharing due to review boards or informed consent requirements.
To address these issues, Tedersoo et al. (2021) suggest measures such as funding agencies covering data management costs and making public availability of research data a mandatory requirement for job and grant applications. They also propose establishing a monitoring structure for data sharing across journals, funders, and academic institutions. Huff and Bongartz (2023) advocate for improving researchers’ data literacy through training on proper data handling in line with FAIR principles. However, not all structural changes need to be large scale and grand in design: For example, Kidwell et al. (2016) found that simply adding an open data badge to publications led to an increase in open data practices and resulted in higher availability of usable and complete data. Furthermore, to enable researchers to share data through repositories, informed consent forms should avoid including commitments to destroying data or guarantees of confidentiality that restrict data sharing (J. A. R. Logan et al., 2021). Overall, the field of teaching effectiveness research could benefit from a paradigm shift, encouraging journals to implement more data sharing friendly policies, viewing research data not just as a tool for answering questions but as a research product that can be published and shared via a repository. This shift is particularly important for older publications, where the likelihood of obtaining data upon reasonable request is especially low.
Supplemental Material
sj-docx-1-ero-10.1177_23328584251409658 – Supplemental material for Investigating Data Sharing Practices in Teaching Effectiveness Research
Supplemental material, sj-docx-1-ero-10.1177_23328584251409658 for Investigating Data Sharing Practices in Teaching Effectiveness Research by Talha Sajjad, Johannes Hartig, Thomas Lösch and Carmen Köhler in AERA Open
Footnotes
Appendix
Author Note
Portions of these findings were presented as a poster at the DGPs conference in 2024.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
Notes
Authors
TALHA SAJJAD is a PhD candidate at DIPF | Leibniz Institute for Research and Information in Education, Frankfurt, Germany; email:
JOHANNES HARTIG is a professor of educational measurement at DIPF | Leibniz Institute for Research and Information in Education, Frankfurt, Germany; email:
THOMAS LÖSCH is a coordinator of the Research Data Center for Education at DIPF | Leibniz Institute for Research and Information in Education, Frankfurt, Germany; email:
CARMEN KÖHLER is a postdoctoral researcher at DIPF | Leibniz Institute for Research and Information in Education, Frankfurt, Germany; email:
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
