Sage Journals: Discover world-class research

Abstract

Introduction

Effective clinical research training is crucial for advancing medical science and improving patient care. However, current evaluation systems in China often focus on theoretical knowledge, neglecting practical skills and innovation. This study aimed to develop a comprehensive evaluation framework for clinical research training programs using the Delphi consensus method.

Method

A 2-round Delphi method was employed, involving healthcare professionals and educators from top tertiary hospitals and leading academic institutions in China. The first round included 15 participants, and the second round included 19 participants. The evaluation framework was based on the Kirkpatrick model, covering Reaction, Learning, Behavior, and Results dimensions. Indicators were evaluated using a 5-point Likert scale, with consensus defined as a mean significance score ≥3.50 and a coefficient of variation ≤0.25.

Results

In the first round, 9 indicators were excluded and 5 added. In the second round, 26 indicators met consensus criteria. Key indicators included “Relevance of training content” (mean = 4.89, CoV = 0.06), “Degree of knowledge mastery” (mean = 4.58, CoV = 0.13), and “Impact on career development” (mean = 4.53, CoV = 0.15). Other significant indicators were “Timeliness of training information” (mean = 4.84, CoV = 0.08) and “Success rate of applying for scientific research funds” (mean = 4.05, CoV = 0.21).

Discussion

This study developed a comprehensive evaluation framework for clinical research training in China, emphasizing the importance of relevant training content, strong learning outcomes, and long-term professional impact. This framework provides a robust tool to assess and enhance clinical research training programs, ultimately contributing to improved healthcare and medical research. Future work should focus on validating this framework through empirical studies and refining it based on ongoing feedback.

Keywords

clinical research training evaluation framework Delphi method Kirkpatrick model medical education in China

Introduction

The rapid evolution of medical science and healthcare demands robust clinical research training to cultivate a new generation of physician-scientists capable of driving innovation and improving patient outcomes.^1,2 While this need is global, it is especially urgent in China, where clinical research capacity is expanding rapidly but lacks a standardized, multidimensional evaluation system to ensure training quality and effectiveness.

Internationally, structured approaches to evaluating clinical research training have gained traction.^1,3,4 In the United States, the National Institutes of Health (NIH) supports program assessment through the Clinical and Translational Science Awards (CTSAs), which emphasize measurable outcomes across multiple domains of trainee development.⁵ Similarly, the European Union promotes competency-based training frameworks that integrate multidisciplinary collaboration, practical research skills, and translational impact.^6,7 These initiatives highlight a shared recognition: effective training must be systematically evaluated—not just by knowledge acquisition, but by changes in behavior, research output, and real-world contributions.

In contrast, evaluation of clinical research training in China remains underdeveloped.⁸ Most programs rely heavily on theoretical coursework and summative exam scores, offering little assessment of practical research competence, innovation capacity, or long-term professional impact.⁹ This narrow focus fails to capture essential dimensions of trainee development and hinders the identification and nurturing of high-potential research talent.⁹ Moreover, no nationally recognized evaluation framework exists that aligns with China's institutional structures, educational priorities, or healthcare goals.

To address this gap, we propose a comprehensive, contextually adapted evaluation framework grounded in the Kirkpatrick model—a well-established theory in training evaluation that assesses outcomes across 4 levels: Reaction, Learning, Behavior, and Results. This model provides a robust theoretical foundation for evaluating not only how trainees perceive and learn from programs, but also how training translates into research practice and institutional impact.^10,11

To ensure our framework was both evidence-informed and practically relevant, we began with a systematic literature review and expert interviews. We conducted an extensive search of PubMed, Embase, and Web of Science databases (from inception to May 2025) using keywords including “clinical research training,” “medical training,” “evaluation,” and “framework.” After screening 484 articles, we identified 35 that were most pertinent to evaluating clinical research training or similar educational programs. These articles provided a foundational set of potential indicators. Crucially, we then conducted in-depth interviews with 9 experts—specifically, individuals who serve as program directors of clinical research training programs or are recognized experts in medical education. These experts were purposively selected based on their extensive, hands-on experience and demonstrated expertise in designing, implementing, or evaluating clinical research training initiatives. This dual approach ensured our initial list of 35 indicators was not merely derived from international literature but was actively shaped and refined by practitioners familiar with the specific challenges and opportunities of clinical research training in China. By integrating these expert perspectives with the findings from the literature, we ensured a well-rounded and robust starting point for our Delphi process.

Method

Study Design

A 2-round modified-Delphi method was selected as our primary research method to engage a diverse group of experts in the field of clinical practice and education.¹² This method is particularly suited for achieving consensus among experts on complex issues, such as the development of an evaluation model that is both culturally sensitive and scientifically rigorous.¹³ The goal was to leverage the collective expertise of healthcare educators and professionals with practical experience to develop an evaluation model that is theoretically sound and practically applicable within the Chinese context. The study was conducted between November 27, 2024, and January 17, 2025. The reporting of this study conforms to the DELPHISTAR statement.¹⁴

Modified-Delphi Method

In the first round of our Delphi process, we constructed a survey to assess key elements of clinical research training evaluation using the Kirkpatrick model. This survey was based on literature research for indicators of clinical research training evaluation or similar courses. Participants were asked to rate each indicator using a 5-point Likert scale, where a score of 5 represented strong agreement, 4 indicated agreement, 3 signified neutrality, 2 denoted disagreement, and 1 reflected strong disagreement. The criteria for index deletion were a mean significance score of less than 3.50 or a coefficient of variation greater than 0.25.^15–17 After completing the questionnaire, experts were encouraged to identify any additional elements not previously listed.

Based on the first-round results and feedback, the indicators were modified, and a second-round questionnaire was developed. To achieve a shared understanding and pinpoint the essential aspects of clinical research training, we undertook at least 2 iterations. The Delphi process was terminated once consensus was reached.^18,19 The study process is shown in Figure 1.

Figure 1.

Screening process of the clinical research training program.

Participants

In our study, we used a 2-round Delphi method with 34 participants in total (15 in the first round and 19 in the second). We recruited healthcare professionals involved in clinical research and education.

Participants were selected based on the following inclusion criteria:

Clinical Research Training Experience: Completion of at least 1 clinical research training program in China.

Professional Role: Current or past involvement as a student, instructor, program manager, or curriculum designer in clinical research training programs in China (Figure 2).

Relevant Expertise: Demonstrated experience in clinical research and/or medical education, evidenced by active participation, teaching, program leadership, or scholarly output (eg, publications or conference presentations) in the field.

Figure 2.

Evaluation framework for clinical research training in China based on the Kirkpatrick model.

The following exclusion criteria were applied:

Less than 3 years of professional experience in clinical research or medical education.

No recent contributions to the field (defined as no publications, presentations, or program involvement in the past 5 years).

All participants were affiliated with top-tier tertiary hospitals or leading academic institutions in China to ensure expert-level input. A total of 34 experts participated across 2 Delphi rounds.

In the first round, there were 15 participants, consisting of 3 males (20%) and 12 females (80%). In the second round, the number of participants increased to 19, with 5 males (26.32%) and 14 females (73.68%). The age distribution of participants varied across different age groups. In the first round, the majority were aged between 31 and 40 years (46.67%), followed by those aged 41–50 years (40%). In the second round, the age group 41–50 years had the highest representation (47.37%), with those aged 31–40 years also well-represented (36.84%). Participants had diverse areas of expertise, including clinical medicine, education, hospital management, and bioinformatics. In the first round, 9 participants (60%) were from clinical medicine, while 1 participant (6.67%) was involved in education. In the second round, all 19 participants (100%) were from clinical medicine; 1 also had involvement in hospital management (5.26%), and 1 had a background in another discipline (5.26%), indicating a focus on clinical perspectives. Participants represented a diverse range of stakeholder roles within clinical research training programs, including students, instructors, program managers, and program designers. Importantly, the survey allowed participants to select all roles that applied to them, reflecting the reality that many individuals transition between these roles over time—for example, former students often become instructors or program managers, and some may even contribute to curriculum design. This approach ensured that we captured perspectives from individuals with direct, lived experience across multiple stages of the training lifecycle. In the first round, 8 participants (53.33%) identified as students, 7 (46.67%) as instructors, and 7 (46.67%) as program managers. In the second round, participation broadened further: 10 participants (52.63%) selected “student,” 2 (10.53%) “instructor,” 8 (42.11%) “program manager,” and 12 (63.16%) “program designer.” Participants had varying levels of involvement in clinical research training programs. In the first round, 7 participants (46.67%) had participated in 1–3 times, while 6 (40%) had participated more than 6 times. In the second round, the majority of participants (63.16%) had participated in 1–3 times, and only 3 (15.79%) had participated more than 6 times. For a comprehensive overview of participant demographics, please refer to Table 1.

Table 1.

Characteristics of Expert Panelists Involved With Clinical Research Training Evaluation in China Questionnaire.

Characteristic	Participants, No. (%)
Characteristic	Round 1 (n = 15)	Round 2 (n = 19)
Gender
Male	3 (20)	5 (26.32)
Female	12 (80)	14 (73.68)
Age
26–30	0	3 (15.79)
31–40	7 (46.67)	7 (36.84)
41–50	6 (40)	9 (47.37)
51–60	2 (13.33)	0
Area of Expertise (Select All That Apply)
Clinical Medicine	9 (60)	19 (100)
Education	1 (6.67)	0
Hospital Management	3 (20)	1 (5.26)
Bioinformatics	1 (6.67)	0
Others	4 (26.67)	1 (5.26)
Years of Working Experience
<5 years	1 (6.67)	4 (21.05)
5–10 years	3 (20)	5(26.32)
11–20 years	8 (53.33)	7 (36.84)
>20 years	3 (20)	3 (15.79)
Roles in Clinical Research Training Programs (Select All That Apply)
Student	8 (53.33)	10 (52.63)
Instructor	7 (46.67)	2 (10.53)
Program Manager	7 (46.67)	8 (42.11)
Program Designer	4 (26.67)	12 (63.16)
No. of Participated Clinical Research Training Programs
0 time	0	2 (10.53)
1–3 times	7 (46.67)	12 (63.16)
4–6 times	2 (13.33)	2 (10.53)
>6 times	6 (40)	3 (15.79)

Ethical Considerations

The study protocol adhered to the ethical guidelines outlined in the Helsinki Declaration and was approved by the Clinical Research Ethics Committee of Jiahui Medical Education and Research Group (Ethics approval number: A-CR-2025003). Written informed consent was obtained from all participants.

Statistical Method

The statistical analysis was conducted using Stata SE 16 and Microsoft Excel. Descriptive statistics (mean, standard deviation, coefficient of variation, percentages) were used to summarize the data. Consensus was determined iteratively across the 2 rounds. For each indicator, the mean score and CoV were calculated after each round. An indicator was retained if it met the predefined consensus criteria (Mean ≥ 3.50 and CoV ≤ 0.25) in the second round.

Result

The Delphi process successfully refined an initial set of 35 indicators (Table 2) into a final framework of 26 consensus-based indicators for evaluating clinical research training programs in China. This evolution reflects the iterative nature of expert consultation, with 9 indicators removed after Round 1 due to failing to meet prespecified consensus criteria, and 5 new indicators added based on expert feedback to address critical gaps.

Table 2.

The Description of the Indicators Which Used in the Evaluation Model Questionnaire Based on the Kirkpatrick Model.

Dimensions	Indicators	Description
Reaction	Relevance of training content	The degree to which the training content matches the trainees’ current research projects or future career development directions; whether the training content covers the issues that trainees are most concerned about.
Reaction	Timeliness of training information	Whether the training content reflects the latest developments in the field of clinical research; whether the training materials are updated in a timely manner.
Reaction	Quality of training materials	Whether the textbooks, handouts, and cases are scientific, practical, and easy to understand.
Reaction	Training atmosphere	The degree of interaction between the trainer and trainees; the activity level of communication among trainees; the comfort of the training environment.
Reaction	Overall satisfaction with training	Through questionnaires, interviews, and other methods, understand the trainees’ satisfaction with the entire training process.
Learning	Degree of knowledge mastery	Through written tests, oral tests, and other means, assess the trainees’ mastery of basic knowledge of clinical research, statistical knowledge, literature retrieval, and other knowledge.
Learning	Skill improvement	Evaluate the trainees’ improvement in skills such as data analysis, paper writing, and project management.
Learning	Mastery of basic knowledge of clinical research design	Understand whether the trainees have mastered common research designs such as randomized controlled trials, cohort studies, and case – control studies.
Learning	Mastery of statistical methods	Understand whether the trainees can proficiently use statistical software such as SPSS and SAS for data analysis.
Learning	Mastery of medical literature retrieval and reading skills	Understand whether the trainees can efficiently use databases such as PubMed and Cochrane Library to search for relevant literature.
Learning	Mastery of scientific research paper writing norms	Understand whether the trainees are familiar with the norms such as the IMRAD format and reference citation.
Learning	Mastery of scientific research project management skills	Understand whether the trainees have mastered the skills such as scientific research project application, budget management, and progress control.
Behavior	Frequency of participation in academic exchanges	Understand the frequency of trainees’ participation in academic conferences, seminars, and other activities.
Behavior	Guiding postgraduate or undergraduate students to conduct research	Understand whether the trainees guide students to participate in scientific research projects or write papers.
Behavior	Establishment of a scientific research cooperation network	Understand whether the trainees have established cooperative relationships with other researchers.
Behavior	Number of SCI papers published	Count the number of SCI papers published by trainees after training.
Behavior	Success rate of applying for scientific research funds	Count the success rate of trainees’ applications for various scientific research funds after training.
Behavior	Number of clinical trials participated in	Count the number of clinical trials participated by trainees after training.
Results	Professional title promotion	Understand whether trainees have obtained senior professional titles after training.
Results	Academic influence	Measure the trainees’ academic influence through indicators such as the H-index and citation times.
Results	Transformation of scientific research achievements	Understand whether the trainees’ scientific research achievements have been transformed into clinical applications or industrialization.
Results	Cultivation of research teams	Understand whether the trainees have cultivated scientific research talents such as postgraduate students and postdoctoral fellows.
Results	Contribution to the scientific research level of the institution	Understand the trainees’ contribution to the scientific research output and scientific research level of the institution where they are located.
Results	Impact on the development of the discipline	Understand the trainees’ impact on the development of the discipline where they are located.
Results	Quality of papers	Evaluate the quality of papers through indicators such as the impact factor and citation times.
Results	Quality of projects	Evaluate the quality of projects through indicators such as project innovation and feasibility.
Results	Number of patents	Count the number of patent authorizations obtained by trainees.
Results	Impact on career development	Understand the impact of training on the trainees’ career development.
Results	Long-term impact on the development of the discipline	Understand the long-term impact of training on the development of the discipline where they are located.
Results	Ratio of training cost to effect	By comparing the training input and output, evaluate the economic benefits of the training.
Results	Academic output^a	Whether the papers published after clinical research training can be cited by guidelines or form standards, etc.
Results	Innovation ability^a	Evaluate the students’ ability to propose innovative research questions and designs.
Results	Student practical ability^a	Evaluate the student's ability to apply theoretical knowledge to practical clinical research design.
Results	Student feedback satisfaction^a	Through questionnaires or on-site interactions, understand the satisfaction of students with the lecturer answering questions and solving clinical research needs.
Results	Lecturer's feedback suggestion^a	By collecting feedback and suggestions from instructors on the management of clinical research training projects, we can understand the shortcomings and room for improvement in the training process.

Indicators added on the second round of survey.

Consensus Criteria and Indicator Evolution

Consensus for retaining an indicator was defined a priori as a mean significance score ≥3.50 and a coefficient of variation (CoV) ≤ 0.25. In Round 1, 9 indicators failed to meet these criteria and were excluded from the final model. Notably, several indicators related to behavioral outcomes, such as “Frequency of participation in academic exchanges” (mean = 3.20, CoV = 0.24) and “Number of patents” (mean = 3.27, CoV = 0.27), did not achieve consensus. Based on qualitative feedback from Round 1 participants, 5 new indicators were introduced in Round 2 to capture more nuanced aspects of trainee development, including “Innovation ability,” “Student practical ability,” and “Lecturer feedback suggestion.” All 5 new indicators met the consensus criteria in Round 2, demonstrating their perceived value to the expert panel.

Summary of Consensus Indicators by Kirkpatrick Dimension

The final 26 indicators are presented below, organized by the 4 levels of the Kirkpatrick model. The results highlight strong agreement across all dimensions, with particular emphasis on the relevance of content and long-term career impact, please refer to Table 3.

Reaction Dimension: Experts strongly endorsed indicators assessing the immediate trainee experience. “Relevance of training content” received the highest mean score (4.89, CoV = 0.06), followed closely by “Timeliness of training information” (4.84, CoV = 0.08). “Quality of training materials” achieved a perfect mean score of 5.00, indicating unanimous agreement on its importance.

Learning Dimension: This dimension showed robust consensus, with high scores for core competencies. “Skill improvement” (mean = 4.68, CoV = 0.12) and “Degree of knowledge mastery” (mean = 4.58, CoV = 0.13) were among the top-rated indicators. “Mastery of basic knowledge of clinical research design” also scored highly (mean = 4.79, CoV = 0.09).

Behavior Dimension: Only 1 indicator in this dimension met the consensus criteria: “Success rate of applying for scientific research funds” (mean = 4.05, CoV = 0.21). While this indicator just met the CoV threshold, its inclusion underscores the panel's view that securing funding is a critical behavioral outcome of effective training.

Results Dimension: This dimension yielded the largest number of consensus indicators, reflecting a strong focus on long-term impact. Key outcomes included “Impact on career development” (mean = 4.53, CoV = 0.15), “Academic influence” (mean = 4.37, CoV = 0.16), and “Transformation of scientific research achievements” (mean = 4.26, CoV = 0.22). The newly added indicator “Innovation ability” (mean = 4.53, CoV = 0.14) was also highly rated, highlighting the panel's desire to measure creative capacity.

Table 3.

The Results of Two Round-Questionnaire of Experts’ Opinion.

Indicators	Round 1			Round 2
Indicators	Mean	S.D	CoV	Mean	S.D	CoV
Relevance of training content	4.67	0.62	0.13	4.89	0.32	0.06
Timeliness of training information	4.40	0.74	0.17	4.84	0.37	0.08
Quality of training materials	4.67	0.62	0.13	5.00	0.00	0.00
Training atmosphere	4.33	0.72	0.17	4.74	0.45	0.10
Overall satisfaction with training	4.40	0.63	0.14	4.79	0.54	0.11
Degree of knowledge mastery	4.20	0.68	0.16	4.58	0.61	0.13
Skill improvement	4.47	0.74	0.17	4.68	0.58	0.12
Mastery of basic knowledge of clinical research design	4.60	0.63	0.14	4.79	0.42	0.09
Mastery of statistical methods	3.93	1.10	0.28^a	–	–	–
Mastery of medical literature retrieval and reading skills	4.47	0.64	0.14	4.68	0.58	0.12
Mastery of scientific research paper writing norms	4.33	0.72	0.17	4.58	0.69	0.15
Mastery of scientific research project management skills	3.93	0.80	0.20	4.68	0.48	0.10
Frequency of participation in academic exchanges	3.20^a	0.77	0.24	–	–	–
Guiding postgraduate or undergraduate students to conduct research	3.60	0.99	0.27^a	–	–	–
Establishment of a scientific research cooperation network	3.93	1.03	0.26^a	–	–	–
Number of Science Citation Index papers published	3.87	1.06	0.27^a	–	–	–
Success rate of applying for scientific research funds	4.00	0.76	0.19	4.05	0.85	0.21
Number of clinical trials participated in	4.13	1.06	0.26^a	–	–	–
Professional title promotion	3.53	1.13	0.32^a	–	–	–
Academic influence	3.80	0.77	0.20	4.37	0.68	0.16
Transformation of scientific research achievements	4.20	0.77	0.18	4.26	0.93	0.22
Cultivation of research teams	3.73	0.88	0.24	4.32	0.75	0.17
Contribution to the scientific research level of the institution	3.93	1.03	0.26^a	–	–	–
Impact on the development of the discipline	4.07	0.70	0.17	4.37	0.90	0.20
Quality of papers	3.87	0.64	0.17	4.53	0.61	0.14
Quality of projects	4.00	0.76	0.19	4.47	0.84	0.19
Number of patents	3.27^a	0.88	0.27^a	–	–	–
Impact on career development	3.93	0.96	0.24	4.53	0.70	0.15
Long-term impact on the development of the discipline	4.20	0.86	0.21	4.37	0.90	0.20
Ratio of training cost to effect	3.93	0.96	0.24	4.26	0.87	0.20
Academic output	–	–	–	4.21	0.79	0.19
Innovation ability	–	–	–	4.53	0.61	0.14
Student practical ability	–	–	–	4.63	0.60	0.13
Student feedback satisfaction	–	–	–	4.37	0.68	0.16
Lecturer feedback suggestion	–	–	–	4.53	0.61	0.14

Indicators that have not met the consensus criteria: mean significance score of less than 3.50 or a coefficient of variation greater than 0.25.

The analysis reveals a clear prioritization of indicators that link training directly to tangible, long-term outcomes. The high ratings for “Impact on career development” and “Success rate of applying for scientific research funds” suggest that experts view training success through the lens of professional advancement and research productivity. Furthermore, the unanimous support for “Quality of training materials” and the high scores for “Relevance” and “Timeliness” indicate that the foundation of effective training lies in well-designed, current, and applicable content. The addition and subsequent consensus on new indicators like “Innovation ability” and “Student practical ability” demonstrate the panel's commitment to evolving the framework to capture modern competencies beyond traditional metrics.

Discussion

This study, employing the Delphi method, has successfully identified and refined key indicators to form a robust evaluation model based on the Kirkpatrick model. The findings provide valuable insights into the critical elements that should be considered when assessing the effectiveness of clinical research training programs in China. While the study is grounded in the Chinese context, its implications extend globally, offering a comparative lens for international medical education communities to reflect on their own clinical research training frameworks.

Addressing the Core Deficiencies in China's Training Landscape

The most significant contribution of this framework is its explicit response to the identified shortcomings of China's current evaluation system, which remains heavily skewed toward theoretical knowledge and summative exams.^8,9 Our findings demonstrate a clear expert consensus that effective evaluation must extend far beyond the “Learning” level to encompass tangible behavioral changes and long-term professional outcomes. This represents a fundamental shift from the prevailing model.

From Theory to Practice: The high consensus scores for indicators like “Success rate of applying for scientific research funds” (mean = 4.05) and “Transformation of scientific research achievements” (mean = 4.26) are particularly telling. In the Chinese context, where career advancement for physicians is often tied to securing national grants and publishing in high-impact journals, these indicators are not abstract metrics; they are direct measures of a trainee's ability to navigate the real-world research ecosystem. This framework forces programs to move beyond assessing what trainees know to evaluating what they can do and achieve. It directly counters the criticism that current programs produce researchers who are strong on paper but weak in practice.^20,21

Bridging the Clinical-Research Divide: A major challenge in China is the perceived tension between clinical service and research productivity.²⁰ The framework implicitly addresses this by including indicators that measure the practical application of skills, such as “Student practical ability” (mean = 4.63), which assesses the ability to apply theory to real clinical research design. Furthermore, the emphasis on “Impact on career development” (mean = 4.53) acknowledges that successful training must support both academic and clinical career paths, fostering physician-scientists who can thrive in integrated roles rather than being forced into silos.

Moving Beyond the “Publication Count": While “Number of Science Citation Index papers published” was excluded for failing to meet consensus, the retention of “Quality of papers” (mean = 4.53) and “Academic influence” (mean = 4.37) suggests a sophisticated understanding among experts. They recognize that quantity alone is an inadequate metric. Instead, they prioritize the impact and quality of research output, which better reflects genuine scholarly contribution and aligns with global trends moving away from simple publication counts toward more nuanced impact assessments.^21,22

How This Framework Improves Upon Existing Models

While frameworks such as the NIH's CTSA or Europe's Horizon Europe emphasize comparable domains, this Chinese model distinguishes itself through contextual prioritization and locally grounded design.²³ Unlike generic competency-based approaches like CanMEDS or ACGME milestones^24,25—often adapted retrospectively to fit diverse settings—our framework was built from the ground up with direct input from Chinese experts who understand the realities of local institutional pressures and incentive structures. This is exemplified by the near-unanimous consensus on “Quality of training materials” (mean = 5.00), which reflects a pressing local need for scientifically rigorous and practically relevant educational resources, addressing widespread concerns about outdated or poorly designed course content in many Chinese institutions.²⁶ Moreover, the framework introduces a dynamic element often absent in traditionally top-down evaluation systems: feedback loops. Newly added indicators such as “Lecturer feedback suggestion” (mean = 4.53) and “Student feedback satisfaction” (mean = 4.37) institutionalize ongoing dialogue between trainers and trainees, transforming evaluation from a static compliance exercise into a mechanism for continuous program improvement. Finally, the framework directly confronts a well-documented weakness in Chinese medical education—the underemphasis on creative thinking—by incorporating “Innovation ability” (mean = 4.53) as a measurable outcome. This not only validates innovation as a core competency but also mandates that training programs actively cultivate originality and problem-solving alongside technical skills.

Implications for Clinical Research Training Programs in China and Beyond

This study's framework not only addresses China's unique challenges but also contributes to global discussions on optimizing clinical research training, particularly in resource-limited settings where balancing clinical and research demands is a universal challenge.²⁷

Curriculum Design

Clinical research training programs in China should be designed to be dynamic and responsive to the latest developments in medical science. The curriculum must align with the current needs of the healthcare system and the future career aspirations of trainees, ensuring that they acquire both clinical and research competencies. The integration of basic medical sciences with clinical practice and research skills is crucial for developing well-rounded physician-scientists.²⁰ Additionally, the curriculum should emphasize the importance of continuous learning and adaptation to new technologies and methodologies.²⁸

Quality of Training Materials

High-quality educational resources are fundamental for effective knowledge transfer and skill development in clinical research training. Training materials should be regularly updated to reflect the latest advancements in medical research and practice. In China, the use of validated assessment tools and diverse instructional methods, such as problem-based learning (PBL) and team-based learning (TBL), is essential for enhancing the learning experience.²⁶ However, current training programs often rely heavily on lecture-based formats, with limited emphasis on practical skills and behavioral changes.²⁶ Future efforts should focus on diversifying instructional methods and incorporating more interactive and hands-on learning experiences.

Skill Development

Practical skills are a cornerstone of clinical research training in China. Trainees should be equipped with essential skills such as data analysis, scientific writing, and project management to navigate the complexities of clinical research. The current training model in China, particularly the “5 + 3 + X” model, aims to combine clinical subspecialty training with research capabilities. This model comprises 5 years of undergraduate medical education, followed by 3 years of standardized residency training, and an additional “X” years of specialized fellowship training (which may include research components) for those pursuing advanced clinical or academic careers.^20,29 However, concerns have been raised about the adequacy of clinical practice experience in these programs, highlighting the need for a balanced approach that integrates both clinical proficiency and research competence.²⁰ Hands-on workshops and mentorship from experienced researchers can further enhance the practical skills of trainees.

Long-Term Impact

The long-term impact of clinical research training programs should be measured through outcomes such as the transformation of research achievements into clinical practice.³⁰ In China, the ability to secure research funding and translate scientific findings into practical applications is crucial for career advancement and institutional contributions.³¹ Training programs should focus on developing the skills required for successful grant applications and fostering a culture of innovation and collaboration. Additionally, ongoing evaluation and feedback mechanisms are necessary to ensure that training programs continue to meet the evolving needs of the healthcare system and research community.³²

Implementation and Evaluation

The next critical phase involves implementing this Delphi-consensus evaluation framework in real-world clinical research training settings to assess its practicality and effectiveness. Pilot programs should be conducted across diverse institutions in China, including tertiary hospitals and academic medical centers, to evaluate how well the framework aligns with local training needs and operational constraints. Feedback from trainees, instructors, and program administrators will be essential to identify gaps, such as discrepancies between theoretical training and practical application, or challenges in measuring intangible outcomes like innovation capacity.

Also, from a global perspective, the framework's adaptability should be tested in other healthcare systems, particularly in low- and middle-income countries (LMICs) where resource limitations and varying research infrastructures may necessitate modifications.

Limitations

The present study acknowledges several limitations. First, while we aimed to recruit 40 participants, the final sample size was 34 (15 in Round 1, 19 in Round 2). This shortfall may limit the robustness of the consensus achieved and the generalizability of the findings. We mitigated this by ensuring all participants were highly experienced experts from leading institutions and by conducting a second round to refine the indicators. Second, the panel composition, while expert, was predominantly female and heavily focused on clinical medicine, with limited representation from other disciplines like public health, biostatistics, or ethics. This lack of diversity in perspectives may affect the comprehensiveness of the framework, potentially overlooking critical evaluation aspects relevant to nonclinical stakeholders. Future studies should strive for a more diverse panel. Third, the framework was developed for China; its applicability to other regions requires further validation through comparative studies. Fourth, a formal calculation and justification for the target sample size were not performed prior to data collection, which is a methodological limitation common in Delphi studies but nonetheless affects the interpretability of the consensus strength. Finally, the small sample size and potential for selection bias are inherent limitations of the Delphi method that warrant consideration when interpreting the results.

Conclusion

This study developed a comprehensive evaluation framework for clinical research training in China using the Delphi method. The findings emphasize the importance of relevant training content, measurable learning outcomes, and long-term professional impact. While focused on China, the framework aligns with global priorities in clinical research education, offering insights for international medical training programs facing similar challenges in integrating research and practice. Future studies should validate this model in diverse settings to enhance its applicability worldwide. By addressing both local needs and universal training principles, this work contributes to advancing clinical research education globally.

Supplemental Material

sj-docx-1-mde-10.1177_23821205261428939 - Supplemental material for Developing a Delphi-Consensus Evaluation Framework for Clinical Research Training: A Chinese Model With Global Implications

Supplemental material, sj-docx-1-mde-10.1177_23821205261428939 for Developing a Delphi-Consensus Evaluation Framework for Clinical Research Training: A Chinese Model With Global Implications by April Shengjie Zhu, Jeremy Haoqing Zhu, Yun Chen and Paula Pei Li in Journal of Medical Education and Curricular Development

Supplemental Material

sj-docx-2-mde-10.1177_23821205261428939 - Supplemental material for Developing a Delphi-Consensus Evaluation Framework for Clinical Research Training: A Chinese Model With Global Implications

Supplemental material, sj-docx-2-mde-10.1177_23821205261428939 for Developing a Delphi-Consensus Evaluation Framework for Clinical Research Training: A Chinese Model With Global Implications by April Shengjie Zhu, Jeremy Haoqing Zhu, Yun Chen and Paula Pei Li in Journal of Medical Education and Curricular Development

Footnotes

Acknowledgments

The authors thank all 34 expert panelists who generously contributed their time and expertise to this study. Their insights were instrumental in developing this evaluation framework.

ORCID iD

April Shengjie Zhu

Ethics Statement

Not applicable.

Consent to Participate

Not applicable. Participation in the Delphi survey was voluntary, and informed consent was obtained electronically from all participants prior to their involvement in the study.

Author Contributions

April Shengjie Zhu: conceptualization, literature review, survey development, data collection, data analysis, manuscript writing. Jeremy Haoqing Zhu: literature review, manuscript writing. Yun Chen: literature review, manuscript writing. Paula Pei Li: project administration, participant recruitment, corresponding author responsibilities, manuscript review and editing. All authors reviewed and edited the final manuscript.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Conflict of Interest

Not applicable.

Supplemental Material

Supplemental material for this article is available online.

References

Wang

. Medical education in China: progress in the past 70 years and a vision for the future. BMC Med Educ. 2021;21(1):453. doi:10.1186/s12909-021-02875-6

Xing

B-C

. Quality over quantity: a necessary path for clinical research transformation in China. Hepatobiliary Surg Nutr 2020;9(5):684‐686. doi:10.21037/hbsn-2020-5

Hou

Michaud

, et al. Transformation of the education of health professionals in China: progress and challenges. Lancet. 2014;384(9945):819‐827. doi:10.1016/s0140-6736(14)61307-6

. 4 + 4 Medical education: a word of caution. Lancet. 2020;395(10225):688. doi:10.1016/s0140-6736(19)32996-4

Trochim

Rubio

Thomas

. Evaluation guidelines for the clinical and translational science awards (CTSAs). Clin Transl Sci. 2013;6(4):303‐309. doi:10.1111/cts.12036

Callard

Rose

Wykes

. Close to the bench as well as at the bedside: involving service users in all phases of translational research. Health Expect. 2012;15(4):389‐400. doi:10.1111/j.1369-7625.2011.00681.x

Martinez

Russell

Rubin

Leslie

Brugge

. Clinical and translational research and community engagement: implications for researcher capacity building. Clin Transl Sci. 2012;5(4):329‐332. doi:10.1111/j.1752-8062.2012.00433.x

Kwame

Petrucka

. A literature-based study of patient-centered care and communication in nurse-patient interactions: barriers, facilitators, and the way forward. BMC Nurs. 2021;20(1):158. doi:10.1186/s12912-021-00684-2

. Constructing an effective evaluation system to identify doctors’ research capabilities. Health Care Sci Feb. 2024;3(1):67‐72. doi:10.1002/hcs2.82

10.

Frye

Hemmer

. Program evaluation models and related theories: AMEE guide no. 67. Med Teach. 2012;34(5):e288‐e299. doi:10.3109/0142159x.2012.668637

11.

Kaufman

Keller

. Levels of evaluation: beyond Kirkpatrick. Human Resource Dev Quarterly. 2006;5(4):371‐380. doi:10.1002/hrdq.3920050408

12.

Makhmutov

. The Delphi method at a glance. Pflege. 2021;34(4):221. doi:10.1024/1012-5302/a000812

13.

McPherson

Reese

Wendler

. Methodology update: Delphi studies. Nurs Res 2018;67(5):404‐410. doi:10.1097/nnr.0000000000000297

14.

Niederberger

Schifano

Deckert

, et al. Delphi studies in social and health sciences-recommendations for an interdisciplinary standardized reporting (DELPHISTAR). results of a Delphi study. PLoS One. 2024;19(8):e0304651. doi:10.1371/journal.pone.0304651

15.

Chen

Zhang

. Construction of haemodialysis nursing-sensitive quality indicators based on Donabedian theory: a Delphi method study. Nurs Open. 2023;10(2):807‐816. doi:10.1002/nop2.1349

16.

Wang

Hsu

Fang

Kuo

. Gamification in medical education: identifying and prioritizing key elements through Delphi method. Med Educ Online 2024;29(1):2302231. doi:10.1080/10872981.2024.2302231

17.

de Villiers

Kent

. The Delphi technique in health sciences education research. Med Teach. 2005;27(7):639‐643. doi:10.1080/13611260500069947

18.

Giannarou

Zervas

. Using Delphi technique to build consensus in practice. Int J Business Sci Appl Manage. 2014;9(2):65‐82. doi:10.69864/ijbsam.9-2.106

19.

Diamond

Grant

Feldman

, et al. Defining consensus: a systematic review recommends methodologic criteria for reporting of Delphi studies. J Clin Epidemiol Apr. 2014;67(4):401‐409. doi:10.1016/j.jclinepi.2013.12.002

20.

Zhang

Yue

Jiang

Cao

Ten Cate

. Why clinical training in China should improve: a cross-sectional study of MD graduates. BMC Med Educ. 2021;21(1):266. doi:10.1186/s12909-021-02647-2

21.

Ioannidis

. Why most clinical research is not useful. PLoS Med 2016;13(6):e1002049. doi:10.1371/journal.pmed.1002049

22.

Kinkorová

. Horizon Europe, research and innovation programme until 2027 – what is new comparing with Horizon 2020? Cas Lek Cesk. Summer. 2022;161(3-4):163‐166. Horizont Evropa, rámcový program Evropské komise na období 2021-2027 – novinky ve srovnání s programem Horizont 2020.

23.

Liu

Huang

Liu

Chen

. An exploratory study on needs for clinical research training: data from Chinese hospitals. BMC Med Educ. 2021;21(1):559. doi:10.1186/s12909-021-02993-1

24.

Frank

Snell

Cate

, et al. Competency-based medical education: theory to practice. Med Teach. 2010;32(8):638‐645. doi:10.3109/0142159x.2010.501190

25.

Nasca

Philibert

Brigham

Flynn

. The next GME accreditation system–rationale and benefits. N Engl J Med. 2012;366(11):1051‐1056. doi:10.1056/NEJMsr1200117

26.

Shuyu

HOU

Xin

Tingting

, et al. Research on continuing education for general practitioners in China over the past decade: a systematic review. Chinese General Practice Journal. 2024;1(4):100033. doi:10.1016/j.cgpj.2024.11.002

27.

Whitworth

Kokwaro

Kinyanjui

, et al. Strengthening capacity for health research in Africa. Lancet. 2008;372(9649):1590‐1593. doi:10.1016/s0140-6736(08)61660-8

28.

Liu

Feng

Liu

, et al. Medical education systems in China: development, Status, and evaluation. Acad Med. 2023;98(1):43‐49. doi:10.1097/acm.0000000000004919

29.

Zhu

Chen

. Doctors in China: improving quality through modernisation of residency education. Lancet. 2016;388(10054):1922‐1929. doi:10.1016/s0140-6736(16)00582-1

30.

Rajadhyaksha

. Training for clinical research professionals: focusing on effectiveness and utility. Perspect Clin Res. 2010;1(4):117‐119. doi:10.4103/2229-3485.71767

31.

You

Wang

Zhang

Yao

. China’s national continuing medical education program for general practitioners: a cross-sectional survey (2016–2023). BMC Med Educ 2025;25(1):72. doi:10.1186/s12909-025-06682-1

32.

Ejigu

Fekadu

Whitty

, et al. Development, implementation, and evaluation of an innovative clinical trial operations training program for Africa (ClinOps). BMC Med Educ 2025;25(1):119. doi:10.1186/s12909-025-06733-7

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB

0.12 MB