Abstract
Background
High-stakes Objective Structured Clinical Examinations (OSCEs) are resource-intensive and may constrain access and equity, especially for candidates from remote locations. Smartphone-based tele-OSCEs could reduce logistical burdens while maintaining assessment quality.
Objective
To evaluate whether a smartphone tele-OSCE yields assessment outcomes comparable to an in-person OSCE while improving implementation efficiency, costs, and acceptability within an undergraduate medical curriculum.
Methods
We conducted a quasi-experimental historical-control study (2021 tele-OSCE vs 2019 in-person OSCE) in 5th-year medical students at a single university in China. The tele-OSCE comprised 2 stations (history taking, clinical reasoning) aligned with the course blueprint. Primary outcomes included overall scores and pass/fail decisions; secondary outcomes included examiner/standardized patient (SP)/student acceptability, direct per-candidate costs, total examination time, and logistical metrics. Psychometric analyses included interexaminer correlations and descriptive consistency checks.
Results
Of 176 candidates scheduled for the 2021 tele-OSCE, 164 without prior online-OSCE exposure were analyzed; 272 in-person candidates from 2019 served as historical controls. Students in the tele-OSCE cohort obtained lower mean scores than those in the 2019 in-person OSCE cohort (65.6 ± 11.2 vs 72.0 ± 10.6;
Conclusions
A smartphone tele-OSCE can support curriculum-integrated, high-stakes competency decisions with performance outcomes that remain within an acceptable range relative to the conventional format, while improving logistical feasibility. We provide implementation details and practical guidance to facilitate replication in similar curricular settings.
Keywords
Background
The Unified National Graduate Entrance Examination (UNGEE) is a mandatory requirement for medical undergraduates in China seeking admission to postgraduate programs. The UNGEE evaluates candidates through 2 distinct parts: a theoretical knowledge examination and a performance-based clinical skills assessment. 1 The clinical skills assessment, administered by universities under national guidelines, primarily utilizes Objective Structured Clinical Examinations (OSCEs)—a globally validated method for assessing core clinical competencies, including history taking, diagnostic reasoning, and treatment planning, through simulated patient encounters with standardized patients (SPs).2–6 However, in-person OSCEs are burdened by substantial logistical and financial constraints, including high-operational costs, intensive resource allocation, and geographical barriers that disproportionately affect candidates in remote areas, particularly in national-level examinations requiring mass coordination.7–9
The rising cost of medical education poses significant challenges to healthcare systems worldwide, with implications for equitable access and resource allocation. 10 Organizing large-scale OSCEs, particularly in-person formats, demands substantial logistical and financial investments—ranging from venue coordination to SP recruitment—a burden felt acutely in resource-constrained settings.10–13 While the urgency of remote assessments has diminished with the subsidence of the COVID-19 pandemic, the long-term value of tele-OSCEs extends beyond crisis management. First, by eliminating cross-regional travel, tele-OSCEs reduce both temporal and economic burdens for candidates and institutions, a critical advantage for high-volume examinations or regions with limited infrastructure. 14 Second, the integration of telemedicine technologies into hybrid education frameworks ensures continuity of competency assessments during future public health emergencies while maintaining alignment with accreditation standards.15–17 Finally, mobile-enabled tele-OSCEs catalyze the broader digital transformation of medical education, leveraging smartphone ubiquity to democratize access to standardized assessments and align with global trends toward decentralized, technology-driven learning ecosystems.18–20
Technological advancements now permit the reliable administration of tele-OSCEs through internet-connected devices with students and SPs separated at a distance via computers or mobile devices.21–23 Mobile devices, such as smartphones and tablets, further enhance accessibility through wireless connectivity, portability, and user-friendly interfaces—features that transcend the spatial constraints of traditional desktop-based systems. 24 Emerging evidence suggests that mobile-enabled OSCEs not only improve student engagement and learning outcomes but also optimize time efficiency in both preparation and execution.25–28 However, scalable implementations of mobile-based tele-OSCEs in high-stakes settings remain underexplored, necessitating rigorous feasibility studies to address technical, pedagogical, and equity-related challenges.
To address the limitations of conventional OSCEs and leverage emerging mobile technologies, our team recently developed a WeChat®-based platform designed to facilitate structured virtual interactions between medical students and SPs. 29 During the COVID-19 pandemic, when face-to-face clinical training and traditional OSCEs were suspended due to infection risks, this platform enabled students to sustain foundational clinical skills training through remote SP encounters.18,20,22 Building on these adaptations, we further implemented the platform to administer a tele-OSCE for the 2021 UNGEE clinical skills assessment—a high-stakes, nationally mandated examination—aligning with pandemic containment protocols while ensuring continuity in competency evaluation. Importantly, the adoption of a tele-OSCE format was conducted under regulatory oversight: education authorities provided provisional approval for a remote OSCE version of the UNGEE during the pandemic, contingent on meeting standard examination security and quality requirements. Although tele-OSCEs had been administered in some form at other institutions, 2021 marked the first implementation of a tele-OSCE at our institution using a smartphone-based platform for a high-stakes exam. To validate this approach, we systematically evaluated the feasibility, acceptability, and cost-effectiveness of the mobile tele-OSCE modality, with implications for advancing resilient, technology-driven medical education systems.
Methods
This observational study is reported in accordance with the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) statement for cross-sectional studies. 30 The completed STROBE checklist is provided as Supplementary File 1.
Participants
A cohort of 5th-year medical students applying for postgraduate positions at Nanjing Medical University (NMU) in 2021 was recruited for a tele-OSCE. All participants were within 3 months of completing their Bachelor of Medicine degrees in clinical medicine and had passed the UNGEE theoretical knowledge examination, with final admission contingent upon OSCE performance. Students who had prior similar online examination experience were excluded from the study. In practice, this meant we excluded any student who had previously participated in an online OSCE or comparable virtual clinical assessment (eg, those who took part in a pilot tele-OSCE at our institution), so that all included candidates were naive to the tele-OSCE format. This criterion ensured a fair comparison and avoided giving any students an undue advantage from prior exposure. Participants had completed the full medical curriculum including a 1-year clerkship and a minimum 9-month hospital internship. Additionally, all students had previously undergone in-person OSCE training during their 4th year through the National Medical Examination Center's standardized Clinical Skills Test, confirming baseline competency in SP-based diagnostic reasoning.
To enable comparative analysis, historical control data were obtained from 272 candidates who completed the in-person UNGEE OSCE at NMU in 2019. The control cohort met identical eligibility criteria (degree completion timeline, clerkship/internship requirements, and exclusion of prior tele-OSCE experience), with OSCE scores extracted from institutional archives. Because this was a pragmatic observational study, we did not perform a formal a priori sample size calculation; instead, we included all eligible candidates in 2021 and used all available OSCE data from 2019 as historical controls.
The study design was approved by the Ethics Committee of Nanjing Medical University (Approval number 2021 670). According to the “Regulations on Ethical Review of Biomedical Research Involving Humans” issued by the National Health Commission of the People's Republic of China, and the Institutional Review Board's decision, the requirement for informed consent was waived for this observational study of an educational intervention, as it involved no more than minimal risk to participants.
Examination Requirements
Candidates were asked to log onto the university examination website for online registration and identity verification. Each examinee was required to prepare a dual-device, dual-camera setup (details of the requirement are available at https://yjszs.njmu.edu.cn/2022/0921/c10171a221823/page.htm). One device was a laptop computer (or a desktop computer equipped with an external webcam and microphone; headphone use was prohibited for examinees) to serve as the primary front-facing camera. The second device was a smartphone (Android 5.0 or above recommended) using its built-in camera in landscape orientation. During the test, the examinee was instructed to place the smartphone such that its camera provided a lateral or rear side view capturing the examinee's profile and the screen of the main device. Meanwhile, the laptop's webcam (or desktop's external camera) functioned as the front view, showing the examinee's face and hands (students were asked to keep their hands visible on the desk). This dual-camera configuration allowed proctors to monitor both the student's immediate environment and their computer screen in real time (Figure 1). The computer's operating system needed to be Windows 7 or above, and the latest version of Google Chrome was required to access the exam web page. By using 2 synchronized camera feeds, we aimed to enhance exam security and transparency during the remote assessment. All hardware and internet access were provided by students themselves in this implementation.

Schematic diagram of the dual-camera setup.
Examination Procedures
Intervention: Tele-OSCE Implementation
We implemented a smartphone-based tele-OSCE aligned with the course blueprint and competency domains. Each candidate used a personal smartphone to connect to the secure platform; identity verification included real-name login and photo ID checks. Proctoring procedures comprised continuous audio–video surveillance (via the dual cameras described above), standardized 360° room scans, and time-stamped recording of sessions. Contingency standard operating procedures (SOPs) covered unexpected events such as connection dropouts, background noise, or third-person entry into the room. SPs and examiners underwent calibration training using case scripts, checklists, and rating exemplars. Network redundancy was provided: candidates were instructed to use stable local Wi-Fi with a mobile hotspot as backup. The examination comprised 2 sequential stations—History Taking (10 min with an SP) and Clinical Reasoning (15 min with examiners)—with synchronized scheduling through a virtual waiting area and automated station switching. All data (video, audio, and scores) were encrypted and archived on institutional servers according to university policy.
The smartphone application-based online platform was developed and validated by the university specifically to enable synchronized interactions among students, SPs, examiners, and technical coordinators.
29
As a pilot implementation in high-stakes settings, the tele-OSCE was intentionally designed with only 2 stations (history taking [Station A] and clinical reasoning [Station B]) to prioritize feasibility and risk mitigation during the pandemic. This streamlined structure aligns with the efficiency requirements of the UNGEE, which mandates core competency assessment within a limited time frame, while accommodating the technical complexities of remote administration. As illustrated in Figure 2, the examination workflow proceeded as follows:
Preassessment: on exam day, students logged into the platform and entered a virtual “Waiting Area” until the coordinator initiated the examination. Once the coordinator sent the “start test” command, the student's interface automatically allowed entry into a virtual exam room to meet the SP. Station A (History Taking): in the virtual exam room, students engaged in a 10-min structured medical history interview with an SP, focusing on symptom elicitation, communication skills, and other history-taking competencies. Two blinded examiners, who were physically colocated with the SPs but concealed from the student's view, independently evaluated the student's performance using a standardized competency checklist. Upon completion of the 10-min station, the students returned to the virtual waiting area. The platform then automatically proceeded to Station B once both examiners had submitted their Station A assessments. Station B (Clinical Reasoning): the students entered a second virtual room to participate in a 15-min discussion with examiners, demonstrating diagnostic reasoning, differential diagnosis formulation, and treatment planning. Examiners scored the students' responses using a standardized rubric. After 15 min, the students exited the virtual room and the platform indicated the end of the exam for that candidate.

Structure and workflow of the tele-OSCE process. Abbreviation: OSCE, Objective Structured Clinical Examination.
The maximum scores for Stations A and B were 60 and 40 points, respectively, reflecting the greater weight of history-taking skills in our curriculum's competency frameworks (total score = 100).
Case Development
Four SP cases were developed by the university's clinical education committee to reflect common clinical presentations in general medicine: chest pain, dyspnea, abdominal pain, and jaundice. The case development process and blueprinting have been described previously. 29 To ensure examination integrity across the 2-day testing window, each candidate was assigned a unique case during their half-day session, preventing content leakage between cohorts.
SP Training
The SPs were recruited and trained by the university to meet the qualifications for high-stake examinations. They were introduced to their case scripts 1 day before the exam. SP training was facilitated by instructors certified by the Association of Standardized Patient Educators. The training materials (case narratives, checklists, etc) were standardized based on cases provided by the case development group, ensuring that all SPs enacted scenarios consistently.
Examiner Recruitment and Blinding
The examiners were clinicians familiar with the symptom domains and cases included in the exam. Each examiner received the checklist guidelines and scoring rubrics for Stations A and B 1 h before the exam for final review.
During the tele-OSCE sessions, examiners scoring Station A were in the same room as the SP (to observe the live interaction) but they were not visible to the student, and they did not interact with the student directly. All examiners were blinded to candidate identities; the platform assigned code identifiers to each user (student, SP, examiner) upon login to conceal personal information and avoid any bias.
Pilot Run
We conducted a pilot test of the platform and procedures prior to the formal examination. 29 This included a full technical rehearsal and workflow simulation. To ensure a smooth examination process, all examinees completed an online practice exam 19 on the day before the exam, and faculty participated in a pre-OSCE calibration session. Feedback from the pilot run was used to refine SOPs (eg, clarifying steps to take if an SP's connection was lost mid-station, etc).
Evaluation and Outcomes
Feasibility Metrics
1.
2.
3.
Acceptability
Multistakeholder acceptability was assessed via postexam questionnaires tailored to each group: students, SPs, and examiners. Each questionnaire used 5-point Likert-scale items (1 = strongly disagree to 5 = strongly agree) to gauge perceptions of the tele-OSCE's clarity, ease of use, technical quality, and overall experience. Students’ survey included items such as “I could hear/see the SP and examiner clearly” and “The online test had no influence on my performance,” among others (results compiled in Supplemental Table 2). SPs and examiners were asked parallel questions (eg, SPs were asked if they could clearly observe students’ behavior; examiners were asked if remote evaluation affected their judgment, etc; see Supplemental Tables 3 and 4). The questionnaire was designed and used in our previous study. 29 It revealed a Cronbach's alpha of .97, and item-total correlation ranged between .79 and .93. We also collected free-text comments for qualitative feedback.
Performance Comparison
The primary performance outcome was the students’ total OSCE score and pass/fail decision, comparing tele-OSCE (2021) to in-person (2019). Students’ performance scores (out of 100) were recorded using a checklist developed by the Academic Affairs Office staff. Scores from the 2021 tele-OSCE were compared with those from the 2019 in-person OSCE. For pass/fail decisions, a score ≥60 out of 100 was required to pass, following the standard criteria from prior years. Additionally, to evaluate interrater reliability in the tele-OSCE, we analyzed the correlation between the 2 examiners’ scores for each of the 4 cases. We computed Spearman's rank correlation coefficients (
Statistical Analysis
We used SPSS (v 26; IBM Corp.) for statistical analyses. For the primary comparison of student performance (tele-OSCE vs in-person), we applied independent samples
Results
Participant Characteristics
A total of 164 5th-year medical students completed the tele-OSCE in January 2021 and were included in analysis (after excluding 12 students who had participated in a prior online OSCE pilot). These 164 students represented all eligible UNGEE candidates from our institution in 2021 who met inclusion criteria. The control group comprised 272 students who sat for the in-person OSCE in 2019 under identical curricular conditions. These 2 cohorts were final-year medical students of similar age (mean ∼23 years) and had completed equivalent clinical rotations.
Primary Outcome: Performance Comparison
In the primary comparison of total OSCE scores between cohorts, students in the 2021 tele-OSCE cohort obtained lower scores than those in the 2019 in-person OSCE cohort. The mean total score for the tele-OSCE cohort was 65.6 ± 11.2 (

(A) The average scores of tele-OSCE in 2021 and in-person OSCE in 2019 were 65.6 ± 11.2 (
The pass/fail outcomes showed 95.1% of students passed the tele-OSCE versus 96.3% passed the 2019 OSCE (pass rates were high in both years because the UNGEE OSCE is a requirement for graduation and almost all students meet competency expectations). This small difference in pass rates was not statistically significant (χ2 = 0.34,
Interexaminer Reliability
Table 1 shows the interexaminer agreement for each of the 4 cases in the tele-OSCE. Spearman correlation coefficients (
The Correlation Analysis of Scores From Different Examiners of the Four Cases.
Abbreviation: ICC, intraclass correlation coefficient.
Feasibility: Duration and Efficiency
The tele-OSCE demonstrated increased efficiency in certain aspects. The total examination time per student (from sign-in to completion of both stations) was ∼40 min for tele-OSCE, compared to ∼48 min for the 2019 in-person OSCE (Table 1). On average, the waiting time between Stations A and B was only about 10 to 20 s—essentially the time for an examiner to click “Invite” for the next station. This near-instantaneous transition effectively eliminated the typical 1 to 2 min downtime seen in physical OSCE circuits (often needed for walking to the next room or loading a new Zoom breakout room).
Feasibility: Cost Analysis
The financial analysis indicated that the tele-OSCE incurred higher direct costs per student (+12.7%) compared to the traditional OSCE. The total expenditures were ¥51 340 (≈USD7960) for tele-OSCE versus ¥70 400 (≈USD10 915) for the 2019 OSCE, but when distributed across different cohort sizes (176 vs 272 students scheduled), this equated to USD45.22 per tele-OSCE candidate and USD40.13 per in-person candidate (Table 2). The difference was primarily driven by tele-OSCE-specific investment, notably the amortized cost of platform development (which accounted for ∼39% of tele-OSCE expenses) and additional technical support fees. In contrast, the in-person OSCE's largest cost components were personnel (eg, examiner and SP stipends, totaling ∼57% of in-person costs) and logistics (venue rental, materials). Detailed cost breakdowns are provided in Supplemental Table 1. Despite the moderate increase in per-student costs, the tele-OSCE offered savings in other respects: it reduced total examination time and required fewer support staff on-site (Supplemental Table 1). Moreover, the tele-OSCE's compliance with pandemic restrictions yielded operational benefits: the on-site personnel density during the exam days dropped from 4.26 persons per 100 m2 in 2019 to only 0.34 persons per 100 m2 in 2021 (Table 2), greatly reducing infection risk and resource use.
Financial and Time Cost of the Tele-OSCE and 2019 In-Person OSCE.
Abbreviation: OSCE, Objective Structured Clinical Examination.
Commuting time for non-local students.
Acceptability: Student Perspective
Out of 164 tele-OSCE participants, 161 (98%) completed the postexam student survey. Overall, student feedback was positive regarding the tele-OSCE format. As seen in Figure 4, a majority of students agreed or strongly agreed that they could clearly hear the SPs (93% agreement) and clearly see the SP's expressions (86% agreement). Most found the smartphone app platform easy to operate (72% agreement) and were comfortable with the idea of tele-OSCEs being used in future assessments (78% agreed they would be willing to participate in tele-OSCEs for other summative exams). Notably, when asked if “the online test had no influence on my performance,” responses were mixed: about 34% strongly agreed or agreed, 37% were neutral, and 29% disagreed (ie, felt the online format did have some influence). Approximately one-third of students indicated that communicating through an online interface affected their performance to some degree. This aligns with qualitative comments we collected: for example, 1 student mentioned feeling that “I cannot think as clearly as usual when talking on the phone,” suggesting that the medium itself introduced some discomfort (Table 3). Another student explicitly commented that “network and device quality would affect exam performance,” underscoring concerns that technical factors could hinder them (Table 3). On a positive note, despite those concerns, the majority of students (∼79%) indicated they could accept the tele-OSCE format in the UNGEE context and even be open to using similar online platforms for future formative practice with SPs.

Results of the acceptability of tele-OSCE from 164 students. The examinees took a 7-item questionnaire rated on a 5-point Likert-scale (1-5) online questionnaire after the exam to measure various aspects of acceptability. The results of each item were present with a 100% stacked bar with different color codes. Detailed data also can be found in Supplemental Table 2. Abbreviation: OSCE, Objective Structured Clinical Examination.
Excerpted Comments on Tele-OSCE.
Abbreviations: OSCE, Objective Structured Clinical Examination; SP, standardized patient.
Acceptability: SP and Examiner Perspectives
All 20 SPs and 39 examiners involved in the tele-OSCE completed their respective acceptability questionnaires. As seen in Figure 5, SPs reported high levels of comfort and effectiveness in the tele-OSCE setting: 95% agreed they could hear students clearly, and 70% felt they could observe students’ nonverbal cues (movements, facial expressions) clearly through the video. SPs did not report major difficulties with the technology; only 1 SP mildly disagreed that the platform was easy to use. In terms of preferences, 85% of SPs said they could accept the tele-OSCE format, and a similar proportion would be willing to participate in such online exams in the future. The main concern raised by SPs in comments was the lack of physical interaction—a few SPs mentioned that not being in the same room made it harder to create empathy or fully assess certain communication aspects. Nonetheless, they were generally satisfied with how the cases were conducted virtually.

Results of the acceptability of tele-OSCE from 20 SPs. All SPs took an 8-item questionnaire rated on a 5-point Likert-scale (1-5) online questionnaire after the exam to measure various aspects of acceptability. The results of each item were present with a 100% stacked bar with different color code. Detailed data also can be found in Supplemental Table 3. Abbreviations: OSCE, Objective Structured Clinical Examination; SP, standardized patient.
Examiners similarly had a favorable view. As seen in Figure 6, all examiners found the platform user-friendly and agreed that they could hear the students clearly. Furthermore, 95% of examiners reported that evaluating students via devices had no adverse effect on their judgment. In terms of future use, 95% of examiners said they could accept the tele-OSCE format, and around 93% would recommend it for other summative assessments. One consistent piece of qualitative feedback from examiners concerned exam security: a few examiners echoed the sentiment that we must ensure robust measures to prevent cheating when candidates are unsupervised at home (Table 3).

Results of the acceptability of tele-OSCE from 39 examiners. All examiners took a 6-item questionnaire rated on a 5-point Likert-scale (1-5) online questionnaire after the exam to measure various aspects of acceptability. The results of each item were present with a 100% stacked bar with different color code. Detailed data also can be found in Supplemental Table 4. Abbreviation: OSCE, Objective Structured Clinical Examination.
Discussion
Our study demonstrates that a smartphone-based tele-OSCE can serve as a viable alternative to the traditional in-person OSCE for a high-stakes examination, without fundamentally compromising the assessment of core clinical competencies. Despite using only 2 stations and a remote format, the tele-OSCE produced high pass rates and a reasonable score distribution, and it was well accepted by students, SPs, and examiners, even though mean scores were lower than those observed in the 2019 in-person cohort. Moreover, the tele-OSCE introduced efficiencies and logistical benefits, highlighting important implications for broader implementation.
At the global score level, the 2021 tele-OSCE cohort performed significantly lower than the 2019 in-person cohort, with a moderate effect size favoring the in-person modality. This should not be interpreted as evidence that the tele-OSCE format is inherently inferior, because the 2 cohorts differed in calendar year and educational context. The 2021 cohort experienced substantial disruption of clinical placements during the COVID-19 pandemic, and candidates also had to manage dual devices, camera positioning and connectivity, which may have added cognitive load and performance anxiety. Despite lower mean scores, pass rates and the overall distribution of grades remained within an acceptable range, suggesting that—at least for the focused competencies of history taking and clinical reasoning—the tele-OSCE remained usable for high-stakes decisions. Taken together, our quasi-experimental comparison does not demonstrate equivalence between tele-OSCE and in-person OSCE, but indicates that a smartphone-based tele-OSCE can function as a rigorous assessment that may slightly increase difficulty for candidates. This pattern is broadly consistent with emerging literature reporting no large performance decrements with virtual OSCEs, while highlighting that local contextual factors (such as pandemic-related training disruption and technical demands) can influence scores. For high-stakes decisions, these findings underscore the need to monitor score distributions and pass rates carefully, review standard-setting procedures, and provide targeted preparation (eg, mock tele-OSCEs and technical rehearsal) so that additional challenges introduced by the online format do not unintentionally disadvantage candidates. Importantly, our study was not designed or powered as a formal noninferiority trial, and randomized or counterbalanced comparisons would be needed to determine psychometric equivalence between tele-OSCEs and traditional in-person OSCEs.
Our tele-OSCE was distinctive in using a dedicated mobile platform (built on WeChat) rather than general videoconferencing tools like Zoom. 29 This design addressed technical pitfalls reported in other virtual OSCEs, where manual management of breakout rooms can introduce delays and coordination errors.18,19,31–35 In our system, room assignment and station switching were automated, so the interval between stations was typically only 10 to 20 s. In contrast to Zoom-based OSCEs that rely on manually operated breakout rooms (which can introduce delays and coordination errors), our system's auto-allocation mechanism kept the between-station waiting time to a minimum. We do not claim this ultra-short transition is “optimal.” There is currently no consensus on the ideal break length between OSCE stations, and traditional in-person OSCEs usually allow a brief pause of around 30 to 60 s for movement and mental reset. In our implementation, the 10 to 20 s interval was simply a byproduct of the platform's workflow rather than a deliberate pedagogical choice. Anecdotally, students did not report the short transition as problematic, which may be related to the fact that only 2 stations were used. In longer OSCE circuits, scheduled rest stations or longer pauses are often introduced to combat fatigue. Future research should examine whether very short breaks, such as those in our tele-OSCE, have any impact—positive or negative—on candidate performance, fatigue and anxiety, and whether slightly longer intervals might actually be beneficial.
The tele-OSCE proved to be resource-efficient in some respects but more resource-intensive in others. Direct institutional cost per student was slightly higher for the tele-OSCE, mainly due to platform development and IT support, yet once the platform is built the marginal cost of additional tele-OSCEs is low and amortization over multiple years could make this model cost-effective. Importantly, we explicitly separated institutional costs from personal costs in our analysis: student travel and lodging for the 2019 in-person OSCE were not counted as savings for the institution, but they were effectively eliminated for candidates in 2021. Tele-OSCEs therefore may increase organizer costs modestly while simultaneously reducing time and financial burdens for students. From an equity standpoint, removing the need to travel can widen access for candidates from distant or underserved regions, who might otherwise face substantial relocation expenses. At the same time, the tele-OSCE model shifts responsibility for hardware and internet access to students; in our implementation, all candidates used their own devices and connections, and several students noted that “network and equipment quality” affected their performance. For high-stakes examinations, organizers should therefore not only define minimum technical requirements but also provide practical support—such as mock exams, equipment checks, and loaner devices or on-site tele-exam hubs—to ensure that no student is disadvantaged by technology or socioeconomic constraints rather than their clinical competence.
Exam security and integrity are critical challenges in remote, high-stakes assessments. We implemented multiple safeguards: strict identity verification (real-name login and ID checks), continuous monitoring via 2 camera angles, and unique case assignments to prevent sharing of answers. Additionally, our platform's design had some inherent security advantages—for example, WeChat does not support built-in screen or call recording, which reduces the risk of examinees easily capturing exam content. All users were anonymized by code names to prevent bias or collusion. Despite these measures, we acknowledge the valid concern raised by an examiner: “how to prevent cheating” remains a foremost concern when students are not in a controlled physical environment. It is possible for a student at home to receive unseen assistance or use unauthorized materials if proctoring is insufficient. Going forward, additional steps such as artificial intelligence (AI)-driven proctoring (eg, algorithms to detect suspicious eye movements or the presence of a second person in the room) could augment human oversight. We also suggest that future tele-OSCEs might involve live remote proctors or require students to take the exam at designated centers even if the encounter is virtual, to add another layer of invigilation. In this first implementation, we did not encounter any confirmed cheating incidents, and the fact that performance was on par with the in-person exam suggests no widespread malpractice. Nonetheless, maintaining the credibility of tele-OSCE results will depend on continuously strengthening security protocols as technology and cunning evolve in tandem.
Another challenge for our tele-OSCE was the limited number of stations. We effectively assessed 2 major domains (communication/history and reasoning) with 4 clinical case variations, which is a narrower sampling of skills than most traditional OSCEs that often include physical examination, procedural skills, and so on. This was a conscious decision to keep the pilot tele-OSCE feasible and safe under pandemic constraints. However, it raises concerns about content validity and decision reliability. We attempted to mitigate this by weighting the history-taking station more (60%) to reflect its importance and by ensuring the cases were of comparable difficulty to prior exams. We also note that all students had already demonstrated basic clinical skills in earlier in-person assessments (during their 4th year), so the tele-OSCE's goal was to certify continued competence rather than teach new skills. Nonetheless, our study concurs with previous recommendations that increasing station number would likely improve the reliability and robustness of high-stakes OSCEs. Standard setting was another aspect we kept simple (using the same passing threshold as the university traditionally used). A more rigorous approach in subsequent tele-OSCEs could involve formal standard setting (eg, Angoff) and calculating decision consistency if pass/fail decisions are to be made on fewer stations. The relatively high pass rates in both years suggest the exam was designed to ensure minimal competence rather than to be highly selective; thus, pass/fail decisions were not borderline for most students. Had there been many borderline performances, the limited content could have been a greater concern for fairness. We have tempered our conclusions accordingly: rather than claiming full equivalence, we frame our findings as demonstrating feasibility and preliminary comparability, with the understanding that further validation would strengthen the evidence.
In terms of stakeholder acceptance, our data are encouraging. Students appreciated not having to travel and the innovative aspect of the exam, even as about one-third felt the format influenced them. Interestingly, that split in perception (some feeling affected, some not) mirrors findings from other studies where, for example, an average “experience” rating for tele-OSCEs was moderate (around 7.5 out of 10) and students desired more practice and fewer technical issues. 36 Those concerns are not unlike the usual reactions to any new format—familiarity improves comfort. Indeed, many of our students requested more mock online sessions ahead of the graded exam. This suggests that incorporating low-stakes practice OSCEs online could help students adapt to the modality, potentially leveling the field between those who are tech-savvy and those who are less so. Faculty (examiners and SPs) were largely positive, which is notable because faculty acceptance can be a barrier to implementing new assessment methods. Examiners in our study cited time savings and ease of grading behaviors on-screen, consistent with reports that examiners find remote OSCEs less burdensome in some respects. 37 The key faculty concern was ensuring the exam remained rigorous and fair—emphasizing once again that issues like potential cheating or technical failures need to be adequately addressed for faculty to fully trust the system.37,38
Limitations
Our study has several limitations. First, the historical control group (2019 in-person OSCE) differed in time and context, which introduces potential confounders (eg, the 2021 cohort's education was partly during COVID disruptions). These cohort and temporal differences could affect performance in ways unrelated to exam modality. Ideally, a parallel-group design or a controlled trial would compare tele- versus in-person OSCEs under the same conditions; our use of a 2019 baseline, while pragmatic, means results should be interpreted cautiously. Future studies should include a concurrent control group to strengthen causal inferences about modality differences. Additionally, the tele-OSCE's 2-station format sampled a narrow slice of clinical competence, which likely limits the reliability of pass/fail decisions. We did not perform a generalizability (G) study across cases and raters; consequently, the decision consistency of a 2-station exam remains uncertain. Future studies should incorporate more stations or use G-theory modeling to ensure sufficient reliability for high-stakes decisions. Second, while examiners were blinded to candidate identities, their physical proximity to SPs (being in the same room) may have inadvertently influenced scoring or given them additional cues (eg, hearing the SP's live reactions). In future iterations, this could be mitigated by having examiners also participate remotely (fully separate from SPs) to mirror a true double-blind scenario. Third, the platform's dependency on stable internet access may inherently exclude or disadvantage candidates in low-bandwidth regions. Not all students globally have equal access to fast internet or quiet home environments. This underscores the need for hybrid assessment models or institutional support systems that can accommodate students from varying circumstances (eg, providing on-campus facilities for those who cannot take the exam at home). Fourth, the content coverage of our tele-OSCE was limited to 2 stations, which raises concerns about content validity and reliability. We addressed core competencies within those stations, but other important skills (physical examination, procedural tasks) were not directly assessed, potentially limiting the exam's scope. Future research should expand station numbers, integrate AI-driven proctoring, and validate the platform in diverse socioeconomic settings. Lastly, our study did not conduct a formal priori sample size calculation. Although we included all eligible students and observed no large differences in performance, the study may not have been powered to detect small or moderate effects or to demonstrate statistical equivalence between modalities.
Conclusions
The mobile tele-OSCE represents a viable alternative to traditional OSCEs, offering logistical flexibility and expanded access without compromising core competency assessment. Our findings provide proof that a high-stakes clinical skills exam can be conducted on a smartphone-based platform with performance outcomes that remained within an acceptable range relative to the conventional format. This has significant curricular implications: tele-OSCEs can be integrated into medical education to enhance resilience (continuity during disruptions), improve geographic equity (by removing travel barriers), and potentially streamline examination processes. As medical education evolves toward decentralized, technology-enhanced paradigms, this study provides a reference for balancing innovation with psychometric validity in high-stakes contexts.
Supplemental Material
sj-docx-1-mde-10.1177_23821205251413401 - Supplemental material for Introducing a Smartphone Tele-Objective Structured Clinical Examination to Support High-Stakes Competency Decisions: A Quasi-Experimental Study and Curricular Implications
Supplemental material, sj-docx-1-mde-10.1177_23821205251413401 for Introducing a Smartphone Tele-Objective Structured Clinical Examination to Support High-Stakes Competency Decisions: A Quasi-Experimental Study and Curricular Implications by Xiaozhi Wang, Junjie Du, Binlin Luo, Liling Chen, Huanhuan Chen, Surong Jiang, Wei Sun, Lei Zhou, Lars Konge, Hua Huang and Qiang Ding in Journal of Medical Education and Curricular Development
Supplemental Material
sj-docx-2-mde-10.1177_23821205251413401 - Supplemental material for Introducing a Smartphone Tele-Objective Structured Clinical Examination to Support High-Stakes Competency Decisions: A Quasi-Experimental Study and Curricular Implications
Supplemental material, sj-docx-2-mde-10.1177_23821205251413401 for Introducing a Smartphone Tele-Objective Structured Clinical Examination to Support High-Stakes Competency Decisions: A Quasi-Experimental Study and Curricular Implications by Xiaozhi Wang, Junjie Du, Binlin Luo, Liling Chen, Huanhuan Chen, Surong Jiang, Wei Sun, Lei Zhou, Lars Konge, Hua Huang and Qiang Ding in Journal of Medical Education and Curricular Development
Footnotes
List of Abbreviations
Acknowledgments
The authors would like to thank Mrs Ning Wang and Mr Jianfeng Ge of the Medical Simulation Center for their valuable assistance with data collection. Their efforts in collecting and organizing data on parameters of the study, including expenditure, time, and personnel density of the examinations, were instrumental in ensuring the accuracy and reliability of the study results.
Ethics Approval and Consent to Participate
This study is approved by the Ethics Committee of Nanjing Medical University (Approval No 2021 670). Informed consent was waived according to national regulations and IRB determination.
Author Contributions
Conceptualization: Lars Konge, Hua Huang and Qiang Ding. Methodology: Xiaozhi Wang, Junjie Du, Binlin Luo, Liling Chen, Lars Konge and Huanhuan Chen. Data curation: Xiaozhi Wang, Surong Jiang and Wei Sun. Formal analysis: Xiaozhi Wang and Lei Zhou. Investigation: all authors. Visualization: Xiaozhi Wang and Junjie Du. Writing—original draft: Xiaozhi Wang and Junjie Du. Writing—review and editing: all authors. Supervision: Hua Huang and Qiang Ding. All authors approved the final manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Medical Education Branch of the Chinese Medical Association and the National Center for Medical Education Development under Grant 2023A20; Major Educational Research Project of Nanjing Medical University under Grant 2023ZDZB003; Jiangsu Provincial Educational Science Planning Project under Grant B-b/2024/01/108; Jiangsu Provincial Graduate Research and Practice Innovation Program and Teaching Reform Project for Degree and Postgraduate Education under Grant JGKT24_A009 and Special Project on Digital Construction of Teaching Materials in the New Era of Jiangsu Provincial Colleges and Universities under Grant 2024JCSZ04 to Qiang Ding. Additional funds from Nanjing Medical University under Grant 2021ZC043 and 2023LX059 awarded to Xiaozhi Wang and Wei Sun, respectively.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
