Abstract
Objective
To evaluate the feasibility and impact of remote consecutive, in-person consecutive, and in-person simultaneous interpreter modalities on clinical workflow, patient experience, and provider experience in a multidisciplinary pediatric craniofacial clinic.
Design
Mixed-methods study incorporating quantitative clinical workflow and patient survey data with qualitative semi-structured interviews.
Setting
Single-site multidisciplinary pediatric craniofacial clinic at an academic medical center.
Patients, Participants
A total of 170 patients were seen during the study period: 126 (74.1%) English-speaking (for no interpreter comparison) and 44 (25.9%) Spanish-speaking using interpreter services. A total of 105 (61.8%) patients completed voluntary patient satisfaction surveys. Sixteen craniofacial providers and four Spanish-speaking interpreters participated in interviews.
Interventions
Language interpretation was provided via three modalities: remote (audio and video) consecutive, in-person consecutive, and in-person simultaneous.
Main Outcome Measure(s)
Clinic cycle time and face-to-face time; patient satisfaction scores; thematic analysis of provider and interpreter interviews.
Results
Clinic time and satisfaction scores did not differ significantly across modalities. Interviews revealed that in-person interpretation supported rapport, trust, and clarity, while remote interpretation posed technical and relational challenges. Simultaneous interpretation was valued for efficiency but required greater cognitive effort from interpreters. Preferences for in-person consecutive and simultaneous interpretation varied with patient and clinic needs.
Conclusions
While quantitative outcomes showed minimal workflow differences among interpreter modalities, qualitative findings highlight important communication benefits of in-person interpretation, especially when interpreters are experienced and integrated into the care team. Tailored interpretation approaches may better promote equitable care in multidisciplinary pediatric craniofacial settings.
Keywords
Introduction
Effective communication is essential to high-quality healthcare. However, patients who speak a primary language other than English (PLOE) frequently face barriers that compromise their ability to fully engage in medical encounters, resulting in lower quality of care, increased risk of adverse events, and worse health outcomes. 1 These issues are amplified in pediatric care, where communication occurs not only with the patient but also with caregivers and family members.2,3 Children of PLOE caregivers experience health disparities such as fewer recommended preventative care visits compared to children of parents whose primary language is English. 2
In the absence of professional interpretation, families often rely on untrained individuals including the pediatric patients themselves or other family members serving as ad hoc interpreters. This ad hoc interpreting has been associated with higher rates of interpretation errors, potentially compromising care quality and patient safety. 4 In contrast, the use of professional interpreter services have been associated with fewer communication errors and improved patient satisfaction, access, and clinical outcomes. 1 Additionally, communication challenges are not only linguistic but can also arise from differences in cultural beliefs and practices. In these cases, professional interpreters often serve as cultural brokers who facilitate providers’ understanding of a family's cultural beliefs and practices, thereby promoting more culturally responsive and effective care. 5
Several interpretation methods are employed in healthcare to support communication between providers and PLOE patients: in-person, video remote interpretation (VRI), and audio-only services. Evidence suggests that VRI provides better outcomes compared to audio interpretation due to the presence of visual cues. 6 Among in-person methods, interpretation is often delivered consecutively, where the speaker pauses while the interpreter relays the message. However, simultaneous interpretation in which an interpreter relays the conversation in real-time using a microphone and headset system to the other party while the provider is speaking, has been associated with higher levels of satisfaction and perceived reduction in communication errors among Spanish-speaking families, allowing for direct and effective communication, enhancing mutual understanding, and care quality. 7 Healthcare providers and interpreters often favor in-person interpretation because it minimizes technical disruptions and fosters rapport.8,9 Evidence also suggests that in some settings, dedicated in-person interpreters may reduce visit length and improve efficiency compared to video interpretation due to technical difficulties. 10 However, despite the advantages, in-person interpreters are not always available, often due to staffing limitations and competing needs across clinical settings. 11
While research on interpreter services has been performed in pediatrics inpatient 7 and primary care settings, 12 less is known about interpretation best practices in multidisciplinary and interdisciplinary outpatient clinics. The American Cleft Palate Craniofacial Association recommends that the healthcare needs of children with craniofacial differences are best managed by an interdisciplinary team. 13 In these settings, multiple providers often see the same patient during a single visit, which can make clear, effective communication with patients and their families particularly challenging. For PLOE patients, this can be even more difficult and may place increased psychosocial distress on patients. 14 Therefore, identifying effective interpretation practices is key to delivering high-quality, interdisciplinary care for PLOE patients. Studies on patient and provider preferences and satisfaction with different interpretation methods in an outpatient multidisciplinary clinic setting are limited, and there have not been any studies about in-person simultaneous interpretation in outpatient pediatric settings. Most prior studies focus on comparing in-person versus remote interpretation modalities.10,15 The interpretation modality employed may influence not only the quality of patient–provider communication, but also clinic efficiency, satisfaction, and team dynamics.
This pilot study sought to evaluate the feasibility and impact of three interpreter modalities—remote consecutive, in-person consecutive, and in-person simultaneous—on patient experience and clinic workflow in a multidisciplinary pediatric craniofacial clinic. Using a mixed-methods approach, this study aimed to understand (1) how interpreter modality may affect the delivery of complex outpatient care and (2) to identify best practices for facilitating patient-centered communication and care in a multidisciplinary pediatric craniofacial clinic.
Methods
An Institutional Review Board reviewed and approved all study procedures. All participants provided informed consent prior to participation.
Study Design
This study used a convergent mixed methods design to assess clinic workflow metrics including patient cycle time, provider face-to-face time with patients, follow-up appointment scheduling, and overall patient satisfaction. Semi-structured interviews with providers and interpreters were conducted to understand their experiences and preferences using different interpretation modalities. A previous study similarly employed this design to explore the strengths and limitations of in-person and remote interpretation. 12
Setting
This study took place in a multidisciplinary pediatric craniofacial clinic at an academic medical center. The Pediatric Craniofacial Clinic (CFC) comprises an interdisciplinary team of providers who collaborate to provide comprehensive team-based care. Each clinic visit involves a series of sequential encounters with up to six core providers: a pediatrician, dentist, orthodontist, plastic surgeon, audiologist, and speech-language pathologist. The clinic coordinator manages overall clinic flow, directing providers into patients’ rooms and ensuring each patient is seen by the necessary providers. Not all patients are seen by all six providers, as provider involvement is based on clinical need.
The clinic patient volume is 531 patients per year. Appointments are scheduled by a clinic coordinator, who documents the family's preferred language when scheduling their appointment. The average number of Spanish-speaking patients per year is 90, making Spanish the most common language spoken other than English in the CFC. For patients requiring a Spanish interpreter, an in-person interpreter is requested from the health system's Language Services Department. Interpretation services are delivered based on interpreter availability.
Participants
Study participants included pediatric craniofacial patients and their caregivers, providers, and interpreters. Eligible participants included CFC pediatric patients ages 0 to 21 years and their accompanying caregivers who attended an in-person CFC appointment between April 2024 to December 2024. CFC patients are followed by the multidisciplinary craniofacial team for their specialized care through annual clinic visits. Since Spanish and English were the most common languages spoken in the CFC and the in-person language interpreters involved in the study were available for Spanish only, families who spoke a primary language other than English or Spanish were not approached for the study. The distribution of patients by primary language and interpreter modality is shown in Table 1.
Distribution of Patients by Primary Language and Interpreter Modality.
CFC providers included pediatricians, dentists, orthodontists, plastic surgeons, audiologists, and speech pathologists. Additional members of the multidisciplinary team, including social workers and medical scribes, were included in the interviews to better understand the clinical workflow as a whole. Interpreters were certified professional interpreters from the Language Services Department within the same health system.
Interpreter Modalities
Patients who required interpretation were assigned one of the following three modalities based on availability: (1) video remote consecutive interpretation via an iPad (remote interpretation), (2) in-person consecutive, or (3) in-person simultaneous. Interpreter modality was assigned based on availability and not randomized. On the day of the clinic, availability depended on the interpreters from the Language Services Department. When in-person interpreters were not available, the default modality was video remote consecutive interpretation.
Video remote consecutive interpretation via an iPad, referred to in this study as “remote interpretation,” involved a remotely located interpreter joining the clinical encounter through the iPad on video and interpreting after each speaker finished. During in-person consecutive interpretation, the interpreter was physically present in the patient's room and interpreted after each speaker finished. In-person simultaneous interpretation involved the interpreter rendering the provider's speech in real time through a microphone and headset connected with the patient/caregiver, while interpreting the patient/caregiver's speech consecutively to the provider.
Of note, the in-person simultaneous interpreter modality used in this study was actually a hybrid approach combining consecutive and simultaneous interpretation, as this was the methodology previously implemented during a study at the same health center on inpatient pediatrics family-centered team rounds. 7 This approach was considered well suited for the fast-paced nature of the multidisciplinary clinic workflow, where patients are seen by multiple providers in a single appointment, similar to multiple providers speaking during family-centered team rounds. This study refers to the hybrid method as “in-person simultaneous interpretation” to distinguish it from the “in-person consecutive interpretation.”
Quantitative Data Collection
Clinical Workflow Metrics
A prospective observational approach was used to collect quantitative data in the multidisciplinary pediatric craniofacial clinic from April 2024 to December 2024. Key metrics included total cycle time, provider face-to-face time, and whether a follow-up appointment was scheduled. These metrics were selected based on previous research studies comparing interpreter modalities.10,16,17 Total cycle time was defined as the duration from the patient entering the exam room to the time of departure, excluding pre-clinic waiting time. Face-to-face time measured the duration each provider spent in the room with the patient and family. The research coordinator tracked these times using a digital stopwatch. Follow-up scheduling data were abstracted from the medical record at the end of each clinic day. For a more controlled comparison, a subset of patient visits in which all six providers of the core team saw the patient in a single day was separately analyzed.
Patient Experience Surveys
Patient experience and the patient–provider relationship was assessed using survey items adapted from the Child Hospital Consumer Assessment of Healthcare Providers and Systems (Child HCAHPS)18,19 and the Patient–Doctor Relationship Questionnaire (PDRQ-9). 20 Both surveys have been validated for use with English and Spanish-speaking patients.18,21 Survey items from the Child HCAHPS domains “Communication with your child's doctor” and “Doctors communication with your child” were merged and simplified to reduce redundancy and improve clarity. Questions from the PDRQ-9 were adapted for use in a pediatric setting, allowing parent/caregiver respondents to answer on behalf of their children.
At the end of each clinic visit, the research coordinator administered the survey in-person. Participation was voluntary. Surveys were administered to caregivers to complete for pediatric patients. For patients over 18 years of age attending by themselves, the patient completed the survey. Surveys were offered in English and Spanish and completed by participants based on their preferred language. Surveys are available from the authors upon request and as Supplemental File.
Statistical Analyses
Descriptive statistics were generated in the forms of frequencies and percentages for categorical variables and medians and interquartile range (IQR) for continuous variables. Comparisons were conducted by language spoken (English vs. Spanish) and interpreter type (in-person vs. remote). Chi-square tests were used to analyze the frequencies of patients who saw individual provider types. Wilcoxon rank-sum test was used to compare the number of providers seen and then time spent with each provider. To determine if there was a difference between the total cycle time and the total face-to-face time by language or interpreter type, a linear regression model was used including a covariate for the number of providers to account for the fact that all patients did not see all providers during their visit. Subgroup analyses were conducted for the patients who saw all providers using Wilcoxon rank-sum test to compare their total cycle time and total face-to-face time.
Survey responses were analyzed using “top-box” scoring methodology. For Child HCAHPS questions, the “top-box” score was defined as a response of “always” on the 4-point Likert scale (never, sometimes, usually, always)7,22 while “totally appropriate” on the 5-point Likert scale (not at all appropriate to totally appropriate) for PDRQ-9 questions. 20 Survey responses were compared between English-speaking patients (no interpreter used) and Spanish-speaking patients (interpreter used) using Fisher's exact tests. SAS version 9.4 (SAS Institute, Cary, NC) was used for statistical analyses. P-values <.05 were considered statistically significant.
Qualitative Data Collection
Semi-Structured Interviews
A semi-structured interview guide was developed to focus on probing provider and interpreter perspectives related to interpretation modality preference with respect to patient care, clinical workflow, and overall satisfaction. The interview guide was developed following a literature review and input from study team members, including one with experience in qualitative methodology. The interview guide is available upon request from the authors and as Supplemental File.
All clinical providers and interpreters who attended clinic during the study period were invited to participate in virtual interviews between December 2024 and January 2025. Participation was voluntary. One-on-one interviews were conducted virtually by the first author and lasted approximately 45 minutes. Interviews were audio recorded and transcribed using live transcription software. The first author reviewed each transcript for accuracy and clarity. Transcripts were deidentified prior to coding.
Thematic Analysis of Interviews
Interview transcripts were analyzed using thematic analysis. The coding team included four members (S.J., research coordinator; S.N. and N.O., trainees with qualitative training support; and C.T., pediatric faculty with expertise in medical education and training in qualitative methods). An additional study author (J.P.), with expertise in health services and qualitative research, provided methodological consultation but did not participate in coding.
The coding team independently reviewed an initial subset of eight interview transcripts to draft and refine a preliminary codebook. 23 Emerging themes were identified through iterative discussions using a deductive analysis approach focusing on the areas of communication, rapport, workflow, and satisfaction across interpreter modalities. Intercoder reliability was established using a consensus-based approach. Differences in coding were resolved through discussion until agreement was reached. The final codebook was then applied to all remaining transcripts. Code labels were refined continuously until thematic saturation was reached.
To enhance rigor and trustworthiness of the qualitative analysis, the team addressed the criteria of credibility, dependability, confirmability, and transferability.24,25 Credibility was supported through coder triangulation across diverse professional roles and iterative coding with team consensus. Dependability was addressed by maintaining an audit trail of coding decisions and revisions. Confirmability was supported by using verbatim quotes to ground findings in participant voices. Transferability was considered by providing detailed descriptions of the clinical setting, interpreter modalities, and participant characteristics. All coders had experience in pediatric care settings and in the pediatric multidisciplinary craniofacial clinic itself, which informed their interpretation of the data.
Results
Patient Demographics
During the study period, there were 170 patients who attended the multidisciplinary pediatric craniofacial clinic. Of these, 126 (74.1%) were English-speaking and 44 (25.9%) were Spanish-speaking. When examining the number of providers seen during each visit, there was a statistically significant difference between English-speaking and Spanish-speaking patients. English-speaking patients saw a median of 5.0 providers compared to 6.0 providers for Spanish-speaking patients (p = .0416). A significantly higher proportion of Spanish-speaking patients (77.3%) saw an orthodontist compared to English-speaking patients (57.1%) (p = .0177). There was no statistically significant difference in the number of providers seen between interpreter types.
Total Cycle Time and Face-to-Face Times
Among the 170 CFC patients, the median total cycle time was 67.5 minutes (IQR 49.0, 86.0) for English-speaking patients and 73.0 minutes (IQR 59.5, 91.0) for Spanish-speaking patients. Median face-to-face time was similar between groups: 44.5 minutes (IQR 32.0, 57.0) for English-speaking patients and 45.0 minutes (IQR 35.5, 57.5) for Spanish-speaking patients. Follow-up appointments were scheduled by 66.7% of English-speaking patients and 84.4% of Spanish-speaking patients (p = .069).
A linear regression analysis was performed to control for the number of providers seen during the visit, given the group difference noted above. The analysis showed that language was not a significant predictor of total cycle time (estimate (SE) = 1 (4.057); p = .8057) or face-to-face time (estimate (SE) = −0.77 (2.581); p = .7660). Similarly, linear regression comparing in-person and remote interpretation modalities showed that interpreter modality was not a significant predictor of total cycle time (estimate (SE) = 4.584 (7.123); p = .5235) or face-to-face time (estimate (SE) = 2.21 (5.266); p = .6769) (Supplemental Table 1).
Average time spent with individual providers was examined across three groups: no interpretation, in-person interpretation, and remote interpretation. Patients who spoke English were the “no interpretation” group. PLOE families who spoke Spanish were the “in-person interpretation” or “remote interpretation” groups. Figure 1 shows the distribution of time by provider type and interpreter modality. Time with the orthodontist was longer with in-person interpretation, while time with the dentist was similar across all modalities. In contrast, time spent with the audiologist, pediatrician, and speech pathologist was shorter with in-person interpretation compared to when remote interpretation was used and when no interpreter was used.

Time spent with individual providers by interpreter type. Median time and interquartile range in minutes spent with individual craniofacial team providers was compared across three groups: English-speaking patients who did not need or use an interpreter (n = 126) and Spanish-speaking patients who used in-person interpretation (n = 18) and remote interpretation (n = 26).
A subset of visits in which patients saw all six providers of the core team within a single visit was analyzed descriptively due to the small sample size of the in-person interpretation groups. This subset contained 66 patients including 43 (65.2%) English-speaking and 23 (34.8%) Spanish-speaking. Among the Spanish-speaking patients, 13 (56.5%) used remote interpretation, 5 (21.7%) used in-person consecutive interpretation, and 5 (21.7%) used in-person simultaneous interpretation. As shown in Figure 2, the shortest median total cycle time was observed during visits using consecutive interpretation (70.0 minutes) and the longest using remote interpretation (90.0 minutes). Median total face-to-face time was similar for no interpretation (50.0 minutes), remote interpretation (50.0 minutes) and consecutive interpretation (51.0 minutes). Simultaneous interpretation had a shorter median total face-to-face time being 39.0 minutes.

Total time in clinic and total face time with providers by interpreter type. (A) Total clinic time and (B) Total face-to-face provider time (median and interquartile range) in minutes for patients who were seen by all six core providers of the craniofacial team, comparing English-speaking patients (no interpreter used) and Spanish-speaking patients by interpreter modality: remote, in-person consecutive, and in-person simultaneous.
Patient Survey Findings
Of the 170 total CFC patients seen during the study period, 105 (61.8%) completed patient satisfaction surveys. Missing surveys were due to patients declining participation, or the research coordinator being unable to approach them during their clinic visit. Among respondents, 78 (74.3%) spoke English and did not need interpreter services. Of the 27 Spanish-speaking respondents who needed an interpreter, 14 (51.9%) used remote interpretation, 8 (29.6%) used in-person consecutive interpretation, and 5 (18.5%) received in-person simultaneous interpretation.
Median satisfaction ratings on a scale of 0 (worst) to 10 (best) scale for interpreter services were high across modalities: remote (9.5), consecutive (10.0), simultaneous (10.0). Table 2 shows the percentage of respondents in each interpreter group who selected top-box scores, indicating the highest level of satisfaction during the visit.
Proportion of Top-Box Responses to the Child Hospital Consumer Assessment of Healthcare Providers and Systems Survey (Child HCAHPS) and Patient–Doctor Relationship Questionnaire (PDRQ-9) by Interpreter Modality.
*P < .05.
The proportion of top-box responses on the Child HCAHPS and PDRQ-9 were generally high for all study participants and across all three interpreter modalities. No statistically significant differences were observed in Child HCAHPS items between groups. However, statistically significant differences emerged on three items from the PDRQ-9: “My child's doctor is dedicated to help my child” (p = .0425), “My child's doctor and I agree on the nature of my child's medical symptoms” (p = .0425), and “I can talk to my child's doctor” (p = .0049).
Provider and Interpreter Interview Findings
In total, 20 semi-structured interviews were conducted (with 16 clinical providers and 4 Spanish medical interpreters). Clinical providers included pediatricians (n = 4), plastic surgeons (n = 2), dentists (n = 1), orthodontists (n = 2), audiologists (n = 2), speech pathologists (n = 2), social worker (n = 1), and medical scribe (n = 2). Five major themes emerged from the thematic analysis as shown in Table 3: (1) Craniofacial patients have unique interpretation needs due to visit complexity and communication challenges, (2) In-person interpretation supports more effective patient–provider communication compared to remote interpretation, (3) In-person interpretation fosters rapport and connection among providers, interpreters, and patients compared to remote interpretation, (4) In-person consecutive interpretation allows for flexibility and adaptability during clinic visits, (5) In-person simultaneous interpretation creates a seamless clinical flow but at the expense of cognitive strain. Interview themes and related sub-themes are summarized in Table 3.
Themes and Subthemes from Provider and Interpreter Interviews About Experiences with Different Interpretation Modalities.
Theme 1: Craniofacial patients have unique interpretation needs due to visit complexity and communication challenges
Providers and interpreters emphasized how craniofacial care involved highly specialized communication needs that go beyond standard interpretation scenarios. They highlighted the nuanced and specialized care needs of craniofacial patients, with particular attention to differing abilities in language and auditory processing. Effective communication often relied on visual cues such as facial expressions, lip-reading, and hand gestures. These cues were often perceived as limited or distorted in remote formats, even with audio and visual capabilities. Many interviewees shared that in-person interpretation better accommodated non-verbal cues for a smoother and more conducive clinical experience. Additionally, they perceived that in-person simultaneous interpretation with headsets helped patients with hearing impairments by enabling volume adjustment. One provider, an audiologist, summarized the significance of these factors for craniofacial patients: In person, we can have them wear a device that amplifies the interpreter's voice. Whereas we can kind of do that with [remote interpreting], but it's just not as great. With an in-person [interpreter], [the patient] can read lips, which anyone with hearing loss relies on, and it's … not delayed like sometimes it is with the [remote] interpreting. For just general medicine, there is less diversity in the medical terminology that's used, as opposed to this [craniofacial] clinic, where [the interpreters] have to know terminology on head and neck surgery … plastic surgery … craniofacial surgery, [and] general medical language. They have to know about dentistry and orthodontics. So, it's so many different areas that they need to specialize in. I’ll ask them [the in-person interpreters] … “Are they [the patient] using any past tense, are they omitting articles, or are they using any prepositions? I heard [the patients] say ‘on’ … would they [normally] say ‘under’? Is this typically what you would say or does this seem like something that is typically said by children?” So [the interpreters] kind of help me if I’m not sure about something, from a grammatical, expressive, vocabulary aspect.
Theme 2: In-person interpretation supports more effective patient–provider communication compared to remote interpretation
Many providers and interpreters expressed that remote interpretation detracted from the overall care experience for patients. A key concern was the background noise in the patient's room, which made it difficult for patients and remote interpreters to hear each other. Providers attempted to compensate for these external distractors by adjusting their language and the depth of their explanations. An audiologist shared their differing approach when using remote versus in-person interpretation during patient visits: I might try to simplify my language or keep my sentences a little bit shorter versus when someone [interpreter] is remote. I feel like their [in-person interpreter's] ability to interpret longer, more complex things is a little bit easier, because they can hear me a little bit better than someone [interpreter] who's trying to hear me over an iPad. Because it's louder, I feel that like it's more challenging for remote interpretation if they [remote interpreters] are unable to hear with the additional noise. I observe them [caregivers] to be more frantic because they’re trying to … manage their children's behaviors to quiet down, so that the remote iPad interpreter is able to hear them. [We interpreters] can solve those difficulties because we’re responsible. Whenever [we’re] interpreting in the simultaneous modality, and if there's a difficulty with the audio, we’re the ones responsible of helping the patient or patient family navigate, increasing the volume, or maybe just checking the microphone, right? Or us speaking closer to our microphone. We can solve the issues there.
Theme 3: In-person interpretation fosters rapport and connection among providers, interpreters, and patients compared to remote interpretation
Providers reported that repeated interactions with in-person interpreters strengthened collaboration by fostering a sense of familiarity and interpersonal connection. Being able to recognize one another and address each other by name contributed to stronger rapport between providers and interpreters, ultimately supporting more effective communication within the clinical team. One pediatrician reflected on their relationship with in-person interpreters: I don’t know them [the iPad interpreters] too well, but all the in-person interpreters do an excellent job interpreting things that we say exactly as we want them to… I think, being in person, too, you feel more of a connection or more team-based and working together. I love how they’re available from start to finish … the same interpreter is there [for] the plastic surgeon to the audiologist, to the pediatrician. They give that whole comprehensive and holistic experience.
In-person interpreters described themselves as active members of the care team, playing a key role in relationship-building with patients and contributing to a more enhanced patient care experience. Beyond interpreting information, in-person interpreters perceived themselves as the first point of contact, greeting patients, and initiating conversation by asking how they are doing. As interpreters engaged in repeated sessions with the same patients, they often learned patients’ names, fostering a sense of familiarity that made clinic visits feel more tailored to the individual and their family. An interpreter detailed one particular experience with a patient: The surgeon seemed like he wasn’t really getting through to the caregiver… Once I got there in person, it seemed like the caregiver established some rapport with me and was more confident that I was giving her side of the story to the surgeon. [The surgeon] felt like he was able to explain what he was trying to do and the decision that they’re trying to make. And I think [the caregiver] was able to reflect and think about the situation in a better way, and they got to connect at a deeper level … in the end, they had a very successful outcome.
Furthermore, patients often appeared hesitant to ask questions or raise concerns when using remote interpretation. Providers and interpreters attributed this to the technical challenges with remote interpretation. Interviewees perceived remote interpreters as unable to assess patient understanding. In-person interpreters provided informal support by troubleshooting technical issues in real-time and reading patient understanding. They could sense how some patients remained silent when they were confused or had difficulties hearing or understanding remote interpreters. One interpreter recalled patients seemed more open and forthcoming with in-person interpretation when they had previously received remote interpretation: With the video [remote interpretation], there's a little bit more of like, “Okay, let me just answer, and then we’ll move on.” I think [remote interpretation] … takes a lot more time and so people probably are conscientious of that, and I can see people are not as talkative or elaborate as much with it.
Theme 4: In-person consecutive interpretation supports flexibility and adaptability during clinical visits
When discussing the differences between in-person consecutive and simultaneous interpreting, interviewees often highlighted how in-person consecutive interpretation supported more deliberate and conversational communication during clinical encounters. Unlike simultaneous interpretation where interpreters may skip words to maintain speed, consecutive interpretation allowed interpreters to pause, take notes, and rephrase language using more natural vocabulary. As the providers, patients, and interpreters took turns talking during consecutive interpretation, an interpreter noted how this modality allowed them to take the time to adjust: [Interpreters are] more involved in the consecutive mode. I also noticed that we have more time as interpreters to reformulate the phrase in a more natural way. Sometimes when you’re interpreting simultaneously, you’re so fast that you’re so focused on not missing a word that at the end what you said might not be very normal, normally understood. If I’m very familiar with the patient and the providers, I feel like simultaneous is preferable. But if it's a new patient, and I don’t know like what is going to be said, and I haven’t had a chance to look at the history, I think consecutive is better because it gives me a bit of a lag time while the providers are talking to be able to kind of establish context and be able to render things better.
Interviewees also commented on the back-and-forth structure of consecutive interpretation creating space to pause and reflect, contributing to a more conversational flow. However, they perceived time as the primary limitation of the consecutive modality. As information must be conveyed sequentially, they believed such visits often take longer. One interpreter noted how some providers appeared rushed during consecutive interpretation: Providers are so focused on finishing up that very often they’re just on their computer typing notes and taking advantage of the fact that we’re [interpreters] in there. In the time that we’re rendering the message [to patients], they’re [providers] typing their notes from their computer… I feel that they’re a little more rushed when we interpret in the consecutive mode. I find that in a session with a social worker, because there are so many questions back and forth, consecutive works best during those types of sessions.
Theme 5: In-person simultaneous interpretation creates a seamless clinic flow but adds cognitive strain compared to consecutive interpretation
Providers appreciated how in-person simultaneous interpretation modality generally facilitated a more seamless flow of communication compared to the consecutive modality. The real-time delivery of simultaneous interpretation minimized pauses, thereby allowing patients and their caregivers to feel more aligned in their conversation with the provider. Simultaneous interpretation also accommodated the distinct communication needs of pediatric patients and their caregivers. Interviewees described how English-proficient pediatric patients could follow the provider's speech directly, while caregivers received the interpreted communication in Spanish via headset. One dentist described how dual-channel delivery enhanced the clinical interaction by ensuring both the patient and caregiver were engaged and informed in real time: It's useful for the interpreter who's very quietly interpreting my conversation to the teenager for Mom, and I feel like especially one or two of them especially have a way of doing it, not very loudly, very gently, in the background… And that's nice. I like that. That's positive if it's done well … if I’m speaking to the teenager, it's not interfering with our interaction at all. Simultaneous is hard for me [a provider]. It's almost like if I’m on a phone call and I’m hearing feedback, or like a delayed response, delayed auditory feedback that is distracting for me, and I feel like I’m not as concise, and I’m not as exact in my language, because I’m auditorily distracted… It's just these are very small rooms, so they [interpreters] can’t stand far enough away from me to where I’m not hearing them. [Simultaneous interpretation] does have its deficiencies because it's not a modality that we can sustain for very long. There have been many studies that show that there's a fatigue factor that sits between 20 to 40 minutes, depending on the person's experience and physical state of being… Simultaneous is more challenging than consecutive, just because of that level of fatigue that happens after doing it for so long. With simultaneous, something that we are taught to do if we don’t know the word, or its particular meaning, is to just leave it [the word] in English or in the source language. So, leaving it in the source language sometimes, obviously, can create more question … if I know that it's a critical piece of information, then I’ll go back and clarify. But if it's a name … just for the sake of conversation, I may not go back for that piece of information. If it's something that I feel is the heart of the message, then I will clarify.
Discussion
This mixed-methods study explored how interpreter modalities of remote, in-person consecutive, and in-person simultaneous influenced clinic workflow, patient–provider communication, and patient satisfaction in a multidisciplinary pediatric craniofacial clinic. Although interpreter modality did not significantly affect cycle time or satisfaction scores, the qualitative findings highlight meaningful differences in rapport, communication quality, and relational aspects of care. This apparent discrepancy illustrates the value of a mixed methods design.
Quantitative measures captured measurable workflow and satisfaction outcomes but were not sensitive to the interpersonal nuances that emerged during interviews. In contrast, qualitative data revealed how interpreter presence, immediacy, and communicative style shaped providers’ and families’ experiences in ways not detectable through standardized metrics. Together, these complementary data underscore that evaluating interpreter services requires attention to both efficiency and the relational dimensions of care, as the latter may meaningfully influence trust, understanding, and patient–provider connection even when overall visit metrics appear comparable across modalities.
Similarly, previous studies have demonstrated the value of a mixed methods approach in comparing patient and provider experiences with in-person and remote interpretation. 10 In this study, cycle time and face-to-face time did not differ significantly across modalities, despite expectations that remote interpretation would slow visits. Although Spanish-speaking patients often saw more providers, their overall visit lengths were comparable to those of English-speaking families. These findings align with previous research 17 but contrast with other reports that associate remote interpretation with longer visits due to connection delays or repeated clarifications. 17 Incorporating qualitative methods helped clarify why time differences were not observed since providers described adapting their communication, often simplifying explanations with remote interpretation. While such adaptation may maintain workflow efficiency, it could limit the richness of provider-family communication. Taken together, these results illustrate how mixed methods are important for capturing both efficiency metrics and more subtle aspects of care, including rapport and communication quality across interpretation modalities.
When examining provider patterns during craniofacial visits, this study found that Spanish-speaking patients saw more providers overall compared to English-speaking patients, particularly with respect to orthodontics. This observation suggests differences that extend beyond language alone and may reflect systemic and clinical factors. Prior research shows that Spanish-speaking patients are more likely to be covered by public insurance 26 such as Medicaid, and that patients with public insurance face more difficulty accessing orthodontic care in the community due to limited provider participation and coverage restrictions. 27 Additionally, community orthodontists may feel less comfortable managing patients with complex craniofacial conditions, further narrowing access for patients who require subspecialty care. 28 As a result, Spanish-speaking patients may rely more on the craniofacial clinic for comprehensive services that might otherwise be fragmented or unavailable. This reliance underscores systemic barriers related to insurance and clinical complexity of multidisciplinary care, in addition to language, reinforcing the need for high-quality, modality-appropriate interpreter services in the CFC.
English-speaking patients, who did not require interpretation, reported significantly higher satisfaction scores than Spanish-speaking patients, regardless of modality. However, no significant differences emerged between the remote and in-person interpreter groups, suggesting that overall satisfaction differences were more closely linked to language rather than modality alone. Still, findings should be interpreted with caution given small sample sizes in some groups.
Patient satisfaction scores were consistently high across all interpretation modalities, echoing prior research. 10 Some PDRQ-9 items revealed differences in perceived provider dedication, agreement on symptoms, and openness to talk, but they may be driven by differences in English-speaking versus Spanish-speaking groups rather than by the interpreter modality. However, interviews did emphasize the value of in-person interpretation in fostering rapport, building trust, and creating more personal and collaborative clinical interactions. Providers noted that patients seemed more open and willing to share during in-person interpretation, similar to previous findings on interpreter-mediated care. 12
A key contribution of this study is the comparison between in-person consecutive and simultaneous interpretation. Findings suggest that consecutive interpretation offered conversational pacing and space to clarify complex terms, while simultaneous interpretation preserved natural dialogue and engagement with caregivers. Prior inpatient pediatric research found that simultaneous interpretation improved both patient and provider satisfaction, 7 informing the decision to pilot this modality in an outpatient clinic with a similarly multidisciplinary team and caregiver-child dyads.
However, simultaneous interpretation posed challenges. Providers found audio overlap distracting, and interpreters reported cognitive strain from sustaining real-time interpretation across specialties without breaks. The rapid pace also limited interpreters’ ability to use natural-sounding language, potentially affecting comprehension. Due to small sample size, standard statistical tests could not be used to compare visit length between in-person consecutive and simultaneous interpretation, though interviews suggested few trade-offs and no clear preference between the two.
Ultimately, this study's findings underscore the importance of tailoring interpreter modality to the communication needs of patients, providers, and specific clinical contexts. While all modalities can support efficient care delivery, in-person interpretation enhanced understanding, rapport, and diagnostic accuracy, especially when interpreters were experienced and familiar with craniofacial care. Simultaneous interpretation may streamline clinic flow but should be balanced with interpreter workload and cognitive demands. From a practical standpoint, investment in specialized interpreter training for high-complexity, multidisciplinary settings like craniofacial clinics is essential. Interpreter training can emphasize not only linguistic accuracy but also familiarity with craniofacial terminology and team-based care. Likewise, provider training can improve effective collaboration with interpreters by encouraging clear communication, allowing space for interpretation, and recognizing the cognitive demands of simultaneous interpretation. Staffing models and workflow policies can account for interpreter productivity and workload, including adequate staffing, scheduled breaks, and institutional recognition of the specialized expertise required. Taken together, efforts to ensure modality choice aligns with patient and provider needs can improve productivity and are key to providing equitable, patient-centered care. These implications agree with a recent study comparing remote and in-person interpretation experiences for clinicians and patients. 12
Limitations
The authors acknowledge several limitations in this study. This was a single-site pilot study within one pediatric craniofacial clinic. The sample was small, especially for visits involving in-person simultaneous interpretation as well as the subgroups of patients seen by all six core providers, limiting statistical power. Thus, these findings may not be generalizable to all other healthcare settings. Additionally, using a stopwatch to time visits may introduce observer bias, as the presence of a timer or observer could influence workflow or recording accuracy. While all visits were timed using standardized procedures to minimize this effect, this was noted as a potential limitation of the study.
Patient perspectives were captured through surveys but not through in-depth qualitative interviews. Due to time and grant funding constraints, interviews with patients or caregivers were not conducted. Instead, the study focused on provider and interpreter perspectives, which are less frequently represented in the literature. While prior studies have extensively documented patient experiences with interpreter modalities,7,8,15 fewer have examined how interpretation impacts providers and interpreters working in multidisciplinary teams.
Future research should include larger, multisite samples to strengthen generalizability. Randomized assignment to interpreter modalities could offer a more rigorous comparison of communication outcomes. Additionally, incorporating qualitative interviews with patients and families would provide richer insights into how interpretation modality affects their care.
Conclusion
Clinical visits with patients and families with PLOE encompass communication challenges that can impact clinical workflow, patient–provider relationship, and overall quality of care. 1 Interpreter modality plays a critical role in addressing these challenges, particularly in complex, multidisciplinary settings such as team-based craniofacial clinics. While remote interpretation was associated with high patient satisfaction and supported some aspects of clinic efficiency, findings highlighted the unique advantages of in-person interpretation in comparison, especially when delivered by interpreters familiar with craniofacial care. Consecutive and simultaneous modalities each offered distinct benefits and challenges.
Although any of the interpreter modalities can effectively support patient care, optimal communication requires interpreter services that are responsive to the specific needs of the patient, provider, and clinical context. In complex, multidisciplinary environments like pediatric craniofacial clinics, where visits often involve multiple specialties, nuanced communication, and diverse family dynamics, a flexible, on-demand model that offers modality choice based on the demands of the visit may be most effective in promoting equitable, high-quality, patient-centered care.
Supplemental Material
sj-docx-1-cpc-10.1177_10556656251408214 - Supplemental material for Effect of Language Interpreter Modalities on Patient Satisfaction and Clinical Workflow: A Pilot Study in a Multidisciplinary Pediatric Craniofacial Clinic
Supplemental material, sj-docx-1-cpc-10.1177_10556656251408214 for Effect of Language Interpreter Modalities on Patient Satisfaction and Clinical Workflow: A Pilot Study in a Multidisciplinary Pediatric Craniofacial Clinic by Selina Juang, Stella Nguyen, Nada Osman, Keymia Ghodrati, Cristian Reyes, Ioannen Maldonado, Jacquelynn Pino, Marlon Duarte, Kimberly Halley, Irene Hendrickson, Marinda Tu, Jennifer Brazier Peralta, Holly Wilhalme, Amanda Kosack, Jessica Lloyd, Carlos Lerner and Christine Katie Thang in The Cleft Palate Craniofacial Journal
Supplemental Material
sj-docx-2-cpc-10.1177_10556656251408214 - Supplemental material for Effect of Language Interpreter Modalities on Patient Satisfaction and Clinical Workflow: A Pilot Study in a Multidisciplinary Pediatric Craniofacial Clinic
Supplemental material, sj-docx-2-cpc-10.1177_10556656251408214 for Effect of Language Interpreter Modalities on Patient Satisfaction and Clinical Workflow: A Pilot Study in a Multidisciplinary Pediatric Craniofacial Clinic by Selina Juang, Stella Nguyen, Nada Osman, Keymia Ghodrati, Cristian Reyes, Ioannen Maldonado, Jacquelynn Pino, Marlon Duarte, Kimberly Halley, Irene Hendrickson, Marinda Tu, Jennifer Brazier Peralta, Holly Wilhalme, Amanda Kosack, Jessica Lloyd, Carlos Lerner and Christine Katie Thang in The Cleft Palate Craniofacial Journal
Supplemental Material
sj-docx-3-cpc-10.1177_10556656251408214 - Supplemental material for Effect of Language Interpreter Modalities on Patient Satisfaction and Clinical Workflow: A Pilot Study in a Multidisciplinary Pediatric Craniofacial Clinic
Supplemental material, sj-docx-3-cpc-10.1177_10556656251408214 for Effect of Language Interpreter Modalities on Patient Satisfaction and Clinical Workflow: A Pilot Study in a Multidisciplinary Pediatric Craniofacial Clinic by Selina Juang, Stella Nguyen, Nada Osman, Keymia Ghodrati, Cristian Reyes, Ioannen Maldonado, Jacquelynn Pino, Marlon Duarte, Kimberly Halley, Irene Hendrickson, Marinda Tu, Jennifer Brazier Peralta, Holly Wilhalme, Amanda Kosack, Jessica Lloyd, Carlos Lerner and Christine Katie Thang in The Cleft Palate Craniofacial Journal
Footnotes
Acknowledgments
The authors would like to thank the patients, families, providers, and interpreters in the multidisciplinary pediatric craniofacial clinic for their participation in this research. A portion of this study was presented at the American Cleft Palate-Craniofacial Association Annual Meeting in May 2025 in Palm Springs, California.
Ethical Considerations
This study (IRB#24-000922) was approved by the UCLA Institutional Review Board. UCLA's Federalwide Assurance with the Department of Health and Human Services is FWA00004642.
Consent to Participate
Informed consent to participate was obtained verbally.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the University of California Los Angeles (UCLA) Department of Pediatrics Justice, Equity, Diversity, and Inclusion (JEDI) Grant Program [no grant number], and the National Center for Advancing Translational Science (NCATS) of the National Institutes of Health (NIH) through the UCLA Clinical and Translational Science Institute (CTSI) [grant number UL1TR001881].
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability
Data are available upon reasonable request from the corresponding author.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
