Abstract
Introduction
With advancements in information and communication technologies, telemedicine in healthcare has opened an opportunity to revolutionize the delivery of quality health care, with regards to diagnosis and monitoring of patients, along with providing an expert second opinion in complicated clinical scenarios.1,2 The development of telemedicine took significant leaps during the COVID-19 pandemic and its scope has grown significantly over the years all across the world.3,4 Although telemedicine has not been established as a replacement to face-to-face consultation with a physician, it has most certainly added a new avenue to patient care. 4
Telemedicine has been around the developed countries for many years, with many large-scale research programs being deployed for understanding the complete scope of its application. 1 However, studies from the developing world are quite limited. It has a huge potential to deliver high-quality, affordable healthcare to underserved population in remote parts of the developing world, considering the limitations in both infrastructure and manpower, where specialist care is almost nonexistent making comprehensive management of not only complex diseases but also common conditions quite challenging.1,5
Remote consultation in the field of otorhinolaryngology is also rapidly expanding. 5 In the field of otorhinolaryngology-head and neck surgery (ORL-HNS), telemedicine usually refers to video consultations or teleconsultations. 5 But there is lack of substantial research with regards to its utilization in the field of ORL-HNS in Nepal and other developing countries. For the implementation of a telemedicine program at a national level, it is absolutely necessary to understand the effectiveness and limitations in specific clinical situations. 2 This is what we want to achieve with our pilot study. The primary goal of our model is to streamline early diagnosis to provide appropriate management for common conditions through consultation with a specialist and ensure timely referrals for complex diseases requiring direct care by an ORL-HNS specialist, such as head and neck cancers.
Moreover, while telemedicine is critically essential for countries like Nepal to extend specialized ORL-HNS services to remote areas, it’s often challenging for specialists to consistently offer live services due to their regular commitments and the need to align with patients’ availability. Our model in this study has thus been designed, taking into account this challenge from the service provider’s perspective. Consequently, it adopts an “Asynchronous” mode of service delivery, allowing specialists to provide diagnoses and most suited line of management, at their convenience after reviewing the submitted online data.
This study thus aims to assess the efficacy of our tailored telemedicine clinic model, operating in an “asynchronous” mode, in diagnosing various ORL-HNS diseases across all age groups.
Materials and Methods
This cross-sectional study was conducted at the Department of Ear, Nose, Throat, Head and Neck Surgery (ENT-HNS), Tribhuvan University Teaching Hospital, Maharajgunj Medical Campus in Kathmandu, Nepal, from January to mid-February 2024, spanning over 45 days. A telemedicine clinic model was devised, utilizing the following instruments (Figure 1):
Pen endoscope camera (Teslong®, Shenzhen, China)
Smartphone
Headband with camera adapter
Mimi Hearing App (v. 5.0)
Examining instruments (nasal speculum, tongue depressor, tuning fork of 512 Hz)
Electronic medical record (EMR) form (utilizing Google Forms)

Asynchronous telemedicine clinic model. (a) Otoscopic examination with pen endoscope. (b) Nose examination with camera mounted over a headband. (c) Throat examination with camera mounted over a headband. (d) Hearing examination with Mimi Hearing App. (e) User interface of EMR form at the examiner’s end. (f) User interface of Google Sheet at the end of researcher making the diagnosis remotely. EMR, electronic medical record.
History taking and examinations of the patients who were already diagnosed with their condition by consultants in the ward was conducted by 3 undergraduate students. The examining students were blinded regarding the diagnosis and plan of management until the complete history taking and examination was over. Their involvement was based on their limited exposure to ORL-HNS, reflecting the experience, or lack thereof, of nonspecialized doctors who serve in remote areas. The students documented their findings and captured relevant images of ear, nose, and throat using a portable endoscope camera and a smartphone mounted on a headband. The portable pen endoscope that we used can be used in the same way as a traditional otoscope for examination of the ears, while inserting the endoscope in the nose helps visualize the anterior half of nasal cavity. Hearing assessments were performed using Mimi (v. 5.0), a mobile-based app. All findings were diligently recorded in an online EMR system. The EMR form was meticulously designed to encompass all aspects of history-taking and examinations, aiming to prevent the omission of crucial information and mitigate the risk of selective examination leading to misdiagnosis.
Following initial examination, the diagnosis and management plans formulated by the examining ORL-HNS consultants were also documented in a separate EMR form. Subsequently, the investigator, working remotely and blinded to the initial diagnosis and management plan, assessed the patient’s history, examination details, and pertinent images within the online EMR. This investigator independently formulated his own diagnosis and management plan based solely on the data provided by the examining students in the EMR system.
Interobserver reliability between the diagnoses made through telemedicine analysis and the in-person examinations conducted by the consultants were performed using SPSS version 26 based on Cohen’s kappa. The association between the results of the Mimi Hearing App (v. 5.0) and pure tone audiometry (PTA) were performed using chi-square test.
Results
Ninety-four patients were included in this study over the pilot study duration, wherein 48 were females and 46 males. The patients were between the ages of 2 and 63 years, with 27 years being the median age. Figure 2 demonstrates the expanse of the patients in terms of geographical distribution. We had patients from all the provinces of Nepal. The majority of the patients were from Bagmati Province (n = 50), the location of the institute of focus in our study. This was followed by Lumbini (n = 11) and Madhesh Province (n = 10). Koshi, Gandaki, and Karnali had 7 patients each to contribute and the remaining 2 patients were from Sudurpaschim Province.

Heat map showing cases from various geographical regions included in the study.
Table 1 depicts how often the diagnoses made by the specialist through our telemedicine model corroborated with the diagnoses made by specialists at the center through direct interaction with patients. The degree of agreement between the diagnosis derived through our model and the ones made by specialists in the hospital for the same patients is substantial for all the disciplines with otorhinolaryngology. Correct diagnosis of the condition was possible in most cases of rhinology, otology, and pediatric cases; while Head and Neck pathologies proved to be the most challenging. Still, the final diagnosis was within the differentials generated via teleconsultation in most cases as elucidated in the table. In cases where differential diagnoses were considered, if the diagnosis made by consultants fell within those differentials and was deemed to be made based on investigations proposed by the investigator, the diagnosis was also considered to be correct.
Categories of Diagnosis Made Through Telemedicine Across Different Subspecialties and the Degree of Agreement.
Abbreviation: Ear, Nose and Throat.
In Figure 3, we present the spectrum of cases evaluated in our study, spanning all disciplines of otorhinolaryngology. Chronic otitis media (COM) emerged as the predominant diagnosis across all age groups, particularly prevalent in the pediatric population. Various head and neck swellings followed closely in frequency. Rhinology cases commonly included symptomatic deviated nasal septum (DNS) and chronic rhinosinusitis (CRS) with polyposis. Additionally, Figure 3 provides a comprehensive depiction of other diagnosed pathologies beyond these prevalent conditions.

Different clinical cases included in the study. COM, chronic otitis media; SSNHL, sudden sensorineural hearing loss; EAC, external auditory canal; H&N, head and neck; OSAS, obstructive sleep apnea syndrome; DSNI, deep space neck infection; DNS, deviated nasal septum; JNA, juvenile nasal angiofibroma; CRS, chronic rhinosinusitis; AC, antrochoanal; NK cell, natural killer cell; CSF, cerebrospinal fluid.
Table 2 outlines the challenges encountered by the primary researcher during the diagnostic evaluation of cases consulted through the telemedicine portal. The inherent limitation of lacking tactile sensation to delineate lesions and characterize swellings posed a significant obstacle (Figure 4). Consequently, 4 cases of head and neck swellings were inadequately diagnosed. Additionally, in 6 cases of COM, further classification into tubotympanic or atticoantral type was hindered by obscuration of relevant anatomical areas due to ear discharge or wax, thus restricting examination (Figure 5). Suboptimal endoscopic imaging, attributed to inaccessibility of relevant areas, resulted in poor visualization in 5 cases, including juvenile nasal angiofibroma, nasofacial natural killer cell lymphoma, DNS, and epistaxis. In one instance, insufficient history and clinical examination data provided challenges in diagnosing a case of papillary carcinoma of the thyroid.
Difficulties Encountered by the Researcher While Making Diagnosis Based on Telemedicine Data.
Abbreviation: DNS, deviated nasal septum; JNA, juvenile nasal angiofibroma; NK, natural killer.

Originally a case of papillary carcinoma of right thyroid lobe with no obvious anomaly on photographic documentation.

Originally a case of COM squamous with wax in the attic region obscuring the pathology.
A total of 27 cases underwent hearing assessments utilizing the Mimi hearing app (v. 5.0). Table 3 illustrates the association between the PTA findings of these patients and their hearing outcomes obtained from the Mimi hearing app (v. 5.0). No statistically significant association was observed (P > .05). Nonetheless, it is noteworthy that although the application did not accurately quantify the degree of hearing loss, it also did not produce any false negative results across all cases, that is, indicating good hearing in instances of any degree of hearing loss.
Association Between Hearing Tests Results by PTA and Mimi Hearing Application (v. 5.0).
Abbreviation: PTA, pure tone audiometry.
Discussion
According to the annual report of 2023 provided by the Nepal Medical Council, there are 328 registered ORL-HNS consultants in Nepal. 6 This provides us with an estimate of 1 specialist per 88,000 people in Nepal. Nepal is rapidly urbanizing, yet greatly agrarian, and large chunks of population are still sparsely distributed across the unforgiving terrain. This, along with lack of universal healthcare and health awareness among the underserved communities has led to a large number of patients presenting with late-stage complications of diseases to tertiary centers. The principal rationale of our study is to estimate the diagnostic concordance of our method, so as to gauge whether this telemedicine approach could be used on a large scale to facilitate better management via specialist consultation in remote areas and early referral of complex cases to centers where ORL-HNS consultants are available.
The asynchronous telemedicine portal encompasses a store-and-share approach to telecast the pen endoscopic camera findings to the consultant in order to formulate a diagnosis and a management plan. This allows for flexibility on the part of patients as well as the consultant. In comparison to the live models, a consultant does not have the patient at his/her disposal, albeit through a screen, so as to modify the camera settings, exposed area and modulation of examination, and patient history that is regarded pertinent by their expertise. The live model is cumbersome, in terms of finance as well as human resources, in the context of Nepal and heavily relies on multiple variables co-existing in harmony at a time. The asynchronous model is cheaper, in this case relying on only a pen-endoscope and a smartphone with internet connectivity, and also more suitable for remote areas in regards to the few co-investigators covering a larger geographical area at different time periods.
The asynchronous model used in this study consisted of nonspecialized healthcare workers taking history, conducting examinations, and taking images with the pen-endoscope and conducting hearing evaluation with an Android application through an earphone. The cost of the pen-endoscope was 300 USD, camera adapter was 10 USD, earphone was 10 USD, and a smartphone roughly of 300 USD. The Mimi Hearing app (v. 5.0) was free of cost. The portal used for the communication with the consultant was Google Forms which was also free of cost. An online EMR system would have been the ideal and could serve multiple specialties at once, but they are costly and also require use of computers and proper grasp of the record system which is impractical for use in the field in remote Nepal.
Previous studies have demonstrated that the use of video-otoscopy is capable of producing suitable images that can be relied on in order to formulate a working diagnosis. Most of these studies have either one of the following components: real-time based, video-images based, limited to otology, limited to adults or children only.7-12 There haven’t been many studies pertaining to the use of asynchronous smartphone-based telemedicine models for ORL-HNS diseases.
The overall diagnostic agreement in our study was substantial to almost perfect with kappa values of .843, .784, .737, and .764 for the rhinology, head and neck, otology and pediatric cases. The highest degree of agreement, almost perfect, was in the rhinology cases with 18 out of 20 cases were correctly diagnosed and 1 case each within differentials and incorrectly diagnosed. The majority of the cases were symptomatic DNS and CRS. The diagnosis in cases of rhinology was not only based on the images but largely relied on the history and clinical examination of the patient as laid out in the pro forma which steered toward a higher rate of concordance in the diagnoses. The incorrectly diagnosed case was compounded by the inability to visualize the posterior nose with the use of the pen-endoscope used in the study. Yulzhari et al conducted a real-time mobile technology-based otolaryngology care study reporting a diagnostic concordance of 81.3% among nose and sinuses-related cases. 7 This goes on to show how effectively we can deploy this method for diagnosing rhinology cases.
The head and neck cases proved to be a challenging ground with 7 out of 29 cases correctly diagnosed with a further 18 cases being within the differentials of the remote otorhinolaryngologist along with the management plan demonstrating a substantial agreement. Four of the cases were incorrectly diagnosed. In all of these cases, the remote consultant reported issues in the feedback including the lack of palpation to delineate the lesion manually. There was a lack of targeted history and examination during recording of the finding for telemedicine evaluation that were pertinent for diagnosis and still images visualization of a neck lesion would always be inadequate to identify whether the origin of the swelling was a superficial connective tissue, deep neck space, lymph node, or thyroid gland, among others. Yulzhari et al also reported a lower rate of concordance (60%) in neck/pharynx/larynx citing similar limitations. 7 Kwok et al reported a diagnostic concordance of 81.9% with telephone consultations and 96.9% concordance in recommended treatment plans. 8 The importance of direct physician interaction is highlighted through our findings here as the differences in concordance of diagnoses are stark in the head and neck pathologies. We believe that incorporating ultrasonography in this model could ensure timely referral of cases which on examination don’t have any obvious findings. However, that would require the presence of a trained personnel and is cost as well as human resources restricted in the current scenario of Nepal. Still the management approach delineated by the consultation for these pathologies can prove extremely helpful for the healthcare workers as well as the patients in remote areas of the country.
The diagnostic concordance for otology cases was substantial with a kappa value of .737, 20 of 23 cases correctly diagnosed, further 2 being within the differentials and only 1 incorrectly diagnosed. The major bulk of the cases were that of COM and otosclerosis. The 2 cases requiring the otorhinolaryngologist to formulate differential diagnoses had issues with inadequate visualization of the attic region due to poor recording and uncleared obstructive wax. The incorrectly diagnosed case findings were reported to have been submitted with poor resolution of the image. Biagio et al reported that the video-otoscopically acquired tympanic membrane details including surface, texture, color, and position assessment had a moderately to substantial agreement with similar results in diagnosis. 9 Eikelboom et al demonstrated statistically significant agreement (P < .001) in all cases, but cholesteatoma with image quality being at least good in most cases in a tele-otoscopy study among pediatric patients. 10 Biagio et al reported diagnostic accuracy with k = .68 to .75 in an asynchronous otology video-based telemedicine model among pediatric patients. 11 The diagnostic accuracy is already quite good for otology cases through our model, and further work in this regard through improvements in both the technology as well as training for handling of the device can help to enhance the diagnostic accuracy further.
Children pose a unique set of challenges in diagnosis and management of Ear, Nose and Throat (ENT) pathologies. Pediatric cases were correctly diagnosed in 17 cases out of 22, they were within the differentials in 2 more cases and incorrect in 2 cases with inability to make a diagnosis in 1 case. Several studies have been done regarding telemedicine in otorhinolaryngology among children. Smith et al demonstrated that diagnosis was identical in 81% of the cases with clinical management advised identical in 76% of the cases. 12 The study was asynchronous with video-recording among pediatric patients. The image quality of the recordings was judged to be good or very good, that is, adequate for diagnosis, in 75%, 79%, and 74% in ear, nose, and throat cases, respectively, showing significant accuracy. A real-time telemedicine model by Smith et al assessing pediatric cases in 19 outreach clinics reported 99% agreement in initial videoconference diagnosis and 93% agreement in management strategy. 13
The study also included a hearing evaluation with the Mimi hearing app (v. 5.0) in selected patients based on the assumption of their level of comprehension of the intricacies of the application on the basis of their age, knowledge level including grasp of the English language, their smartphone usage, and their willingness to participate in the study. The app was chosen because of its compatibility in both Android and iOS ecosystem and free of cost nature considering the targeted rural demographics. The study did not demonstrate statistically significant association between the standard PTA findings and the application-based assessment. Nevertheless, the study also did not demonstrate any false negative results and therefore could be considered as a screening tool in the rural areas lacking formal audiometric assessments.
There were several limitations to the study. There is no standardized or tested audio delivery device specified by the application to be used for best results. The volunteers in the study did not speak English as a primary language, and they were not excluded from the study if they had ongoing otologic disease or occluded ears. The reference PTA findings were not done at the same setting with a gap of a day or 2 on average from formal audiometric assessment to the application-based assessment which could have produced significant deviation from the prior finding in a patient with active otologic disease. The Mimi hearing application’s design which encompassed pressing continuously on a button until a sound was heard also proved to be tedious for several patients. The study was also done with the use of foam inserts in ear and the test was done in an examination room adjacent to the general ward in order to mimic the decibel levels of a home environment which is the primary utility of this application. The findings were inconsistent with the other reference study 14 done at another center with the same application but in adult patients without active otologic disease or occluded ears, and with normal hearing or sensorineural hearing loss. The study had demonstrated no significant difference between average hearing levels through the Mimi hearing app (v. 5.0) and a standard audiometric booth test and also reported user satisfaction with the application along with shorter duration for completion in comparison to standard audiometric evaluation. We also didn’t have recommended specific earbuds for the Mimi app and were using generic earbuds which could have affected the validity of our outcomes. Further, intranasal imaging was carried out by inserting the pen endoscope camera into the nose which encompassed only the anterior half of nasal cavity, and precluded the examination of posterior part of nasal cavity which was an important limitation of our study.
This was a pilot study and the enrollment to the study was voluntary and no data were collected regarding those not approached resulting in a constrained list of differentials covered in the study. The patients under 2 years of age were not included in the study and hence, the study is not entirely representative of the pediatric population. The study had a 1 visit only per case policy and the cases were not followed up so as to monitor for the possible complications or improvement in the subjects resulting in no measure of a follow-up. Unlike many of the telemedicine-related studies in otorhinolaryngology, the images taken by the endoscope was 1 image per site of interest, the acceptance of which was judged by the co-investigators and not the ENT consultant, mirroring a resource-limited wide-range usage in the actual field in the setting of Nepal. This resulted in few images being rated unacceptable by the consultant in order to formulate a plan or make a diagnosis. The other limitation factor was lack of depth perception when single images are viewed in contrast to in-person consultations. The study was done using a rigid endoscope resulting in inability to obtain relevant images in cases of lesions of posterior nose, pharynx, and larynx. As a result, the cases, especially voice disorders and laryngeal masses, ideally requiring a flexible endoscopic evaluation were also not included in the study. The study was done among the in-patients and thus, did not include the many of the patients that visit the OPD daily with no visibly ascertainable findings in the clinical examinations. Thus, the consultant was not clinically challenged to rule out the presence of any disease in most of the cases.
The use of telemedicine entailing remote data collection and transfer adds a concern regarding the privacy of data and challenges regarding security of stored data. There has to be proper encryption and secure access codes for only the required personnel to access required information. The identifiable should also be properly coded so as to minimize the risk of breach of confidentiality. The system also has to be regularly updated and maintained to avoid any breach. The quality of care provided during telemedicine consultations should also mirror in-person consultations with standardization and laid out protocols so that gaps can be easily identified and quality improved during the process. The healthcare personnel need to be properly trained keeping all these aspects in mind.
The future of telemedicine in otolaryngology can be gauged to be effective in reducing cost and human resource use in the limited setting of Nepal with the model allowing for remote but effective communication between the concerned patients and the expert doctor in clarifying doubts and making diagnosis, early recognition of diseases, and preventing or decreasing complications of those diseases. Kokesh et al reported 73% ORL-HNS telemedicine consultations preventing need of travel to tertiary centers, ease of planning surgery if needed, and decreasing wait times for in-person consultations after a decade of use of comprehensive telemedicine model in Alaska. 15 Smith et al reported reduction in travel need, increased number of procedures at outreach clinics, and wide coverage of assessment of children after 3 years of implementation of telemedicine model in ORL-HNS practice. 16 The model can also be used in following up a patient managed by the consultant in a tertiary center so as to monitor improvement or derangement in the condition. The model shall certainly be welcomed by the remote unspecialized health care providers as the burden is eased on them and they also get to contribute to the local health by facilitating as a mediator.
Conclusion
Our pilot study is a stepping stone that will open many frontiers in the application of telemedicine in the field of ORL-HNS, especially with the promising results we obtained. There are numerous limitations to this model which can be improved with forthcoming studies. The application of our proposed model of asynchronous telemedicine will ensure quality medical care in the remote areas of the developing world which has been deprived of specialist care for far too long.
Footnotes
Acknowledgements
We would like to thank Manish Khanal and Biraj Wagley for their technical assistance.
Author Contributions
B.G. conceptualized this study and created the initial format for conduction of this study. Ob, S.G. and A.K. conducted data collection and analysis under supervision of B.G., B.G., Ob, S.G., and A.K. involved in manuscript writing. U.R., P.T., and K.A. involved in editing the manuscript and bringing the manuscript to final form. All authors approve of the final manuscript submitted for publication.
Availability of Data and Materials
The datasets used and/or analyzed in this study are available from the corresponding author on reasonable request.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Consent
Written informed consent was obtained from all the participants for their involvement in the study and the publication of this study.
Ethical Approval
Ethical approval for this study was obtained from Institutional Review Committee (IRC) of Institute of Medicine (IOM), Tribhuvan University. IRB approval number for our study is 299 (6-11) E2. The identity of all patients and all sensible information are kept confidential.
