Abstract
Importance
Efficient and accurate tools for early detection of hearing loss are essential for reducing delays in diagnosis and treatment.
Objective
To determine the accuracy and reliability of Automated iPad Hearing Screening (AIHS) as a screening tool compared to a formal pure tone audiometry (PTA).
Design
A parallel cross-sectional study.
Setting
Tertiary referral center in Hong Kong.
Participants
Seventy-nine adult patients (158 ears) aged from 28 to 87 who were diagnosed with hearing loss were included.
Exposure or Intervention
Participants underwent the AIHS screening at 1, 2, 4, and 0.5 kHz in the right and left ears, respectively, prior to a formal PTA, focusing solely on air conduction thresholds.
Main Outcome Measures
Sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (AUC) for detecting hearing loss at 3 thresholds: >25, >40, and >60 dBHL. Intraclass correlation coefficient (ICC) and Bland–Altman analysis were used to assess agreement between AIHS and PTA.
Results
Excellent sensitivity and specificity of the AHIS were identified across 3 age groups and different hearing levels. The AUCs of AHIS were .917 (95% CI: .842-.993), .911 (.863-.960), and .968 (.942-.994) for thresholds over 25, 40, and 60 dBHL, respectively. ICC = .901 (.864-.927) and Bland–Altman analysis indicated good agreement between these 2 methods.
Conclusion
The AIHS is a simple, intuitive, and portable screening test for hearing loss that can be repeated with high accuracy and reliability at relatively low cost.
Key Message
AIHS achieves high accuracy with excellent sensitivity and specificity and good reliability in detecting hearing loss.
It is a simple, portable, and cost-effective screening tool that can be used in primary care without user training.
The study highlights AIHS as a practical alternative to formal audiograms in clinical settings.
Introduction
Hearing loss is a global health problem with increasing prevalence. 1 Hearing loss is known to have a significant psychosocial impact. It is associated with impaired speech development in children 2 and cognitive decline and depression in the elderly. 3 Early diagnosis and rehabilitation are crucial in the treatment of hearing loss in older adults and in reducing its negative impact on communication, quality of life, and overall well-being. 4 A good screening tool should be simple, safe, inexpensive, widely accessible, and should not place a significant burden on the healthcare system. 1 It must demonstrate high diagnostic performance, including both sensitivity and specificity. Acceptable screening tools for adult hearing loss should achieve at least 90% sensitivity and 80% specificity, 5 though higher values are desirable, especially in clinical settings where false negatives may delay intervention.
Pure tone audiometry (PTA) is the diagnostic gold standard for hearing loss. It is traditionally performed manually by an audiologist or trained personnel in a special soundproof environment. 6 Despite being the gold standard, PTAs have their limitations, such as immobility, operator bias, the need for operator training, and a soundproofed environment 7,8
Long waiting time from diagnosis to prescription is a big problem in our locality. The sequence of events that a patient with hearing loss has to go through from diagnosis to treatment in the public sector in Hong Kong is as follows. They usually present to the general practitioners complaining of hearing loss. They then receive a referral letter to an ear, nose, and throat (ENT) specialist outpatient clinic in their geographical area. The waiting times for clinic appointments for patients with suspected presbycusis to see an ENT specialist are typically one and a half years. Subsequent referral to an audiologist for the prescription of hearing aids may require waiting times of 1 year. A referral period of approximately 2.5 years from presentation to diagnosis is not uncommon due to the lack of manpower and resources. It is important to note that the history of hearing loss in the referral letters is very important to us, as severity, time of onset, and associated symptoms play an important role in the allocation of clinic appointments.
In recent years, rapid advances in consumer technology have opened up new possibilities for healthcare applications.9,10 One such innovation is the use of automated iPad audiometry for hearing screening, which provides a convenient and potentially cost-effective alternative to traditional methods. This technology makes it possible to perform hearing screenings outside of specialized clinical settings, making it particularly promising for community-based screening programs and pre-specialist diagnosis. 11
Automated iPad Hearing Screening (AIHS) has been developed in the Department of Otorhinolaryngology, Head, and Neck Surgery, the Chinese University of Hong Kong. The present study aims to evaluate the accuracy and reliability of AIHS for hearing screening in a clinical setting compared to the gold standard pure tone audiometry in a sound booth. By comparing the results obtained with the iPad audiometry system to those of the traditional approach, we will evaluate the feasibility and potential benefits of implementing this technology into broader screening programs. Evaluating the accuracy of hearing screening in a clinical setting will provide valuable insight into the practicality and effectiveness of implementing this approach outside of a controlled environment. The importance of this research lies in its potential to improve the accessibility, cost-effectiveness, and efficiency of hearing screening in healthcare. If proven accurate and reliable, the AIHS could revolutionize hearing screening programs by increasing their reach, reducing the need for extensive infrastructure, and shortening wait times. It also enables early detection of hearing loss in community settings and remote or underserved areas.
Methods
The AIHS Development and Application
Each AIHS device was connected and calibrated with a designated pair of Bose QuietComfort 35 Series II circumaural noise-canceling headphones. AHIS was conducted using a tablet-based system that directly reports results in dBHL, consistent with conventional clinical audiometry. The output levels of the system were pre-calibrated by the software developer in accordance with IEC 60645-1:2017 and ANSI S3.6-2018 standards, using established Reference Equivalent Threshold Sound Pressure Levels for the Bose headphone model employed. Therefore, no manual conversion between dB SPL and dBHL was required during analysis. Output levels were verified using a Type 1 sound level meter and an IEC 60318-1 artificial ear prior to data collection, ensuring conformity with international calibration standards. Automated Hughson Westlake threshold determination procedures were adopted in the AIHS system. Auditory thresholds at 1, 2, 4, and 0.5 kHz, in the right and left ears, respectively, were determined with the automated bracketing technique of descending and ascending stimulus tones according to the “method of limits” principle. The AIHS always started with a 40 dBHL tone at 1 kHz in the right ear, followed by 2, 4, and 0.5 kHz, and then the same frequencies in the left ear.
As the AIHS system was designed for the purposes of initial screening, efficient use of referral and clinical resources, only air conduction thresholds were recorded. The lower limit of the AIHS was set to 5 dBHL and the upper limit to 65 dBHL. No visual cue was shown on the iPad screen to indicate whether any stimulus tone was being presented.
Although the Bose QC35 II headphones communicate via Bluetooth, the test tones were preloaded and generated locally on the iPad to avoid streaming-related latency or compression artifacts. Pretesting confirmed that neither active noise cancelation (ANC) nor Bluetooth signal processing introduced measurable distortion or frequency bias.
To ensure the reliability and consistency of results, the following quality assurance procedures were implemented: Daily verification of headphone output using a sound level meter and coupler setup; Maintenance of ambient noise levels <35 dBA in all test settings; Use of standardized examiner training and procedural checklists to ensure consistency across sessions.
The development and maintenance of the AIHS was supported by the social innovation and entrepreneurship development fund (SIE Fund) (KPF19HLF09), which was a Hong Kong SAR government-funded program. The study has been approved for human studies by the Institutional Review Board of the Joint Chinese University of Hong Kong New Territories East Cluster Clinical Research Ethics Committee (2023.201). All recruited patients have provided written informed consent for clinical interventions as well as publication purposes.
The Bose QuietComfort 35 Series II Headphones
These were wireless, circumaural noise-canceling headphones. They had 3 different noise-canceling modes: “Off,” “Low,” and “High.” Using the theory of destructive interference and built-in microphones, these headphones detected ambient noise and emitted sound waves that reduced the amplitude of the incoming sound waves, thereby achieving the effect of noise cancellation.
Identification of the Subjects
All patients scheduled for a formal PTA examination between February and December 2023 at the ENT outpatient clinics of Prince of Wales Hospital and Alice Ho Miu Ling Nethersole Hospital in Hong Kong, with different reasons and degrees of hearing loss were invited to take the AIHS screening test. We excluded patients with a physical or cognitive disability from using the AIHS test or refusing to participate in the study. Patients were informed, and their consent was obtained.
AHIS Procedure
Otoscopy was performed prior to testing in accordance with standard clinical protocol to ensure the external auditory canal was clear. The hearing screening was conducted in a private consultation room within the outpatient specialist clinics, with the door closed to minimize ambient noise. Ambient noise levels were measured and documented in accordance with ISO 8253-1 standards, ensuring they remained below 35 dBA during testing. Minimal assistance was provided by trained research personnel.
Participants were seated at a desk with an iPad placed in front of them. They wore Bose QuietComfort 35 Series II headphones, which are circumaural and equipped with ANC. The ANC was enabled and set to “High” prior to testing to suppress environmental noise. The headphones were regularly validated for functional performance, and proper placement was ensured by marking the left and right sides with blue and red indicators, respectively.
The test was performed via headphones in a single-ear, sequential manner. Each ear was tested independently using air conduction tones while the non-test ear remained unoccluded and unmasked. Although no masking was applied, the maximum output was limited to 65 dBHL, thereby reducing the likelihood of cross-hearing. 12
Visual instructions were presented on the iPad screen. Participants were instructed to tap a “smiley face” icon whenever they heard a tone, regardless of how faint. Pure tones were delivered at intensities ranging from 5 to 65 dBHL, in 5 dB steps, at frequencies of 0.5, 1, 2, and 4 kHz. The tones were presented randomly to each ear, with short pauses between frequencies. Only air conduction thresholds were recorded. If no response was detected at 65 dBHL, the system recorded it as “no response at maximum level,” corresponding to a hearing threshold greater than 65 dBHL.
The Diagnostic Test
Immediately following completion of the screening test, a formal PTA was performed by a trained member of the audiology department in a soundproof booth in the same outpatient specialty clinic. The data of their air conduction hearing at frequencies of 0.5, 1, 2, and 4 kHz were documented.
Statistical Analysis
Thresholds were recorded in dBHL for each ear at 0.5, 1, 2, and 4 kHz, and the average of these 4 frequencies was calculated for analysis. Hearing loss was categorized using three levels of severity: >25 dBHL (any hearing loss), >40 dBHL (hearing loss requiring clinical attention), and >60 dBHL (moderate-to-severe hearing loss with communication impact). To evaluate the screening performance of AIHS relative to PTA as the reference standard, receiver operating characteristic (ROC) curves were constructed for each threshold, and the area under the curve (AUC) was computed. Diagnostic accuracy metrics, including sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), were calculated for each threshold. These were derived from 2 × 2 contingency tables comparing AIHS to PTA results. Although each participant contributed data from both ears, intra-subject correlation was evaluated using a generalized linear mixed model with subject ID as a random effect. The random effect variance was estimated to be 1.753 (P = .076), indicating that the ears data could be treated as statistically independent in the primary analyses. A P-value less than .05 was considered statistically significant.
Results
Demographics
This study was conducted between February and December 2023. Seventy-nine patients (158 ears) with various degrees of hearing loss were recruited from the Prince of Wales Hospital and Alice Ho Miu Ling Nethersole Hospital, ENT outpatient clinics in Hong Kong. Sex distribution of 32 males, 47 females. Ages ranged from 28 to 87 years with a mean age of 63.5 years. The participants were distributed as follows: 7 young adults (aged 28-40 years), 71 middle-aged adults (aged 41-65 years), and 80 older adults (aged over 65 years). We found that 22.8% of patients had normal hearing (≤25 dB), 27.2% had mild hearing loss (26-40 dB), 29.7% had moderate hearing loss (41-55 dB), 12.7% had moderate hearing loss (56-70 dB), 5.1% had severe hearing loss (71-90 dB) and 2.5% had profound hearing loss (>90 dB). Although individual test durations were not systematically recorded, the average duration of the AIHS screening was approximately 10 minutes per participant. All participants successfully completed the test, and no technical failures were reported.
Accuracy Tests
Stratified analysis by age group revealed some variation in AIHS performance (Table 1). Among participants aged 41 to 65 years, both sensitivity (.946, 95% CI: .823-.985) and specificity (.952, 95% CI: .762-.999) were high, with PPV and NPV values exceeding .90. In older adults (>65 years), sensitivity remained high at .927 (95% CI: .849-.968), and PPV was also strong (.962, 95% CI: .888-.987). However, specificity decreased to .625 (95% CI: .306-.863), and NPV dropped to .455 (95% CI: .213-.720), indicating a higher rate of false negatives in this subgroup. In contrast, the youngest age group (28-40 years) showed moderate sensitivity (.667, 95% CI: .208-.939) and specificity (.857, 95% CI: .487-.974), although sample size was limited.
Accuracy of AIHS for 3 Different Age Ranges.
Abbreviations: FN, false negative; NPV, negative predictive value; PPV, positive predictive value; TN, true negative; TP, true positive; FP, false positive.
ROC analysis showed excellent discrimination of AHIS across thresholds, with AUC values of .917 (95% CI: .842-.993) for >25 dBHL (Figure 1), .911 (95% CI: .863-.960) for >40 dBHL (Figure 2), and .968 (95% CI: .942-.994) for >60 dBHL (Figure 3). With specific thresholds under the max AUC, the AIHS demonstrated high sensitivity and specificity (Table 2). For detecting hearing loss >25 dBHL (iPad > 26.875 dBHL), the sensitivity was .910 (95% CI: .846-.954) and specificity was .889 (95% CI: .738–.963), with a PPV of .965 and a NPV of .744. At the >40 dBHL (iPad > 43.125 dB HL), sensitivity remained high at .911 (95% CI: .823-.961) with a specificity of .835 (95% CI: .737-.906); both PPV and NPV exceeded .84. Performance was strongest at the >60 dBHL (iPad > 56.875 dBHL), with a sensitivity of .913 (95% CI: .732-.981) and specificity of .911 (95% CI: .847-.950). The NPV was particularly high at this level (.984, 95% CI: .941-.996), while the PPV was lower (.636, 95% CI: .452-.788), likely reflecting the lower prevalence of severe hearing loss in the sample. All comparisons were statistically significant (P < .001).

Receiver operating curve for hearing loss >25 dB (AUC = .917, 95% CI: .842-.993).

Receiver operating curve for hearing loss >40 dB (AUC = .911, 95% CI: .863-.960).

Receiver operating curve for hearing loss >60 dB (AUC = .968, 95% CI: .942-.994).
Accuracy of AIHS for 3 Different Degrees of Hearing Loss.
Abbreviations: FN, false negative; FP, false positive; NPV, negative predictive value; PPV, positive predictive value; TN, true negative; TP, true positive.
Reliability Tests
We conducted an intraclass correlation coefficient (ICC) analysis and Bland–Altman analysis to evaluate the reliability of this tool. The ICC was calculated using a 2-way mixed-effects model with absolute agreement, yielding an ICC value of .901 (95% CI: .864-.927), indicating good agreement between the 2 methods.
Bland–Altman analysis revealed a mean difference between AHIS and standard audiometry was .95 ± 11.70 dBHL, with 95% limits of agreement ranging from −21.991 to 23.890 (Figure 4). Most data points fell within these limits, suggesting acceptable agreement with no significant systematic bias.

Scatter plot of Bland–Altman analysis. Most data points fell within these limits, indicating no significant systematic bias between these 2 methods.
Discussion
Participants
This study evaluated the diagnostic performance and reliability of AIHS across varying hearing loss thresholds and age groups. Our findings demonstrated that AIHS was a highly sensitive and specific tool for detecting mild-to-severe hearing loss, with excellent agreement compared to standard PTA. The percentage of patients with hearing loss in our sample was much higher than the global estimate of 25% prevalence in elderlies. 13 This was because our sample comes from an ENT outpatient clinic where patients with some form of hearing problem are referred from primary care, student care, or private practitioners. We realized that this sample did not reflect the general population, but this was not a concern in this screening study as the sample reflected the target population of AIHS in Hong Kong.
Performance of AIHS Across Different Age Range
It was found that the sensitivity and specificity of the AIHS were higher in middle-aged groups. In patients aged 41 to 65 years, the overall sensitivity and specificity were 94.6% and 95.2%, respectively, while in patients aged over 65 years, the values for sensitivity and specificity were 92.7% and 62.5%, respectively. The age group of 28 to 40 years old had a sensitivity and specificity of 66.7% and 85.7%, respectively. This was surprising as older subjects generally have poorer hearing, which could affect accuracy. Recent studies have also shown that cognitive load could increase PTA thresholds and affect accuracy. 14 The high reliability of AHIS in older patients was encouraging, as in our region, a large proportion of patients requiring audiologic services fell into this age group. Due to the demographics of the patient pool, particularly the small number of subjects in the younger age group, it was difficult to identify a causal relationship between age and accuracy of AIHS. Further studies would be required to investigate this.
The Performance of AIHS in Different Degrees of Hearing Loss
There were notable differences in sensitivity and specificity depending on the degree of hearing loss. The sensitivity for detecting a hearing loss >60 dBHL was higher overall than for simply detecting a hearing loss >25 and 40 dBHL. The NPV also increased accordingly from 78.4% to 98.4% when the cut-off value was increased. This was promising, as the high sensitivity of AIHS in detecting more severe hearing loss allowed clinicians to identify appropriate patients for intervention. In terms of specificity, the trend was partially reversed: specificity (88.9%) was higher for hearing loss >25 dBHL than for hearing loss >40 dBHL (83.5%). This showed that automated audiometry was able to reliably reassure patients that they did not have a hearing loss.
Comparisons with Other Available Hearing Screening Tests
There were several other hearing screening tests on the market. The HearCheck Screener 15 was an example that was frequently discussed in the available literature. It was developed by Siemens in 2005 and was a portable device that automatically generates 6 tones per test. The tones were 20, 35, 55 dBHL at 1 kHz and 35, 55, 75 dBHL at 3 kHz. The test was usually performed in a soundproof booth with the HearCheck device held to the tested ear. The button on the HearCheck screener was pressed once to play the first 3 tones at 1 kHz, then again to play the second 3 tones at 3 kHz. Patients were instructed to raise their hands or tap the table to indicate that a tone was heard. If the patient does not perceive one of the 6 tones, they were classified as hard of hearing and referred for an official examination. HearCheck had a sensitivity of 79% and a specificity of 98% for the detection of hearing loss, as well as a PPV of 94% and an NPV of 89%. 16
Although we did not use other hearing screening tests for comparison in our study, our AIHS outperformed the HearCheck screener in both sensitivity and PPV for hearing loss >25 dB. Further advantages over HearCheck were that the AIHS could be performed outside a soundproof booth and that the AIHS could be potentially performed automatically without the need for an employee to hold the device to the tested ear. It would be advantageous to compare our AIHS with other screening tests available on the market in a further study.
Study Limitations
We acknowledge several limitations in the design and implementation of this study. First, the AIHS test was always performed prior to formal PTA, which may have introduced order or learning effects. Participants may have become more familiar with the task or stimuli during the AIHS procedure, potentially influencing their responsiveness in the subsequent PTA test. While this fixed sequence was maintained for logistical consistency, future studies should consider randomizing test order to mitigate this potential bias.
Recruitment from an ENT outpatient clinic may have introduced spectrum bias, as participants were more likely to have existing or suspected hearing loss. This clinical population does not fully reflect the variability seen in general or community-based screening settings, potentially leading to overestimation of diagnostic accuracy. As a result, the findings may not be directly generalizable to broader populations. Future studies should include more diverse cohorts to better assess test performance in real-world screening contexts.
Conclusions
The AIHS test demonstrated consistently high sensitivity across all levels of hearing loss tested, including mild (>25 dBHL), moderate (>40 dBHL), and severe (>60 dBHL) thresholds. Specificity and predictive values were generally robust, particularly at thresholds >40 dBHL, where both PPV and NPV exceeded .90. However, performance was more limited for identifying in older adults, where NPV was notably lower, suggesting that false negatives may occur in this subgroup.
The system’s maximum output of 65 dBHL may also restrict its ability to accurately classify profound hearing loss, potentially underestimating severity in some cases. Therefore, while AIHS is a practical and accessible tool for early detection of hearing loss, it is best suited for screening purposes rather than diagnostic confirmation. Basic operator training is recommended to ensure consistent use in primary care or community settings.
Footnotes
Authorship Contribution
Study idea and design: WTC, IHYN, and MCFT; Data collection and analysis: WTC, CXYL, WYH, WLC, SCL, XXC, and IHYN; Manuscript preparation: WTC, CXYL, XXC, WYH, WLC, SCL, IHYN, and MCFT; Final approval: All authors.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The development and maintenance of the automated iPad audiometry (AIHS) was supported by the SIE Fund (KPF19HLF09), which was a Hong Kong SAR government-funded program.
Ethics Approval
The study has been approved for human studies by the Institutional Review Board of the Joint Chinese University of Hong Kong New Territories East Cluster Clinical Research Ethics Committee (2023.201). All recruited patients have provided written informed consent for clinical interventions as well as publication purposes.
