Abstract
The use of two types of speech-in-noise (SPIN) assessment, namely digits-in-noise self-tests and open-set, monosyllabic word tests, to assess the SPIN understanding performance of children with cochlear implants (CI) in mainstream and special education, was investigated. The tests’ feasibility and reliability and the influence of specific cognitive abilities on their results were studied. The results of 30 children with CIs in mainstream and special education were compared to those of 60 normal-hearing children in elementary school. Results indicate that the digit triplet test (DTT) was feasible for all children tested in this study, as seen by the familiarity of all the digits, the high stability of the test results (<3 dB SNR), and a small measurement error (≤2 dB SNR). Remembering full triplets did not form a problem and results did not show systematic attention loss. For children with CIs, the performance on the DTT was strongly related to the performance on the open-set monosyllabic word-in-noise task. However, small but significant differences were observed in the performance of children with CIs in mainstream and special education on the monosyllabic word test. Both tests showed little influence of cognitive abilities, making them both useful in situations where the bottom-up auditory aspect of SPIN performance needs to be investigated or in situations where sentence-in-noise tests are too challenging.
Introduction
Cochlear implants (CIs) are the standard of care for children with severe to profound bilateral hearing loss who cannot benefit from regular hearing aids (Choo et al., 2021). Currently, due to early cochlear implantation, 70–90% of the children attend mainstream education (ME) (De Raeve & Lichtert, 2012; Huber et al., 2008; Uziel et al., 2007; Verhaert et al., 2008), while this was 10% before the era of cochlear implantation (Choo et al., 2021). Nonetheless, even with relatively good sound detection thresholds and understanding in quiet (Firszt et al., 2004), many adults and children with CIs experience difficulty understanding speech under adverse conditions (Davidson et al., 2011; Geers et al., 2003; Neill et al., 2019; Zaltz et al., 2020). Regardless of background noise, worse speech understanding performance in children with CIs has been linked to various demographic characteristics such as late implantation compared to early implantation (Dettman et al., 2016; Tajudeen et al., 2010), bimodal fitting compared to bilateral fitting (Choi et al., 2017) and use of non-verbal communication such as total communication or sign language compared to oral-only communication methods (Boons, Brokx, Dhooge, et al., 2012; Dettman et al., 2013; Sarant et al., 2001). Problems with understanding speech-in-noise (SPIN) cause significant communication problems in real-life situations (Zaltz et al., 2020) and can affect linguistic and cognitive development (Davidson et al., 2019; Eisenberg et al., 2016; Goossens et al., 2016; for a review see: van Wieringen & Wouters, 2015). Therefore, rehabilitating and following up on the ability to understand SPIN is essential for children with CIs.
The performance of children with CIs in adverse conditions not only depends on auditory factors but is also related to top-down processes such as cognitive and language skills (Blomquist et al., 2021; Davidson et al., 2011; Eisenberg et al., 2016; Köse et al., 2022; Moberly, Houston, et al., 2017; Zaltz et al., 2020). For example, research in children with CIs shows that working memory plays a role in the restoration of degraded speech signals and the inhibition of masking signals (Casserly & Pisoni, 2013; Rönnberg et al., 2013), and non-verbal intelligence explains part of the variance in the SPIN understanding performance of these children, on top of the variance explained by age at implantation and word recognition in quiet (Zaltz et al., 2020). Furthermore, processing speed, the ability to process and parse language input and access word meanings in real-time, is related to the SPIN of children with CIs (Casserly & Pisoni, 2013). As their semantic networks are less structured than their normal hearing (NH) peers (Kenett et al., 2013), they tend to rely more on slow, effortful processing (Aubuchon et al., 2015; Kronenberger et al., 2018). Lastly, aspects of language, such as expressive vocabulary (Eisenberg et al., 2016) and receptive vocabulary (Zaltz et al., 2020), seem related to the SPIN understanding abilities of children with CIs.
Many of these abilities show large variability in children with CIs (AuBuchon et al., 2015; Boons, Brokx, Dhooge, et al., 2012; Kenett et al., 2013; Kronenberger et al., 2013, 2018; Moberly, Pisoni, et al., 2017; Wass et al., 2008; Yehudai et al., 2011). For example, in the research of Boons et al., half of the children with CIs achieved age-adequate language quotients. However, the CI group performed significantly weaker on every language test, including vocabulary, morphology, syntax, and narratives, than their NH peers (Boons et al., 2013). In addition, Boons, Brokx, Frijns, et al. (2012) showed large variability in spoken language comprehension of children with CIs, ranging from two standard deviations below the norm for NH children to as high as one standard deviation (SD) above this norm. They showed that children with monaural implantation had significantly worse language comprehension skills than children with bilateral implantation (Boons, Brokx, Dhooge, et al., 2012) and that the presence of additional disabilities enlarged the odds of poor vocabulary skills (Boons et al., 2013). The high heterogeneity is often also reflected in their educational outcomes, as 10% to 30% of the children with CIs attend special education (SE) or additional SE classes at regular schools (Choo et al., 2021; De Raeve et al., 2012; Huber et al., 2008; Uziel et al., 2007) and children with lower language levels are more likely to attend SE than ME (Boons, Brokx, Dhooge, et al., 2012; Yehudai et al., 2011).
The extent to which these language and cognitive abilities influence the results obtained with SPIN tests largely depends on the type of test used, making it important to consider the exact aim of the measurement when choosing an adequate SPIN test. For example, when investigating SPIN understanding performance in everyday life, a sentence task, or even a multitask paradigm that mimics everyday life situations (Devesse et al., 2020), is preferred. Such speech materials and paradigms heavily involve top-down processes such as working memory, processing speed (Wingfield, 1996), and linguistic knowledge (Vickers et al., 2016). However, in some situations, it may be more important to evaluate the bottom-up auditory performance, irrespective of language proficiency, for example, to gain insight into the mapping of the CI. Moreover, tests requiring many top-down processes can become impossible for some populations. For instance, several children with CIs who attend SE perform poorer on measures involving language than children attending ME (Boons, Brokx, Dhooge, et al., 2012; Yehudai et al., 2011). While sentence materials are often too difficult, resulting in an unreliable estimation of speech understanding, monosyllabic words in noise and digits in noise could be better options.
A Dutch example of a monosyllabic word test is the Lilliput, used for assessing SPIN performance in children and adults (van Wieringen & Wouters, 2022). The Lilliput consists of meaningful monosyllabic consonant–vowel–consonant (CVC) words for young children (4–5 years old) to adults. The test can be used with and without background noise with an open-set or a closed-set response format. The open-set response format is more closely related to everyday situations as words need to be retrieved from the lexical memory, reflecting real-world listening (Buss et al., 2016). In these tasks, the familiarity and frequency of a word affect its recognition more than in a closed-set task, and when using an open-set task, the listener has to hear the entire target word clearly enough to differentiate it from all other words in their vocabulary, instead of only from the response alternatives in a closed-set task (Buss et al., 2016). Moreover, lexical knowledge can influence the results of meaningful CVC word recognition tasks (Wilson et al., 2010). When using monosyllabic words in quiet, around 1 dB of the 2.5 to 3.9 dB difference observed between age groups in children aged 6 to 12 years old could be attributed to differences in word familiarity, including vocabulary size (Wilson et al., 2010). Most likely, the difference is even bigger when the listening circumstances are more challenging such as during SPIN listening. This could result in an underestimation of the auditory abilities in persons with lower linguistic skills, which is, for example, the case for children with CIs (Kenett et al., 2013), especially in SE (Boons, Brokx, Dhooge, et al., 2012; Kenett et al., 2013; Yehudai et al., 2011). A SPIN test with even lower linguistic demands could be more suitable in this situation.
The digit triplet test (DTT) is one of the paradigms used to assess the SPIN understanding abilities with little influence from linguistic skills (Cullington & Agyemang-Prempeh, 2017; De Graaff et al., 2018; Kaandorp et al., 2015; van Wieringen et al., 2021). The DTT is designed as a self-test, meaning it is made to determine the speech reception threshold (SRT) without a test leader. The DTT uses a closed-set paradigm with digits, which are easy, familiar words (Smits et al., 2004). It limits the effect of top-down processes like linguistic skills and allows the DTT to be used as a baseline for speech recognition (Cullington & Agyemang-Prempeh, 2017; De Graaff et al., 2018; Kaandorp et al., 2015). The DTT has already been investigated for adults using CIs and HAs, which showed the DTT useful for, for example, telehealth (Cullington & Agyemang-Prempeh, 2017; De Graaff et al., 2018; Kaandorp et al., 2015). Therefore, the DTT can potentially be used for children with CIs in ME and SE as well. Nevertheless, it is not entirely ruled out that some cognitive abilities influence the DTT when done as a self-test. For example, verbal working memory is needed to store the three-digit sequences while searching for the response buttons, and visual working memory is needed to remember the places of the buttons to reduce visual searching time. Processing speed is required to rehearse the digits so they are not forgotten while searching for the correct buttons. Reasoning and fluid intelligence are potentially needed to understand the test instruction when performing a self-test. Lastly, the DTT takes around six minutes and therefore requires maintained attention. When a self-test takes too long, younger children lose attention, resulting in unreliable test results (Van den Borre et al., 2022).
The Lilliput and the DTT can both be highly valuable in the rehabilitation and follow-up of SPIN understanding of children with a CI. The DTT can be particularly useful when testing at home, for example, in between appointments with a clinician, or when the patients have limited linguistic skills, to estimate the overall bottom-up auditory skills of children with CIs and can be valuable for providing frequent follow-up of the performance of CI-users. An open set, monosyllabic word-in-noise test, such as the Lilliput, can be helpful when estimating auditory skills in detail, such as difficulties with specific phonemes, when the linguistic skills of the patient allow it and a clinician can be present. However, the feasibility and validity of neither of these two tests have previously been investigated in children with a CI in ME and SE education.
Therefore, this study investigated whether the DTT self-test and a monosyllabic word test could be used as SPIN tasks in children with CIs in elementary school, in ME and SE, both native and non-native Dutch speakers. The performance was compared to that of an age-matched group of children with NH in ME. To make an adequate comparison possible, we first investigated the demographic characteristics of the three groups.
The performance on the DTT and the Lilliput were compared between the groups by investigating whether all children were familiar with all the digits and their written forms, whether they could remember full triplets, and by studying and comparing the SNR’s SD in the trials of the test used for the SRT calculation and the measurement error between test and retest to measure the test result's stability and reliability, respectively. Moreover, the configuration of the adaptive staircases was analyzed to investigate the potential effects of attention loss. Furthermore, we investigated the correlation between the results of the DTT self-test and the monosyllabic word, administrator-controlled test with an open-set format to investigate the extent to which the results on both tests agree for NH children and children with CIs.
Additionally, the relation between SPIN understanding, measured with the DTT and the Lilliput, and non-verbal working memory, processing speed, and reasoning was investigated as we expected these specific cognitive functions to influence the DTT as a self-test. We expected non-verbal working memory and reasoning to play less of a role when assessing SPIN with the Lilliput, which only required repeating one word at a time. However, the open-set response format of the Lilliput requires retrieval from the lexical memory which is also related to processing speed (Blomquist et al., 2021).
Participants
Ninety children aged between 5 years, 9 months, and 12 years were tested, of whom 25 children had two CIs (bilateral stimulation) and 5 had one CI and a hearing aid (bimodal stimulation), and 60 were NH in both ears (PTA0.5−4 Hz < 30 dB HL, measured with headphones in a silent room). All children with CIs were recruited from the ENT department of the University Hospitals Leuven or centra for ambulant rehabilitation in Flanders. All children with NH were recruited from Flemish schools. All children were in elementary school. The children with CIs were using their devices for at least 3 years. Every child was implanted in Flanders and was enrolled in the Flemish (Dutch) education system at least since then. Children with additional disabilities unrelated to their hearing loss, such as autism spectrum disorder, were not included. Eleven children with CIs attended SE, and 19 attended ME. The demographics are given in Table 1. More detailed analyses of these demographics and their relation to the test measures used in this study are given in the “Results” section.
Demographics for NH Children, Children with CIs in ME, and Children with CIs in SE. The 70 dB nHL Cutoff was Chosen Based on the Reimbursement law for CIs in Belgium, Which States that Newborns Qualify for Reimbursement When Peak V of the Brainstem Evoked Response Audiometry is Above 70 dB nHL in Both Ears. The Communication Mode ‘Oral + Gestures’ Includes All Types of Gestural Language Such as Total Communication and Official Sign Language.
All parents of the participating children provided written informed consent. The Ethics Committee approved the study of the University Hospitals Leuven (approval no. B322201731501). The participating children received a small toy after testing, and the travel costs were refunded when the participants came to the hospital for testing.
Methods
Materials
The protocol consisted of five tests, including two SPIN and three cognitive tests. The children did the different tests in randomized order.
SPIN Understanding Assessment: Digit Triplet Test and Lilliput Monosyllabic Word Test
The SPIN tasks were administered using direct streaming from the tablet to the CI's sound processor for children with CIs. The CI was put into the streaming mode, meaning that during the streaming of the sounds, surrounding noises were not transmitted through the CI. An HDA-200 headphone was used for the NH children to avoid disturbing surrounding noise. For children with only one CI, the sounds were streamed only to that CI. The speech sounds were never presented to the HA. For children with a bilateral CI, sounds were streamed to both CIs, sequentially per ear with monaural presentation or binaurally, depending on the test. For children with CIs, calibration was done for the individual streaming devices to ensure that differences between manufacturers were considered. The calibration was kept constant for NH children as all children used the same tablet and headphones.
The first SPIN task was the Flemish-Dutch version of the DTT, developed in 2013 (Jansen et al., 2013) and modified later to increase feasibility and reliability (Denys et al., 2019). The DTT used 17 sequences of three digits in noise. The DTT used an adaptive procedure, starting at an SNR of 0 dB SNR. The first five triplets were scored with triplet scoring to diverge rapidly to SNRs close to the actual SRT. The following trials were scored with digit scoring to obtain high reliability. Step size was calculated with the formula
Secondly, the Lilliput monosyllabic word test was administered in an administrator-controlled version in which the child had to repeat the presented word, after which the administrator scored the items (van Wieringen & Wouters, 2022). The first word was used to familiarize the child with the test and was therefore not scored. The other eleven words were scored with phoneme scoring (three per word). Every child completed six lists. For children with NH, three lists were presented at an SNR of −10 dB SNR (Lilliput −10 dB SNR), and three had an SNR of −5 dB SNR (Lilliput −5 dB SNR). The first list per SNR was used as training. SNRs were chosen based on the average values (−5.7 dB SNR for 4-year-old children and −10.3 dB SNR for adults) observed previously (van Wieringen & Wouters, 2022).
For children with CIs, three lists were presented at an SNR of 0 dB SNR (Lilliput 0 dB SNR), and three lists were presented at an SNR of +5 dB SNR (Lilliput 5 dB SNR). These specific SNRs were chosen based on former research showing average SRTs to be around 5 to 10 dB SNR poorer for children with CIs than the SRTs found with similar material for their NH peers (Ching et al., 2013; Zaltz et al., 2020). The first list per SNR was used as training.
The Lilliput result was the average of the percentage correct calculated for the last two lists. A separate result was calculated for both SNRs. The average result was transformed with a rationalized arcsine unit (RAU) transform to make them more suitable for statistical analysis (Hoen, 2015; Studebaker, 1985).
Cognitive Tests: Matrix Reasoning, Picture Span, and Rapid Automatized Naming Task
Fluid intelligence and reasoning were assessed with the Wechsler Intelligence Scale for Children (WISC)-V Matrix Reasoning (Wechsler, 2014). During Matrix Reasoning, children were presented with a series of incomplete matrices, each of which was a series of abstract patterns and designs. Children were directed to select the best from five-item choices to complete the matrix. The child was asked to select the missing piece from various options. The result was the scaled score between 0 and 20 converted with the age-dependent norm tables of the WISC-V. A scaled score between 1 and 7 is below average and corresponds to a percentile rank between one and 16. A scaled score between 8 and 12 is described as average and has a corresponding percentile rank of 25–75. A scaled score between 13 and 19 is above average and corresponds to a percentile rank between 84 and 99. The WISC-V Matrix Reasoning is a fluid intelligence and reasoning test that measures visual processing and abstract spatial perception (Wechsler, 2014).
Non-verbal memory was assessed with the WISC-V Picture Span (Wechsler, 2014). During the Picture Span, the child saw a series with a variable number of pictures, which they had to memorize and later pick from a series with other pictures with the same or longer length. The shortest series consists of one picture and the longest series consists of eight pictures. The number of pictures in every series is predefined in the WISC-V Picture Span, so every child receives the same sequence of picture series. For the first three series, the pictures’ order was unimportant. For the series from 4 to 26, the maximum number of points per series was two. To get two out of two, the child had to identify all the pictures in the correct order. If the child identified all the pictures but selected them in the wrong order, they got one out of two points. The child got zero points when one or more pictures were identified incorrectly. The test was stopped when the child scored zero on three consecutive trials. The Picture Span is a test for non-verbal working memory and is scored with the same type of scaled scores as described for the WISC-V Matrix Reasoning.
Processing speed was assessed with the Rapid Automatized Naming (RAN) Task (Denckla & Rudel, 1974, 1976). The RAN task contains four parts: object naming, color naming, letter naming, and number naming, shown on four different stimulus cards. Each stimulus card consisted of five different items, each replicated ten times. The child was instructed to name all the items on the paper as fast as possible. The result was calculated with the formula
Statistical Analysis
Statistical analyses were performed using R and R studio (Studio, 2020). Normality was checked with Shapiro-Wilk normality tests and Q-Q Plots. Possible inter-correlations between the predictor variables were checked. The following paragraphs describe the specific statistical analyses per research question.
Description of Demographics
Possible differences between children with CIs in ME and in SE in hearing status at birth, communication mode, Dutch as mother tongue, and stimulation mode were investigated with Chi-Squared Test. Age, age at receiving the first implant, the difference between both implants, and the time of CI usage were investigated with Kruskal–Wallis tests.
The effects of hearing status at birth, communication mode, and stimulation mode on SPIN understanding were investigated with a Kruskal–Wallis test. Independent models were made with one of these demographics as an independent variable in each model, and the DTT results, the Lilliput 5dB SNR, or the Lilliput 0dB SNR as the dependent variable. Age at implantation and time between both implants were investigated with Simple Linear Models with one of these demographics as an independent variable in each model and the DTT, Lilliput 5dB SNR, or the Lilliput 0dB SNR as a dependent variable. Possible age effects on the RAN were investigated with multiple linear models including age and hearing (CI or NH) as independent variables and the results on the RAN subtests as dependent variables. Differences in results on the cognitive tasks between groups were investigated with Kruskal–Wallis tests when no age effect was present on that specific cognitive task. When an age effect was present, a multiple linear model was used with age included as an independent variable.
Feasibility of the DTT and the Lilliput for Children with CIs in ME and SE
Differences in average SPIN test results between education types were estimated with Kruskal–Wallis tests. Confusion matrices for the DTT results were constructed for NH children and children with CIs in ME and SE separately to investigate whether all digits were familiar to all groups.
Differences in the SD of the DTT adaptive staircases between NH children and children with CIs in ME and SE were investigated using Kruskal–Wallis tests.
The measurement error of the Lilliput and the DTT was estimated using the formula
The relation between the results on the Lilliput and the DTT was estimated with simple linear models constructed separately for children with NH and children with CIs. These models included the results on the Lilliput −5dB SNR or the Lilliput −10dB SNR as an independent variable and the results of the DTT at the best ear as a dependent variable.
Cognitive Effects Influencing Speech Understanding in Noise Measured with the DTT
Possible relations between the cognitive abilities and the results on the SPIN tests were estimated with simple linear models for children with NH and children with CIs. The models contained the DTT results or the results on the Lilliput −5dB SNR or the Lilliput −10dB SNR as the dependent variables and the different cognitive abilities as the independent variables.
Results
Description of Demographics
Children with CIs in the two different education types did not differ significantly in age (χ2 (1) = 2.7, p = 0.10), in age when receiving the first CI (χ2 (1) = 0.04, p = 0.85), age when receiving the second implant (χ2 (1) = 0.4, p = 0.54), time of CI usage (χ2 (1) = 1.6, p = 0.20), communication mode (χ2(1, n = 30) = 2.9, p = 0.09), and stimulation mode (bimodal/bilateral) (χ2(1, n = 30) = 0.72, p = 0.40). The hearing status at birth (χ2(2, n = 30) = 6.3, p = 0.04) differed significantly between both education types. In ME, five children were born NH, six were born with hearing loss ≤ 70 dB nHL, and seven were born with deafness. In SE, no children were born with NH, nine were born with hearing loss ≤ 70 dB nHL, and two were born with deafness. Whether or not children were native Dutch speakers did differ significantly between both education types (χ2(1, n = 30) = 9.7, p < 0.01). Of the children with CIs tested in SE, 73% spoke another language at home, including Arabic, Turkish, and Tigrinya. Of the children with CIs tested in ME, only 16% spoke another language at home, including Arabic, Chinese, and English.
The specific scores on the cognitive tasks for NH children and children with CIs in ME and SE are given in Table 2. The children with CIs in SE scored significantly worse on most tasks than their NH peers. Only processing speed for colors did not differ significantly between children with CIs in SE and children with NH. Children with CIs in ME scored only significantly different than NH children on non-verbal working memory. An age effect was observed for processing speed (Numbers: t (0.34, 87) = 8.2, p < 0.01; Letters: t (0.35, 86) = 7.8, p < 0.01; Objects: t (0.21, 87) = 4.3, p < 0.01; Colors: t (0.25, 87) = 5.1, p < 0.01).
Average Scores on Different Cognitive Tasks per Group. Statistical Test Results on the Differences in Scores Between Groups. Superscript ‘A’ Stands for ‘Age Dependent’.
SPIN understanding measured with the DTT was related to age at implantation (t (3.3, 28) = 3.4, p < 0.01), stimulation mode (χ2(1) = 4.1, p = 0.04), and communication mode (χ2(1) = 8.6, p < 0.01). SPIN understanding measured with the Lilliput at 5 dB SNR was significantly related to implantation age (t (12.2, 28) = -2.3, p = 0.03) and communication mode (χ2(1) = 8.1, p < 0.01). The Lilliput at 0 dB SNR was not significantly related to implantation age (t (14.8, 28) = -1.7, p = 0.11) but was significantly related to communication mode (χ2(1) = 4.3, p = 0.04). The Lilliput was not significantly related to stimulation mode. The time between both implantations and the hearing state at birth did not influence the SPIN understanding measured with the DTT or the Lilliput. Moreover, children who did not speak Dutch at home did not have significantly poorer SRTs on the DTT (χ2(1) = 0.02, p = 0.88) or significantly poorer phoneme scores on the Lilliput at 5 dB SNR (χ2(1) = 0.60, p = 0.44) or 0 dB SNR (χ2(1) = 0.74, p = 0.39) compared to children with CIs who did speak Dutch at home.
Feasibility of the DTT and the Lilliput for Children with CIs in ME and SE
All children tested in this research could perform a DTT self-test reliably, meaning that all children could obtain an SRT-value with an SD smaller than 3 dB. The average DTT SRT was −9.3 ± 1.4 dB SNR for NH children and −3.3 ± 4.4 dB SNR, and −1.9 ± 3.3 dB SNR for children with CIs in ME and SE, respectively. DTT scores of children with CIs did not differ significantly for the two education types (χ2(1) = 1.7, p = 0.19). The DTT scores for NH children and children with CIs in both education types are visualized in Figure 1. The average confusion matrices of these three groups show high diagonal scores, indicating a limited number of confusions between digits, Figure 2. Only the digit ‘4’ (‘vi:r’) was confused by children with CIs in both education types, with the digit ‘3’ (‘dri:’) more often than the chance level (10%). For children with NH, the digit ‘4’ is also often confused with the digit ‘3’, but this confusion is not above the chance level. Moreover, children with NH confuse the digit ‘1’ (‘en’) more often with the digit ‘9’ (‘neγən’), which is not included in the speech material set.

DTT SRTs for NH children (n = 60) and children with CIs in ME (n = 18) and SE (n = 12), boxplot with mean, SD, and 3*SD.

Average confusion matrix for NH children (Left) and children with CIs in ME (Middle) and SE (Right).
The average SD of the last twelve trials was 1.3 ± 0.3 dB, 1.5 ± 0.3 dB, and 1.5 ± 0.4 dB for children with NH, children with CIs in ME, and children with CIs in SE, respectively. The difference in SD was significant between NH children and children with CIs in ME (χ2(1) = 6.9, p < 0.01) and in SE (χ2(1) = 5.0, p = 0.03). The SD did not differ significantly for children with CIs in ME or SE (χ2(1) = 0.74, p = 0.39). No clear decreasing or increasing trend is visible in the average staircase, as seen in Figure 3.

Adaptive staircases for the DTT for NH children and children with CIs in ME and SE, on the y-axis, the difference is shown between the SNR in that specific trial and the eventual SRT of the subject calculated after testing.
The measurement error was 1.4 ± 0.1 dB, 2.0 ± 0.4 dB, and 2.0 ± 0.3 dB for children with NH, children with CIs in ME, and children with CIs in SE, respectively.
The number of times children identified the entire triplet correctly was 60 ± 11%, 58 ± 15%, and 56 ± 15% for children with NH, children with CIs in ME, and children with CIs in SE, respectively. The percentage of trials in which the entire triplet was identified correctly did not differ significantly between groups (χ2(2) = 1.2, p = 0.54). The DTT results did not differ across age (NH: t = -1.8, p = 0.08, CI: t = -0.03, p = 0.98).
NH children scored 80 ± 6% on the Lilliput at −5 dB SNR and 50 ± 13% on the Lilliput at −10 dB SNR. Children with CIs in ME scored 82 ± 14% on the Lilliput at 5 dB SNR and 67 ± 19% on the Lilliput at 0 dB SNR. Children with CIs in SE scored 76 ± 9% on the Lilliput at 5 dB SNR and 67 ± 8% on the Lilliput at 0 dB SNR. Children with CIs in SE did not score significantly poorer than children with CIs in ME on the Lilliput at 0 dB SNR (χ2(1) = 0.9, p = 0.33). Still, they scored significantly poorer on the Lilliput at 5 dB SNR (χ2(1) = 5.9, p = 0.01)

Lilliput scores at 0 and 5 dB SNR for children with CIs in ME (n = 18) and SE (n = 12), boxplot with mean, SD, and 3*SD.
The measurement error for NH children was 7.7 ± 0.9% and 8.3 ± 0.8% for the Lilliput at −5 dB SNR and −10 dB SNR, respectively. The measurement error for the Lilliput at 5 dB SNR was 7.5 ± 1.5% and 8.2 ± 1.4% for children with CIs in ME and children with CIs in SE, respectively. The measurement error for the Lilliput at 0 dB SNR was 8.9 ± 1.4% and 7.8 ± 1.5% for children with CIs in ME and children with CIs in SE, respectively.
The Lilliput results did not differ across age for children with CIs at 5 dB SNR (t(12.8, 28) = -1.4, p = 0.17) and at 0 dB SNR (t(15.5, 28) = -0.2, p = 0.88) and for NH children at −5 dB SNR (t(6.40, 58) = 1.4, p = 0.17) and −10 dB SNR (t(12.21, 58) = 0.2, p = 0.83).
For the children with CIs, the SRT-values of the DTT correlated significantly with the scores on the Lilliput (Lilliput 5 dB SNR: t (2.79, 28) = -5.3, p < 0.01, Lilliput 0 dB SNR: t (2.95, 28) = -4.7, p < 0.01). For children with NH, the correlation was not significant for the Lilliput at −10 dB SNR (t (1.31, 58) = 1.9, p = 0.06) nor for the Lilliput at −5 dB SNR (t (1.34, 58) = -0.6, p = 0.56). The relations between the DTT and the Lilliput results in different groups are visualized in Figure 5.

Relation between the DTT SRTs and the Lilliput scores for children with CIs and with NH, the gray area indicates the 95% confidence interval for predictions from a linear model. Lower values on the DTT indicate better SPIN understanding, and higher values on the Lilliput indicate better SPIN understanding.
Cognitive Effects Influencing Speech Understanding in Noise Measured with the DTT and the Lilliput
The SPIN understanding measured with the DTT was not significantly related to processing speed for children with CIs (t (3.8, 28) = -1.6, p = 0.13), but it was for children with NH (t (1.27, 58) = -2.8, p < 0.01). For NH children, the processing speed of numbers explained 11% of the variance in SPIN understanding measured with the DTT. The correlation coefficient between the DTT and processing speed for numbers is −1.0, meaning that for a RAN score lowered by one item/s, the DTT SRT would, on average, go down with 1.0 dB SNR. Processing speed for colors, objects, and letters explained between 6% and 12% of the variance of the DTT results for NH children (Figure 6). The results on the Lilliput were not significantly related to processing speed for children with CIs. For NH children, only processing speed for objects was significantly related to the SPIN results on the Lilliput at −5 dB SNR (t (6.2, 58) = 2.3, p = 0.03). Processing speed for numbers, letters, and colors showed no significant relation with the Lilliput scores at −5 dB SNR or −10 dB SNR. However,

Relation between the DTT, Lilliput −5 dB SNR, and Lilliput −10 dB SNR results and processing speed (RAN) for NH children.
Discussion
The main objective of the research described here is to investigate whether the DTT self-test and a monosyllabic word test, the Lilliput, could be used as SPIN tasks in children with CIs aged 6 to 12 years old in ME and SE, both native and non-native Dutch speakers. Results indicate that the DTT was feasible for all children tested in this study, as seen by the familiarity of all the digits, the high stability of the test results (<2 dB SNR), and a small measurement error (≤2 dB SNR). Moreover, the SRTs obtained for children with CIs in this study (average SRT: −2.8 ± 4.1 dB SNR) are very similar to the values obtained in previous research with adults with CIs (average SRT: −3.2 ± 4.4 dB SNR) (van Wieringen et al., 2021). In addition, remembering full triplets did not form a problem for any of the three groups, which was visible in a high percentage of trials in which all digits were identified correctly. Nevertheless, it is important to indicate that the DTT used in this research allowed for answering during stimulus presentation, meaning that the entire triplet did not need to be remembered until after the stimulus presentation. More problems could potentially appear when the child has to remember the entire triplet until after the full stimulus presentation.
Besides the general feasibility of the tests, some differences were observed between NH children and children with CIs in ME and SE. To start with, the digit ‘4’ (‘vi:r’) was confused with the digit ‘3’ (‘dri:’) by children with CIs in both education types more often than chance level (10%). This is likely because the vowel /i:/ in both tokens is similar and sounds louder than the surrounding consonants.
Secondly, the stability of the DTT, reflected by the SD with which the SRT-value is estimated, was 1.3 dB for children with NH and 1.5 dB for children with CIs. No differences were observed between the children with CIs in ME and SE. The difference in SD is significant between children with NH and children with CIs. The configuration of the average adaptive staircase of children with CIs did not indicate a clear increasing trend towards the end, which would indicate declining attention and motivation. It also did not show a clear decreasing trend, indicating that these children needed more training than their NH peers. Moreover, the difference in stability is only 0.2 dB, which is small compared to the average measurement error of between 1.4 and 2 dB and, therefore, clinically irrelevant.
The measurement error of the DTT was smaller for NH children than for children with CIs. A larger measurement error for listeners with hearing impairment was also determined in previous research (Denys et al., 2019; Kaandorp et al., 2015) and is related to the shallower slopes of the psychometric curves of hearing-impaired listeners compared to NH listeners. The measurement error of NH children is in line with the measurement error in other DTT versions and languages, which is around 1 dB (for a review: Van den Borre et al., 2021).
For children with CIs, the performance on the DTT was strongly related to the performance on the open-set monosyllabic word-in-noise task, the Lilliput. However, when assessing SPIN understanding abilities with the Lilliput, a significant difference of 6% was observed between the SPIN understanding performance of children with CIs in ME and in SE. This task uses meaningful words retrieved from memory (open-set). When using an open-set task, the listener has to hear the entire target word clearly enough to differentiate it from all other words in their vocabulary to be identified correctly at phoneme level. In contrast, for a test with a closed-set format, such as the DTT, hearing parts of the stimuli can already be sufficient to choose the correct response alternative (Buss et al., 2016). If children with CIs in SE show more difficulties with listening in noise at phoneme level than their CI-using peers in ME, these difficulties may be shown with the Lilliput but not with the DTT. However, as in this research, no data was available on the specific confusions made during the Lilliput, this hypothesis cannot be (dis)proved. Another explanation is that the results on the Lilliput rely more on the listener's linguistic skills. In this research, significantly more children in SE spoke another mother tongue at home than the children in ME. Even though bilingualism is not always linked to lower language levels (Sosa & Bunta, 2019; Thomas et al., 2008), bilingualism, other than the typical combinations of official Belgian languages (Dutch–French–German), is in Belgium, like in other European countries frequently the result of an immigration background. Immigration is frequently associated with poorer competence in the local languages and, eventually, poorer social integration, making it more difficult to access professional help and follow local audiological rehabilitation and follow-up programs. In turn, this could result in worse language and academic performance of the child with hearing impairment (Forli et al., 2018). Moreover, as SE in Belgium focuses specifically on language development, it is often the better choice for those with language difficulties (Boons, Brokx, Dhooge, et al., 2012; Yehudai et al., 2011). As the Lilliput uses existing words in an open-set response format, understanding the words can rely on linguistic skills on top of auditory skills (Wilson et al., 2010), making it more difficult for the group attending SE. This could also explain why a similar difference is not seen in the DTT results, as the DTT only minimally relies on linguistic skills (Potgieter et al., 2018).
Previous research with adults showed a high correlation between the DTT results and a sentence-in-noise recognition task (Smits et al., 2004; van Wieringen et al., 2021). Sentences are, of all types of speech material, the most closely related to capturing some real-world listening difficulties. Real-life SPIN understanding is related to top-down processes such as cognitive and language skills (Blomquist et al., 2021; Davidson et al., 2011; Eisenberg et al., 2016; Köse et al., 2022; Moberly, Houston, et al., 2017; Zaltz et al., 2020). We hypothesized that non-verbal working memory, processing speed, and reasoning could influence the results of the DTT and the Lilliput as well. For the DTT, reasoning and fluid intelligence are potentially needed to understand and perform the self-test. Non-verbal working memory is required to store the three-digit sequences, visual working memory to remember the places of the buttons and processing speed to rehearse the digits while looking for the buttons. For the Lilliput, we expected processing speed to affect the lexical retrieval, such as shown in the research of Blomquist et al., (2021). Both the Lilliput and the DTT were only influenced minimally by the cognitive skills investigated in this research. Fluid and non-verbal intelligence did not significantly influence the Lilliput or the DTT. Significant relations were observed between processing speed and the DTT for NH children. However, the correlation coefficient of 1.0 indicated that for every point (1 item/s) scored better on the RAN numbers, the DTT SRTs results changed with 1.0 dB SNR. The SD of the RAN score in the group of children with NH is 0.5 item/s, indicating that within 68% of the children, processing speed can cause a 1.0 dB SNR difference on the DTT, which is smaller than the measurement error of these tests and can therefore be seen as clinically irrelevant. For NH children, the processing speed of objects was significantly related to the results on the Lilliput at −5 dB SNR. However, the correlation coefficient is again small (8.0), which indicates that within 68% of the children, only a difference of 3.2% on the Lilliput was induced, which is, again, smaller than the measurement error. It is possible that we did not observe a relation between processing speed and the results of the Lilliput for children with CIs because of different types of speech material. While Blomquist et al. (2021) used sentence material, we used monosyllabic words. The use of sentence material is likely to be more cognitive demanding than short, single words and therefore more dependent on processes such as processing speed. However, the limited influence of cognitive abilities on the results of the Lilliput and the DTT indicates that both tests are probably less closely related to real-life situations where these cognitive abilities are needed. However, the high correlation between the DTT and a sentence in noise task in previous research indicates that the DTT is probably a good estimation of the auditory aspect needed for SPIN understanding (Smits et al., 2004).
Both the monosyllabic word-in-noise test and the DTT seem feasible tests to assess the auditory aspect of SPIN performance for children with CIs. Due to differences in test format, both tests can be useful in different situations. An open-set, monosyllabic word-in-noise task, such as the Lilliput, requires listeners to compare the stimulus word to all possible candidate words in their lexical memory. In contrast, in closed-set tests, the listeners need to make only a limited number of comparisons among the response alternatives shown at that moment. The advantage of an open-set test is the information it gives on the specific confusions made by the listener, giving insight into what phonemes are most difficult to that person specifically. However, an open-set response format relies more on linguistic skills and cannot be done at home as a test administrator is needed to score the answers. In contrast, a SPIN test with a closed-set format, such as the DTT, can be done as an adaptive self-test without a test administrator. Adaptive procedures are very efficient, and their results, SRT-values, are easily comparable with norm values or with results obtained earlier by the patient. In previous research with adults with a CI, the DTT was very useful for telehealth as it could be done remotely, reducing the number of times the patients have to drive to the clinic (Cullington & Aidi, 2017). Monosyllabic words, such as the Lilliput, are less suitable to be presented in an adaptive procedure, as the slope of the psychometric curve of the speech material is too shallow (about 6%/dB), and SRT-values would be estimated with too little precision (van Wieringen & Wouters, 2022). Therefore, the DTT can be particularly useful when testing at home, for example, in between appointments with a clinician, or when the patients have limited linguistic skills and an open set, monosyllabic word-in-noise test, such as the Lilliput, can be helpful when estimating auditory skills in detail (van Wieringen & Wouters, 2022), when the linguistic skills of the patient allow it and a clinician can be present.
Limitations of the Study and Future Research
Our study investigated the feasibility of the DTT and a monosyllabic, word-in-noise task, the Lilliput, to investigate the auditory aspects of SPIN understanding of young children with CIs, a population that is characterized by the large variability in cognitive and language skills (AuBuchon et al., 2015; Boons, Brokx, Dhooge, et al., 2012; Kenett et al., 2013; Kronenberger et al., 2013, 2018; Moberly, Pisoni, et al., 2017; Wass et al., 2008; Yehudai et al., 2011), whether or not related to specific demographics (Boons, Brokx, Frijns, et al., 2012; Choi et al., 2017; Dettman et al., 2013, 2016; Sarant et al., 2001; Tajudeen et al., 2010).
However, it is important to note that because the population of children with CIs tested in this research shows this expected, high heterogeneity, some of the estimations on the influence of specific demographics on the SPIN understanding performance have limited power and can, therefore, only be seen as an indication. For example, a relation between age at implantation and SPIN understanding performance was expected (Dettman et al., 2016; Tajudeen et al., 2010) and confirmed in our results. However, in our study, not all children were born deaf, meaning some could have experienced auditory input before cochlear implantation. Our sample size did not allow us to break down the analysis into smaller groups based on hearing status at birth.
Additionally, we investigated the use of two SPIN tests for children with CIs in ME and SE. It is expected that children in SE show lower language scores (Boons, Brokx, Dhooge, et al., 2012; Yehudai et al., 2011), which would make a test like the DTT that only requires the knowledge of digits up to ten and is therefore little influenced by language skills (Potgieter et al., 2018) particularly useful. Our research confirmed the hypothesis that the DTT was more suitable for children with CIs in SE than the Lilliput, a test that likely requires more language skills (Wilson et al., 2010). However, as we did not include any standardized language tests, we cannot give a conclusive answer on the exact influence of language skills on both SPIN tests. Furthermore, we did not include bilingual NH children, so that a comparison of performance between bilingual children with CIs and bilingual NH children is not possible.
Moreover, it is important to note that, even though the DTT is a good option for children with CIs with limited language skills, it could still be a highly challenging task for children with even lower language skills, for example, after recent immigration. For these children, alternatives such as fully language-independent testing (Van den Borre et al., 2022) or heritage language testing could be useful. In addition, our study excluded children with CIs with additional disabilities unrelated to their hearing loss. As 30% to 40% of children receiving CIs have a comorbid disorder (Johnson & Wiley, 2009), it would be interesting to perform follow-up research to investigate the feasibility and validity of the DTT and the Lilliput for this population as well.
Conclusion
This research showed that the DTT is a suitable SPIN test to investigate the auditory aspect of SPIN understanding of children with CIs in ME and SE in elementary school. A monosyllabic word test seems feasible for children with CIs in ME but is potentially influenced by linguistic skills, making it less suitable for children with CIs in SE. Both the monosyllabic word task and the DTT show little influence of cognitive abilities, making both tests useful in situations where the bottom-up auditory aspect of SPIN performance needs to be investigated or in situations where sentence-in-noise tests are too challenging.
Footnotes
Acknowledgments
The authors thank all the children, their parents, and teachers who participated in our research project.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a TBM-FWO grant from the Research Foundation-Flanders (grant number T002216N). The authors acknowledge the financial support from the Legacy Ghislaine Heylen (Tyberghein). NV has a senior clinical investigator fund (Flemish Research Foundation 1804816N).
Flemish Research Foundation, Legacy Ghislaine Heylen (Tyberghein), TBM-FWO (grant number 1804816N, T002216N).
