Abstract
In the scope of anxiety disorders in childhood and adolescence, selective mutism is characterized by a consistent failure to speak, interfering with social communication and academic performance which invariably requires clinical interventions. Since children and adolescents with this condition may present cognitive difficulties, there is a need for further investigation, with the aim of identifying a neuropsychological profile. Traditionally, the most used interventions in the literature involve drug treatment as well as cognitive behavioral therapy. In support of traditional treatment, new and sophisticated technologies are being explored to improve the results already achieved. Examples of these technologies involve software systems and augmented reality used to support cognitive behavioral therapy. In line with new technologies, this article presents an unprecedented clinical trial, in the context of neuropsychological assessment, involving the use of a humanoid robot as a verbal communication tool between the psychologist and the child. The pilot study involved four children with selective mutism, where in two cases the approach showed very promising results. In the other two cases, the children remained mute and chose not to interact with the robot. The results of this initial clinical trial motivated the team to evolve the experiment by including new cases of selective mutism.
Introduction
Selective mutism (SM) is a rare disorder linked to anxiety. It is defined by the inability to speak in specific social situations, even when the person shows normal language skills. SM is commonly associated with extreme shyness, fear of social embarrassment, obsessive-compulsive tendencies, isolation, clingy behavior, and temper tantrums. Additionally, individuals with SM often receive diagnoses of other anxiety disorders such as social anxiety disorder and separation anxiety disorder. 1
SM is categorized as an anxiety disorder in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5). 2 This categorization reinforces approaches that emphasize the treatment of SM in conjunction with the treatment of underlying anxieties. Symptoms generally manifest in children aged between 2 and 5 years 3 but are often only clinically recognized once children start attending school.4,5
Children diagnosed with SM may have difficulty in certain situations, such as school, restaurants, shopping malls, and parks. 6 SM is a rare condition, with a prevalence of < 1% in the population. 4 Typically, the disorder follows a natural course of remission after 6 months. However, persistent symptoms lasting longer than 6 months or the presence of symptoms after the age of 10 are considered unfavorable prognostic indicators. 7
The most extensively researched cognitive ability in children with SM is language, as per the findings of McInnes et al. 8 Besides Language deficit, motor challenges, and social difficulties have also been related to SM, as reported by Kristensen and Oerbeck. 9 In clinical practice, individuals with this disorder show inhibition, shyness, and a tendency to remain silent. They may exhibit signs, such as avoiding eye contact, feel embarrassed, and showing reluctance in separating from their parents. 10 The result of these symptoms is a communication barrier that makes it difficult to properly assess and treat the disorder. Consequently, the strategy for evaluating and following up on the individual’s progress primarily involves their parents and teachers.
In this process, neuropsychological assessment is used as a way of obtaining cognitive and behavioral mapping that makes it possible to describe aspects of cognition, personality characteristics, social behavior, emotional state, and adjustment conditions. This information is essential for understanding functionality and conditions for adapting to practical life.
The objective of the pilot study presented here is to evaluate the use of a humanoid robot as an aid tool in verbal tests for the neuropsychological assessment of children with SM.
We consider that humanoid robots have the potential to interact naturally with patients, creating a more comfortable and less intimidating environment than interacting with a human. Additionally, they can be programmed to provide positive and encouraging feedback, which can help reduce children’s anxiety during the assessment.
To the best of our knowledge, this proposal is unprecedented in SM neuropsychological assessment, where interaction is traditionally carried out through writing by the child or through the intermediation of parents or guardians.
It is important to highlight that a complete analysis of this proposal considering statistical analysis and control groups is unfeasible due to the rarity of the disorder. In fact, in a period of 10 years of work in the psychiatric outpatient clinic, the neuropsychologist involved in this work identified only 14 cases of the disorder. In this scenario, this article reports a pilot study involving just four research subjects.
This article is organized as follows. In the “Related work” section, some studies related to the use of computational or robotic agents in SM treatment are presented. In the “Methods” section, we describe the methodology adopted to use a humanoid robot as an intermediary agent in the interaction with the child as well as a comparison with other methodologies. In the “Case study” section, the implementation environment, the characteristics of the research subjects and the experiments carried out are presented. Furthermore, the results achieved in the neuropsychological assessment stage are also discussed. Finally, in the “Conclusion and future work” section, we present the conclusion and future perspectives for use in the proposed system.
Related work
A bibliographic search carried out in the main collection of Web of Science, covering all the years (1945–2023), with the search key child* AND “selective mutism” AND
The first reference identified involves the work of Ooi et al., 12 in which a randomized clinical trial (RCT) was proposed using the cognitive-behavioral therapy (CBT) in children with SM. The information technology aspect is associated with the use of CBT with the support of a computer application. It involved 21 children assigned randomly to either the CBT therapy experimental group (11) or the control group (10). The findings of this research study suggest that utilizing an online CBT program has the potential to enhance oral communication and alleviate the symptoms of SM.
The second reference, proposed by Tan et al., 13 presents a clinical trial based on Ooi et al.’s previous work. 12 The study highlights the challenges of using CBT, such as the considerable effort required to expose children to anxiety-inducing situations in traditional clinical sessions. The authors suggest that virtual reality exposure therapy (VRET) could be a useful extension of CBT. This study involved 20 children who completed six therapist-supervised VRET sessions. All participants successfully completed the sessions and the results indicated the viability of the strategy as an adjuvant modality (and not a substitute) for CBT in the treatment of SM.
Finally, in the third reference identified in the research, the work of Manivannan and Fails 14 highlights the main strategies for using technology as a tool to help children with SM. One of these ways is to use technology to relieve social or environmental stress. Another approach is to establish communication between teachers and children through emails, video conferencing or audio messages. It’s important to note that these children usually use tablets and smartphones for entertainment and sometimes to communicate with teachers or relatives. However, they rarely use these media to communicate with other children, preferring games that require full attention.
Neuropsychological assessment strategies in children with SM
Neuropsychological assessment is used to map cognitive and behavioral aspects, including cognition, personality, social behavior, emotional state, and adjustment. It is essential for understanding functionality and adaptation in daily life. The assessment involves interviews, questionnaires, and tests to analyze the patient’s performance with relative precision. However, it is crucial to prepare the patient for the assessment by seeking their cooperation and motivation to alleviate anxiety, which can impact performance, particularly in cases of SM. 15
In the literature, there is an absence of works that directly address neuropsychological assessment in the context of SM. In fact, there is still a lack of consensus in the community regarding instruments for screening or diagnosing the core symptoms of SM. 16
Klein et al. 17 also suggested that there are few studies on standardized and normative measures for expressive language competence within the scope of SM. Standardized tests are necessary to estimate a person’s abilities in relation to their age group. However, applying them without adaptation becomes unfeasible as children with SM do not communicate with professionals or in certain environments. The utilization of new technologies provides an alternative to overcome this barrier.
This study employs a set of neuropsychological tests in conjunction with a humanoid robot to aid in the assessment of expressive language competence of children with SM.
Methods
In this section, we describe the methodology used for utilizing a humanoid robot as an intermediary between a psychologist and a child during neuropsychological assessment sessions.
Firstly, we present the procedure followed by patients when treating their disorders in the outpatient clinic. Next, we detail the neuropsychological evaluation assessments that were administered with the robot assistance. Following this, we present the system’s design and an overview of the interface components.
Therapeutic procedure phases in the outpatient clinic
To clarify the moment in the therapeutic process where the humanoid robot was inserted as a support tool, we present in this section the steps taken by patients at the Psychiatry Institute.
Figure 1 shows the phases of this process. It is important to highlight that the robot is only used in phase 3 where the neuropsychological assessment takes place.
Phase 1: The person responsible for the child who spontaneously sought treatment for their children participates in a telephone screening. Once accepted into the program, they are directed to a psychiatric evaluation. Phase 2: Psychiatric assessment is carried out according to DSM-5
2
criteria, through review of medical records, clinical interview, objective anamnesis and psychic examination. Once they meet the diagnostic criteria for SM, they are referred for neuropsychological evaluation. Other disorders, such as psychotic and behavioral disorders, have a different evolution and are not targets of this study. Phase 3: Neuropsychological assessment is carried out using tests adapted to the symptoms of SM. The objective is to investigate cognitive performance, emphasizing attentional, mnestic and executive functions. After carrying out this assessment, the result becomes part of the medical record and the child returns to the psychiatrist to continue treatment. Phase 4: Psychiatrist defines the treatment for the disorder. At this stage, those responsible for the child are called for a feedback interview and information about the sequence of the therapeutic process.

States of the process for selective mutism disorder at the Psychiatry Institute.
Neuropsychological assessment tests
Below are the subtests used:
The first subtest used was Vocabulary, which is part of the Wechsler Abbreviated Scale of Intelligence (WASI).
18
This scale is made up of four subtests, two verbal and two non-verbal. This scale provides us with an estimated intellectual quotient value, which is widely used in research. The activity consists of explaining in an oral way the meaning of the words presented to the subject. This test aims to assess the individual’s ability to express verbal knowledge and the information store they possess. In addition, it serves as a gauge of crystallized and general intelligence, allowing cognitive abilities such as memory, learning capacity, concept formation, and language development to be assessed. The second subtest is Memory for History, which is part of the Wide Range Assessment of Memory and Learning (WRAML).
19
This scale is made up of six subtests that offer verbal or visual stimuli. The Memory for History subtest consists of an activity in which the subject is asked to repeat, as closely as possible, the story they have just heard from the examiner. Two stories are presented with content appropriate for the subject’s age. After a controlled time interval, the subject is asked to tell the same stories again. Next, multiple-choice questions are asked about each of the stories, in which the subject must recognize the correct answer. This subtest makes it possible to assess memory for contextualized verbal stimuli, in the short and long term, as well as recognition memory for these contents. The third subtest used is Verbal Learning, which is also part of the Wide Range Assessment of Memory and Learning (WRAML),
19
mentioned above. The activity consists of presenting the subject with a list of words four consecutive times. On each attempt, the subject is asked to repeat as many words as possible that they have just heard from the examiner. After a controlled time interval, the subject is again asked to say the words he remembers. Next, several words are spoken so that the subject recognizes the ones they heard previously. In this way, this subtest makes it possible to assess the learning of verbal content, immediate recall, and recognition ability.
System architecture
The proposed architecture is based on the method known as the Wizard of Oz proposed by Riek. 20 In this approach, an operator remains in a remote position and manipulates a robot through real-time commands, establishing a behavior and intelligence that autonomous robots are not yet capable of. The Wizard of Oz method requires the presence of three agents: the magician, Oz, and the user. In the proposed system, the psychologist represents the magician, who remotely operates Oz. Oz is represented by the robot that enables the child to interact with the magician. This scenario creates an illusion in which the user believes that the robot (Oz) is alive and acting alone, as illustrated in Figure 2.

Illustration of Wizard of Oz method.
System components
For the Wizard of Oz method to be implemented, two environments are needed. One is where the child interacts with the robot (interview room) and the other where the psychologist operates the robot (operating room). In addition, other resources are needed to complement the system. These include a multimedia system responsible for capturing audio and video from the interview room, and another one responsible for sending audio from the operating room to the interview room.
Figure 3 shows the equipment used in the system, such as humanoid robot, notebook, router, headset for the psychologist, lapel microphone for the child, mini bluetooth speaker, and cell phone used as camera for viewing the experiment room. Some applications are also needed, such as MorphVox Pro 21 for altering the psychologist’s voice preventing it from being recognized by the child, OBS Studio 22 for recording the interaction and our interface, named GUIPsyin, 23 which allows the psychologist to operate the robot remotely. A video that presents the experiment scenario can be accessed at https://bit.ly/3ztz0mP.

Architecture of the system used for the Wizard of Oz method.
Humanoid robot
Figure 4 shows the NAO humanoid robot, developed by Aldebaran Robotics. 24 It is a robot that can be programmed using four languages: C++, Java, Javascript, and Python. In addition to specific programming, it is also possible to create behaviors for the robot using Coreographe, 24 a cross-platform desktop application. Another option for creating behaviors is to integrate the Python code generated by the programmer with NAOqi, 24 a framework that enables the robot to interpret and execute commands.

Humanoid robot NAO.
GUIPsyin, an interface for NAO robot manipulation
As mentioned above, in order for the psychologist to be able to remotely manipulate the robot, a graphical user interface (GUI) called GUIPsyin was developed (Figure 5). This interface allows the psychologist to perform pre-programmed movements, as well as providing a first-person view of the robot. In this way, the psychologist operating the robot has the experience of being in direct contact with the child.

GUIPsyin: Interface developed for manipulating the NAO robot.
The interface is made up of five modules, each responsible for one functionality. The first module “Settings,” identified by the number 1 in Figure 5, is responsible for connecting to the robot. Once the connection has been established, the operator has access to the robot’s front camera.
In the second module, called “Session” and identified by the number 2 in Figure 5, we have the functionalities for operating the robot. The operator has a set of buttons for controlling the robot’s head and for initiating engagement with the subject with whom the robot will interact. The third module (block number 3 in Figure 5) is associated with the “Emergency button” which switches off the robot’s motors.
In the fourth module, called “Movements” (block 4 in Figure 5), a set of buttons are provided, each one representing a behavior that the operator can use as shown in Figure 6.

Examples of behaviors that the robot can express through the movement module.
Finally, still in Figure 5, we have in block 5 the warning area which signals to the operator the system execution status.
Comparison with other methods
In the “Related work” section, we reveal that no works were found associating SM and humanoid robots. When expanding the scope of research to consider information technology and SM, only three studies were located. Two of these works12,13 involve the use of cognitive-behavioral therapy (CBT), on the other hand, the third work 14 evaluates the use of technology as a path to the integration of children excluded by the symptoms of the disorder.
Unfortunately, none of these works cover neuropsychological assessment (Step 3 in Figure 1), the focus of this research. In this context, the comparison of our proposal can only be carried out with the traditional method of neuropsychological assessment, naturally, adapted for the symptoms of SM. In this adapted assessment, communication with children occurs exclusively through signs or the use of parents as intermediaries or also, for literate children, the use of writing. This scenario leads the evaluator to feel distressed due to the child’s silence and the difficulty in interacting during the evaluation.
It is worth mentioning that, according to the testimony of the neuropsychologist who participated in the experiments, the presence of the robot allowed the dynamics of communication to be accelerated, in addition to enabling the observation of other elements such as facial expression and body posture during the tests.
Case study
This study was approved by the Research Ethics Committee of the Hospital das Clínicas of the USP Medicine Faculty – HCFM/USP (opinion 5.602.817) and registered on the Brazil Platform 25 under the Certificate of Submission for Ethical Appraisal – CAAE # 56817022.7.3002.0068.
As stated before, this pilot study describes the use of a humanoid robot as an auxiliary tool for neuropsychological assessments of verbal tasks in children with SM. The neuropsychological assessment tests were only carried out after the guardians had signed a consent form and the child had agreed to interact with the NAO humanoid robot.
In the sequence a video is presented in which a robot invites the child to interact with it. If the child agreed to participate, they and their parents were escorted to a room where the robot was statically positioned. The psychologist introduced the robot to the child and left the room.
Subsequently, the psychologist moved to the adjacent room where the robot control system was located, initiating the interaction. During the assessment, the psychologist begins by asking initial questions such as “What is your name?” and “How old are you?” Following this, a playful activity is proposed that utilizes the verbal tasks described in the “Neuropsychological assessment tests” subsection. The child performs the given tasks while the psychologist continuously monitors if the child wants to proceed. If the child loses interest in the activity, the interaction is stopped.
Participants
The participants, detailed in Table 1, were recruited over a 12-month period beginning in September 2022. The inclusion criterion was to select children aged between 6 and 12 years, with a primary diagnosis of SM, identified in the group of patients being treated at the Outpatient Clinic for Anxiety in Childhood and Adolescence at the Child and Adolescent Psychiatry Service from the HC/FMUSP Psychiatry Institute. Except for the age range restriction, no exclusion criteria were used in this study, especially considering the rarity of cases that met the DSM-5 2 diagnostic criteria for SM.
Participants description.
Results and discussion
As mentioned in the previous section, the interaction with the child, using the humanoid robot as an intermediary, was based on the application of the following tests:
Wechsler Abbreviated Scale of Intelligence (WASI) Wide Range Assessment of Memory and Learning (WRAML)
In the WASI, the robot was used to carry out the Word Vocabulary test. In this test, several words are presented to the child and they are asked to verbally describe the meaning of these words.
In WRAML, two verbal tests were applied, namely the Memory for Story and the Verbal Learning test. In the first test, the patient is exposed to two stories and subsequently asked to repeat them in as much detail as possible.
In the second test, a sequence of words is presented to the patient, followed by the question about how many of those words he could remember. This process is repeated three times, with the aim that with each repetition the patient is able to remember more words.
Table 2 shows a summary of the information associated with the verbal tests carried out with the patients using the NAO robot. This table shows the duration of the interaction with the robot, the different configurations of people in the interview room, the verbal tests carried out and comments on the interaction.
Summary of interaction with research subjects.
The following paragraphs provide additional comments on the behavior of some research subjects, complementing the information in Table 2.
On the first occasion that research subject P1 participated (25/05), he answered only “yes” or “no” to the Vocabulary subtest. In the subsequent sequence (29/06), he responded appropriately to the Memory for History subtest. The result of the test was not completely evaluated because he did not answer the first task appropriately. In the second interaction, P1 was visibly fatigued due to the physiotherapy session he had in the morning, which left him in no mood to interact with the robot.
Patient P2 was more enthusiastic about engaging with the humanoid robot, which made it possible to conduct the all verbal tests in a total period of 30 min. In this experiment, the use of the humanoid robot as a play agent allows the psychologist to establish a direct dialogue with the child to complete the verbal tests eliminating the need for adjustments, like using parental reports or other methods where the children do not speak directly with the psychologist.
In the second interaction (29/06), another psychologist was invited to be present in silence in the interview room. The idea was to see if P2’s behavior would change with the presence of a person outside of his environment. The P2’s attitude remained unchanged ignoring the other psychologist, again facilitating the completion of the Vocabulary subtest.
In the first interaction with P3 (27/07) resistance and certain apathy from the research subject were noted in the experiment. However, in the second interaction (24/08), P3 engaged in a dialogue with the psychologist with the help of the robotic agent, but no test was carried out, just an initial familiarization. It is worth mentioning that on that day P3 was taking an antidepressant (fluoxetine) which helps reduce anxiety and consequently reduces some SM symptoms. In the third interaction (28/09), the Vocabulary subtest was applied again, with the aim of verifying whether patient P3 would perform better. As shown in the table, all proposed verbal subtests could be completed in a total time of 40 min. In this interaction, P3 was still taking the medication.
Finally, research subject P4 demonstrated apathy and chose not to interact with the NAO robot. She later informed his mother that he was feeling scared. After a second questioning by the mother, P4 admitted that she recognized the psychologist’s voice coming from the next room. As a result, in the second interaction (28/09), P4 did not perform the test with the robot, and the verbal subtests were administered directly by the neuropsychologist using her mother as an intermediary.
The results of each subtest have been compiled in Table 3. The verbal tests were carried out with the help of the NAO robot, while the non-verbal tests were conducted directly by the neuropsychologist. It can be seen that most of the children performed below average in the verbal tasks, which was expected due to their SM condition. However, with the support of the robot, the neuropsychologist was able to interact directly with the child without the need for parental intermediation or the use of other communication resources.
Test results.
Note:
(I) Due to a system failure, the WRAML – Verbal Learning test was carried out directly by the neuropsychologist.
(II) The patient refused to interact with the robot, because she suspected that the robot’s voice was that of the psychologist.
WASI: Wechsler Abbreviated Scale of Intelligence; WRAML: Wide Range Assessment of Memory and Learning.
Conclusion and future work
In this work, we propose a computational system that uses a humanoid robot to mediate between a psychologist and a child diagnosed with Selective Mutism during the neuropsychological assessment process.
The interaction with the robot enables direct communication with the child in verbal activities. It’s worth remembering that traditional interaction normally takes place by adapting the tasks, such as asking the child to write down the answers (if they are literate) or guide the child to give the answers to their parents. Both strategies increase the application time and the child’s fatigue, as well as taking care to avoid interference by the parents in the answers, since the professional would not be present in the room.
This pilot study showed that using a robot to assess a child’s verbal communication could be a viable alternative, however, due to the rarity of the disorder, the limited number of cases prevented consistent validation of the strategy. The arrival of new cases of SM at the Institute of Psychiatry will enable a deeper analysis of the issue. We also believe that with the dissemination of this study other research groups can replicate and refine the research with a larger group of local subjects.
In addition to the inclusion of more subjects, new challenges must be overcome, such as adapting assessment protocols to explicitly include interaction with the robot. Despite this, under the careful supervision of psychologists, research in this area continues to advance and offers new perspectives for the diagnosis and treatment of SM in children.
However, more research is needed, and based on this pilot study, some new lines of research should also guide our future work:
Confident: So far a guardian remains in the interview room. Removing this guardian can increase the child’s confidence and stimulate a richer dialog with the humanoid robot. This dialog can be directed towards determining the causes of the underlying anxiety that often accompanies SM. Progressive: The humanoid robot can encourage the child to interact with human beings invited to take part in the experiment. In this process, the robot reduces its speech to the detriment of the new persons taking part in the interaction. Note that the ultimate goal is to evolve the process towards removing the robot from the environment. Community: Instead of using individualized interactions, other research subjects who have already interacted with the robot are invited to take part in group sessions. Reverse: The child with mutism is trained to operate the robot remotely. The aim is to use the robot as an instrument to insert the child into environments where they traditionally remain silent. The strategy uses the advantages of the Wizard of Oz to motivate the child to overcome their anxiety.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Msc. Diogo H Godoi is funded by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001, the Brazilian National Research Council (CNPq), the São Paulo Research Foundation (FAPESP) 2017/01687-0 and FAPESP 2018/25782-5, and the National Institute of Science and Technology (INCT-CAPES-CNPq-FAPESP) under the grants #88887.136349/2017-00, CNPq 465755/2014-3 and 2014/50851-0.
