Abstract
Objectives
Art practice is known to yield numerous benefits, and some studies have explored its role in academic learning and child development. The Paris Philharmonic initiated a project, called Exist with the Voice Together (EVE in French), which is focused on reinforced music and choral singing practice in French middle schools. This study aims to assess the impact of this program on adolescents’ self-esteem and oral skills.
Methods
Students (N = 92) were randomly included in the EVE program or a control group. Self-esteem was assessed using the Rosenberg scale, and oral reading skills were measured using a dedicated observation guide. Data were collected at the beginning of the project (T0), after one year (T1), and at the end of the project, after a year and a half (T2).
Results
Compared to the control group, the EVE group demonstrated more homogeneity in self-esteem results, even though no significant mean differences were found. Additionally, the EVE group had significantly higher oral reading scores than the control group across all three measurement times. However, the EVE group’s score trajectory showed a decline at the end of the program, which was not observed in the control group, whose scores remain stable.
Conclusion
This study highlights the benefits of art-based practice, such as choral singing, for adolescents, while raising questions about the most appropriate structure for collective projects at that age.
Introduction
Historically, as psychological research grew in importance, it quickly became connected to the educational field. The first cognitive ability tests, for instance, were designed to assist teachers in better identifying students’ needs and difficulties (Kamphaus et al., 2012). Throughout the 20th century, an increasing number of psychologists have devoted their work to exploring the interconnection between these two fields. This convergence even led to the creation of new research fields, such as educational psychology, acknowledging that “the human psyche development, if not confused, is closely interlaced with education” (Maia Filho & Viana Chaves, 2016, p. 316).
In France and worldwide, clinical psychology is now regularly employed to assess pedagogical and educational initiatives and their usefulness for children and adolescents attending school (see Chehaib et al., 2023; Räsänen et al., 2009; Sella et al., 2016; Welch et al., 2010). These diverse initiatives aim to make the educational field a domain of constant innovation and reflection, benefiting the students’ learning and/or well-being. It is critical to assess their efficacy and to facilitate their national dissemination if necessary.
Among the educational initiatives sometimes implemented in schools, art-based approaches to learning are often pointed out as innovative ways of learning as well as fostering group dynamics (see Kisida et al., 2020; Kumar et al., 2022). Music, theater, or visual arts are thus integrated into some schools as part of the school curriculum or made available as options, from primary school to high school. Most studies highlight their benefits for students, such as enhanced learning capacity, interest, involvement and satisfaction in class (Bowen & Krisida, 2019; Kisida et al., 2020; Leonard et al., 2018). This is further supported by the extensively researched association between art practice and well-being in individuals (Davies et al., 2015; Fancourt & Finn, 2019; Mastandrea et al., 2019). As an example, a study conducted in 2016 in 42 Houston public schools with 3rd- to 8th-grade students demonstrated the multiple benefits of arts education (music, dance, theater, and visual arts) for the students involved. These benefits include improvements in student discipline, writing achievement grades, and compassion for others (Bowen & Kisida, 2019). These results show that arts education contributes not only to academic learning but also to the development of social skills. However, studies on this matter also highlight variations across schools regarding the accessibility of arts education for students (Bowen & Kisida, 2019). Furthermore, although teachers in non-arts subjects generally hold a positive opinion on the use of arts in learning, they rarely include it in their classrooms (Kumar et al., 2022; Pantazidou et al., 2021).
Whereas arts in general are shown to have benefits for students when integrated into their education, it might be of interest to explore if specific types of arts yield specific benefits. This study specifically focuses on music and its role in learning and in the development of children entering adolescence. Practicing music and singing often involve bodily engagement, and choral singing in particular calls upon the body, voice, and movement while being part of a group. This can be especially challenging during adolescence, a time when both body and voice are undergoing significant changes (Özdemir et al., 2016; Papageorgi, 2020; Raoul et al., 2024; Roberts et al., 2024). Research in clinical psychology demonstrates the role of music and singing in body acceptance especially in early adolescence, as teenagers experience rapid bodily changes in the context of puberty (Brault, 2020; Miranda, 2013; Welch et al., 2010; Welch, 2011).
Furthermore, the past three decades have seen growing interest in evaluating the social impact of the arts (Belfiore, 2002; Bille & Olsen, 2018; Brown & Novak-Leonard, 2013; Lindström Sol et al., 2022; Majid et al., 2020; Matarasso, 1996; Reeves, 2002). This issue raises complex questions that span across the arts, humanities, economics, and politics. As noted by Brown and Novak-Leonard, “questions about how art affects audiences may never be fully answered” (2013, p. 223). One of the main challenges identified by researchers in this field is the selection of appropriate variables to measure the potential impact of the arts. While it is a commonly shared idea that the arts play a central role in shaping cultures and identities, measuring or evaluating their impact is complicated by the variety of factors at play. Once variables are selected, other questions emerge: What constitutes a positive outcome in the arts? Belfiore (2002) discusses the relationships between arts impact studies and social and economic policies, noting how the arts are often seen as “agents of social change” (p. 92) and therefore used in efforts to address social exclusion, a complex issue to which we cannot bring a unique response – whether by using arts or not. In the present study, with these concerns in mind, the methodology employed was designed to be as straightforward as possible in terms of what was measured and how these measures can be interpreted. Moreover, as discussed further in this article, the results of the present study underscore the complexity of evaluating the impact of the arts and the variety of factors that must be considered.
This study aims to explore the role of music and singing in early adolescence by evaluating a music program implemented in 11 middle-schools in the Parisian region of France. As part of its educational and pedagogical missions, the Paris Philharmonic (Cité de la Musique – Philharmonie de Paris) initiated a project called “Exister avec la Voix Ensemble” (“Exist with the Voice Together”), abbreviated as project EVE. Through this program, the Philharmonic offers to adolescents reinforced practice in music and choral singing. Adolescents enrolled in the program all benefit from one hour a week of choral singing practice, as well as regular rehearsals for a choral concert that gathers all 300 students involved at the end of the project. In each of the 11 middle schools where this program has been implemented, one class of either sixth-graders, seventh-graders, or eighth-graders is involved in the project. Thus, in total, 11 classes of students aged 11 to 14 are part of this program. Project EVE responds to a specific preoccupation within the French education system: ensuring that each student has access to arts education and practice, given its well-known developmental impact (Lukaka, 2023). Thus, specific attention is given to the bodily and emotional expression of these students, as well as the quality of the pedagogical relationship between students and the EVE speakers and artists. As the project aims to broaden students’ interests and enhance their well-being, interdisciplinarity is considered a key element in its implementation. This shows not only in the evaluation of this project performed by researchers in psychology but also in the professionals who engage with the students: chorusmasters, specialists in Dalcroze pedagogy (Juntunen, 2016), specialists in Alexander technique (Mayers & Babits, 1987), and music therapists. Musical repertoire is also thought to be diverse: Musical professionals offer students toe opportunity to work both on historic and contemporary pieces, from several continents. For example, they offer a Gregorian antiphon, also providing an opportunity to work on meditation and body posture; or on the popular Brasilian song Baianá.
An initial experimental phase was conducted from 2018 to 2021 within two primary schools by another research team. The findings demonstrated a positive effect of the EVE program on the social abilities of students aged 7 to 10, particularly in terms of sharing and cooperating behaviors (Aimé et al., 2021). The program also had a positive impact on school learning, with a significant difference observed in EVE students compared to the control group during a syntactic task, part of the test battery BMT-i, where participants had to repeat grammatically complex sentences (Aimé et al., 2021). Notably, these results were obtained despite the challenges posed by the COVID-19 pandemic and the associated health practices – including the students and speakers wearing masks. This first phase was conducted by a team of cognitive sciences researchers, and because of the observed impact on social abilities, school learning, and overall students’ wellness, it was decided that the second phase could benefit from the viewpoint of clinical psychology.
The second experimental phase took place between 2023 and 2024, involving middle-school students, to examine the effects of the EVE project on adolescents. The present study forms part of a broader research project assessing the EVE program, which was conducted by a team of researchers in clinical psychology and will not be discussed in this article. All team members have clinical experience working with children and/or adolescents, and the lead researcher of the study specializes in music and its role during adolescence. This study was implemented following a call for research projects issued by the institution overseeing the EVE program. As a result, the study was conducted by a different institution than the one managing the EVE project. The researchers were not responsible for defining the EVE program or its content, but rather for evaluating the impact of the project. In this part of the study, the primary aim was to determine whether choral singing could enhance oral academic skills and students’ self-esteem. We hypothesized that the EVE program would have a positive impact on both variables. While collective singing and individual oral reading are distinct activities that require distinct capacities, it can be hypothesized that the skills developed through choral singing – such as breath control, articulation and enunciation, bodily awareness – may transfer to and enhance oral reading skills (Aksoy, 2023; Gutiérrez Cisneros et al., 2023; Strait & Kraus, 2011). These hypotheses were defined by the researchers and submitted to the EVE program's management as part of the call for projects. However, neither the educational team nor the parents or students were involved in the definition of these hypotheses.
Methods
Research Protocol
Among the 11 middle schools participating in the EVE project, three were assigned to this study, with two classes of sixth-graders and one class of seventh-graders. All three schools are located in Paris or its immediate suburbs. Three assessments were performed between January 2023 and June 2024. The first assessment (T0) took place in January 2023, after one to two introductory EVE sessions. The second assessment (T1) took place in November 2023, in the middle of the project, approximately eight months after it began. The last assessment (T2) occurred in June 2024, at the end of the EVE project, after almost one year and a half of choral singings sessions. Timing of measurements was based on the school calendar rather than the EVE calendar due to organizational considerations. Thus, assessment times could not be planned to align with EVE rehearsals or gatherings.
The students participating in the EVE project were randomly selected by the school: A class was chosen at the beginning of the school year, and the project was proposed to the students and their parents. Prior experience in music or singing was not required to participate, and when assessed by music teachers at the start of the program, most of the students had no specific background in either. The EVE project was designed to last a year and a half. For this reason, the same group of students remained together in the EVE classes for two years, with only a few students leaving the project. This contrasts with the usual practice in France, where class compositions change every year.
In each middle school, a class was selected to be the control group. This control-group class was not part of the EVE project or any art-related activities other than the mandatory hour of music and visual arts. The control condition class had been selected by the educational staff, who ensured that the control-group class was similar to the EVE group in terms of general academic levels, population, number of students, and diversity. Due to field constraints, participants knew from the beginning of the study which group they belonged to. The EVE group was aware they were part of an educational project, and the control group was informed about their role in the study. However, one of the two researchers was blind to group membership. To improve the control group's involvement in the study and reduce bias, a museum tour was offered to control group students at the end of the study, after the last measurement time.
The study began with the following population: four classes of sixth-graders of which two were involved in the EVE project, and two classes of seventh-graders of which one was involved in the project. Participants were aged 11 to 13 at the beginning of the study, and 12 to 14 at the end of the study. The only inclusion criterion for participants was to be fluent in French.
Measures
To assess the participants’ self-esteem, the Rosenberg self-esteem scale was used. This scale was validated in its French version in the early 1990s (Gnambs et al., 2018; Rosenberg, 1979; Vallieres & Vallerand, 1990). It was self-administered at the beginning of each of the three assessment times. It is a 10-item scale measuring self-worth through both positive (e.g., “I feel that I have a number of good qualities”) and negative feelings (e.g., “At times I think I am no good at all”) about one'. Participants answer on a four-point Likert scale ranging from “strongly agree” to “strongly disagree.”
To evaluate oral reading skills, each student took turns standing up and reading a short text aloud in front of the rest of the class. Each student was given one of three possible texts to avoid repetition and learning effects, and minimize differences between the first and last readers. The texts were selected among those used in French classes for sixth-graders. Each assessment was conducted by two researchers to ensure good inter-rater reliability, who graded each student separately. The two researchers used an assessment grid, slightly adapted from the one used by the schools’ teachers at the end of middle school for the certificate (Supplementary Material). One item – evaluating adherence to a set time limit – from the original grid was removed, as it was irrelevant to this study. Additionally, some of the observations within the item “Reading quality” that relied on a student-written text specific to the original exam context were also excluded (e.g., “never-ending sentences” or “use of useless expression such as “‘like, hum’”). This choice of tool originates from the researchers’ wish to design a protocol that closely reflects school concerns. The assessment grid includes six items with possible scores ranging from 0 (“poor performance”) to 10 (“excellent performance”). The six items were assessing preparation before reading and stress management; reading quality; audible voice; body anchoring and coordination; support from breathing; and gaze and quality of interaction. The two evaluations of each student by both researchers resulted in a score based on the mean of the two evaluations.
Ethical Approval
This study received the validation of the ethics committee of the Université Paris Cité (n°00012022-136). Participants received an information letter, and parents or legal guardians signed a consent statement, given that the study involved minors. No refusal was reported.
Given the age group of 11 to 14 years old, we also ensured that, at each measurement time, participants had the option to decline participation, even if the consent form had been signed. This applied whether they wished to leave the classroom for a few minutes, for one of the measurement times, or for all of them. During each measurement time, the two researchers were accompanied by a member of the educational staff – usually the students’ music teacher – who ensured that all students felt comfortable while the study was conducted and could accompany any student needing a break to another room or activity. Only the results of participants who were present at all three measurement times were included in the analysis.
In addition, the researchers – both trained clinical psychologists with ongoing experience working with youth – were available a few minutes before and after the assessment so that participants could ask questions or raise any concerns they might have regarding the study or its process. Finally, if students did not feel comfortable with one or several questions of the scale, the researchers sat with them and resumed the task if necessary.
Analytic Plan
The data were analyzed statistically using the software Jamovi (Version 2.3.28.0). The main goal of the data analysis was to evaluate whether the evolution of the EVE group on Rosenberg scores and oral reading scores differed from the evolution of the control group over the assessment times. Therefore, an ANOVA was performed. Relationships with gender were examined using t-tests.
Population
To achieve a moderate effect size (η² = 0.25, p < .05) with adequate power (.95) in a repeated measure ANOVA with two groups (EVE group vs. control group) and three measurement times (T0, T1, and T2), a total sample size of 44 participants – 22 participants in each group – is required (G*Power 3.1). The sample includes only students present at each measurement time (N = 92). Among these participants, 50 were in EVE classes and 42 in the control classes. Moreover, 68.5% of participants were sixth-graders and 31.5% of participants were seventh-graders. Girls made up 52.2% of the group, and boys 47.8%. The details of the population distribution are provided in the supplementary material.
It should be noted that at the first measurement time (T0), there was a total of 136 participants in the EVE classes (N = 76) and the control classes (N = 60). This indicates a 32.4% attrition rate over one a half year, of which 59.1% were EVE students, and 40.9% were part of the control group. The attrition was a result of various factors, primarily related to school organization: changing class, repeating a year, changing schools, absences, and the school's calendar. For example, the last measurement time took place right before summer breaks, resulting in less motivation among the students. The analysis comparing the attrition group with the remaining participants showed no significant differences between the two groups in the available variables (i.e., group gender, Rosenberg scores, oral skills scores). The difference between the attrition in the EVE group and in the control group was not significant either.
Results
Self-Esteem
Cronbach’s alpha was calculated for each measurement time and indicates a good internal consistency throughout each measurement time (T0, α = .778; T1, α = .857; T2, α = .836).
Table 1 presents the Rosenberg scale results for the two groups (EVE and control) across the three measurement times. The ANOVA (Table 2) shows no significant difference between the means of the EVE group and the control group at any of the three measurement times. However, the ANOVA over time indicates that the overall evolution in the scores of the total population is significant: EVE and control students both reported higher self-esteem over the year and a half during which measurement took place, with this improvement being particularly visible between T0 and T1 (η2p = .152). These results might be nuanced by considering the changes in standard deviation over time, and thus the data distribution within the two groups (EVE: T0, SD = 4.80; T2, SD = 4.45; Control: T0, SD = 5.38; T2, SD = 6.36). As shown in Figure 1, the two groups present very similar distributions at T0, but by T2, the EVE group exhibits fewer extreme values compared to the control group. This suggests that the control group becomes less homogenous over time, while the EVE group shows more homogeneity.

Distribution of the EVE and control group's Rosenberg scores at T0 and T2.
Self-esteem means.
Repeated measures ANOVA according to time and group for self-esteem.
Oral Reading Skills
Considering that each student's oral skills were evaluated by two researchers separately to ensure inter-rater reliability, intraclass correlation coefficients were checked for each measurement time. The results show good inter-rater reliability at T2 (ICC = 0.878), and adequate reliability at T0 (ICC = 0.729) and T1 (ICC = 0.686) (Koo & Li, 2016). Moreover, the Cronbach's alpha was calculated for each measurement time, indicating high internal consistency (T0, α = .878; T1, α = .895; T2, α = .817).
Table 3 presents the oral skills scores for the two groups across the three measurement times. The ANOVA per group shows a significant difference in scores between the EVE group and the control group at all three measurement times: The EVE group tends to have higher oral skills scores than the control group, even from T0, at the very beginning of the EVE project (Table 3). However, at T2, the difference is less pronounced than at T0 and T1, indicating a smaller gap between the groups in their measured levels of oral skills (p < 0.001 at T0 and T1; p = 0.056 at T2).
Oral skills means.
Table 4 presents the ANOVA results by time and group, and the pairwise comparisons detailing each measurement time. The overall population shows significant score evolution over time (p < .001, η2p = .371), with the effect of time being particularly visible between T0 and T1 (η2p = .524). This indicates that oral reading skills change over time among middle-schoolers. Table 4 also indicates a significant interaction between time and group (p < .001), with a moderate effect size (η2p = .079). The table also shows the pairwise comparisons through each measurement time to further explore this interaction: the interaction is not significant between T0 and T1 (p = .978; η2p = 0), showing no difference in the way EVE and control's scores changed between T0 and T1, however, a significant interaction is found between T1 and T2 (p < .001; η2p = .139) and between T0 and T2 (p = .002; η2p = .097).
Repeated measures ANOVA according to time and group for oral skills.
When examining the evolution of scores over time within the two groups (Figure 2), the specificities of each group's progression over time become clearer: While the control group shows great improvement between T0 and T1, the mean remains quite stable between T1 and T2. In contrast, the EVE group exhibits the same fast improvement between T0 and T1, but instead of stabilizing or continuing to improve, the mean significantly decreases between T1 and T2, although remaining higher than the control group's mean. This suggests that the EVE project does impact oral reading skills, reflected in the higher scores of the EVE group and in their significant decrease at the end of the project compared to the control group.

Evolution of the EVE and control group's oral skills Scores over time.
Further analysis of each item on the grid can provide more detailed information on the results. As shown in Figure 3, each item follows the same pattern as the overall mean. Table 5 presents the ANOVA results per group and time, item by item. The interaction between time and group is significant for five of the six items, and particularly strong for two of them: stress management (p < .001, η2p = .085) and body anchoring (p < .001, η2p = .079). The only item showing no significant difference between the EVE and the control group in its evolution over time is reading quality (p = .320). Table 6 presents the difference between the EVE group's and control group's results for three of the items. If at T1, the EVE students scored significantly higher than the control group on each item (Table 5), by T2, this significant difference disappeared for the three items presented in Table 6: stress management, body coordination, and quality of interaction. On these three items, the EVE and the control group scores were almost equivalent.

Evolution of the oral skills results by items between EVE and control groups. Colors should be used in the publication of all figures.
ANOVA per group and time regarding oral reading skills items’ results.
Difference between EVE group's and control group's oral reading skills on three items.
Contrary to the three other items of the grid, the three presented in this table had no significant difference in the scores at T2 between EVE and control, while a significant difference was observed at T0, T1, or both.
Gender Differences
For both self-esteem and oral skills, the differences between girls and boys were analyzed, regardless of whether they belonged to the EVE or the control group. The results are presented in Table 7. Regarding self-esteem, the t-test indicates a significant difference at T1 (effect size d = 0.466) and at T2 (d = 0.421) between the scores of girls and boys, with girls tending to report lower self-esteem than boys. No significant difference between girls and boys was found at T0 regarding self-esteem.
Difference between girls’ and boys’ self-esteem and oral skills.
Regarding oral reading skills, no significant differences were observed in this study at any measurement time (p > .05, d < 0.2). This indicates that gender did not play a role in oral reading skills results.
Among the other available parameters (e.g., age difference between sixth- and seventh-graders), none showed significant relationships with the measured variables. However, despite the non-significant results, different measures could lead to a better comprehension of how age and developmental maturity could be important factors.
Discussion
This study explored whether choral singing would benefit students’ self-esteem and oral reading skills. The data showed a more nuanced reality, highlighting both the complexity of the measured variables and of the adolescent development.
Self-Esteem
Regarding self-esteem, there was no interaction effect between time (T0, T1, and T2) and group (EVE vs. control), suggesting no direct benefits of the EVE program on self-esteem development. Both groups showed a significant increase in their results over time, particularly between T0 and T1, which represented the longest interval between measurements. Findings regarding the evolution of self-esteem during adolescence are inconsistent: While some studies indicate a slow increase during this period (Birkeland et al., 2012), others, conversely, report a gradual decrease (Robins & Trzesniewski, 2005). Most researchers seem to agree that self-esteem changes during adolescence and can vary based on multiple factors (Huang et al., 2022). The increase and subsequent stabilization observed in the present study align with these findings and further highlight the specificity of that life stage, characterized by profound physical and psychological changes.
However, while both groups had very similar data distributions at T0, by T2 there were fewer extremely low values among the EVE group students compared to those in the control group. The EVE project might contribute to maintaining high self-esteem scores and improving lower scores, thereby reducing heterogeneity within the group. In other words, the EVE project might be of more benefit to students who have lower self-esteem. This could be particularly beneficial for middle-school students, given the strong links between self-esteem and other important aspects of school life, such as grades and academic achievements (see Zheng et al., 2020), as well as social bonding and the prevention of negative social behaviors (see Ayoub et al., 2021 ; Sarkova et al., 2014).
Previous studies suggest that group singing has a positive effect on self-esteem (Reagon et al., 2016). However, whereas qualitative studies often report a beneficial impact of group singing on self-esteem (Baines & Danko, 2010; Gale et al., 2012; Pavlakou, 2009), the few quantitative studies on the subject show more nuanced results (Galinha et al., 2021; Fancourt & Finn, 2019), suggesting a relationship between the two variables, but not as clearly as might initially be expected. This implies that when participants in interviews state that group singing increased their self-esteem, their understanding of self-esteem may not fully align with the definition used to build standardized tests. Moreover, scientific literature broadly suggests that the arts benefit well-being (Davies et al., 2015; Fancourt & Finn, 2019), with benefits ranging from enhanced mental well-being to illness prevention. These previous findings, as well as the differences found between qualitative and quantitative studies exploring the relationship between group singing and self-esteem, suggest that using an alternative measurement scale or survey, instead of the Rosenberg scale, could lead to different results. Indeed, self-esteem is a complex concept that is linked to multiple aspects of one's experience. Exploring the impact of choral singing on various dimensions of well-being could help better understand the relationship between this form of art and one's sense of self. For instance, the impact of choral singing on body motion and experience has been theorized, hypothesized, and researched for several decades (see Balsnes, 2018; Daley, 2012; D’Amario et al., 2023). Choirs are also known for the unique group experience they provide (Delius & Müller, 2022). In that sense, as a potential follow-up to this research, focusing on these two particular aspects – body experience and group life – and their role in the adolescent's self-esteem in the context of early adolescence could be particularly enlightening, given the physical changes adolescents experience and the growing importance of peer interactions during this period of time.
Additionally, some studies suggest that the timing of measurement is also critical in identifying the benefit of the practice of arts on individual well-being. For instance, Busch & Gick (2012) found an association between choral singing and positive affect, personal growth, and vitality when measured immediately after a rehearsal. While one might hope these benefits are maintained over time, testing at different measurement times could help better identify the nature and quality of the relationship between art and self-esteem, or well-being in general. Regarding the role of gender in self-esteem, the results confirmed the existing literature: teenage girls tend to express lower self-esteem than teenage boys (see Quatman & Watson, 2001). However, other studies suggest that breaking down the concept of self-esteem into its various components could be useful to understand the role gender plays in the individual experience of self-esteem and the gender differences that can be observed (Golan et al., 2014). The present findings suggest that the EVE program may benefit girls more in this regard. Since girls tend to score lower than boys, and the EVE program appears to reduce disparities in scores, it is likely that girls are the primary beneficiaries of this effect, protecting them from lower or decreasing self-esteem.
Oral Skills
The results in regard to choral singing's impact on oral reading skills are also more nuanced and complex than initially expected. From T0, the EVE group consistently demonstrated higher oral performance compared to the control group, even before the program started. This statistically significant difference was observed at all three measurement times. These results suggest a positive impact of the EVE program on oral reading skills, as students in the EVE program were randomly assigned, with no prior experience in music or singing required for participation. Being part of the EVE program seems to contribute to better oral reading skills in middle-school students. However, the fact that this impact was observed from the first measurement time suggests that this positive effect could partly come from the benefits of being part of a specific academic program. Indeed, the EVE program is a new initiative, implemented as a test in a few middle schools by the Paris Philharmonic, introducing students to new knowledge and experiences within the school curriculum. For EVE students, this novelty and the curriculum's new focus on music may favor an individual commitment to the program and the skills it develops, including oral skills. Moreover, due to field constraints, the study began shortly after the start of the EVE project. As a result, the EVE students had already participated in one or two sessions by the time the study began. Therefore, the differences observed between the EVE and control groups at T0 could partly be explained by these initial sessions. Another key factor that could explain these results could also be considered a potential bias in this research: Since the study was presented to students as a study of the EVE program, EVE students might have felt more engaged into it than those in the control group, possibly leading to better results in oral reading for EVE students. The researchers sought to minimize the impact of this known bias by offering special advantages to students in the control group, such as visits to the Paris Philharmonic or museums and exhibits related to music after the last measurement time. However, the study did not include the measurement of the effectiveness of this decision.
Similar to the Rosenberg scale results, the impacts of the EVE program on students were not as expected: Although there were mean differences at all three measurement times, it was also hypothesized that the trajectory of results over time would differ between the two groups, with EVE group students showing more rapid progress over time compared to the control group. Data analysis revealed score improvements between T0 and T1; however this improvement was present in both groups and to the same extent. Between T1 and T2, a difference in score evolution is apparent, which seems to contradict the initial hypothesis: The control group's results remained stable between T1 and T2, while the EVE group's scores were significantly lower at T2 than at T1, though they remained significantly higher than both the control group's results and the EVE group's T0 results. According to these findings, the EVE program does have a positive impact on oral reading skills, but this benefit appears to have diminished throughout the program.
As stated in the introduction, the arts are known to play a role in school learning (Bowen & Kisida, 2019; Kisida et al., 2020; Leonard et al., 2018). However, when it comes to the specific role of singing in the development of speaking skills, research remains limited. Most existing studies focus on its use in helping with speech disorders (Behaghel & Zumbansen, 2022), in learning foreign languages (Ludke et al., 2014; Setia et al., 2012), or in early language acquisition (Christiner & Reiteter, 2018; Franco et al., 2021). These studies consistently highlight the positive impact that music and singing can have, although never as unique factors; rather, they are found to be beneficial in conjunction with other variables such as a supportive environment or individual abilities – which raises the recurrent “nature vs. nurture” debate. Overall, few studies have explored the role of music in school learning within the participants’ native language, and among a general population without any specific disorders or musical training. In that sense, the present study helps explore the everyday impact that the arts – and singing in particular – can have on learning and growing. The findings presented here suggest that, while integrating singing into classrooms might be beneficial, it is essential to consider the broader context of its implementation.
Several hypotheses could help explain the results of this study and would require further exploration. First, the EVE program included a concert involving all 300 EVE participants, which took place a few weeks after the last measurement. Many students expressed both excitement and anxiety about this event, which may have impacted their involvement at the end of the study, their focus being on the concert. This hypothesis could be supported by the item-by-item results: The three items on which no significant difference between the EVE and the control groups was observed at T2 can closely be linked to stress and anxiety – stress management, body coordination, and quality of interaction. A measurement after the concert could have helped explore the impact of emotions on the results by comparing the results before and after the concert, and could even reveal a positive impact of the concert, a stressful but also very rewarding experience for the students.
Similarly, these findings raise questions about the ideal length of this type of project for middle-schoolers. A significant improvement is visible after almost a year of project, which could lead to two interpretations and provide guidelines for further exploration. First, it might be relevant to shorten the project, which could help maintain the results observed at T1 and prevent young students from weariness or emotional disengagement from the project. A second suggestion is based on a “dose-dependent” approach: Due to the school-year's organization and schedule, there were nearly twice as many EVE sessions between T0 and T1 as there were between T1 and T2 – approximately 29 sessions between T0 and T1, and around 16 between T1 and T2. The difference in results between these two periods of time could suggest that certain benefits might only become apparent after a specific number of sessions, which was not achieved between T1 and T2. Scheduling additional measurements at different times could help investigate and refine the potential effects of time.
Both the oral skills and Rosenberg scale results raise questions about the potential effects of keeping a class unchanged for two years for students aged 11 to 13. Due to the project lasting one year and a half, the students in the EVE classes remained together for two consecutive school years. Usually, in French middle schools, class compositions are changed at the beginning of each school year. While this practice might cause complaints from the students who wish to stay with their friends, it could also help prevent negative group dynamics that are known to be very present at this age, such as teasing, bullying, or the formation of “cliques” or subgroups within the class (Menesini & Salmivalli, 2017). It could be hypothesized that classes that remain stable for two years could reinforce both positive and negative relationships and behaviors among students. The potential negative effect of stability in classrooms has been suggested by a few studies, even though the findings are not consistent and research on this topic is limited (Rambaran et al., 2020). This hypothesis could be further explored through qualitative data, gathered from the subjective experiences of students and teachers involved in the EVE project. If it is validated, even partly, then increased bullying or intensified group hierarchical dynamics could have impacted both self-esteem and academic results, partly explaining the current findings. Moreover, from a broader viewpoint, further examining the effects of stable classrooms over two years in middle school could help refine our understanding of social dynamics at that age.
Finally, these results and the suggestions presented above further demonstrate how oral skills are influenced by environmental and external factors. Theories of intelligence vary along a continuum, from entity theorists, who believe that intelligence is fixed, to incremental theorists, who assert that intelligence can change over time through learning and practicing (Gunderson & Hamdan, 2017). In that sense, though one might initially assume that oral skills are deeply rooted in a student's innate abilities and academic potential, they appear to be significantly affected by the student's surroundings and events in both their school and personal lives. Therefore, as this study further demonstrates, it is essential to continue providing students with an appropriate educational environment to foster their ability to develop personal and academic skills.
Limitations
In discussing the results, one of the biases this study encountered was highlighted, namely the fact that it was presented to students as part of the EVE program, potentially leading to a lesser involvement on the part of the control group students, who did not perceive as many benefits for themselves as the EVE group did. As mentioned above, the researchers addressed this bias by supporting the schools’ initiative to organize a school trip for the control group students. This offer was made to help the control group's students to feel more involved in the study by including them in a specific activity.
While this bias could influence how students responded and engaged with the study, other biases may have also affected the researchers and their assessments, particularly regarding the oral skills evaluation, which relies on the researchers’ impressions. To address this bias, we ensured that the evaluation grid had good ecological validity, internal consistency, and good inter-rater validity among the two independent researchers. However, the evaluation might still be influenced by the researcher's perceptions of each group of students. For organizational reasons, one of the researchers had to be aware of whether the students belonged to the control group or the experimental group. This prior knowledge could have impacted their assessment, potentially leading them to expect one group to do better or worse than the other. To address this bias, the other researcher involved was blind to group membership. The intraclass correlations (ICC) for each measurement time were calculated and indicate good inter-rater reliability, which suggests that this bias did not significantly affect the results. However, this issue should still be considered when interpreting the results.
Additionally, while the attrition rate is reasonable over that period of time (32.4% over a year and a half), the loss of participants can still potentially alter the final results. The analysis comparing the participants who dropped out and the rest showed no significant differences. However, it is important to consider that other factors, which the researchers did not have at their disposal, might differentiate the two groups (e.g., socioeconomic data, general academic level).
This also raises important questions regarding the random assignment of students to the EVE program. While it helped reduce selection bias, making the results more generalizable, it may have also contributed to lower levels of motivation among the participants, partially explaining the results of the study. Although random assignment was considered appropriate given the program's objectives, it also raises ethical concerns about access to arts education that cannot be overlooked in future studies.
Finally, access to more sociodemographic data could have helped refine the results and reveal additional relationships between variables. Working with a population of minors meant that the researchers had limited access to information about the population. However, all statistical analyses were conducted using the available variables, mainly gender, age difference, the school's geographical area, and the socioeconomic status of that area. Only gender – and group (EVE vs. control) – showed any interaction with the measured variables. We can hypothesize that, for instance, information about their grades, extracurricular activities – other than music – family socioeconomic status, or cultural background could have provided a deeper understanding of the measured variables.
Conclusion
The study results suggest a significant relationship between choral singing, self-esteem, and oral reading skills, though this relationship is more nuanced than initially hypothesized. In terms of self-esteem, the weekly practice of choral singing within the classroom appears to favor more homogenous scores, with fewer low self-esteem scores and greater stability in high scores over time. However, no significant differences in mean self-esteem scores were observed between the experimental and control groups. Conversely, the EVE group showed significantly higher scores in oral reading skills compared to the control group across all three measurement times, suggesting a positive impact of choral singing on these academic skills. Yet, the EVE group's score trajectory shows a decline at the end of the program, a pattern not observed in the control group, raising questions on the most appropriate structure for collective projects at that age.
This study further supports the idea that initiatives like the one implemented by the Paris Philharmonic are beneficial for young people, helping them in navigating both their academic journey and the changes of adolescence. By using assessment tools closely aligned with the skills and challenges that middle school comprises, the study aimed to have a direct impact on practical strategies that can be used daily by students and teachers.
This study also highlights that research on the matter of the role of the arts in learning as well as in adolescent development can still be fortified. While the current findings give a direction and encourage further exploration, deeper insights could be gained through quantitative data examining more specific aspects of the sense of self and through qualitative data capturing the subjective experiences of students and teachers. More importantly, these types of studies foster interest in creating educational environments where the arts are not just supplementary but contribute meaningfully to both collective and individual growth, encouraging personal and academic development.
Supplemental Material
sj-docx-1-mns-10.1177_20592043251369986 - Supplemental material for Implementing Choral Singing in the School Curriculum to Foster Oral Skills and Self-Esteem
Supplemental material, sj-docx-1-mns-10.1177_20592043251369986 for Implementing Choral Singing in the School Curriculum to Foster Oral Skills and Self-Esteem by Claire Michel, Anthony Brault, Haya Haidar, Maïa Guinard, Déborah Loyal and Mi-Kyung Yi in Music & Science
Footnotes
Action Editor
Graham Welch, University College London, Institute of Education
Peer Review
One anonymous reviewer
Nardi, Free University of Bozen-Bolzano, Faculty of Education
Contributorship
The study was conceptualized by AB and approved by all authors. The investigation was conducted by CM, AB, and MKY. Data analysis was performed by CM with DL’s assistance. CM drafted the manuscript, DL provided critical review and revision, and HH provided language revisions. All authors critically read the manuscript and approved its final version.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Approval
This study received the validation of the ethics committee of the Université Paris Cité (n°00012022-136).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Paris Philharmonic (Philharmonie de Paris - Cité de la musique) and the charitable organization Fondation Bettancourt Schueller, (grant number 52 000€). However, the Philharmonic had no role in the design of the study, data collection, analysis, interpretation of results, or the decision to publish the results.
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
