Abstract
Research on choirs and other forms of group singing has been conducted for several decades and there has been a recent focus on the potential health and well-being benefits, particularly in amateur singers. Experimental, quantitative, and qualitative studies show evidence of a range of biopsychosocial and well-being benefits to singers; however, there are many challenges to rigor and replicability. To support the advances of research into group singing, the authors met and discussed theoretical and methodological issues to be addressed in future studies. The authors are from five countries and represent the following disciplinary perspectives: music psychology, music therapy, community music, clinical psychology, educational and developmental psychology, evolutionary psychology, health psychology, social psychology, and public health. This article summarizes our collective thoughts in relation to the priority questions for future group singing research, theoretical frameworks, potential solutions for design and ethical challenges, quantitative measures, qualitative methods, and whether there is scope for a benchmarking set of measures across singing projects. With eight key recommendations, the article sets an agenda for best practice research on group singing.
Keywords
Introduction
The human singing voice has been a subject of investigation for many decades. However, only over the last 20 years have researchers begun to address the nature and the health and well-being effects of choral singing, particularly in amateur singers. Some pioneering studies in this field were those by Bailey and Davidson (2002), Beck, Cesario, Yousefi, and Enamoto (2000) and Clift and Hancox (2001). Recently, the UK All Party Parliamentary Group on Art, Health and Wellbeing has reported extensively on the health benefits of singing (APPG, 2017; Gordon-Nesbitt & Howarth, 2019). Advancing Interdisciplinary Research in Singing, a Canadian-led collaboration with over 70 researchers from 16 countries, has also contributed to the field (see www.airsplace.ca). In addition, a number of systematic reviews and Cochrane systematic reviews on singing for people with various health conditions have been undertaken (e.g., Clift, Nicols, Raisbeck, Whitmore, & Morrison, 2010; Daykin et al., 2018; Irons, Petocz, Kenny, & Chang, 2016; Lewis et al., 2016; Williams, Dingle, & Clift, 2018). Converging evidence from these reviews shows that group singing has the potential to enhance well-being and quality of life, as well as improve lung function and symptoms of depression, anxiety, and stress, amongst different populations. Despite this consensus, however, systematic reviews have emphasized the limitations and challenges of group singing research such as the following: the use of uncontrolled designs, lack of randomization, lack of blinding to conditions, small sample sizes, high attrition in longitudinal studies, lack of longitudinal or follow-up approaches, and selective reporting of outcomes (Linnemann, Schnersch, & Nater, 2017; Williams, Dingle, & Clift, 2018). In recognition of the need to improve the methods used and reported in this field, the authors met for a workshop to discuss how best to address and overcome such challenges. This article summarizes the proceedings of the workshop and sets an agenda for best practice research in singing.
The Workshop
Drawing upon their research networks, the first two authors invited delegates to a workshop on “Setting an Agenda for Best Practice Choir Singing Research,” in December 2018. The 18 delegates were from five countries and represented the following disciplinary perspectives: music psychology, music therapy, community music, clinical psychology, educational and developmental psychology, evolutionary psychology, health psychology, social psychology, and public health. In preparation for the workshop, delegates were asked to provide information about theories, methods, and measures they had used in choir research, along with their critical review of these. The first author compiled this information into a booklet and circulated it to the group. During the workshop, the first author facilitated discussion by posing topics and questions for discussion—as shown in the headings of this article. Some topics were explored with whole group discussion; others involved small group discussions at the four tables, followed by reporting to the whole group; and others involved particular authors addressing the workshop about their own research experiences. After the workshop, the first author prepared a draft manuscript of the proceedings with input from the other delegates, who are listed as co-authors in alphabetical order.
Priority Questions for Group Singing Research
The opening task for discussion at the workshop was identifying the priority questions for future singing research. The question posed was “What should we be focusing on and why is it important?” We discussed this in small groups of about five participants, and then each subgroup reported to the full group. The following list was compiled from these contributions: Question 1: Are the health and well-being benefits of group singing unique to singing, or is any enjoyable group activity similarly effective for health and well-being? There is mixed initial evidence on this question, with one study reporting that group singing is associated with faster social bonding after 1 month than group creative writing and craft making (Pearce, Launay, & Dunbar, 2015). Another study found that choral singers considered their choirs to be a more coherent or “meaningful” social group than team sport players considered their teams (Stewart & Lonsdale, 2016). This is important in light of research establishing that feeling connected to others is itself a basis for psychological well-being (Baumeister & Leary, 1995). In a study by Stewart and Lonsdale (2016), however, choral singers and team sport players reported equivalent psychological well-being, and both groups’ average well-being scores were significantly higher than that of solo singers. Similarly, no difference was found between group singing and group creative writing in terms of longitudinal well-being outcomes for participants with chronic mental health problems (Williams, Dingle, Jetten, & Rowan, 2019); and no difference was found in the effect of a single session of group exercise and group singing on emotional state and social connectedness among older adults (Maury & Rickard, 2018). There is room for further research on this question in relation to a range of samples, and timeframes, and using appropriate control samples. This is especially the case in light of a recent longitudinal study of 7,305 older adults which revealed that people who believe their life is filled with worthwhile activities (such as involvement in civic society, cultural activities, and volunteering) experience greater well-being and healthier ageing (Steptoe & Fancourt, 2019). Question 2: How does group singing compare with structured group therapy for psychological problems, such as group cognitive behavior therapy? Given the potential physical, psychological, social, and biological benefits of group singing, it is suitable as an adjunct to individualized treatment for people who experience chronic health or mental health problems. Some authors shared anecdotal evidence that similar mechanisms are evoked in both group singing and in cognitive behavior therapy (a well-established approach to psychotherapy); for example, exposure to social and performance situations, behavioral activation, and the provision of group-based social support. This suggests that group singing has potential to be an alternative to individual therapy. Before group singing can be considered a stand-alone intervention, however, empirical evidence is required to test these proposed therapeutic mechanisms. For example, there is recent evidence that social support acts as a mechanism by which group singing leads to mutual recovery from bereavement in people affected by cancer (Warran, Fancourt, & Wiseman, 2019). This question is of particular relevance in light of the recent emergence of social prescribing networks whereby adults experiencing social isolation and low well-being are referred from primary care directly to group programs, including group singing—bypassing psychotherapy and other health services (Chatterjee, Camic, Lockyer, & Thomson, 2018). To our knowledge, no study has thus far examined group singing compared with other forms of group psychotherapeutic treatment or group singing alone versus as an adjunctive treatment. Question 3: What is the cost effectiveness of group singing for health? The authors agreed that cost-benefit analyses are an important avenue for future research on singing and health. When considering a singing program compared with other treatment or social care resources, both the costs and the benefits of participation in group singing are important indicators for policy makers and health care professionals. Hence, future singing studies are encouraged to include a cost-effectiveness analysis. Researchers are encouraged to collaborate with health economists, using validated methods. Costing metrics can include general practice visits, social care involvement, in-patient stays, outpatient attendance, and use of prescription medication before and after participating in a singing intervention. For example, a singing study has utilized the EQ-5D (Group EuroQuoL, 1990), a short 5-dimensional instrument that yields Quality Adjusted Life Years, to assess cost-effectiveness of a community singing group program for older people (Coulton, Clift, Skingley, & Rodriguez, 2015). Indeed, a range of health cost savings beyond those immediately associated with singing sessions may be of value to the singers as well as stakeholders such as health service providers. One study of 166 ambulatory older adults assigned to a chorale (choir) or a control activity group found that chorale participants fared significantly better in overall health ratings, number of doctor visits, number of over-the-counter medications, number of falls, and other health problems (Cohen et al., 2006). Singing may provide a relatively low-cost therapeutic program, which can optimise some clinical outcomes, but further cost-benefit analyses are needed for singing interventions for a range of health conditions. Question 4: What makes an effective group singing leader? Existing research indicates that the leader’s personal qualities are important or even crucial to singers to achieve positive experiences of singing with others (e.g., Lamont, Murray, Hale, & Wright-Bevans, 2018), yet relatively little research into the characteristics of effective group singing leaders has been conducted. For example, are there differences in health and well-being outcomes for high- or low-energy leaders, or for leaders with different types of professional training? Community arts practitioners are trained to be self-aware and to manage the energy levels of the group. They are trained to “think backwards”; that is, to envisage the outcome they are intending to reach and to communicate that to the group to steer the outcome (e.g., Stickley, Hui, Souter, & Mills, 2016). On the other hand, music therapists and psychologists tend to focus more on the therapeutic processes involved in the group (e.g., Sullivan, 2003). More research is required to understand the advantages and disadvantages of different approaches to group singing leadership. For instance, some singing group studies have included predominantly leaders who hold dual qualifications in music and in therapy (e.g., the German Singing Hospitals network, see Kreutz, Clift, & Bossinger, 2015), whereas other leadership models feature collaborations between musicians and therapists (e.g., Williams, Dingle, Jetten, & Rowan, 2019) or between musicians and volunteer supporters (e.g., Skingley & Bungay, 2010).
Indeed, the effectiveness of choir leadership may depend on the purpose of the group. If the purpose is for the group to become musically excellent, then clearly an expert choir leader is needed—but if the group is used as a basis for therapy, then other aspects such as understanding the health needs and challenges of the individual members come to the fore. One study found that facilitators for music programs with older adults could develop their practice by making fuller use of nonverbal modelling; encouraging participants to contribute to setting goals, making more use of attributional feedback that supports autonomy in learners, and varying the organizational structure and style to suit the diverse needs within groups of older learners (Creech, Varvarigou, Hallam, & McQueen, 2014). Effective group singing leaders may require an understanding of a range of health conditions. For example, Lewis and colleagues (2016) report that leaders of singing groups for a specific health condition such as Chronic Obstructive Pulmonary Disease will require specialized skills and support. Further research on the topic of group singing leadership might look to vocal pedagogy (e.g., Wenk, 2014) as well as to social and organizational psychology research on leadership and group dynamics (e.g., Rowold & Rohmann, 2008). Question 5: How can the message that most people can sing and can enjoy health and well-being benefits from group singing be promoted more widely in the general public? Several authors gave anecdotal accounts of individuals in their choirs who had never sung before, believed that they “couldn’t sing,” and who were surprised that singing was “for them” (see also Welch et al., 2011). Several authors mentioned that the very term “choir” might be a barrier to engagement for some individuals who may perceive choir singing as an elitist or professional activity that only occurs in churches or concert halls. (Possible alternatives are “vocal group,” “singing group,” or “glee club.”) There has been some research on perceptions of choir singing in school students (Sweet, 2010) and in men experiencing homelessness (Bailey & Davidson, 2002). In Australia, PubChoir—a monthly event in which strangers gather for a few hours in a sociable context to learn a song in three-part harmony and record it—has grown from 70 participants to 800+ participants over the course of a year. Founder and director Astrid Jorgensen credits this widespread appeal to holding sessions in pubs and music venues where people feel a sense of familiarity and can choose to have a drink before they sing which may help to overcome their anxiety about singing in public (McMillan, 2018). Another way to engage people in singing groups is through families, because group and peer singing is often introduced in early childhood services and schools (see Degé & Schwarzer, 2011). Large-scale studies of public perceptions are required to verify these anecdotal reports about potential barriers and facilitators to engagement. It will be important to engage target recipients in the research design process in order to identify their preferences for things such as group names. Question 6: How long do the psychological benefits of group singing last after a single session? This is important in the context that the notion of “singing on prescription” implies that a “dose” of singing will support singers’ health and well-being during the week until the next rehearsal. For instance, in terms of mood, preliminary evidence indicates that there is an immediate increase in positive emotions after singing in adults with Parkinson’s (Baird et al., 2018), in cancer patients (Fancourt et al., 2016), and in adults with chronic mental health conditions (Dingle, Williams, Jetten, & Welch, 2017). In the latter study, the increase in positive emotions was short-lived (i.e., diminished over a course of a day) while the effect on negative emotional states was more lasting (i.e., continued to dampen negative mood in the evening; Dingle et al., 2017). There have been reports from older adult singers who experienced a “high” during the rehearsal, followed by a “low” afterwards (Lamont et al., 2018, p. 430). Experience sampling methods (Csikszentmihalyi & Larson, 2014) would be suitable to monitor such effects over the course of a week in order to identify minimal necessary “doses”—with the caveat that people’s moods may fluctuate for many other reasons than the singing group. Beyond mood and emotional states, other potential effects of singing that would be worth investigating over longer durations include sleep quality and pain management. To our knowledge, neither of these topics has been researched to date. Future research could also explore activities that participants could do outside of the group singing sessions in order to maintain or “top-up” any identified benefits of participation (e.g., sing along to a recording of their group singing). Question 7: How effective is group singing in the estimated 85% of the world population that are not living in Western, educated, industrialized, rich and democratic (WEIRD) societies? Singing is an enjoyable and important social activity throughout the developing world (Huron, 2003; Trehub & Trainor, 1998). In many developing nations for instance, singing is embedded into parenting practices (Bornstein & Putnick, 2012). Two authors described their work with a women’s singing program for maternal mental health in The Gambia where the use of singing and dance for emotional support and as ritual to mark important ceremonies is already commonplace and fully participatory (McConnell et al., 2018). This example emphasizes differences in perspective and culture around group singing between WEIRD and developing societies and more research is needed to more fully understand this. Question 8: Is there a need for replication in singing research? The need for replication of studies is gaining recognition across the sciences as an index of reliability of the findings. However, a review of psychology studies published in top ranked journals revealed that only around 1% have been replicated (Makel, Plucker, & Hegarty, 2012). In a direct replication, the new research team essentially seeks to duplicate the sampling and experimental procedures of the original research by following the same methods as described in the original publication. In a conceptual replication, the original methods are intentionally altered to test the rigor of the underlying hypothesis. It is striking that very few direct replications of studies on singing and health have so far been undertaken. Some examples are Kreutz, Bongard, Rohrmann, Hodapp, and Grebe’s (2004) replication of the seminal Beck et al. (2000) cortisol study. Clift, Manship, and Stephens (2017) replicated a study of group singing and mental health by Clift and Morrison (2011). In addition, Skingley, Clift, Hurley, Price, and Stephens (2018) replicated an earlier study of group singing for people with chronic obstructive pulmonary disease by Skingley et al. (2014). In a broader sense, multi-site and cross-national studies allow for replication and comparison of singing program effects in a variety of cultural contexts, see for example Livesey, Morrison, Clift, and Camic (2012). Recommendation 1: the authors recommended that, in addition to addressing the question of whether group singing is effective for health and well-being, future group singing research should advance the field along new avenues proposed by the eight priority questions identified above.
Theoretical Frameworks
Currently, there appears to be no preferred theory to explain how and when group singing relates to health and well-being among participants (see Williams, Dingle, Calligeros, Sharman, & Jetten, 2019). Many studies have reported outcomes from group singing without any theoretical basis. According to Cramer (2013), six characteristics of a good theory are comprehensiveness, precision and testability, parsimony, empirical validity, and both heuristic and applied value. Various authors critically reviewed the following theories that have been used in singing research.
The biopsychosocial model is a model of health applied to singing that includes biological, psychological, and social factors (Fancourt, 2017, pp. 29–30). Proposed by George Engel in 1977 for understanding health and illness, this model has been criticized for being descriptive rather than holding predictive value and for lacking an overarching framework for explaining associations between the components (e.g., McLaren, 1998). The psychobiological model provides an explanation of behavior in relation to biological and psychological contributing factors. A psychobiological model of choir singing was adopted by Bullack, Gass, Nater, and Kreutz (2018), who reported that amateur singers experienced significantly improved mood and social connectedness in a singing compared with a non-singing condition. However, there were no differences in salivary cortisol or amylase between the conditions, indicating a lack of convergence between the biological and psychological effects of group singing in this study. Similarly, despite significant differences in self-reported anxiety between a choir singing session and a non-singing control session, no difference was found for salivary amylase (Sanal & Gorsev, 2014). Finally, Fancourt and colleagues’ (2016) study of group singing with cancer patients and carers provided preliminary evidence that singing improves mood state and modulates components of the immune system, consistent with a psychoneuroimmunological model (Fancourt, 2018; Fancourt, Ockelford, & Belai, 2014).
Psychological theories include Seligman’s (2011) positive psychology perspective, which has been applied to singing and well-being in older adults (Lamont et al., 2018). Seligman’s PERMA model comprises one aspect of hedonic well-being:
Several researchers have adopted a developmental psychology perspective. For example, Barrett and Bond (2015) and O’Neil (2006) have written about the role of music programs (including singing and other musical activities) in adolescent development and flourishing. Musical preference acts as a “badge of identity” during the adolescent period, aiding in the formation of social groups (North & Hargreaves, 1999). Relatedly, engagement with music allows adolescents to both present a certain image of themselves, aligned to their musical tastes, and address their emotional needs (North, Hargreaves, & O’Neill, 2000). While there might be a popular idea that musical preferences in youth might cause negative mental health or behavior, so far there is little evidence to support this, and rather it is more likely that music tastes might indicate some vulnerability, or are being used to manage extreme emotions (Baker & Bor, 2008; Sharman & Dingle, 2015). The positive youth development perspective provides a framework for understanding young people’s musical development and positive engagement in musical activities across the domains of competency (musical, academic, and social), confidence, connection, character, and caring.
In a review of the literature on music engagement in older adults, Creech, Hallam, McQueen, and Varvarigou (2013) also took a positive development/empowerment perspective. Miranda, Blais-Rochette, Vaugon, Osman, and Arias-Valenzuela (2015) proposed a cultural-developmental psychology of music in adolescence, drawn from cultural psychology and music research at the intersection of evolutionary psychology, music perception, and ethnomusicology. A cultural-developmental perspective of music in adolescence can account for findings of research on music preferences, music motivation and functions, dance, language, social network and multitasking, ethnicity and cultural diversity, and cultural competence in music-based interventions (Miranda, Blais-Rochette, Vaugon, Osman, & Arias-Valenzuela, 2015).
Turning to theories that focus on the social processes involved in group singing, Dunbar and colleagues have applied an evolutionary model to understanding how singing and dancing with other group members may have evolved as a way to allow the group to better socially bond and to solve internal conflicts (Dunbar, 2012). Increasing group sizes in early hominin species may have led to pressure to develop mechanisms that would help these groups stay together, despite increasing internal competition (e.g., Keller, König, & Novembre, 2017; Pearce, Launay, Machin, & Dunbar, 2016; Weinstein, Launay, Pearce, Dunbar, & Stewart, 2016). Speaking to the potential health effects of such social bonding, some studies from this group have measured resilience to pain as a proxy for endogenous opioid release. The findings highlight the importance of a strong social network in maintaining health and well-being.
The social identity approach posits that through group belonging (identification), people can access group-based psychological resources such as meaning, control, support, and esteem, which in turn lead to improved well-being (e.g., Dingle, Brander, Ballantyne, & Baker, 2013; Tarrant et al., 2016; Williams, Dingle, Calligeros, et al., 2019). According to this approach, it is group identification rather than singing per se that confers benefits to health and well-being of its members. Although it is recognized that group singing, because of its propensity to bond people (Pearce et al., 2016), may be an effective means of encouraging identification. Consistent with this theory, Williams, Dingle, Jetten, and Rowan (2019), reported that adults with chronic mental health conditions who joined a choir reported similar benefits to those who joined a creative writing group; and these outcomes were related to the extent that participants identified with their arts-based group (Williams, Dingle, Jetten, & Rowan, 2019). This theory also accounts for why singing in a group is more beneficial for participants’ well-being than singing solo (Stewart & Lonsdale, 2016). A recent pilot randomized trial assessed the feasibility of a group singing intervention for the well-being of people with aphasia after a stroke and explored the social identity processes that were activated during singing (Tarrant, Code, Carter, Carter, & Calitri, 2018). Recommendation 2: Numerous theories are available that have shown empirical promise in explaining the health and well-being effects of group singing. Researchers are able to select one that is most suited to the purpose and context of the singing group being set up. The authors recommended that future research clearly specify a theoretical framework guided by the research question and that researchers measure theoretical constructs that are meaningful to that framework.
Design and Ethics
Working in small groups, the authors were given several methodological challenges to attempt to resolve. We then discussed these as a whole group.
What Ethical Issues Were Raised in the Ethics Review Process and How Did You Address Them?
Some examples of issues raised in the ethics review process were the use of different group leaders and group characteristics in a multiple site study, and how the researchers would achieve recruitment targets when they were relying on (busy) health professionals to approach potential participants and then pass on contact details of any interested individuals to the researchers. Overall, the ethical considerations of group singing projects did not seem to be particularly different from those of other research projects involving psychosocial interventions.
How Do You Achieve Randomization in Different Settings (e.g., Health/Hospital, Community, Education)?
The authors agreed that self-referral works best for group singing programs, with randomization to conditions conducted after the initial assessment. This raises an issue of whether people are willing to be randomized to a wait-list control condition. Wait-listed participants tend to drop out at higher rates than those in the immediate singing condition. This is possibly because they feel they are missing something important or because they make a commitment to an alternative activity during the waiting period that then clashes with their delayed singing group (e.g., Skingley, Bungay, Warden, & Clift, 2013). One suggestion was to use a stepped wedge design (Thabane, Dennis, Gajic-Veljanoski, Paul, & Thabane, 2016). In this design, each site has a control phase followed by an experimental phase; hence the potentially effective intervention is not withheld from any participant. The sites commence at different times, allowing for comparisons to be made within each site between the control and experimental phase and, also, the control phase of one site can be compared with the concurrent experimental phase at another site, thus controlling for seasons and time of year.
What is a Suitable Control if Randomization is Not Feasible?
Including an active control condition is optimal—i.e., where the participants in the control condition receive an intervention that is similar to group singing but lacks the active ingredients of interest (see, for example, Maury & Rickard, 2018; Särkämö et al., 2014). Where a no-treatment control condition is included, other forms of creative, social or community engagement (similar ingredients) can be assessed in questionnaires and controlled for in the analysis. Suggestions for recruiting participants in a no-treatment control condition included disseminating online questionnaires using social media and word of mouth. For a matched control sample, researchers could request each member of the singing group to invite an age- and gender-matched friend (who is not joining the group) to complete the assessments; potentially offering them an incentive for their participation. Some research included a comparison singing group, conducted by the same director, whose members did not share a characteristic of interest (e.g., Dingle, Williams, Jetten, & Welch, 2017).
How Have you Achieved Blinding of Assessment and Analysis?
Clearly, it is not possible for singing group participants to be blind to their study condition, so only single-blind designs are feasible in this field (e.g., blinding outcome assessors). Single blinding has been achieved in quantitative research (e.g., Coulton et al., 2015) and qualitative research by including coders of interview transcripts who were independent of the choir project (e.g., Dingle et al., 2013). Others have used videotaped or audiotaped recordings of rehearsals and engaged researchers who are blind to the study questions to code specific instances of behaviors of interest (e.g., Tarrant et al., 2018). With biological samples, assay analysis is unlikely to be affected by knowledge of intervention and is often done externally.
How do you Increase Sample Size and Prevent Attrition in a Longitudinal Study?
Sample size should be guided by the power required to detect the expected effect size in the primary analyses of interest. Recruitment strategies include giving talks and presentations to potential participants to describe how fun it is to sing in a group and to highlight the possible benefits of singing for their health beyond the rehearsals—for instance, better posture and breathing (e.g., McNaughton et al., 2017). Singing “taster sessions” and performances to show how easy it is to get involved are helpful recruitment strategies. Using many forms of recruitment helps to raise the profile of the singing program, such as social media, word of mouth, email lists, newspapers/magazines, marketing fliers (distributed widely, in the community and hospitals), and talks in the community and at support groups. Once the singing group is established, members can be encouraging to recruit others. Although attrition is an issue (as in most longitudinal studies), the authors stated that they had not experienced difficulties contacting participants who had discontinued their participation for follow-up assessments. It is recommended that researchers make the aims clear at the beginning of the project and seek consent to contact participants even if they have dropped out of the singing group.
What is the Optimal Timing of Assessments in Longitudinal Studies?
Longitudinal studies have adopted a range of durations and intervals between assessments. In order to analyze rates of change in key variables during the intervention and afterwards, a minimum of three time points are recommended: before the intervention (baseline), immediately after the intervention, and a longer-term follow-up of 3 to 6 months (e.g., Särkämö et al., 2014). In reality, this can be challenging to implement. For instance, people who wish to join another singing group after completing the singing for the study would be expected to show further improvements at longer-term follow-up compared with those who stopped at the end of the researched group. In some contexts, such as a singing group in a hospital ward, participants may be referred in and discharged at varying times and will be more challenging to follow up if they have moved outside of the geographical area in which the study took place. Recommendation 3: Randomization is preferred but where not feasible, researchers should include an appropriate control or comparison sample in their design. Recommendation 4: To achieve adequate power in the main analyses, future quantitative group singing research should recruit sample sizes large enough to detect the predicted effects. In longitudinal designs, consent should be sought to contact participants for follow-up even if they have withdrawn from the singing program. Recommendation 5: Longitudinal studies should ideally include at least three assessment points and adopt (single) blinding of assessors.
Measures Used in Research on Group Singing
In preparation for the workshop, the authors contributed a measure that they had used in group singing research and a critical review of its use. These measures are considered below in the following categories: biomarkers, self-report measures, experience sampling methods, and cognitive/neuropsychological measures.
Biomarkers
Biological measures are desirable for exploring biological processes underlying the health benefits of group singing. Biomarkers, such as stress hormones and immune system proteins, are analyzed through blood, saliva, urine, or hair samples. The timing of assessments in relation to the start of singing activities is important as there is a time delay to peak levels of biomarkers such as cortisol (10 to 30 min—see Bozovic, Racic, & Ivkovic, 2013) and oxytocin (around 15 min—see Seltzer, Ziegler, & Pollak, 2010). For a detailed overview, please refer to the chapter by Theorell (2014). While the authors agreed on the importance of considering biomarkers, there was uncertainty about which measures are most appropriate and reliable, given inconsistent biomarker methods and results across group singing studies to date. One commonly researched biomarker is the hormone cortisol, which is a well-established measure of stress response in relation to hypothalamic–pituitary–adrenal axis (HPA) activity. Decreased salivary cortisol has been found in low-stress singing conditions (such as rehearsals), while high-stress conditions (such as performances) have been connected with increased cortisol levels (Beck, Cesario, Yousefi, & Enamoto, 2000; Fancourt et al., 2015). Short-term group singing has shown reductions in cortisol in cancer patients, carers, and bereaved carers (Fancourt et al., 2016), and mothers with postnatal depression symptoms, although in this study this was not indicated by cortisone (also involved in the stress response) (Fancourt & Perkins, 2018). This research indicates that singing may affect us biologically by modulating the stress response through reductions in cortisol, although this has not been shown across all biomarkers. Mirroring this, two studies found no difference in salivary alpha-amylase (an indicator of stress-related changes in the autonomic nervous system) between choir singing and a control condition (Bullack, Gass, Nater, & Kreutz, 2018; Sanal & Gorsev, 2014). Furthermore, mixed findings have been reported about blood plasma adrenocorticotropic hormone (ACTH, a measure of stress response) measured during singing of pre-composed music and improvisation (Keeler et al., 2015).
Regarding other types of biomarkers, three blood plasma endocannabinoids (associated with euphoric feelings from exercise) showed increases after 30 minutes choir singing in healthy females, whereas only one type of endocannabinoid (OEA) increased in the same participants following 30 minutes cycling exercise or reading, and no changes were found after 30 minutes of dancing (Stone et al., 2018). Kreutz (2014) found salivary oxytocin (an indicator of bonding and attachment, with a role in stress) increased significantly in 21 participants after 30 minutes of singing but not after 30 minutes of chatting together. In contrast, Fancourt et al. (2016) reported that salivary levels of oxytocin decreased during group singing in the cancer choir, mirroring another study where reductions in salivary oxytocin were seen after group singing, alongside reductions in cortisol, suggestive of its role in stress response rather than social interactions (Schladt et al., 2017). Fancourt et al. (2016) also found significant increases were found in the cytokines (immune system messengers) GM-CSF, IL17, IL2, IL4, TNFα, sIL-2rα and sTNFr1 after singing, suggesting an activation of the immune system and reduction in inflammation.
Some of the neuropeptides of interest to choir researchers (e.g., beta-endorphin, oxytocin) cannot cross the blood brain barrier; therefore, measuring them in blood or saliva was not likely to give an accurate understanding of levels in the central nervous system (Carson et al., 2015; Kagerbauer et al., 2013; Valstad et al., 2017). Proxy indicators for the release of endorphins can be considered, such as pain resilience measured by the level of pressure that participants can withstand using a blood pressure cuff (e.g., Weinstein et al., 2016), or the duration that participants can sit against a wall without a chair (e.g., Sanfilippo, Pearce, Stewart, & Launay, 2016). Despite disputes over biomarker testing regarding choosing saliva or blood, saliva has additional benefits in that it is non-invasive, doesn’t require a medical professional (can be done by participant themselves), and can be sampled at the same time by multiple people.
Overall, there is preliminary evidence to suggest that the ways in which singing influences health is through modulations of biomarkers associated with stress, and through the immune system. However, there are inconsistencies seen in results across studies, as well as biomarker levels not always converging with other measures. In light of this, and due to the costs incurred by biomarker analysis, future research should be careful in the consideration of timing of sampling following intervention. Due to lag times of biomarker production, the use of multiple sampling points is optimal. Considering the different functions of biomarkers and the interactions between them, it is recommended to assess for more than one biomarker and to analyze relationships among biomarkers, self-report, and physiological measures. Follow-up measures and longitudinal research are also encouraged to assess how long effects last and the accumulation of effects (Fancourt et al., 2014; Finn & Fancourt, 2018). Recommendation 6: Given the inconsistent relationships between group singing and levels of biomarkers, and the fact that biomarker research is costly, researchers seeking to include biomarkers should collaborate with an endocrinologist, immunologist, or other biological scientist to ensure that appropriate measures and methodologies are used.
Self-Report Measures
A variety of self-report measures have been used in choir research, with mood in longitudinal studies (or emotional states in experimental studies), well-being, and social connectedness the most commonly measured constructs. Mood symptom measures include the Perceived Stress Scale (Cohen, Kamarck, & Mermelstein, 1983), a four-item measure of the degree to which individuals appraise situations in their lives as stressful; and the Kessler-6, which measures anxiety and depression symptoms over the past 30 days (Kessler et al., 2002). Depression and anxiety symptoms have been measured in hospital samples using the Hospital Anxiety and Depression Scale (Zigmond & Snaith, 1983), a 14-item scale designed to assess mood disturbance while avoiding somatic symptoms that may be due to either a medical condition or a mood disorder (e.g., Fancourt et al., 2016; McNaughton et al., 2017). Aligned to depression and anxiety, loneliness has been measured in samples prone to social isolation such as community dwelling older adults (Johnson et al., 2018). Loneliness may be measured using brief scales such as the three-item loneliness scale (e.g., Hughes, Waite, Hawkley, & Cacioppo, 2004); the Roberts UCLA loneliness scale (RULS-8; Roberts, Lewinsohn, & Seeley, 1993); and a subscale of the NIH Toolbox for the Assessment of Neurological and Behavioral Function (Hodes, Insel, & Landis, 2013).
Numerous measures of psychological well-being have been used, including the General Health Questionnaire (GHQ-12; Goldberg & Williams, 1988), the health-related quality of life measure EQ-5D (Group EuroQuoL, 1990), the 14-item Warwick-Edinburgh Mental Wellbeing Scale (Tennant et al., 2007) and the World Health Organization—Five Well-Being Index (WHO-5; WHO, 1998). Bohnke and Croudace (2016) explored the GHQ, WEMWBS and EQ-5D using multidimensional item response theory and found that a two-factor model provided the best account of the data. Further, they showed that the GHQ-12 and WEMWBS items assess mainly the same construct: a general factor that is central to people’s conceptions of well-being (Bohnke & Croudace, 2016). Quality of life has been measured using the SF-12 (Ware, Kosinski, & Keller, 1996) and the four-item Global Quality of Life subscale from the WHOQoL (“How would you rate your quality of life?”; “How satisfied are you with the quality of your life?”; “In general, how satisfied are you with your life?”; and “How satisfied are you with your health?”). Due to its global scope and nonspecific timeframe, this measure would be suited to longitudinal choir studies but not to single session or short-term (e.g., 8 weeks) longitudinal studies.
Beyond symptom measures, some group singing research has included measures of theoretical constructs that may help to explain how group singing works to bring about positive outcomes. Examples include social identification (with others in the singing group), flow, interpersonal emotion regulation, and tests of cognitive functioning. Numerous choir studies have assessed social connectedness among the participants. Relevant measures include the four-item group identification scale (e.g., “I identify with members of the choir” and “I feel strong ties with members of the choir”) adapted from Doosje, Ellemers, and Spears (1995) and the Social Connectedness Scale (Carroll, Bowera, & Muspratt, 2017). Others have used the single item Inclusion of Ingroup in the Self measure (IIS: Tropp & Wright, 2001), which is an adaption of Aron, Aron, and Smollan’s (1992) Inclusion of Other in Self Scale (IOS). The IIS is a pictorial measure and asks participants to state how socially close they feel to a group using images of circles which overlap to a greater or lesser extent (e.g., Weinstein et al., 2016), and is a useful way to assess group processes within choirs. Observational methods can afford a more detailed and dynamic understanding of group behaviors, including in situ assessments of group processes as they occur during singing sessions. One group of researchers (Tarrant et al., 2018) have video recorded singing group sessions and trained independent coders to rate the degree of group cohesiveness using scales including the Cohesion in Group Psychotherapy measure (Budman et al., 1987).
Emotion regulation has been measured in various ways, such as studies of single sessions of choir singing with repeated assessments using the Positive and Negative Affect Scale (PANAS; Watson & Clark, 1988)— see, for example, Dingle, Williams, Jetten, and Welch (2017) and Weinstein, Launay, Pearce, Dunbar, and Stewart (2016). The Self-Assessment Manikin (Bradley & Lang, 1994) may be useful for studies where low rates of literacy are a consideration (e.g., in a non-WEIRD context). This is a picture-oriented questionnaire developed to measure an emotional response on three dimensions of valence, arousal, and dominance. Interpersonal emotion regulation has also been measured using the Emotion Regulation of Others and Self scale (Niven, Totterdell, Stride, & Holman, 2011).
Experience Sampling Methods
A way of capturing group processes during a rehearsal or across a program is ecological momentary assessments or experience sampling methodology, in which participants are alerted at random occasions during waking hours and asked to complete a brief online survey or diary entry (Csikszentmihalyi & Larson, 2014; Greasley & Lamont, 2011; Randall & Rickard, 2013). Flow—or optimal experience—is characterized by complete absorption in what one does with no spare attention being available for anything else (Csikszentmihalyi, 1990). Flow can be measured at the end of each singing rehearsal using the Flow Questionnaire (Csikszentmihalyi & Csikszentmihalyi, 1988; Delle Fave & Massimini, 1988).
Cognitive/Neuropsychological Measures
Finally, singing projects designed to support cognitive health in older adults have adopted neuropsychological tests such as the Mini Mental Status Examination (MMSE; Folstein, Folstein, & McHugh, 1975) and the Addenbrooke’s Cognitive Examination (ACE-III; Mathuranath, Nestor, Berrios, Rakowicz, & Hodges, 2000), which has three equivalent forms that can be alternated for repeated measures design projects to avoid practice effects.
The authors considered whether a common set of self-report measures could be used across multiple choir studies for the purposes of benchmarking or pooling for analysis. Researchers could supplement these with other measures specific to the sample and research questions of each project. Based on the criteria of maximum coverage and validity and minimum burden on participants, we propose the following set of measures for this benchmarking set: the Kessler-6 for anxiety and depression; the WHO-5 for well-being; the four-item measures of social identification with the singing group (for belonging/identification); and the EQ-5D quality of life, from which a health economic evaluation can be derived. This set of 21 items would take respondents only a few minutes to complete. Recommendation 7: Researchers should consider the psychometric properties of the self-report measures they use and if they want to compare their sample descriptive statistics against a benchmarking set they could include the measures suggested above.
Qualitative Methods
The authors described the advantages and procedures for several methods of qualitative research with choirs. For example, a World Café approach (Brown, Homer, & Isaacs, 2007) can be conducted immediately following a choir rehearsal and allows participants to work in small groups with rotating members to discuss specific questions. The conversation can be recorded for later transcription and analysis and, in addition, artefacts can be collected, such as drawings and notes on paper tablecloths and photos from the session (e.g., Lamont et al., 2018). Individual interviews with qualitative analysis are less prone to the influence of group dynamics; however they are time-consuming, with each taking around 30 minutes or more (Williams, Dingle, Calligeros, et al., 2019).
The Sing Yourself Better project included two questions as part of an international online survey: “Are there ways in which you think participating in the choir is good for your health?—If yes please describe”; and “Please add any comments about the benefits of being in a choir” (Moss, Lynch, & O’Donoghue, 2018). Researchers collated the comments and analyzed them using thematic analysis (Braun & Clarke, 2012). Similarly, participants in large-scale choir projects have been asked to provide written feedback about the effects of their involvement in group singing at several time points (e.g., Clift & Morrison, 2011). Regardless of the method of data collection used, there are published guidelines for the quality reporting of qualitative research—such as that published by the Qualitative Methods in Psychology working group for the UK Research Evaluation Framework (QMiP REF working group, 2018); and the COREQ 32-item checklist published by Tong, Sainsbury, and Craig (2007). Recommendation 8: Recognizing the importance of using qualitative methods alongside quantitative methods to explore mechanisms of effect, the authors recommended taking advantage of existing guidelines for the conduct and reporting of qualitative research, such as the QMiP report and the COREQ.
Limitations
The workshop and this article based on the proceedings focused predominantly on health and well-being benefits, with little discussion of the potential risks and downsides of group singing. Kreutz and Brunger (2012) analyzed responses from a large sample of longstanding members of choral societies that revealed that there can be negative experiences related to the conductor (50.0% of respondents reported this), fellow choristers (38.1%), and performance aspects (13.6%) of group singing. Their results suggest that social problems as well as conflicting aesthetic goals feature in negative experiences associated with amateur choral singing. Moreover, a large-scale international survey of choir singers found a small number of negative issues raised such as physical stress (throat hurting after singing), a lack of fit with the group you sing with, and issues associated with the skills of the musical director (Moss et al., 2018). The research also indicated that how the choir manages poor performance and lack of confidence is important in contributing to well-being and health benefits. Clearly, there is a need for a balanced understanding of the relationships between group singing and health and well-being that includes both the benefits and costs to singers.
Conclusions
Current research evidence suggests that singing in a choir or group has a number of health and well-being benefits; however, we need to know more about the negative physical (voice) and psychological (social problems) experiences associated with group singing. We also need more research about the societal, educational, and political dimensions. The majority of published studies on group singing have focused on middle-class amateur or professional singers, who are not representative of the general population. To understand better how and why singing works we need more research testing theoretical models and adopting robust methodologies. The ideas recorded here emphasize the importance of interdisciplinary collaboration. We agreed it is important to ensure that singing group leaders are given a voice along with the participants’ views, to obtain input from those “on the ground.” The current article outlines a number of recommendations for future singing studies. Whilst these issues have arisen from group singing research, they have potential relevance more broadly to music researchers and those from other disciplines.
Footnotes
Acknowledgements
The authors would like to thank the Royal Society for Public Health, UK and the Sidney de Haan Research Centre for Arts and Health, Canterbury Christ Church University for supporting the workshop. Dr. Dingle is grateful to the Faculty of Health and Behavioural Sciences at The University of Queensland, Australia, for supporting her involvement in this workshop as part of her Special Studies Program.
Contributorship
GD and SC conceived the study and organized the research meeting. GD wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Peer Review
Gunter Kreutz, Carl von Ossietzky University, Music Department.
Fiona Costa, University of Roehampton, Education Department.
