Abstract
This article investigates the multiple values of audio description (AD) across an increasingly discerning, broad and multi-platform audience of video consumers. While other accessibility features, such as closed captions, are an established aspect of accessible video consumption, AD has more recently emerged as a socially and culturally significant feature for audiences, both with and without vision-based disabilities. This article offers a review of historical accounts of AD and current discussions around both the quality and provision of AD for video. This discussion is presented alongside the findings from our three-way review of the accessibility of the video on demand landscape in Australia. We identify that AD is at a critical juncture, popularised by the rise in audio content and audience demands for personalised viewing options, thus becoming a mainstream entertainment issue as well as an accessibility issue.
Introduction
In 1917, the blind and low-vision soldiers of St Dunstan's Hospital in London were taken to a screening of Herbert G. Ponting's film With Captain Scott in the Antarctic at the Philharmonic Hall. Filmed in sub-zero temperatures using a camera that had to be cranked by hand, the documentary brought together two years’ worth of rare expedition footage. Billed as a ‘Complete Cinematographic Diary […] of the Greatest Adventure of Modern Times’, Ponting's ‘moving picture lecture’ caused a sensation upon its release (Bean, 2019). Long before the Discovery Channel and easy access to online documentaries, it was the first time a professional photographer had visited the polar region and returned with film (Lynch, 1989). The imagery was unique and extreme – a visual treat. Blind and low-vision soldiers from St Dunstan's were able to participate in this cultural moment thanks to the descriptive efforts of Lady Eleanor Waterlow who had apparently honed her skills by providing the same group of men with AD for live theatre performances, although they proclaimed her cinema debut was ‘the most interesting of all’ (Audio Description in Australia, n.d.).
For an early pioneer of the craft, there were no existing guidelines, set standards, or rules to follow; instead, it was a process of trial and error to discover what worked and what did not. As reported in a 1917 issue of the National Institute for the Blind's journal The Beacon (cited in Audio Description in Australia, n.d.), Lady Waterlow had a ‘happy way of creating mental pictures by flashes of suggestive description interjected at appropriate moments’. As such, although they could not see the film, soldiers were provided with ‘a series of lively impressions of the scenes and incidents represented on the screen’ (Audio Description in Australia, n.d.). Without her description of the visuals, they would have been unable to fully access a key moment in media history. As it was, they left the hall with a strong impression of the content and plenty of material to discuss with others (Audio Description in Australia, n.d.).
The screening at the Philharmonic is one of the earliest recorded instances of live audio description (AD) for a blind or low-vision audience (Fryer, 2016). It is, therefore, a useful place to begin our paper because it illustrates not only the social and cultural values of such a service, but also the importance of quality. Indeed, Lady Waterlow's AD was widely acclaimed as a success due to her level of quality detail: When the lecturer referred to the wonders of the Antarctic – the Great Ice Barrier, the glow of the midnight sun, the absorbing animal life, and the heroic personnel of the fated expedition – the listeners were not left in any doubt as to the appearences [sic] recorded, for Lady Waterlow was ready with the vivid words that tersely emphasised what was salient and characteristic, and stimulated the imagination to fill up the picture (Audio Description in Australia, n.d.).
Over a century later, both AD and the visual media it accompanies have evolved significantly, but the issue of quality remains pertinent. Indeed, in this article, we argue that the provision of high-quality AD is more important than ever before. As Louis Fryer points out, if defined as ‘a sighted person painting a verbal picture for a friend or family member with impaired vision’, AD is a practice that can be traced back throughout human history (2016: 15). However, the advent of digital technologies has rapidly transformed the context in which this practice takes place; over the past decade, AD has become increasingly essential as video content has risen to dominate digital culture and communication. Consumer research correctly predicted that by 2022 over 82% of all internet traffic would be shared in video form (Sarika, 2021). Contemporary AD can therefore be defined as ‘a track of narration included between the lines of dialogue which describes important visual elements of a television show, movie or performance’ (Ellis et al., 2018: 7).
For those who are blind or have low vision, accessing this content can present a serious challenge – there is now a widespread need for AD to be provided in an organised way to large audiences across multiple platforms. This article begins by considering the provision and accessibility of AD in relation to a more established assistive feature, closed captions. It then outlines and interrogates the rising profile of AD as an emerging feature of video on demand (VOD) both internationally and within Australia, and reviews audience perspectives via a recent audience survey. We consider how the role and purpose of AD are being negotiated, in conjunction with other audio-based media such as audiobooks. We specifically examine the ways in which the quality of AD is being discussed, mirroring similar historical lineages of other accessibility features such as closed captions.
Historical comparison with closed captions
While AD is relatively new to the Australian television landscape, captions have a longer history and are more familiar to broad audiences. Captions are defined as the display of audio as text, usually at the bottom of the screen. Captions as an accessibility feature were designed to make television more accessible to people who are D/deaf or hard of hearing. Despite their origins in accessibility, captions have now become a mainstream feature of digital entertainment and, in particular, VOD (Ellis, 2019). Quality captions in particular have become increasingly expected by diverse television audiences. However, broadcasters, legislators and audiences have historically resisted captions for the following broad reasons, namely that:
Captions were thought to have a limited audience. Captions were an unnecessary cost. Caption technology would be better in the future and audiences should ‘wait’. People should accept whatever they could get; there was no need for standards.
It has been argued that the current provision of AD is following a similar pathway to that of captions some 20 years previously, particularly in relation to public reception. In this section, we identify four intersecting factors that influenced the availability of captions worldwide which, we argue, are being repeated in the current AD debate – advocacy, technological change, legislation and potential audience reach.
In 1915 motion pictures were described as ‘nearer to the realisation of hearing for the deaf than anything else’ by Alexander Pach, a columnist for the Deaf community newspaper The Silent Worker (as cited in Schuchman, 1984). Yet, with the introduction of sound some 16 years later, D/deaf audiences were excluded; they, therefore, began advocating for spoken words to be somehow ‘[thrown] directly under the screen as well as being spoken’ (as cited in Downey, 2008) to avoid the ‘calamity’ of being excluded (as cited in Johnson, 2017) by sound which had now been added to visual media.
As Katie Ellis and Gerard Goggin explain in their history of captions in the book Disability and the media (2015), the introduction of captions in the US versus the UK followed different trajectories. While in the US, captions became available as a result of civil rights advocacy and activism, in the UK their provision was the result of legislation and incentives to purchase televisions that allowed for the display of captions (Ellis and Goggin, 2015).
All of this took time; it was not until the 1950s that captions, as we understand them today, began to be developed by teachers at a number of US schools for the Deaf, and it was not until the 1970s that activists in the US started to campaign extensively for the introduction of open captions on public broadcast television (Downey, 2008). Advocates in the US drew heavily on available legislation such as the 1973 Rehabilitation Act and the 1934 Communications Act in making their argument that D/deaf and hard of hearing audiences had a right to access captions.
During the 1980s, technological advances allowed for accessibility to closed captions or the display of captioning only when the viewer wanted them. However, audiences needed a decoder attached to their television to display the captions and, at the time, these were expensive. This led to a Catch-22 situation – broadcasters would not make captioning available until the audience share for them increased, yet audiences would not purchase a decoder until more captioning became available (Downey, 2007; Youngblood, 2013).
Continued US consumer advocacy with a focus on legislative change alongside these technological advances resulted in the passing of the Television Decoder Circuity Act in 1990. This piece of legislation mandated that televisions with a screen greater than 13 inches be designed to display captions. With technology in place to display closed captions in the home environment, as well as similar legislation being widely accepted internationally, they, therefore, became more available.
However, the availability of captions has not remained constant with each new technological change such as the shift from analogue to digital television or during the introduction of VOD. In fact, advocates have had to reignite the argument for the introduction of captions at the emergence of each new form of audiovisual media. This is mainly because these new media broadcasters usually only offer the minimal legislated amount of captioning. As such, focusing purely on the legislative requirements for captioning as opposed to their innovative potential or their ability to retain broad audiences is problematic. While advocates have often highlighted the mainstream potential of captions (see Downey, 2008), it was not until the introduction of Facebook's autoplay function – when captions became a popular tool for social media audiences to use when turning on the sound would be inappropriate (Facebook Business, 2016) – that this potential demographic has been taken more seriously. Now, 85% of Facebook users watch videos with the sound off (Huxley, 2018). Furthermore, research shows that consumers are 12% more likely to engage with a video if captions are available (Facebook Business, 2016), thus making them a good business decision. This availability of captions in consumers’ social media and entertainment lives means that the zeitgeist has shifted around this accessibility feature; captions are increasingly being viewed as simply a part of everyday audiovisual media.
Alongside this increased availability of captions has been an associated expectation regarding their quality, in particular their accuracy, for both hearing and D/deaf audiences. D/deaf audiences acknowledge that during the 1980s they were happy with whatever captions they could get, at whatever quality (Newell, 1982). However, this is not the case today as audiences are willing to complain to regulators and even litigate. Alongside the growing emphasis on accuracy, there has also been a significant technological shift with the affordances of machine learning and the ability to create automatic captions using artificial intelligence (AI). In this context, the literature has focused on both accuracy and the reduced lag time in live captioning. While in 2016 Ericsson partnered with the BBC to create a new method of AI live captioning to reduce delays (Varshney, 2016), a comprehensive machine learning system had been proposed a decade earlier in order to leverage a wider audience for people engaging the captions track such as those speaking another language (Yuh and Seo, 2006).
The legislation regarding captioning is also changing. For example, legislation such as the US 21st Century Television and Video Accessibility Act (2010) has had an international impact on the availability of captioning. This Act mandates that all content previously broadcast with captioning must be captioned within 15 days as of broadcast (Federal Communications Commission, 2021). Following this legislation, captions are also required on 100% of new programming and 75% of pre-rule programming distributed by cable operators, broadcasters, satellite distributors and other multi-channel video-programmers. As a result, captions are widely available on platforms such as Netflix, Amazon Prime, Stan, Disney+, Apple TV+, Google Play, Binge and Tubi.
In Australia captions are mandated under both the Broadcasting Services Act (BSA) (1992) and the Disability Discrimination Act (1992) but only for the primary digital broadcast channels, not for VOD or the digital multichannels. As such, while captions are largely available on VOD platforms, particularly international ones due to legislation such as the US 21st Century Television and Video Accessibility Act (2010), there is no legislative requirement that they be made available when streamed in Australia.
Rising profile of AD as a feature in VOD services – our research
In February 2021, we undertook a three-way review of the accessibility of the VOD landscape in Australia, via an accessibility policy scoping study, a survey conducted through SurveyMonkey, and follow-up interviews. This project aimed to develop an understanding of ways in which people with (and without) disability engaged with VOD, and the impact, availability and use of accessibility features. The project design and survey questions were based on a similar study conducted in 2015, at a time on which VOD had just emerged in the Australian media landscape. This was also a time in which broadcast television was increasingly captioned, subject to the BSA (1992), and was beginning to trial the use of other accessibility features such as AD.
The results of the 2015 survey found a limited and inconsistent approach to accessibility by VOD providers, and a frustrated audience of people with disability who found a lack of quality and consistent accessibility features, challenging set-up, high cost and unreliable internet connections often excluded them from this new service. The provision of AD was absent from VOD services in 2015, aside from a limited introduction by Netflix. Over five years later, while barriers remained that diminish equal access to VOD for all people with disability, the accessibility of VOD had improved significantly.
In 2021, the scoping study of VOD accessibility features and policies found there was a general increase in the quality and consistency of closed caption provision, as well as the introduction of other accessibility features. However, despite a growing inclusion of accessibility features, in part influenced by US legislation and international advocacy effort, accessibility policies were still lacking and poorly communicated. AD was being offered by five of the 20 VOD providers but by none of the free-to-view, broadcast VOD services. This was despite multiple trials and the provision of some AD on one Australian television broadcaster.
Regarding the importance and value of VOD for both people with and without disability, this was clearly indicated in our 2021 SurveyMonkey results. A total of 267 individuals took part in the survey, with 93% using VOD. As we had utilised an advisory group for the purpose of this project, with representatives from both the disability community and the VOD industry, the survey was likewise promoted via the networks, through email and social media, of these organisations/members. We also invited (redacted) students that had a Disability Access and Inclusion Plan (DAIP) via a targeted email to individuals on that email list.
While this sample was a little lower than originally projected, based on the National Statistics Service sample size calculator and drawing data on the proportion of Australians accessing SVOD services (69.3% in June 2021) (Bureau of Communications, Arts and Regional Research, 2022) and the ABS data on the percentage of the population of Australian's with a disability (17.7% in 2018), this sample reflects a margin of error of plus or minus 6%. However, given the sample was selected primarily by snowball selection, rather than a true random sample of the population, this confidence level should be considered with caution.
Most respondents were between the ages of 18–24 (38%), 21% were 25–34, 12% were 35–44, 14% were 45–54, 9% were 55–64 and 6% were over 65. 31% of the respondents were male, 64% were female, 4% were non-binary, and 1% listed ‘Other’. The majority of respondents, 81%, lived in urban areas of Australia. In both regional and urban areas, participants noted they had a reliable internet connection, with only 5% stating they did not. However, these statistics did not always correlate with the comments provided in other parts of the survey, where better and faster internet were commonly cited in the question of what participants thought would improve access to VOD television. The cost of VOD was also commonly cited as an area that could be improved, or reduced, something which also reflected the household income of participants. The largest proportion of survey respondents, 28%, earned below $30,000 per annum, while 14% earned over $150,000.
We invited people to complete the survey to comment on the accessibility of VOD, and while we expected most participants to have a disability, almost half indicated they did not have a disability, but many still utilised accessibility features. Moreover, the availability of accessibility features was valued by respondents. While 79% stated that flexible viewing was a key reason for using VOD, and 73% cited content as another important feature, 37% used VOD for the availability of accessibility features.
A number of different aspects to accessibility to VOD services were surveyed, namely the respondents’ preferred service provider and device, their disclosed disability and preferred access feature, their ease of set-up of service and access to AD in particular. These are discussed further below.
Most participants used both free-to-air VOD and subscription services (63%), with Netflix being the most popular service – 88% of respondents watched Netflix, followed by ABC iview (58%), SBS On Demand (49%), Amazon Prime (42%), Disney+ (41%) and Stan (40%). Most participants watched multiple, often more than three, services. Moreover, AD had become more widely used and was a pertinent feature of interest for a broad spectrum of survey participants; of note, the most popular subscription services were also the most accessible, with Netflix, Amazon Prime, Disney+ and Stan all offering AD.
However, accessibility of content is not solely reliant on the provision of accessibility features, and the accessibility (and compatibility) of devices was shown to also significantly impact which device people used to watch VOD. Notably, the computer and the smartphone were listed as the most popular devices used, with 57% and 55% of respondents, respectively, choosing these devices. A total of 50% used smart televisions, and 39% utilised a streaming video player such as Chromecast. Again, most survey respondents did not use one device, but accessed VOD on multiple devices. However, the predominant device used also correlated to whether the participant had a disability and, if so, which ‘category’ of disability. For example, the smartphone was significantly more popular (65%) for people with a vision-based disability as this device is cited as being the most compatible with AD and for screen reader capabilities.
For those respondents who had a disability, most cited a vision-based disability (45%), a further 25% had a chronic illness, 22% had a hearing-based disability, and 18% had a mobility-based disability; 13% listed ASD (autism spectrum disorder), 4% an intellectual disability and 3% a head injury, and 19% listed ‘other’ disabilities, predominantly ADHD, sensorimotor disabilities and mental illness. This broad range of disabilities also reflected a range of accessibility features participants cited as making television watching easier for them. The most prominent was AD (45%), a feature which was rarely available five years previously, 44% listed closed captions, 35% used talking menus and 32% required screen reader support. 30% cited that more compatibility with assistive technologies would improve their capacity to watch television, as well as clean audio (29%), spoken subtitles (28%), large or colour-coded remote control keys (15%) and signing (3%). Additionally, 17% suggested other accessibility features would be useful, from ‘better quality’, adjustable or consistent captions, to more accessible, better-designed interfaces, to voice command remotes. While we identified that the accessibility of VOD has increased since our survey in 2015, when asked how available these features were on the VOD they used, almost half of all participants stated it was never, rarely or sometimes available.
Moreover, 30% of respondents with disability had problems finding accessible content, and the set-up and use of VOD were cited as either very difficult or difficult by almost one-quarter of participants (with 38% asking someone to help with set-up, predominantly family or friends). These results offer insights into the ways in which accessibility is layered; while an increase in the provision of features such as AD suggests VOD is becoming more accessible, and as detailed in our follow-up interviews, its use is impacted by device limitations and interface issues, technical literacy and inconsistencies between service providers.
What was a distinct finding of this study, and particularly in contrast to the previous 2015 project, was the increased use of accessibility features by people without disability, and in particular the emerging significance of AD, again for both people with (a range of) and without disability. Notably, in follow-up interviews with 12 participants, AD was a pertinent subject – it was used by all five with low vision or blindness, as well as by a participant without a disability. The quality, consistency and continuity of AD were often raised as a pertinent issue, as well as the general lack of this feature across most service providers and, for those that offered it, that it was only on limited content and not always easy to find. Reflections on the quantity and quality of AD varied – one participant reflected on the value of audio describers and the heightened quality of content that was developed and written with and alongside AD, rather than it being ‘tacked on’ at the end of the production process. This interviewee reflected on the artistic role of the audio describer, and the way accent, narrative skill and age/gender appropriateness were integral to the storyline. For some participants, particularly those who were blind or had low vision, the presence of AD completely determined whether or not they would use a service; they chose to watch Netflix or Disney+ over other providers because of the larger AD content base.
Arguably, international legislation and advocacy for AD (such as the Accessible Netflix Project championed by Robert Kingett in the US) have impacted the uptake of AD across multiple services. However, the rapid increase in AD content, and interest in and movements by national broadcasters to incorporate AD on both broadcast and VOD services, suggest that, as opposed to the trajectory of the provision of closed captions, AD has been introduced without legislative pressure and instead in response to a demonstrated demand by an increasingly savvy audience who want a personalised television experience that includes full accessibility.
Discussion
Background: the growth of AD
The increasing demand for quality AD observed during our more recent research coincides with a larger mainstream trend towards audio-based media and communication. Recently, and particularly during the COVID-19 pandemic, digital audio has become a more popular form of entertainment (Kelly, 2021). Podcasts, audiobooks, music streaming apps and radio dramas have surged in popularity alongside the development of other sonic tools such as audio-based social media apps and the trial of an ‘audio only’ mode on Netflix (Fischer, 2020; Vonau, 2020). This background is important to consider because it shapes public expectations and awareness of audio as a format, and has arguably contributed to greater audience literacy in the experience of consuming audio content.
Of course, the shift towards digital audio began prior to the pandemic as the widespread adoption of networked mobile devices and streaming platforms in the early 2000s allowed listeners to access content that was previously constrained to radio and analogue media devices (Marshall, 2015). In 2019, Richard Yao predicted that ‘the rise of smart speakers, connected cars, and “hearable” devices like AirPods will continue to drive this unbundling of radio content and fuel the growth of audio streaming services’ (Yao, 2019). The spread of COVID-19 has just accelerated this process; as the health crisis shut down public events, businesses and entertainment around the globe, the world of audio experienced a renaissance.
The audiobook industry offers a useful case study. Building on ‘eight straight years of double-digit revenue growth’, audiobooks took on an important role during the pandemic (Audio Publishers Association, 2020). Journalist Arianna Rebolini (2020) claimed audiobooks were ‘saving [her] sanity’ by providing a distraction from the busy monotony of lockdown, ‘I listen to books while I’m taking [my toddler] Theo for walks, while I’m watching him at the park, while I’m cooking, while I’m cleaning. It's multitasking that doesn’t feel like multitasking’.
This technology is not only beneficial for healthy people sheltering in place; listening also provides solace to those who are battling COVID-19. The Guardian writer Alex Preston (2020) turned to audiobooks when struck down by the virus, ‘One of the first things I did when I undertook my own Covid-19 journey in early April – three weeks of coughing and night-sweats in the spare room – was to draw the blinds and put on an audiobook’. This value is not incidental. After all, audiobooks were originally devised for people with disability. As Matthew Rubery notes in the introduction to Audiobooks, literature, and sound studies, ‘the history of the medium from the outset has been intimately connected to disability’ (2011: 2). When the phonograph was first invented in 1877 by Thomas Edison, the creation of ‘phonographic books’ for the blind was part of his initial impetus (Rubery, 2011: 3).
Even today, the power of audio to transcend physical distance, soothe anxiety and establish an important sense of connection remains clear; this has been particularly so during the pandemic as many around the world were forced to isolate. While North America was an early adopter of audiobooks at this time, the trend was not restricted to a specific area of the world. Business Wire reports that ‘the audiobook market in Europe is poised to grow by US$1.23 billion during 2021–2025’ (Business Wire, 2021). Globally, the Asian Pacific region is expected to contribute ‘a significant revenue share’ in the next few years, while Africa and the Middle East experience the fastest growth (Grand View Research, 2020). A tool originally intended to enhance media accessibility has since become a billion dollar industry.
Other forms of audio drama, such as podcasts, have also become extremely popular. As of 2021, ‘the podcast industry is in a particular moment of creative and innovative renaissance’ (Insider Intelligence, 2021). The pandemic is certainly a contributing factor. Journalist Sigal Samuel (2020) explained that: These days, I prefer listening to looking. Staying at home means a lot of my work and social life happens on Zoom, and staring at a computer screen for so many hours each day feels draining. So when I want to give my eyes a rest and my emotions a boost, I go hunting for podcasts.
Whether this growth will continue as strongly beyond the pandemic is yet to be seen. But in New Zealand, which came out of lockdown much earlier than other nations, the audience remained consistent beyond the initial pandemic surge. As Henrik Isaksson, Managing Director of the Acast podcasting platform explains: Yes, audiences have grown exponentially [during COVID-19], but the interesting part is they’re still listening to the same amount when they’re not in lockdown. There was always uncertainty that podcast listenership would decline when we came out of lockdown, but that's certainly not the case (cited in McKenzie, 2021).
Early signs indicate that the exponential growth of audio is not temporary – it is potentially translating into an ongoing appreciation for digital audio formats and a preference for AD.
Quality AD: what is it and why is quality important?
This greater exposure to diverse audio content over the last few years has contributed to a growing audience demand for AD, and visual media industries are beginning to respond. As noted in our survey results, while AD was absent from VOD services in 2015, by 2021 it had become more widely used and was of interest to a wide variety of participants. As a result of this rapid increase in public awareness and consumption of audio content, the question of quality is also becoming more important across all audio-based industries, including in AD content. For instance, the quality of narration is cited as a key factor in audiobook consumption: [T]he narrator's voice and pronunciation are one of the critical factors impacting the market growth. While good narration elevates the demand for recordings, bad storytelling with improper breath and pitch control, inefficient characterization, and inappropriate articulation may result in loss of interest by the listener (Grand View Research, 2020).
Simply recording the sound of words spoken aloud is not enough. As Thomas Ling (2018) puts it, ‘the very best audiobook narrators don’t just read a novel – they perform it’. Laurence Howell, the Content Director for audiobook platform Audible, argues similarly that ‘narrating is its own art […] It's not easy’ (cited in Ling, 2018). It, therefore, makes sense that an audience attuned to audiobooks would have certain expectations when it comes to AD. Indeed, our research suggests a preference for emotive performances akin to audiobook recordings (Ellis et al., 2019).
However, adding AD to a visual text takes even more performative consideration than narrating an audiobook, because the soundtrack, pace, genre and characteristics of the original source must also be incorporated. As such, in an effort to limit costs and time, some companies and platforms have started using automated text-to-speech technologies to create synthetic AD tracks. A debate has since arisen regarding whether synthetic voice or human narration should be prioritised, and the differences between these modes of presenting AD. The discussion is shaped by a number of factors, including economics, the availability of AD, and whether the people consulted are blind/have low vision or not.
In March 2019, we published a study examining the landscape of AD in Australia, including a series of focus groups. Forty-four people participated in the focus groups, which were held in a secure, accessible online forum. They were divided into five specific groups – television fans, parents of young children, people with ASD, audiobook readers and people with vision impairment. While a significant portion of sighted participants did not know what AD was at the beginning of the focus group, once it was explained (a definition and explanation of AD was provided to all focus groups along with a variety of video examples), it was later identified by participants that it was useful to sighted people in a wide variety of contexts. For example, participants noted that due to an aging population, the mainstream need for AD is becoming more important. There were also a number of perceived benefits of AD, in particular in relation to increasing the clarity and meaning of texts. Participants also claimed that AD has the potential to generate job opportunities in the entertainment and software industries. Participants recognised that there were a number of barriers to accessing AD. However, all sighted participants argued that AD should be available on television, regardless of whether they used it themselves.
One of the major findings was that all participants in our study, both sighted and blind/low vision, believed the quality of AD was very important (Ellis et al., 2019). Presented with examples of synthetic voice versus human AD, sighted participants expressed a unanimous preference for the human voice. As one person explained, ‘I prefer the audio description with warmth and personality. The robotic voice can be jarring and doesn’t allow me to imagine what's going on as clearly’ (Ellis et al., 2019). Another sighted person argued that: The smoothness and natural inflexions of the human voice […] are far easy to listen to and the narrator feels like they’re describing the scene. The synthetic voice does not have that natural feel to it, the tone, pace and inflexions are [sic] don’t fit with the description and seem to be devoid of emotion (i.e. more like reading a list than describing a scene) (Ellis et al., 2019).
Participants who were blind or who had a low vision also preferred human AD; however, they were more cautious in expressing this preference and more aware of the reasons why text-to-speech might be used, ‘I think it would be a shame if the audio description were made with synthetic voices as a cost-cutting measure since while describing the movie there is no emotion’ (Ellis et al., 2019). Another study comparing synthetic voice to human narration found that just 1% of blind and low vision participants preferred synthetic AD; however, the majority was willing to accept it if nothing else was available (Fernández-Torné and Matamala, 2015: 73). For a blind and low vision audience who depend on AD to consume visual media, no AD is the worst-case scenario rather than low-quality AD. As a result, this audience hesitates to wholly reject any alternative, no manner how limited. In contrast, sighted audiences appear more confident to define exactly what they want. As one of our participants stated firmly, ‘I think it's essential to hear human emotion while hearing audio description’ (Ellis et al., 2019). This is confirmed by Fernández-Torné and Matamala who conclude that ‘the preferential choice of blind and partially sighted persons is the audio description voiced by a human, rather than by a speech synthesis system’, but text-to-speech was an ‘acceptable’ alternative (Fernández-Torné and Matamala, 2015: 74).
If we return to Lady Waterlow's live narration of With Captain Scott in the Antarctic, it is notable that her voice is not described as monotonous but as ‘a series of lively impressions’ (Audio Description in Australia, n.d.). These were not dry interjections, but ‘vivid words that tersely emphasised what was salient and characteristic, and stimulated the imagination to fill up the picture’ (Audio Description in Australia, n.d.). Over 20 years later, research suggests this form of dynamic human AD continues to be preferred by all. It is our work now to consider the challenges and ensure that the growth of this medium moves in the direction of quality content for everyone.
Conclusion
Lady Waterlow's live narration clearly illustrates the importance of quality AD for blind and low-vision audiences, however understanding the accessibility preferences for a broad, complex VOD audience was also a central aim of this paper. Returning again to accessibility for audiences who are D/deaf or hard of hearing following the introduction of sound to the medium of film, an article in The New York Times published in 1929 reported the reactions of a cinema audience of patrons who were either blind or were hard of hearing who had been provided with talking descriptions during a screening of the film Bulldog Drummond. While the hard of hearing were provided with sound magnifying apparatus, it was the audience members who could not see who were reported as most enjoying the film: Those without eyesight seemed to enjoy the performance, especially the humorous parts, and there was prolonged applause at the end of the film (The New York Times, 1929).
However, the article concluded with the assertion that no future provisions had been put in place for either group. Despite this gloomy conclusion, the article highlights the importance of universal design and making media accessible to the broadest potential audience. Captions, AD and magnified sound each provide different options to groups with different needs. Both during the pandemic and before, this ability to personalise the way we access media has become increasingly valued and expected.
However, the issue of quality is still one that needs to be addressed. With AD finally becoming available (though for limited content and only a select number of stations) on Australian broadcast television, anecdotal evidence suggests blind and low-vision audiences are reluctant to complain about quality, just like those accessing early attempts at captioning were. Yet at the same time, audio forms of entertainment and associated notions of quality became increasingly popular during the pandemic. Captions, likewise, became more popular during this time.
Media is changing and to be most inclusive it must be accessible to the broadest potential audience base. It is critical that at this juncture the AD debate learns lessons from the provision of captioning, particularly in relation to advocacy, attracting a broader audience and improving expectations regarding accuracy and quality with technological change. It is important that quality audio (and captions) are recognised as a mainstream entertainment issue as well as an accessibility issue so that the fight for access does not begin again when technology inevitable changes. The reaction of Lady Waterlow's audience in 1917 and the audience of Bulldog Drummond in 1929 is not so different to audiences embracing audio, podcasts and Netflix's audio-only mode during the COVID-19 pandemic.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
