Abstract
Since the inception of recorded music there has been a need for standards and reliability across sound formats and listening environments. The role of the audio mastering engineer is prestigious and akin to a craft expert combining scientific knowledge, musical learning, manual precision and skill, and an awareness of cultural fashions and creative labour. With the advent of algorithms, big data and machine learning, loosely termed artificial intelligence in this creative sector, there is now the possibility of automating human audio mastering processes and radically disrupting mastering careers. The emergence of dedicated products and services in artificial intelligence-driven audio mastering poses profound questions for the future of the music industry, already having faced significant challenges due to the digitalization of music over the past decades. The research reports on qualitative and ethnographic inquiry with audio mastering engineers on the automation of their expertise and the potential for artificial intelligence to augment or replace aspects of their workflows. Investigating audio mastering engineers' awareness of artificial intelligence, the research probes the importance of criticality in their labour. The research identifies intuitive performance and critical listening as areas where human ingenuity and communication pose problems for simulation. Affective labour disrupts speculation of algorithmic domination by highlighting the pragmatic strategies available for humans to adapt and augment digital technologies.
Keywords
Introduction
A.I. From the Heart. LANDR is smart and getting smarter. Developed over 8 years of research, LANDR uses A.I. and machine learning (think self-driving cars and Shazam), to replicate the processes human engineers make when mastering a track. (LANDR, 2018: no pagination)
In 2014 research on big data and machine learning from the Centre for Digital Music (C4DM) at Queen Mary University of London culminated in a Montreal-based startup company, Mixgenius, launching a product offering AI enabled audio mastering: LANDR. The company adopt the term ‘AI’ for their system in both public descriptions of their processes and in branding and slogans, as in the above extract from LANDR’s landing page on their website. In the same fashion as other forms of AI, ranging from driverless cars to chess-playing supercomputers, LANDR combines algorithms (see Dourish, 2016) with machine learning and big data analytics to simulate human expertise in preparing audio in the form of music and sound creations for wider consumption. The term AI will henceforth be used in this paper to describe the assemblage (Aradau and Blanke, 2015) of these three elements: big data analytics about music trends, machine learning of mastering skills, and algorithms that apply signal processing to sound productions without human intervention.
The intent in the AI’s deployment in this case is to derive profit and efficiency from substituting for expert human labour to bring costs below human standards and to perform mastering faster than humans. LANDR encourages sound artists/creators to master their own music and, while the product’s rhetoric does not pitch it against engineers per se, the media has certainly speculated on the imminent replacement of humans by machines in audio mastering, since this AI purportedly offers more neutrality and fewer errors than humans (Bilić, 2016). What is noteworthy of this implementation of AI in audio mastering is that it creates an alternative ‘algorithmic culture’ that includes some humans – that is, sound creators – yet excludes others – that is, established human audio mastering engineers.
How does this particular form of AI work? Digital waveforms of audio undergo spectral and frequency analysis and are matched to averages from a large dataset of existing songs in order to determine mastering parameters so that the system is able to apply reasonable templates of signal processing without a human expert ever listening to the result. By utilizing accessible upload and download file-sharing the artist takes the role of quality control. LANDR evolves over time through self-learning processes involving the comparison of thousands of audio tracks alongside descriptions of engineers’ self-perceived processes versus the actual spectral and frequency changes resulting from their physical processing. Here the AI draws on user behaviour for its own education and in this sense mimics human learning and decision-making.
In future there is perhaps a likelihood that AI with deep neural networks and robotics will threaten to simulate the operations of audio mastering engineers’ cognitive and physical functions (Mimilakis et al., 2016). At present, there is still a need for human intervention and intelligence, in this case that of the sound creator, who replaces the audio mastering engineer and is augmented by the AI. In this paper, I aim to shed new light on how cultural industries (Drake, 2003) accommodate or reject such AI assemblages utilizing big data, machine learning and algorithms. Machine learning is perhaps the crucial ingredient in LANDR as it affords the system the ability to attain a degree of autonomy, radically reducing the cost of the processes through learning capacity (Mittelstadt et al., 2016). I foreground how human labourers problematize speculations of imminent job disruption and displacement through ‘algorithmic mediation’ (Mittelstadt et al., 2016). In opening up this avenue of inquiry, I offer a provocation against estimates of human labour receding and facing redundancy with the ‘rise of the robots’ (Ford, 2015). In particular, I establish how creative products result from the admixture of traditional elements with new through processes of ‘search and recombination’ wherein humans adapt to new technologies (Messeni Petruzzelli and Savino, 2015). While ‘delegating decision-making to algorithms can shift responsibility away from human decision-makers’ not all labourers, particularly those who take pride in their ‘craft’, are willing to forego this role (Mittelstadt et al., 2016: 13). Others are less certain about the future. In this paper, I use the emic term of ‘hybrid workflows’ to schematize how audio mastering engineers project a range of adjustments to their labour practices. These visions to compete with, and accommodate, AI present alternative futures to job replacement and redundancy and are timely given systems such as LANDR are already seizing a market share.
Algorithmic culture, affect and AI
The conceptual framework for this paper is critical data studies (Dalton et al., 2016; Iliadis and Russo, 2016), chiefly as a response to more technologically deterministic framings of the impact of AI on human labour (Kaplan, 2016). AI is understood in this paper to be a notional, indeed aspirational, phenomenon, rather than a specific instance of a technological breakthrough towards replicating, even replacing, human intelligence in the present. The paper explores ‘everyday anxieties’ about AI’s use of big data in a specific corner of the cultural industries (Leszczynski and Crampton, 2016). AI fits within a broader debate about algorithmic interaction with music listeners (Airoldi et al., 2016; Karakayali et al., 2018) and festival- (Carah and Angus, 2018) and museum-goers (Wilson-Barnao, 2017). Debate on data’s affective value (Cockayne, 2016) is also pertinent given LANDR’s algorithms draw on databases of human users’ pre- and post-mastered content and contrasts them to actual, human, subjective practices.
Big data are already making inroads into creative practice, such as art (Singer, 2016). With non-human robots and AI intervening in typically human undertakings, such as sex (Cockayne et al., 2017), the appearance of an assemblage of big data, algorithms and machine learning in traditionally human professions is a further incursion on the sanctity of white collar work; future labour disruption appears irresolvable (Susskind and Susskind, 2015). At present, there is still a human profession of audio mastering and close connections between engineers and artists/creators persist, although the career path has faced some profound disruptions in the twenty-first century due to the digitalization of the music industry. In the late 1960s with the spread of home stereo music players, recording studios and music scenes across the globe, specialist studios arose dedicated to mastering and these were until recently the market dominators, such as Sterling Sound in New York, Gateway Studios in Portland, Maine, and Abbey Road Studios in London (Leyshon, 2009: 1318). As the technology changed over the 1980s and various technical skills became industry standards independent ‘freelancers’ began to dominate the market. In the 1990s, the term ‘mastering’ entered popular awareness with a spate of digital remasters on CD, partly as a response to the bootlegging industry: the sale of unmastered (poor quality) live recordings on CD, a trend consequently quashed by Internet file-sharing of MP3s (Melton, 2014). Nowadays audio mastering engineers are either tenants or partners (rather than strictly employees) in a handful of mastering houses or freelancers overwhelmingly utilizing the Internet to locate and communicate with clients. Despite their online presence they also maintain close links with creative communities and infrastructures, invariably in urban cores, where they utilize social networks and word of mouth for ongoing work. Reputation is built on the quality of their craft in a given scene (Gibson, 1999) and recording studios continue to be hotbeds of creativity (Gibson, 2005). The mastering engineer nowadays holds a degree of decision-making and since every audio production is unique, they adopt an ad hoc approach to reaching an acceptable standard of sound involving bricolage, experimentation and improvisation (Jencks and Silver, 2013). Developing a ‘feel for error’ is crucial for audio mastering engineers and this affective appreciation highlights the importance of intuition and feeling in an otherwise scientific, critical undertaking (Garnett, 2016).
The key research question informing this paper is in what ways are audio mastering engineers prescient of emergent algorithmic cultures involving AI? In order to gauge the degree of innovation in audio mastering as a result of AI, I consider the simulation of expertise and the responses of experts in this field to efforts to replicate their skills and competence. In this article, I consider how AI unsettles creativity, leading to unpredictable windows of opportunity for entrepreneurial actors, yet certainly deposing some workers. In some instances, technologies are the root cause of human obsolescence and drive redundancies in occupations, skills and livelihoods. However, evidence also exists showing that in other cases they enable, and even revivify, forms of expertise and pose alternative business models and ways of performing creativity and labour. Interestingly, I show in this paper that for established practitioners innovations such as AI – those that aim to simulate traditionally human labour in audio mastering – strengthen human skills of communication, performance and critical reflexivity, rather than distance them, or remove them entirely, from labour. To provide an empirical grounding to this topic, the paper reports on qualitative interviews conducted in 20 recording studios with audio mastering engineers in Australia.
The structure of the paper is as follows. The next section introduces the reader more fully to the art and craft of audio mastering, and the changing capacity of the mastering engineer in the music industry in light of recent digitalization trends. The third section describes the conceptual framework of algorithmic culture and the shifts in the culture of audio mastering from craft to creative labour. The fourth section describes the methodology and the fifth section outlines the findings through visions of collaboration between human expertise and machine labour. What emerges from this paper’s inquiry is that currently an algorithmic culture features in the introduction of AI into audio mastering that is inclusive for some humans, namely, artists and sound creators who become collaborators in mastering processes; however, this culture problematically excludes human audio mastering experts and pitches them as competitors rather than collaborators.
Audio mastering as affective labour
In this section, I discuss the affective and intuitive aspects of human audio mastering in order to provide a concrete and accessible account of how human labour is changing, or not, in response to AI. There is a felt, affective, emotional side to this labour alongside the routine and scientific aspects, since audio mastering involves engagement with human socio-cultural notions of noise as desirable or undesirable depending on the genre and taste of the listener (Klett and Gerber, 2014). After Atkinson, sound demarcates territory: the ‘sound of a neighbour’s music does not have to be loud, to compromise our sense of autonomy in the domestic setting’ (2007: 198). Listening to music affects the corporeal experience of homes (Duffy and Waitt, 2013) and mobilities to and from them, such as in car travel (Waitt et al., 2017). Sound also affects people’s sense of themselves and others (Doughty et al., 2016). In some instances, the material threshold of the eardrum can be altered detrimentally, a major concern in modern dance music, where the demand for affective ‘loudness’ and the corporeal experience of overwhelming volumes and frequencies causes ear conditions such as tinnitus (Ash, 2015). Building on this notion that sound is enmeshed in human emotions and placemaking, Marie Thompson (2017) unpacks the contemporary dialectic between an aesthetic moralism governing understandings of sound and its absence. She articulates how aesthetic moralism promotes a binary vision of the tranquil quietude of Western European pastoral landscapes in contrast to the unwanted noise ‘pollution’ of powered technologies, domestic animals, and the prosaic sounds of community life in dense built environments. Despite the legacy of this dialectic, contemporary artistic cultures reject aesthetic moralism in favour of complex sounds and compositions, in the process disputing essentialist ideas of noise and quietude.
Audio mastering engineers tend to be agnostic about the ethico-affective responses of listeners to sound in an effort to be objective about their craft; however, they are also cognizant of the need to deliver a product that conforms with the at times ‘aesthetically radical’ and subjective artistic and aesthetic expectations of listeners (Smith, 2005). A caveat about audio mastering engineers’ affective input is that their role precludes the ‘creation’ of new ideas or content. They describe themselves as picture ‘framers’ rather than painters. Human audio mastering engineers recognize that noise is both subject-oriented and object-oriented and split their role into two distinct parts. The first role is heavily routinized and involves engineers distancing themselves from the recording and examining it scientifically for errors and oversights. Here engineers utilize tools such as spectral analysers that display the waveform visually so that the mix can be examined and then corrected using remedial signal processors, such as multi-band compressors and equalizers, or editing software to virtually ‘splice’, reconnect and edit the signal’s waveform just as their predecessors did with magnetic tape and vinyl cutters in the mid-twentieth century.
The second role is far more creative and involves the engineer inspecting the recording in direct comparison to a similar artistic piece, drawn from the same genre or with the same instrumental elements, in order to match it to listeners’ expectations. The process of A/B referencing involves rapidly switching between two recordings (one mastered and one not) and critically listening to both, making incremental interventions to the signal with processors that shape the ‘power’ (via the root mean square denoting the average between the quietest and loudest parts of the waveform) and the perceived volume of the mix as well as adding harmonic distortion: a barely audible corruption of the signal that makes the mix exciting but, if done incorrectly, detracts from the quality. In this stage there is the scope for the engineer to add their own creative signature to the recording through the application of a discrete toolchain of signal processors, often unique to each engineer. The skill of critical listening (Prince and Shankar Kumar, 2012), the foreknowledge of individual tools and how they complement others, and the intuitive response to the sound’s variety of nuances all vary dramatically across audio mastering engineers, making some far more desirable to clients than others. At the pinnacle of this career are those audio mastering engineers who receive widespread fame, for instance, the GRAMMY Awards highlight the mastering category as: ‘this person is an engineer who is the last creative bridge between the mix process and the distribution process’ (The Recording Academy, 2015: 2 my italics).
As this discussion demonstrates, there is a great variety of tasks within audio mastering, some more easily relegated to automation than others. Algorithms can be ‘fetishized’ and integrated into hype cycles and this represents a warning for researchers of AI in creative industries (Thomas et al., 2018). The first role of error correction is often devolved to assistants and apprentices, particularly within the traditional mastering houses, and there is a range of software that assists with the process in order to automate the routine. AI here is a ‘convivial alternative’ to algorithmic paranoia since it is a further tool to increase human efficiency (McQuillan, 2016). Here AI is in a prime position to usurp some human labour. These routines are a counterpoint to the ‘lively performativity’ (Gallagher, 2015) of the second role where the engineer experiments with different tools and often responds to gut instinct, or intuition. Given algorithms within AI packages on their own are inert (Lowrie, 2017), their potential future use by audio mastering engineers is beguiling, since much of the work they do is subjectively constructed through consultation with sound creators. It is this affective aspect of audio mastering that is conceptualized further in the next section.
A changing cultural industry
Workflow models chart.
Craft culture
In the pre-digital era between the 1950s and 1980s there was a linear, mechanical workflow from magnetic tape spliced, mixed and summed by a mixing engineer through the mastering engineer’s toolchain to a record cutter, such as the mid-twentieth century industry standard Neumann VMS (Figure 1). In this craft culture there would be close and physical labour involving the mastering engineer and one or multiple assistants working to assimilate reels of magnetic tape to vinyl disc manually in real-time with a cutting lathe as the signal passes to the cutting head with a ruby stylus that physically cuts the audio waveform into the heated lacquer. In-house consultation was necessary between mixing and mastering engineers regarding specificities in the audio signal and with the music label employees to gauge the expectations of the sound creators. The culture was one similar to that found in a guild of craftspeople since as the signal passed through a mastering console via manual signal processing with limiting and equalization to remove low frequencies a great degree of physical and mental skill was required to avoid damaging the lathe as it cut the groove. Critical listening was required here to establish the loudness of the recording and overall harmonic distortion. Key factors the mastering engineer would need to oversee were the temperature of the cutting head and helium coolant, the pitch denoting the width of the grooves and speed of the lacquer’s spin, and the degree of distortion on the signal coming from the mixing console.
The pre-digital workflow.
Creative culture
With the onset of digitalization in the 1980s the major audio mastering houses devolved into freelancers in response to the automation of many of the craft routines of cutting vinyl. Those engineers able to maintain a client base through respect for their signature expertise were able to reinvent themselves as creative professionals. Utilizing Internet websites for self-marketing, file transfer protocol (ftp) servers and online (cloud) repositories for client communication, and software toolchains to expand their networks to global proportions beyond local clusters, the monopolies of mastering houses tied to major music labels were undone (Wu, 2017). Since many of the routine, manual aspects of audio mastering, from vinyl lathe-cutting to spectral analysis, underwent automation in the latter part of the second half of the twentieth century, practitioners undertook visceral duties and became ‘knowledge workers’ through gaining an awareness of cultural genres and the expectations of listeners for certain trends in sound post-production. These tasks involve their immersion within musical ‘scenes’, the development of competences in musical performance and theory, and social networking and marketing of individual prowess and capabilities. In sum, with the shift from manual to knowledge labour, burgeoning opportunities for entrepreneurialism and even creativity accompany the ongoing development of tradition and craft expertise.
As creative professionals, audio mastering engineers tend to work in isolation rather than in collectives and offer critical listening and expert consultation with sound creators and labels, key skills that continue to be a conundrum for simulation systems such as LANDR (Figure 2). In this creative culture there continues to be demand for discrete, expensive, outboard mastering processors with analogue circuits and no, or limited, digital components, since these are deemed by engineers and consumers alike to have a superior sound quality. Despite their mastery of discrete toolchains, however, the mix is blended ultimately with digital technologies either at the end of the process, to produce a digital file for distribution, or at key stages where there is limited impact on the sound quality.
The post-digital workflow.
What can be observed in the creative culture is continuations of the pre-digital craft culture and admixtures of traditional and contemporary technology. Consultation is now available between the mastering engineer and the artists/creators themselves since the master is easily and inexpensively stored digitally and the settings of the toolchain can be recalled and the process of mastering replicated. Once the engineer has completed the master, a lower quality copy can be shared with the client, so they are able to provide feedback and queries to the audio mastering engineer before bulk transfer to physical media or distribution with online digital retailers or repositories.
A benefit of this creative culture for the audio mastering engineer is that routine manual tasks are largely now redundant through automation or transfer to the digital realm. In order to compensate for the demand for both analogue and digital processes audio mastering engineers exercise critical listening through experimenting with the composition of their studios and the mixture of signal processors in their toolchain. Indeed, each mastering engineer has a unique combination of equipment, which they have developed skills with using over time, and through learning-by-doing, creative experimentation and troubleshooting when older pieces are not completely compatible with newer acquisitions and require retrofitting or novel fixes.
Algorithmic culture
The workflow AI imagines is radically different from either the craft or creative cultures illustrated above since it attempts to simulate creativity through machine learning and the frequency analysis of databases of existing human-mastered audio alongside a raised expectation that sound creators will ultimately undertake the mastering process themselves without a third-party audio mastering engineer (Figure 3). Hence it is crucial for the sound production to fit with established genres and comply with summing standards in the original upload, otherwise unpredictable results could eventuate. AI also simulates consultation between engineer and client through providing the artist/creator with direct access to the masters and cyclical processes of downloading, processing and uploading their content at very little cost. Despite emulating the two core human activities of critical listening and consultation these continue to remain problematic for AI to simulate since sound creators are too closely connected to their work to be critical and because automated systems are unable to comprehend ultimately what sound creators feel and hear.
The AI workflow.
With AI offering sound creators the capacity to perform audio mastering themselves criticality emerges as a profound hurdle for simulation. Critical listening is a skill that is challenging to simulate through an algorithm because it requires a combination of human intuition, spatial awareness and learning over time. A key issue here is the so-called ‘loudness wars’ (Devine, 2013) in popular music where sound dynamics are reduced in favour of highly effected signal processing that makes a recording appear louder than the actual volume through reducing the dynamic range and increasing harmonic distortion. When sound creators master their own productions, they tend towards loudness that is fatiguing for listeners and detracts from the recording quality, and this issue is one prominent criticism of LANDR’s results and user control parameters (low, medium or high ‘intensity’). The standard listening environments available to sound creators are largely inadequate for referencing sound reliably since most do not have access to typical spatial treatments, such as sound insulation, absorption, bass capture, reference monitors (speakers), and so on. Such alterations are expensive and require expert knowledge to achieve. Mastering spaces require critical listening over time in order for engineers to learn spatial nuances and correct for them in the sound master. The spatiality of audio mastering is understood to be a form of expertise in and of itself and results in ‘transparent’ or predictable playback across a diverse range of systems and environments. Critical listening is a vital aspect of audio mastering and involves the intervention of a third-party who is able to objectively comment on the audio waveform and its characteristics and distinguish these data from aesthetic and culture features. Critical listening ties into both communication and consultation with sound creators and links to creative infrastructures and sound and music cultures.
Methods
Following common methods in social science the research builds theories from empirical inquiry by examining the ‘in-depth investigation of the human experiences, routines, improvisations and accomplishments which implicate digital data in the flow of the everyday’ (Pink et al., 2017: 1). The research empirically studies the actual practices surrounding algorithmic technologies through interviews with those affected by and affecting them (Christin, 2017) in the specific domain of music (Wood et al., 2007). Utilizing ethnographic methods on algorithmic cultures (Seaver, 2017) from this journal the research draws on accounts collected from 20 audio mastering site visits in three Australian cities (locations not disclosed for privacy purposes) in both rural and urban settings during 2016–2017. 1 The Australian Recording Industry Association provides statistics on the music industry in Australia. According to the 2011 census 7900 people reported primary musician occupations such as musicians (instrumental), singers, composers or music directors. Moreover, in 2009/2010, each Australian household spent an estimated $AUS380 on music-related goods and services: over $2 billion economy-wide. Participant recruitment was made through random cold-calling. A web search for audio mastering engineers yielded a list of possible candidates. These were then contacted via email and invited to take part in an on-site semi-structured interview and collaborative studio tour.
Only two interviewees were tenant freelancers in a major mastering house, although another four had prior experience earlier in their careers in institutional settings (e.g., a music label). The overwhelming majority of the sample were male, a demographic feature of the industry, although one female agreed to participate. The studio observations paid dues to the methodological guidance on phenomenological and non-representational ethnographic approaches to researching automated technologies within everyday environments (Pink and Sumartojo, 2018). The recording and transcription of the interviews led to categorizations utilizing NVivo 11 software, from which themes were articulated. Methods for entrepreneurship research guided the data coding (Dana and Dana, 2005). All participants directly engaged with the topic of AI during the interviews and studio tours and this was a linchpin theme of the discussion. Demonstrations of workflows involved previews of material currently being mastered and descriptions of individual pieces of equipment and their functionality in respect to the suite.
Discussion: Visions of affective AI
In this section, I report on the empirical research via four possible visions audio mastering engineers contemplate where AI will be complementary to their expertise through human-centred design (Baumer, 2017), rather than adversarial to their human labour – these experts are reluctant to harbour a sense of an algorithmic ‘sublime’ about AI (Ames, 2018). Elsewhere there are predictions that the music industry could enlist AI to ‘create algorithms enabling the creation of customized songs for users and help sound creators to focus more on being creative’ thereby boosting revenue (Naveed et al., 2017: 4). A similar hybrid model could also emerge for audio mastering. After Seaver (2017), algorithms are cultural because they are composed of collective human practices and with LANDR algorithmic cultures involving hybrids of humans and AI are emerging. I adopt the emic term ‘workflow’ to describe these visions. Building off of the conceptual framework’s emphasis on the fusion of tradition with current innovations each hybrid workflow goes against the grain of speculations about machines replacing humans through simulation. There’s so much music being made around the planet now, not just in bedrooms and rehearsal rooms, but on trains, in colleges and schools anywhere there’s a laptop, really. Even if we wanted to, there’s no way we could master all of it at Abbey Road, even using our online services. The automatic services allow some of that to be finished to a standard that its creators are satisfied with, and made available around the world. That’s fine by us. (Inglis, 2016: no pagination)
Humans referencing AI
The first hybrid workflow the interviewees construct is one involving AI as a counterpoint to human workflows: as a ‘minimum benchmark’, as one interviewee put it. The model is informal in that it is not clearly an integration of the AI into the workflow per se, but rather a diversion strategy to disincentivize potentially inappropriate clients from premium services without alienating them. These clients might be as a result of oversights in the original pre-master that went undetected by the algorithm, or mixed results from the AI workflow as a result of misapplied signal processing, as happens today: And I do get projects that have been finished and done and the client’s not happy with it and they send it to me and I listen to it and I go, yeah, someone’s just gone to a lot of trouble to make it loud and they’re not really listening to the essence of what’s going on there and there’s no feel to it and it’s distorted and wrong. So, I have to go back and redo it for them and approach it from a musical viewpoint rather than a technical viewpoint. (P1, Freelance Engineer, Male, 60s) Could we use it? Well, I think, what’s the old saying: Keep your friends close but your enemies closer. Well, if I ever use LANDR as a reference it’s only because (a) it exists, but (b), because it exists I’m somewhat forced to reference it, whether it’s my choice or a customer saying, “Why should I pay …? This is what I did free.” Yeah, “I’ve uploaded a free sample of LANDR. Beat this or …” Or, God forbid, they come back in and say, “This is what LANDR did; sounds horrible. I’ll never use it again. I don’t care what you do with it, just make it sound better than this.” (P2, Freelance Engineer, Male, 40s) For instance, you could give the same piece of music to 10 different mastering engineers, one of them including LANDR, and you'll get 10 different results. The end user, the composer, the engineer, or whomever is the purchaser of that service, could then go I like that one. Mastering is working on people’s art, so in that it is entirely subjective. Which makes me somewhat sceptical as to how AI even could be a threat, because it’s art. (P3, Tenant Mastering House Engineer, Male, 30s)
Humans as a premium option
The traditions engineers develop throughout their careers remain with them in hybridizations of their workflows that provide them with a ‘signature sound’ useful to attract clients and raise the benefits of AI as a cost and time saving tool. Intuition, or ‘gut instinct’, is a key facet of signature sounds wherein engineers draw on their aptitude in decision-making to enact their expertise. In the industry, terms such as a ‘good ear’ accompany the marketing of each engineer’s eclectic suite of technologies and track record of successful releases. The simulation of a signature sound is plausible and has been the focus of many software tools (see Tanev and Božinovski, 2014: 237); however, the genuine article is key here and efforts to replicate the human aspects of exemplary individuals’ portfolios lack authenticity. If there were 10 engineers that decided all they wanted to do was maybe fine tune software like LANDR or iZotope, and there was one guy that was still set up old school, I reckon that guy would always attract a lot of work. Because I think people love the concept of that. I think the only thing that’s going to change is maybe the gradual shift towards digital. There will be a point where people say digital technology has caught up. (P4, Freelance Engineer, Male, 30s) I raise or lower my attention and intelligence to the level of conversation. If I get a bunch of guys in who are bricklayers and they’re in a rock band and they want to talk about surfboards and sharks and shit that’s the conversation, we have all day. If I get a bunch of EDM guys in who want to talk about Drake’s record and technologies that’s the conversation, we have all day. So I make them instantly feel like, very quickly, that I’m with them: I’m not against them. LANDR can’t do that so you're not just buying an end result, you're buying experience. (P5, Freelance Engineer, Male, 50s)
Humans controlling quality
The possibility of testing sound productions prior to human mastering is also a workflow that would feature AI and humans innovating together, rather than in conflict as imagined in the first hybrid workflow above. Many criticisms of AI rest on the imprecision of sound creators at critically listening to their mastered works since they lack the balanced and flat environments of mastering studios and the years of experience audio mastering engineers have fostered. Here AI is utilized as a ‘budget option’ in business models, primarily by established ‘premium’ mastering engineers and houses. The difference to a strictly automated process is the critical listening of the engineers, which could distinguish between proficient mixes and those containing errors or oversights in the algorithm’s toolchain. So what I could do is send them a normalized mix – so that’s the mix as I’ve done it, no gain reduction, no mastering – and I could send them a gently mastered LANDR track (as LANDR has low, medium, high for loudness). And I use a medium setting – if I have used loud I guarantee you that the human mastered one would have been better because there was distortion and clipping in the LANDR one – but that might be a useful way that I could use LANDR to deliver something to a client. Maybe that’s going to save me time explaining to the client this is the source of differences that are going to happen. Because I can explain it with words but for them to hear it’s going to help. (P6, Freelance Engineer, Male, 30s) I think some clients it’s important to be here, because they have very specific ideas. And I think some producers have very specific ideas. And there's a small group of people or producers that I think need to be here. I like that because they have a certain taste and I don’t want to waste my day doing something that I think is good, and then they love it except for … and then there goes eight hours of the day. So it’s better that they say … what I’ll often do is, I’ll master a track, then I go out and have a coffee. They sit in the chair and they AB on my console. So the mastered and unmastered are at exactly the same volume level, and then I come back and they give me feedback. Once we’re on the same page, they often leave and go you’ve got your brief; go for it. (P4, Freelance Engineer, Male, 30s)
The hybrid workflow could also be performed through engaging the artist/creator in the processes necessary to accomplish a mix of their sound production that offers less work and fewer interventions from the human mastering engineer. In fact, this is the way the CEO of market leader LANDR Justin Evans envisages the AI being widely adopted, as he notes in a media interview: Because LANDR is so affordable, our users are using LANDR to learn how to ‘fix it in the mix’. We consistently see people using LANDR many times over and over again on a single track. When we’ve spoken to the users who do this, they tell us that they use LANDR as a tool to audition their mix, hear what’s wrong with it, go back and retry it and iterate from there. People love it, because it’s like having a huge budget where you can go back and forth with a mastering engineer as many times as you need to get it exactly where you want. (Inglis, 2016: no pagination)
Humans as creative directors
A final possibility for a hybrid workflow is to isolate the AI from the signal path altogether and segregate the automation from decision-making and creative control. Here there is a pre-existing tradition already from the digitalization of audio production and the music industry. Audio mastering engineers made use of innovations that automated manual record production, for instance, lathes were designed so that they could modify the pitch of the groove according to the level of the signal automatically, or at least according to a template. In the early 1950s, the German company Neumann innovated an automated variable pitch system eventually leading to digital computerized pitch control. In the late 1970s, digital delay substituted for analogue preview heads and mastering tape decks, which also made vinyl cutting more convenient by giving a predictable early playback for the engineer to monitor the signal before making the manual cut. You know. So, I guess I feel a little bit the same way about LANDR as I feel about an MP3; it’s got a place, it’s absolutely got a place. And one example would be if I was an advertising executive and my composer had just done this amazing track to go with my ad and we’ve got a meeting in an hour and that track it just needs to be louder and it just needs to go with the visuals because we’ve got this presentation in an hour, I would go “Get it on LANDR”. (P7, Freelance Engineer, Female, 40s) Mastering CDs has very little to do with mechanics. You might turn a few knobs or mastering online but there’s no microscope inspection, there’s no helium, there’s no chemical analysis, there’s no heat, there’s no cause and effect. So that’s what’s fundamentally changed over the last 30 years is this job was very much cause and effect. And it still is a little bit. I have a whole bunch of – and every good mastering guy does – I have a whole bunch of custom stuff built for us just because it’s not worth the manufacturer’s time to make these little things that we need when he’s going to sell 30 of them and we all want some. (P5, Freelance Engineer, Male, 50s) I think people have this view of it like someone sends you a song and you make it loud. You pump it through EQs and compressors. That’s part of it, but if someone sends you a track and it sounds amazing, you don’t have to do anything to it. I have this, it’s like a perversion on the Hippocratic Oath. It’s like first, do no harm. If you don’t have to do anything, don’t do anything. Because there's a tendency if you're an engineer to want to do stuff. But if it sounds great, do nothing. Someone could send you a track and you go it sounds great and just send it right back. You don’t have to touch it. But there is that final output format, which you’ve got to worry about. Formats are always changing, obviously. So you’ve got to think about, like I can make something sound great for this room, but it’s not going to be played in just this room. (P8, Freelance Engineer, Male, 30s)
Conclusions
The project contemplates the distinctive influence of technology on the future of human audio mastering engineers’ affective labour and tracks a new algorithmic culture involving an assemblage of big data analytics, machine learning and algorithms, which is being positioned as AI able to compete with audio mastering experts. At the beginning of this paper I asked the question, in what ways are audio mastering engineers prescient of emergent algorithmic cultures involving AI? One conclusion to draw is that AI in the cultural industry of audio mastering will need to strive toward human-centred algorithm design, encompassing both critical listening and creativity, in collaboration with humans rather than through attempts to replace them. What is notable, perhaps, in the present is how professional, established audio mastering engineers are not currently included in this algorithmic culture, except as competitors with AI. Alongside an ambivalence and even anxiety about the consequences of AI, the creative professionals I interviewed shared a willingness to dispel the present hype about technology’s capacity to replace human labour, a pragmatism Seaver noted was shared by programmers coding algorithms (2017). The engineers I witnessed at work in their studios are keen to muse about how their workflows could change through collaborating with AI and the majority of interviewees had already some existing knowledge of LANDR through curious experimentation or client correspondence.
As it stands, systems such as LANDR are encouraging rather than discouraging human participation in audio mastering, through an algorithmic culture that recruits sound producers to become mastering engineers, and by expanding the client base to people who would not normally consider audio mastering their works, due to the cost and effort involved. Since LANDR’s results are not yet on a par with ‘professional’ level audio mastering (judged, for instance, by how many songs mastered by LANDR end up on popular music charts) there is still a way to go until AI properly challenges human careers or indeed involves professionals in this algorithmic culture as a convivial alternative to taking their clients.
In order to progress theory, the research compiled empirical accounts from audio mastering engineers through engagement with the spaces they labour in and in regards to the workflows they muster in order to perform their expertise in an everyday context. The approach underpinning this paper is captured in the focus on the pragmatic integration of AI into existing workflows already in flux due to the digitalization of the music industry. Diverting from estimations of the displacement of labour through simulation of human skills, I instead highlighted the ability of humans to forego or augment aspects of orthodox practice in order to accommodate alternative methods of performing labour effectively.
First, this project shows AI in the music industry and its simulation of human expertise in fact stimulates innovation through forcing humans to re-evaluate their skillsets and adapt productively to challenges and competing influences. Participants’ narratives illustrate AI is not simply a like-for-like competitor for human labour, but rather a further stage in an ongoing reinvention of the role of the mastering engineer in response to a shifting landscape of digitalization, cultural shifts in consumer taste and fashion, and the decline of the centralized mastering house and the rise of the freelancer. For participants, AI did not bode catastrophe as such, but rather stimulated thinking about what the future might hold for their skillsets and ongoing career development. AI could indeed remove some of the drudgery of the routine aspects of audio mastering, including error correction and media formatting. Their narratives offered a window into the everyday struggles and hardships human labour entails as well as epiphanies and a sense of wellbeing from creative or craft integrity. Such themes emphasize there are complexities and ambiguities in assessments of the impact of AI and robotics on human labour and economic or social systems. If policy support for audio mastering engineers is to have meaning there will need to be a deeper engagement into the nature of their enterprises and the exact dimensions of their workflows and assemblages derived from their unique career histories.
Second, creativity as a concept requires re-evaluation and alignment with notions of craft and personal commitment to the performance of labour. The creative class as a concept requires honing in order to capture the efforts of humans who contribute to human enterprises without necessarily sharing credit for them on par with sound creators. Despite a conflict between the ideal of the audio mastering engineer as a third-party to creative content, there is a distinct sense of creative license and, controversially, input through their labour. Notwithstanding its shortcomings, the digitalization of music and the glut in profits from the drop in the sale of music media have meant a renewal of traditions of performance and experience and a need for audio mastering engineers who are cognizant of musical scenes and hold propinquity to creative infrastructures. What arises from the influence of AI is the foregrounding of individual traits disproportionate to simulation, such as communication and networking or intuition and subconscious decision-making, and the backgrounding of those elements of their labour easily replicable by simulation. Here a vision of audio mastering emerges wherein those ‘people skills’ traditionally shunned in the industry become a privileging dynamic for premium services and products.
There are limitations in this study flagging opportunities for future research. As AI is nascent and many of the algorithms are in their infancy there is scope for unforeseen consequences and disruptions to emerge beyond the prescience of the participating audio mastering engineers. Whether AI has the capacity to develop genuine creativity is a moot point since human agency represents an insurmountable hurdle according to today’s standards. If creative AI emerges of a quality indistinguishable from human levels of achievement the ramifications for societies would be so significant that upheaval in the music industry would pale in comparison to other aspects of human experience.
Footnotes
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received funding from Australian Research Council Discovery Project Grant DP160100979.
