Abstract
Matters of sound and listening are increasingly being attended to across the social sciences and humanities, reflecting what has been termed a ‘sonic turn’ since the early 2000s. In urban ethnographic research, scholars are starting to pay attention to the role of sound in social relations, in expressions of identity and senses of belonging, as well as in processes of othering. In this paper, we explore the theoretical and methodological opportunities of sonic urban ethnography, that is, an urban ethnography that foregrounds sound and listening in theoretical and methodological ways. We argue that the promise of sonic urban ethnography lies in its ability to interrupt the predominant focus on text and the visual by developing expanded practices of listening for alternative ways of knowing and engaging with the urban. We share four empirical vignettes from Shanghai, Berlin and London that illustrate, in their different ways, the power exercised through sound in the urban environment. Our discussion of the empirical cases highlights three key ‘lessons’ for doing sonic urban ethnography.
Introduction
The number of studies that critically attune to the rich sonic environments of the city has increased in the past two decades (Atkinson, 2007; Born, 2013, Bramwell, 2015). Attention to sound in an urban context has been bolstered by what is described as the ‘sonic turn’ (McEnaney, 2019), characterised by a growing interest in listening practices and auditory culture since the early 2000s. The foundations for the sonic turn were laid earlier, in the introduction of soundscape as an analytical category with accompanying methodological tools and techniques in the 1970s. The resulting field of sound studies can be characterised as a collection of literatures spanning acoustic ecology, sound and soundscape design, anthropology of the senses, history of everyday life, environmental history, cultural geography, urban studies, auditory culture, art studies, musicology, ethnomusicology and literary studies (Pinch and Bijsterveld, 2013). Indicative of the influence of sound studies through the sonic turn, attention to the relationship between urban space, sound and bodily experience has made its mark on urban and ethnographic research.
In this paper, we seek to illustrate how attention to sound can expand and enrich urban ethnography. We illustrate this in four empirical vignettes from three cities – Shanghai, Berlin and London – that are united around exploring the complexities of the relationship between sound, place and urban experience. The vignettes illustrate how the researchers’ attention to the sonic aspects of interactions in the city helped unlock layers of knowledge about how people navigate belonging in the city. The paper is the result of a writing collaboration between the authors, initiated by the first and second author. The vignettes are individual reflections on the role of sound in our urban ethnographies. Fang provided the vignette from Shanghai, Muhammet provided the first vignette from Berlin, and Katherine the second. Finally, Eva contributed the vignette from London. Ana and Karolina compiled the vignettes, edited them for consistency, contributed the conceptual framing and analysis and wrote all other parts of the paper. In the final part of the article, these vignettes become a reference point for lessons learned about doing sonic urban ethnography and serve as examples of how urban research informed by the sonic turn enriches urban studies with attention to material and embodied practices that are historically and culturally contingent.
The promise of sound in urban ethnography
Over the past two decades, attention to, and theorisation of, the sonic has increasingly begun to inform the tradition of urban ethnography. The growing literature has captured diverse accounts of the interplay between the sonic and the spatial in urban environments, including iPod listening (Bull, 2007), live music in the city (Lashua et al., 2009), music in diasporic urban placemaking (Henriques and Ferrara, 2016) and the vibrant and sociable soundscapes of the daily journeys in Cairo (Battesti and Puig, 2020) or in London (Bramwell, 2015).
What can be described as sonic ethnography is characterised by the adoption of sonic theories and methods within ethnographic practices. Such methodological approaches may, but do not necessarily, involve the production of audio recordings, which is also referred to as phonography (Drever, 2002). They may also involve analysis of recorded sound, and the performativity of audio representation, raising questions about the politics of sonic knowledge (Gallagher, 2019) and how recording ‘mediates or actively constructs particular cultural performances of listening’ (Droumeva, 2016: 82). In more recent years, the proliferation of affordable and mobile audio(-visual) recording technologies has created new opportunities for gathering, analysing, expressing and sharing ethnographic information (Gershon, 2019). Sonic ethnography is not limited to the urban context; however, in this paper we speak specifically to how sonic ethnography as a broader approach can enrich urban-focused ethnographic research.
What sets sonic urban ethnography apart from urban ethnographic approaches more broadly is the foregrounding of sound and listening while employing methodological tools such as interviews, participant observation, filmmaking, archival research and other established ethnographic techniques. Sonic urban ethnography is further distinguished through the development of specialised methods like soundwalks, sound mapping and field recording, all originally developed within the ground-breaking work by S Murray Schafer and his colleagues at Simon Fraser University in Canada on the World Soundscape Project in the 1960s and 1970s, and further elaborated by sonic researchers over the decades since (for an overview of sonic methodology in geography, see Doughty and Drozdzewski, 2023). Schafer’s ([1977]1993) soundscape approach involves a three-tier typology of the sonic environment: keynote sounds that constitute a taken-for-granted background, such as the hum of traffic, signal sounds that we attentively notice, such as bells or sirens, and soundmarks, that are unique to a location or community. Fong (2016) provides a pertinent example of Schafer’s typology applied in an urban setting. His thick description of the soundscapes of Bangkok and Los Angeles offers an example of the ‘sonic vignette’ as an evocative way to present urban sonic research. The literature that we speak to in this paper, however, diverges from Schafer’s soundscape typology by placing interactions between the social and the spatial at the centre of analysis. This means that methods and analysis shift from documenting the soundscape to seeking to understand political and experiential aspects of urban life through their sounds.
Contemporary sonic urban ethnography combines the interests and techniques of sound studies, urbanism and sensory ethnographic methodologies (Pink, 2015) and is often characterised by interdisciplinarity rather than being bound by ethnography’s anthropological roots. The different backgrounds of the authors of this article (sociology, geography, architecture) demonstrate the interdisciplinarity enabled by the sonic turn. The number of scholars engaging in sonic ethnography has grown as a critique of the idea that culture is the result of acts of inscription, and consequently that the key task of the ethnographer is to decipher the meanings that result from these inscriptions (Erlmann, 2004). Related is the growing interest in urban ethnography as a distinctive method that brings into focus the everyday lives of city dwellers in relation to processes of urban change (Duijzings, 2018; Jones, 2021; Ocejo, 2012; Smithsimon, 2010).
In the context of predominantly theoretical urban studies, sonic urban ethnography matters, as it provides us with a means to analyse interactions that would go unnoticed in other types of urban research. A focus on the sonic can help uncover small (seemingly insignificant) details of verbal and non-verbal interactions mediated through public spaces, culture and technology, that help us understand power relations in cities. Symbolic meanings of sound are central to how sonic practices distinguish identity and senses of belonging at the urban social scale, or sonic identity at the scale of the city as a whole (Amphoux, 2003). Authors have been concerned with the contested nature of sound in processes of nation-building, nationalist cultural projects, and in the performance and assertions of regional, ethnic and religious identities, as well as in processes of othering (e.g., Birdsall, 2012; Hirschkind, 2006; Lisiak et al., 2021; Stoever, 2016; Sykes, 2015). In this work, sound has been shown to demarcate and reinforce social stratification through the creation of sonic autonomy and segregation (Born, 2013: 27). The spatialising capacities of sound may also work to challenge boundary-making in the city, such as social and material boundaries between public and private (e.g., LaBelle, 2010), or division between ethnic groups in contested cities (Aceska, 2023; Aceska and Doughty, forthcoming), in its capacity to ‘leak’ between environments. Additionally, residents’ daily practices shape the sonic identity of different areas of the city, and such identities are also often governed according to notions of ‘acoustic zoning’ in sound policy and legislation.
We locate the significance of the sonic for urban ethnographic practices in two key elements. First, as alternative ways of engaging with the urban, expanded practices of listening have the ability to interrupt the reliance on the visual as a primary sensory mode of knowledge construction. Second, privileging sound as an object of study emphasises the connections between the materiality of the urban environment and its experiential and symbolic dimensions. We elaborate on these two elements below, before illustrating through four empirical vignettes how they have informed research carried out by the authors of this paper.
Practices of listening
Sonic ethnography’s focus on sound in its various guises has naturally involved close attention to the significance of everyday listening practices, with researchers having to carefully cultivate the sensitivity of their ‘ethnographic ear’. An often-cited technique is ‘deep listening’, developed in the work of composer Pauline Oliveros. She called for learning to expand our perception of the acoustic space that always surrounds us, including the sounds of daily life, nature, music or our own thoughts (Oliveros, 2010). According to Bull and Back (2003) deep listening as a research practice involves careful, considered, and critical attention to what we hear, ‘attuning our ears to listen again to the multiple layers of meaning potentially embedded in the same sound’ (p.3). Such an auditory perspective is a powerful tool for the ethnographer concerned with the intricacies of social life and urban experience, and particularly the relational qualities of community, place and power (Bull and Back, 2003: 4). A commitment to upsetting dominant structures of power and knowledge is also at the heart of the similar term ‘close listening’ (Hoffmann, 2023), which describes a technique of paying special attention to silences, the unspoken, and non-verbal sounds. Close listening has helped historical ethnographers locate agency and resistance in colonial sound archives, revealing the ‘absent presences’ (Hilden, 2022) of the colonial subjects whose voices are not generally part of official historical narratives.
Attending to the sonic ethnographically means engaging in an exercise of unlearning the forms of listening that are constituted within different social practices or communities, to bring awareness to sensory processes and embodied knowledge that operate at a subconscious level. Listening practices ‘must be understood by reference to the broader cultural and historical context within which they are formed’ (Rice, 2015: 102), and as Sterne (2003) reminds us, listening is a directed, learned activity, which means listening includes but is not reducible to hearing. What Helmreich terms the ‘transductive ear’ of the sonic ethnographer, listens for ‘how subjects, objects, and presences – at various scales – are made’ (Helmreich, 2007: 632). Charles Hirschkind’s (2001, 2006) sonic ethnography of listening to audiotaped sermons and Qur’anic recitations in Cairo in the late 90s is exemplary in showing that listening is a ‘cultural practice through which the perceptual capacities of the subject are honed and, thus, through which the world those capacities inhabit is brought into being, rendered perceptible’ (Hirschkind, 2001: 623–624). He shows how the cassette sermons, as a technology of self-improvement, instilled techniques of disciplined ethical listening that helped Egyptian Muslims attune themselves to the broader current of what is now referred to as the Islamic Revival, and to cultivate a range of normative Muslim virtues. Hirschkind’s work elaborates on listening as a ‘worldmaking’ activity, which in its capacity to actively produce meaning should also be understood as a political act (cf. Eidsheim, 2019).
The intersection of the material and the symbolic
The sonic turn references a broader shift in thinking about sound, in which sound emerged as an object of study, as well as a more comprehensive conceptual apparatus, differentiating sound studies from already established fields dealing with types of sound such as music, language or speech. Foundational accounts (Sterne, 2003; Thompson, 2002) charted the dramatic transformations of aural culture over the 20th century, showing the impact of technological development on how we listen and what we hear, and the resulting reformulation of the relationship between sound and space. The sonic turn resonates and coincides with other ‘turns’, such as the sensory, affective, materialist, performative and speculative, that have influenced social science and humanities scholarship over the past couple of decades. Common amongst them is a focus on experience, embodied knowledge and non-visual modes of perception.
Methodologically, this has translated also into an acknowledgment of the embodied presence of the researcher, ‘the importance of considering emotional and affective processes of doing research and of being researchers’ (Waitt et al., 2014: 289; see also Waitt et al., 2020). For example, Pradelle’s (2006) depiction of the cacophony of markets in town squares in Provence invites the readers to the scene and establishes the authenticity of the work by describing in keen detail the characters and relations of the market, the buzz, the haggles and the vendors’ shouts. In research on the ‘visceral politics of sound’ during a climate protest, Waitt and colleagues used their bodies as ‘instruments of research’ (Longhurst et al., 2008), recording their bodily reactions to sound at the climate march and how sounds triggered feelings such as unease and pleasure, like and dislike, pride and shame, acceptance and oppression. These recordings of visceral response helped the researchers reflect on how the embodied experience of sound ‘triggered moments of emotional intensity through which the personal and the public, the individual and the social, indeed shape each other’ (Waitt et al., 2014: 290).
Our contribution in this paper is situated within this broader acknowledgement of the importance of the senses in understanding urban life. We add to previous sonic urban research with a focus on situated communicative interactions that span verbal, non-verbal and technologically mediated sound. We conclude by arguing that a sonically informed urban ethnography does well to listen to the constitution of what we call ‘the urban’ through attention to how sound interacts with physical, social and political dimensions of cities.
Sonic vignettes from Shanghai, London and Berlin
The following part of the paper contains four empirical vignettes from the authors’ research: one from Shanghai, two vignettes from Berlin and one from London. The vignettes are brief research accounts that demonstrate the promise of attending to interactions between sound, space and identity in diverse cities. The four empirical cases focus on the material-symbolic properties of vocal sound heard in different spaces across the city. They emphasise the analytical possibilities of studying language as sound, expanding its social and political meaningfulness from content to form. They help us translate the core elements of sonic ethnography into an urban empirical context and situate their contribution in relation to urban theory.
Sonic vignette: Languages in Shanghai’s Public Transit System
In 2010s’ Shanghai changed beyond recognition – two decades of urban redevelopment resulted in an expansion of the urban population from 8 million according to the 1990 census to 24 million in 2020. More than 40% of the current population of Shanghai are non-natives, and the vast majority do not speak the Shanghai dialect. From 2013 to 2017, and digitally during the COVID-19 pandemic, Fang conducted an ethnography focused on the languages that make up Shanghai’s soundscape.
The Shanghai dialect is unintelligible to Mandarin Chinese speakers who are not originally from the Yangtze River Delta region, where Wu Chinese was spoken and its variant the Shanghai dialect developed (Qian, 2007). To Shanghai dialect speakers, Mandarin Chinese sounds like a different language, too, and native Shanghai dialect speakers reportedly ‘need to switch brain’ when code-switching between the two. Historically, the unique linguistic character of the city not only distinguished social class and home origin, but also endowed the speakers with a sense of modernity and urban sophistication when the rest of the country was believed to be backwards and provincial, thus Shanghai dialect proficiency is associated with a self-aggrandised Shanghainese identity (Xu, 2021). Through interviews with highly educated migrants in 2017 about their experience of language-based discrimination in the early 2000s Shanghai, Fang learned how they would miss their bus stops when bus conductors made announcements in dialect. When they complained, the native-speaker riders would side with the conductor, suggesting it was the migrants’ own fault due to their lack of Shanghai dialect proficiency. Such overt, unapologetic discrimination was a way for native Shanghainese to declare their ownership of the urban sonic landscape against the migrants.
In the last two decades though, Mandarin Chinese has claimed dominance in public spaces, thanks to successful promotion by the state. The non-speaker research participants reported much less language-based discrimination in public, especially in the public transit system where most native-speaker bus conductors were replaced by automatic ticket sales machines and pre-recorded announcements controlled by the drivers. These recordings provided station announcements in Mandarin Chinese and English, with the glaring omission of the Shanghai dialect. This led native Shanghainese to feel out of place in a sonic landscape devoid of the Shanghai dialect. Interviews with dialect preservation activists in 2013 and 2017 showed that they eventually managed to persuade the authorities to include Shanghai dialect in the recorded announcements on some of the bus and metro lines, though not all. Since 2017, no new bus lines or metro lines have adopted that trilingual announcement system, and the debates within the dialect preservation activist community shifted their focus to the authenticity of the pronunciations of the dialect in the announcement.
The transformed soundscape of Shanghai manifests both top-down policies at the central government level, that is, the Law of the People’s Republic of China on the Standard Spoken and Written Chinese Language in effect since 2001, and tactics of urban residents in their everyday encounters within and away from the state’s surveillance. At the central state level, the 2001 Language Law mandates the exclusive usage of Mandarin Chinese in public schools and in TV and radio broadcasting. At the municipal level, the Shanghai Municipal Working Committee on Language and Writing under the Bureau of Education is set up to regulate and survey linguistic practices of civil servants, employees at organisations affiliated with the state, and the language spoken and written in public spaces. Consequentially, the venue for speaking the Shanghai dialect is rigidly controlled and policed, eliminating native Shanghainese’s right to speak the vernacular in a plethora of public spaces, from government agencies, and schools, to banks and clinics. Furthermore, the clearly demarcated space for using Mandarin Chinese, or prohibiting the usage of Shanghai dialect maintains boundaries of properness and acceptability. As a result, bilingual or multilingual native Shanghainese started by consciously code-switching to Mandarin Chinese in public, and gradually unconsciously or instinctively switching to Mandarin when they communicate with strangers in public, illustrated by the following observation: It was a heavy rainy day during Shanghai’s annual typhoon season in the mid-summer of 2017. During her ride of the No.3 metro line, Fang observed a young woman (A) carrying multiple plastic document folders, an umbrella and a purse, boarding and sitting down next to another young woman (B). Later, when A got up and moved to a just vacated row of seats, one small stack of paper was left behind. B immediately called out in Mandarin Chinese, ‘you forgot these!’ A immediately turned around, picked up her papers, apologized for the trouble, and thanked B multiple times, also in Mandarin Chinese. B smiled and nodded in response, then went back to chat with her friend in Shanghai dialect.
The instant code-switch by B was remarkable because her immediate and to a certain extent, instinctive reaction to the left-behind papers was to call out in Mandarin Chinese, when the interrupted conversation with her friend was in Shanghai dialect. It indicates a separation between a space where speaking the dialect is acceptable or will be understood, and a space where it presumably is not. Such insights can be unpacked only through a sonic ethnography of fleeting social interactions in cacophonic metropolitan public space.
Sonic vignette: Listening in the public library in Berlin
Open to all, public libraries accommodate people with different needs and expectations. Offering a range of resources and materials and timetabled activities across the day, the sonic environment of the public library is highly textured and varied. From the busy queue at the issue desk, where the beeping and stamping of books accompanies the questions and queries from people in the queue, to the sounds of the baby singalong group which floats across from the children’s area of the library, the public library’s sounds vary according to the time of day and intention of the library user.
Katherine’s sonic urban ethnography attends to the day-to-day sounds in a public library in Berlin’s multi-ethnic neighbourhood, Wedding, in 2012. This vignette opens out how sound in the public library is experienced and negotiated by library staff, who articulated the different sonic practices and tolerances they used to negotiate the public sharing of library space. And so, while most public libraries are no longer silent places, sound in the library remains a site of contestation, control and boundary work. Thinking with sound in this context opens out a rich and more textured rendering of relations in the library that may otherwise be considered ordinary or unimportant.
Discussions with library staff about sound and its close relative, noise, opened out nuanced and multi-layered considerations about how they engaged with the sounds of the public library in their daily working routines, which spanned not only control and displeasure but also a commitment to the principles of the public library’s openness to all. Libraries, of course, must accommodate people with different sonic tolerances and needs. As one librarian said: we don’t have enough workspaces; the school children sit on the floor to do their homework […] People in Wedding seem to lack space to work at home – they really come here in droves just to spend time, whether it’s loud or not, and I think, my God, how can you work here, when it’s so loud! […] and this idea that the library is the living room for Germany, we see it more and more; people really come and spend hours here.
The extension of sonic tolerance across different library users is a necessary prerequisite for the library to be an open, participatory space. A key example is that of local teenagers and their sonic presence in the library. Coming to the library after school to do group projects, attend the homework help sessions or just hang out, young people from minority ethnic backgrounds occupy the library’s work tables and social spaces, and the sounds of their interactions would fill the entire library, sometimes giving rise to the consternation and complaints from other users. However, library staff felt that the young people’s noise should be met with ‘a certain level of tolerance [from other library users]’ as one librarian put it. She went on, ‘That has something to do with democracy, and with tolerance, and all those things that we’re always trying to promote, and it’s happening here, in a small way, all the time. And I have to take a stand on this’, she added. These tensions around sonic tolerance illustrate how attentiveness to sound provides an appreciation of the ‘small ways’ in which ‘big’ concepts such as democracy, participation and racism, play out in the social space of the public library.
Yet, as young Germans of colour, the displeasure that the teenagers’ sounds incur can be considered a racialised form of othering. The question is how sounds become a space into which other issues are channelled, specifically how sound can become a proxy for directly discussing race and racism. Lisiak et al.’s (2021: 263) call for ‘naming racism beyond words’ offers a consideration of how sound can become a way of highlighting and differentiating racialised ‘others’. In their article, they reflect on sonic boundary drawing in both Berlin and London’s urban realm – highlighting resistance to the sound of foreign languages in cities’ public spaces and the sound of the wheeled suitcase as signifier of both gentrification and tourism. Another example from Germany is how resistance to plans to build mosques can take the form of (perhaps more socially acceptable) complaints over noise, as Kuppinger (2014) has traced.
In Germany, both discursively and institutionally, there is limited wherewithal to discuss race – where, for instance, race and ethnicity are not officially tracked by a census category, and the German word for race has a problematic history and is contested in contemporary debates (Kelly, 2021). In the Wedding public library, then, other identificatory indicators, such as accent, language skills and noise levels, became signifiers of racialised forms of otherness while avoiding the mention of race. Thinking about experiences of and responses to sound as racialised also links to an awareness of sound as embodied. In the context of the public library, embodied and racialised sonic boundary drawing emerges through a consideration of whose noise and by extension, presence in the library, is legitimate.
Sonic vignette: Shouting at the Maybachufer Market in Berlin
Established in 1887, the Maybachufer Market is one of the oldest markets in Berlin (Spies, 1988), located in the neighbourhood of Neukölln. Many Turks settled in the area following the labour migration of the 1960s. Turkish vendors began to work at the market, and it became known among Berliners as the Türkenmarkt (Turkish Market). Muhammet conducted sonic urban ethnography in this market at different time intervals between 2019 and 2022, incorporating participant observation and 31 in-depth interviews with vendors.
Shouting, while selling products, forms a distinct part of everyday practices in many marketplaces (cf. de Certeau et al., 1998). It also creates a social atmosphere (Watson, 2009), heightening dramatic impacts on the place-specific interactions in urban neighbourhoods. Shouting is a key component of how vendors establish positively connoted interactions with the market (Bauman, 2004), and is a way to attract and interact with potential customers (Pradelle, 2006). When the shouts of vendors draw the attention of market users, this opens possibilities for further social interactions between vendors and customers. The ethnic diversity of the neighbourhood within which the Maybachufer Market sits differentiates it from other street markets in Berlin. Muhammet’s research found that some vendors associate shouting at the market with positive interactions between people from different backgrounds; the shouts of the vendors were felt to make ethnic diversity more aurally present. Whereas for some ethnic German vendors the shouting was more contentious, more often regarded as a disturbance and something that should be regulated. The sounds of shouting at the market were felt to increase tensions between the market vendors and neighbourhood residents.
Two private market management companies operate the Maybachufer Market. While one of them regulates the Tuesday and Friday market, the second one runs the Saturday market constructed mainly for fabrics and textile products. Although the management companies have enforced an ambiguously implemented rule that prohibits shouting while selling products, the shouts of vendors permeate the market, mainly on Tuesdays and Fridays, and leave discernible traces in everyday life in the neighbourhood. The new policies adopted after the privatisation of the Maybachufer Market pose fundamental dilemmas, including the ban of shouting. Onur, who has worked at the market for nearly 30 years, felt that recent changes, such as pedestrianisation and the ban on shouting, had ruined the spirit of the market: They have spoiled the market … these men [the market management] have spoiled it in ten years what we used to, what we struggled to do for years. What were we doing before? We were putting the goods in front of the stalls and parking the cars behind. What do we do now? We put the goods behind. Why? ‘Do not park the cars at the market!’. With so many made-up decisions! They ruined it, they slaughtered it. This place had a texture, you know? That texture has gone. People were shouting here … We were not for the upper class, for the middle class, we were for the lower class and the outsiders … how can I put it … we were the alternative.
Disallowing shouting in urban markets can be viewed as a ‘sanitization strategy’ by local authorities (Gonzalez and Waley, 2013: 976). These policies aim to eliminate so-called ‘noise pollution’ in urban neighbourhoods; however, shouting to sell products is an important sonic practice that supports the belonging of vendors to the market (and thus to the neighbourhood). Here, a ban on shouting resulted in a deterioration of the lower class ‘texture’ of the market, in what could be interpreted as a pandering to upper- and middle-class ‘sensibilities’. Onur attributes the ban to the diminishing significance of the lower class and how the market offers an alternative to the supermarkets. In this sense, shouting implies nostalgic feelings, and thus constitutes one of the integral elements of the market.
Sonic vignette: The disembodied female voice in London
One of the recurring sounds in a neoliberal city like London is the sound of the female voice, as the sonic urban ethnography of Eva reveals. When navigating public spaces, the disembodied female voice projected through speakers continuously guides, warns, and informs people in their everyday – the underground, train stations and buses, even the supermarket self-checkout machines or the default settings for personal assistants on smartphones, all use a female voice. Primary research and field recording aimed at studying this phenomenon in London began in 2020 but were cut short due to the onset of COVID-19 pandemic and imposed lockdowns. The new restrictions thus became a part of the methodology where the preliminary binaural recordings functioned as artefacts of an everyday before the pandemic. The recordings were analysed with a spectral pitch display, as well as compared to other archival material.
As Braidotti (2011) writes, we are caught in the effects of late capitalism where we experience allegedly free borders, yet at the same time increased border controls and security measures. This illusion of freedom of movement can be heard in a train station, where commuters hear they are being observed in a calm and steady female voice: ‘CCTV recording is in operation at this station, for the purpose of security and safety management’ or ‘security personnel tour this station 24 hours a day’. The same gentle voice that guides the commuters and informs them which carriages contain refreshments, also proclaims their luggage – if left unattended – will be ‘removed without warning and destroyed or damaged by the security services’. There is a distinct contrast between linguistic and paralinguistic communication, between what is said (constant observation or destruction of luggage items) and how it is said. The relatively long sentences are uttered in a calm, reassuring, and melodic voice that reverberates throughout the station; naming various destinations intertwined with heightened-security announcements.
According to the British Library sound archive, female voices in public announcements (PA) nowadays outnumber male voices 5 to 1. This is a big change in the sound of public spaces compared to even just a few decades ago, when it was considered that women’s voices are ‘lacking in gravitas for public announcement’ (Rawes, 2010). Arguably the earliest example of a PA system in public spaces is the famous ‘mind the gap’ introduced in 1968 to warn commuters about the gap between curved platforms and train carriages. The recording of the original warning can still be heard at the Embankment station, and it is a useful comparison for contemporary announcements that changed in linguistic as well as paralinguistic elements. As Kanngieser (2012) writes, intonation, pitch, speed and resonance of the voice can on one hand promote affirmative relations or instead reinforce and (re)establish patterns of domination. The disembodied female voice that dictates our behaviour and use of space with ‘go there,’‘do that’, ‘you are being watched’, is the voice of soft coercion (Power, 2017) that often goes unnoticed, it simply blends into the background. The seemingly kind and caring female voice functions as a crucial instrument in the securitisation and control of contemporary public space, ‘cutting across what little is left of the public realm and providing the illusion of efficiency, calm and reassurance’ (Power, 2017). We are in a state of constant emergency, but don’t worry, we have it under control. Together with CCTV cameras and security personnel, the voice adds a layer of uncertainty – especially for the bodies that have historically been othered. Part of a fully scripted system that carefully balances between a sense of safety and control, the gendered voice performs the city’s ability to efficiently handle various crises (whatever they may be), while obscuring its omnipresent surveillance and control. The use of the female voice for increasingly complex security announcements is not a coincidence amid the progression to more securitised and privatised public spaces. LaBelle (2010, 2018) discusses Muzac (elevator or shopping centre music) as a form of late-capitalist environmental conditioning that manages the behaviour and mood of the population – not unlike the calm and reassuring disembodied female voice proclaiming you are observed wherever you go. Attending to the ubiquitous sound of the disembodied female voice allows us to question how gender relations perform in space and how spatial relations manifest in constructions of gender (Rendell et al., 2000).
Lessons learned: Doing sonic urban ethnography
The sheer number and diversity of approaches to analysing the city signals ‘the difficulty that contemporary urban scholars face in dealing with cities that are increasingly fractured, centrifugal, and enveloped by a vast mediascape of local, regional, and transnational networks’ (Stirling, 2021: 115). Doing sonic urban ethnography means developing theoretical approaches and analytical tools that are particularly sensitive to the intersections between sound and the city. The vignettes make clear, in their different ways, that even in the context of digital hyper-connectivity and the trans-local nature of cultural expression, urban locality and spatial proximity matter socio-politically and affectively. The vignettes show that attention to urban soundscapes of social relations in everyday spaces of the city situates the subjects and objects of study within the context of policies and other means of governance and control. Each of the vignettes engages with different dimensions of the urban experience; from the languages spoken in more private spaces to the larger scale of power and governance, and how these layered relations shape engagements with the material environment of the city.
Sound and the built environment
The physical environment is inherently sonic. Resonant urban spaces shape the soundscapes of cities as sound reflects off walls and other surfaces in the streets (Birdsall, 2012). As Gallagher et al. (2017: 620) write, sound is not only inherently spatial, but is also ‘a force that disrupts and reworks common spatial concepts such as boundary, territory, place, scale, and landscape’.
Our four vignettes show that doing sonic urban ethnography means understanding the ways sound is connected to the built environment in cities. To know a sound means to situate the subject that is listening, as well as the object(s) listened to, in socio-political geographies as well as their physical site. The sound then performs as a material, spatial and temporal concept that can change existing, and aid in the creation of new, relations with the built environment. Since a listening body in the city is always related to the built environment, attention to sound in cities can help in analysing how the built environment is formed and performed. The design of the Turkish market in Berlin and the physical proximity between the vendors and the residential buildings that surround the market define the relations between sellers, buyers and the local residential population. The public library in the mixed neighbourhood of Wedding in Berlin served as a living room for the young locals mainly due to the lack of residential space and other infrastructure where they can spend time. The language and sound in public spaces in London are inseparable from the built environments that emit them, much as in Shanghai, urban change over the last decades is shown through the languages used in public space.
Sound and language
The four empirical cases demonstrate how senses of belonging to the city are expressed and maintained through language as sound. Sonic urban ethnography diverges from linguistic analysis by approaching language through an auditory lens, examining not only the semantic content of speech but also how power, identity and cultural resonances are embedded in the use and monitoring of language and voice in urban contexts. Our vignettes study language perceived as shouting or noise, or as means for control or exclusion, in public spaces – and they point to the need for developing methodologies for studying language as a type of sound that produces, maintains, and communicates relations in the city, for example, the significance of rhythm and volume (vignette one and three) and tone (vignette four), and the ways in which language and voice contribute to the construction of public space and how they influence experiences of inclusion and belonging (El Ayadi, 2022).
Sonic urban ethnography focuses on everyday urban settings and relations. It is not only ‘doing interviews’; it is often a devoted and lengthy engagement with the everyday lives of city dwellers, paying attention to human and non-human agents. While sonic urban ethnographers do carry out interviews, most of what they do is ‘small talk’, observations, and participation in the everyday life of the city. This raises practical questions about skills and positionality – is it possible for a researcher who does not speak the fieldwork language to conduct sonic urban ethnography (or any kind of ethnography) and what does this entail for the future of the method? Though spoken languages are understood to form part of the urban sonic environment, for some research foci, language proficiency is still important to enable the content of what is communicated to be included in the analysis. The sonic vignette from Shanghai is a keen example. The aural dimension of the urban transformation in Shanghai is certainly discernible but less easily investigated by scholars who do not have linguistic proficiency in the Shanghai dialect. An ethnographer who does not speak Turkish might not capture important ethnographic insights when analysing the ways vendors communicate with each other in the Maybachufer Market in Berlin. Investigative boundaries can also be challenging to identify, since both private and public spaces in the whole city qualify.
Our four vignettes show that attending to language as part of the urban soundscape requires unpacking taken-for-granted linguistic practices and technologies. By paying attention to who is speaking, where, what they are saying, in what way and to whom, one can provide a fruitful ground for studying how individuals and groups relate to the city and each other. Such analyses may be fruitfully combined with techniques such as conversational analysis and ethnomethodology. Sound, with its material, spatial and temporal properties, can perform as a tool in the creation and interpretation of ways of becoming and relating to the world. In a time of ecological crises and increasingly polarising attitudes in political discourse, such acts of listening can become a kind of affirmative attuning to the world.
Sound and regimes of power and governance
Regimes of power and ways of governing cities are not only consolidated through sound, but they also continuously produce new soundscapes, some of which cannot be easily traced back to particular policies. The question about who makes noise in a public library in multi-ethnic neighbourhoods in Berlin, for example, addresses much wider problems of ideology and politics of identity and remembrance in post-war Germany. In Germany, there is limited wherewithal to discuss race both discursively and institutionally. Race and ethnicity are not officially tracked by a census category, and there is an awkward lack of German words for race and racialised forms of difference. In this context, other identificatory indicators, such as accent, language skills and noise levels, become signifiers of racialised forms of otherness while avoiding the mention of race. This raises questions about the power of policies to affect not only relationships, meanings and materialities but also soundscapes of cities. Policies are not only tools of governance and regimes of power, but they are instruments of making ‘new’ subjectivities, meanings, and relationships, alongside their main objectives (Shore, 2012; Shore and Wright, 1997; Wedel et al., 2005). As dominant organising principles of societies, policies are not linear and straightforward; rather, they are made and implemented in complex settings, often followed by unforeseen consequences (Wedel et al., 2005).
The ‘sonic’ consequences of policies are visible in our data. Our four sonic vignettes show how policies that regulate security, languages and ethnic and racial relationships produce soundscapes that construct, maintain, and communicate senses of identity and belonging to the city. In Shanghai, languages spoken and heard in public generate a sense of belonging or alienation. In London, the use of the disembodied female voice for heightened security announcements perpetuates gender constructs and relationships, where a seemingly kind personal-assistant voice functions as a veil for the omnipresent control of public spaces. In Berlin, both shouting at the market and making noise in the public library draw boundaries between Germans and non-Germans in multi-ethnic neighbourhoods. Doing sonic urban ethnography means, on the one hand, understanding how policies ‘work’ as instruments for making new soundscapes in the city and, on the other, what those changing soundscapes mean to people in their everyday lives. Our vignettes have engaged in a juxtaposition between the professional solutions to urban conditions and the everyday experience and engagement with the urban from the viewpoint of city dwellers. They put urban policies in a dialogue with the ways they have resulted in particular sonic environments and relations among city dwellers in their everyday engagements with the urban. Good urban sonic ethnography, therefore, provides us with means to better understand regimes of power and governance and to speak up against authorities.
Conclusion
This paper identifies how theoretical and methodological developments arising out of the sonic turn can enrich urban ethnographic research. Our discussion extends the lessons learned from our empirical vignettes from Shanghai, Berlin and London to three key propositions for doing sonic urban ethnography: the value of approaching language as sound-making; the importance of paying attention to how regimes of power and ways of governing cities are not only consolidated through sound, but also continuously produce new soundscapes; and the significance of attending to how sound responds to and reworks the physical environment of the city.
This research demonstrates how attentiveness to sound in doing urban ethnography opens out a rich and under-explored space of social analysis particularly in cities characterised by migration and encounters of different ethnic, religious, and ‘regional’ others. The focus on voice and language as meaningful forms of sound-making reveals an emerging soundscape of encounters in these cities. It helps us unpack new layers of how power struggles play out, and how senses of belonging are formed around the negotiations of sound and noise. Future research should seek to understand how the social and sonic dimensions of encounters unfold in cities that do not belong to the most-researched categories, such as big, global or Global-North cities. Following Robinson’s (2016) lead to (re)think cities ‘through elsewhere’, future research should illustrate how tuning into taken-for-granted sound in everyday settings in cities around the globe gives prominence to facets of urban encounters that may otherwise be missed.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
