Abstract
In addition to personal and contextual reasons for listening to music, users of music streaming services also receive constant recommendations from these platforms’ algorithmic systems. In this article, I explore, from a critical realist perspective, how users reflect on and decide what to listen to in various contexts. First, the article develops a framework for understanding reflexive decision-making in the context of music streaming. Then, based on interviews with Spotify users and an analysis of their donated streaming histories, three listening modes are identified based on what motivates music choices and how the users interact with the platform's content. Findings suggest that in addition to the platform's recommendations, users reflexively choose music to fit their contexts, emotions and goals based on their previous experiences, striking a balance between conscious and habitual music listening.
People have different musical preferences depending on the time of the day, the activities they partake in and their own musical tastes (Avdeeff, 2012; Karakayali et al., 2018; Kinnally and Bolduc, 2020; Lüders, 2019; Mok et al., 2022). The musical choices people make throughout their days may be attributed to a combination of conscious and non-conscious decisions, which are influenced by various external factors in their environments and personal experiences. In this way, listening to music becomes an emergent practice that assumes a different meaning and objective for each person. In a context in which data, datafication and algorithmic recommendations seem to potentially take control away from users (Hagen, 2021; Just and Latzer, 2017; Maasø and Spilker, 2022), I argue that users can still reflect on their streaming choices. This study aims to identify the factors that lead users to adopt various listening modes, behaviours and preferences in the context of music streaming and, at the same time, contribute to broader critical data studies by continuing to explore the relationships between data and users.
Multiple perspectives have been used to explain how and why people decide to listen to music at specific times. Different listening and streaming patterns can be attributed, for example, to particular events, moods and emotions as well as increased access to large quantities of music, daily routines and the integration of algorithmic recommendations into everyday life (Carrigan, 2017; Hesmondhalgh, 2013; Nowak, 2016; Seaver, 2017; Siles et al., 2019, 2020; Walsh, 2024; Webster, 2019). Here, I argue that (new) listening modes emerge as the result of the interaction between a music streaming platform and its recommendations; the music provided and the users, with their goals, motivations and musical tastes. In defining these listening modes, I consider the different goals and listening behaviours users may have, such as exploring or archiving music and listening to focus and study, become motivated, dance or pay attention to the lyrics. Thus, I ask, what are the various listening modes practised by Spotify users, and why do they emerge?
I answer this question by placing existing research on music listening habits and critical data studies within a critical realist framework. According to this perspective, people's actions, decisions and practices are mediated by each person's capacity to practice reflexivity and consider themselves and their context before acting (Archer, 2003, 2007). In this way, listening modes and even routines are also the result of conscious reflexive deliberations. For example, the emergentist theory of action (Elder-Vass, 2007, 2010) aims to explain how people make decisions, including both conscious and non-conscious decisions, and act in the ways they do. While this theory is not typically used in critical data studies or music streaming research, I argue that it provides helpful insights into users’ practices when they are using algorithm-based platforms, and specifically the interplay between deciding what to play and playing it. Critical realism is often applied in philosophical and macro contexts, but I argue that, via theories such as the emergentist theory of action, it can also be useful for smaller micro contexts, such as everyday music-listening habits (Archer, 1995; Mutch, 2020; Stutchbury, 2022). Furthermore, this article contributes to ongoing debates on the relationship between data and user reflexivity processes in digital contexts (e.g. Mahnke et al., 2024).
In this article, I explore how people decide to listen to specific music in particular situations and ask what mediates the use of music-streaming platforms and music listening practices. To do this, I investigate the extent to which Spotify's algorithms and recommendations influence music listening decisions and what role reflexivity plays in these choices. Interviews with Spotify users conducted in Norway in 2022 and 2023 and Spotify data donated by some of these participants were used to map their listening patterns and practices. In the next section, previous work on music listening habits will be reviewed, and a critical realist perspective will be developed. Then, the interviews and data donations will be analysed and discussed.
Theoretical framework
Music listening habits and decisions
Research on music listening habits has shown that music listeners actively use music to manage their mood and find meaning in and assign meaning to multiple situations and interactions (DeNora, 2000; Hesmondhalgh, 2013). DeNora (2000) argues that people reflexively use music as an ordering device to create their identities, remember past events, alter their mood and assign meaning to various social situations. Similarly, Hesmondhalgh (2013) explores how the meanings and uses of music vary from private to public environments. He argues that music plays an important role in the generation of feelings, emotions, and intimate relationships at an individual level and a role in forming community and collective flourishing at a social level.
As streaming platforms become increasingly ubiquitous, more research on music streaming habits has focused on algorithmic recommendations and how people think about them. In addition to what the user wants to listen to at a particular time, the recommendations provided by the platform also play a role in music choices. Research suggests that music recommendations actively shape music listening habits (Freeman et al., 2022; Karakayali et al., 2018) and increase genre blending, genre eclecticism and a preference for playlists over genres (Avdeeff, 2012; Hagen, 2015; Siles et al., 2019; van Venrooij and Schmutz, 2018). When a user opens Spotify, diverse music recommendations are displayed based on that user's listening history and other interactions, such as likes and shares. In this way, a feedback loop is created and Spotify can guide the user towards specific genres, artists, tastes and music listening patterns, as well as offering other personalised recommendations (Prey, 2019). Similarly, Mathieu (2023) argues that data and algorithms are mostly beneficial for producers and industry members who attempt to measure, imagine, makes guesses about and control their audiences through quantitative measures. This increase in personalisation and algorithmic recommendations has also led researchers to focus on how much power and influence algorithms and their recommendations might have on users, who are often seen as gullible (Livingstone, 2019). In this way, certain music and music recommendations be viewed as contributing to a habit or routine in which a person is not thinking about music and also the acceptance of algorithmic recommendations, via which users become dependent on the platform.
However, even if algorithms can have a certain degree of influence, music streaming is contextual, and users negotiate between social and algorithmic discovery (Johansson, 2017a, 2017b). In other words, users do not necessarily accept algorithmic content without thinking about it (Cole, 2024; Livingstone, 2019; Mathieu, 2023; Siles, 2023). This can be seen when, even if music streaming platforms claim to encourage users to explore and discover new music, these platforms are used for a combination of music exploration and archival practices (Lüders, 2019, 2020), with daily routines, moods and algorithmic recommendations being part of these processes. Furthermore, even if genre blending and playlisting are increasing to some extent, research also suggests that users do not cross ‘categorical borders’ when interacting with music online (Airoldi, 2021; Drott, 2013).
Research exploring patterns using Spotify streaming data shows similar results. For example, Datta et al. (2018) find that as people adopted online music streaming, there was an increase in music diversity, exploration and discovery, especially during the first months after adopting the service. This is linked to a subscription model that provided access to more music in a more economical way as compared to buying albums separately. Further research shows patterns that involve discovering new music and returning to music that has been previously listened to over time (Mok et al., 2022; Park et al., 2019). Music is linked to people's offline lives, there are more variety-seeking behaviours in older users than younger users, there is an increase in energetic music during the day and relaxing music at night and seasonal events (e.g. Christmas) also encourage listening to certain types of music (Mok et al., 2022).
However, it is necessary to explore the reflexivity processes involved in the development of these listening habits and decisions, as they do not necessarily occur in non-conscious ways, even if one follows algorithmic recommendations. Different perspectives on reflexivity emphasise different aspects of the process, such as musical tastes, rationality or emotions. For example, building on DeNora’s (2000) aesthetic view of reflexivity, Nowak (2016) argues that emotion (emotional reflexivity) is key in music consumption in digital contexts and finds that personal and social contexts trigger music listening through this form of reflexivity. In addition to thinking in terms of emotions, identity and the music's relationship to public and private spaces, as DeNora (2000) and Hesmondhalgh (2013) do, Nowak (2016) also identifies listening modes in terms of settings and the technological options available to a person. In this study, I focus on streaming services as the main technological option available and adopt a critical realist perspective to emphasise the individual's concerns and motivations as part of their reflexive process, which may or may not include emotions.
Within a critical realist perspective, reflexivity acts as a key mediator between structures and actions, as individuals have the capacity to consider themselves, their (previous) actions and their experiences, beliefs, emotions and contexts before acting (Archer, 2003, 2007, 2012). In other words, humans can consciously reflect on these influences before deciding whether and how to act (Archer, 2003, 2007). In this way, even if actors act according to the conditioning of social structures, they are not predetermined to do so, because the courses of action taken by individuals are the result of reflexive deliberations. In this way, even the most mundane social practices are led by intentional reflexive deliberations regarding our concerns and goals.
While this perspective relies on completely conscious actions, other theories, such as Bourdieu's work on the habitus, rely on non-conscious or habitual actions (Elder-Vass, 2007, 2010). Critical realist scholars have attempted to reconcile the ontological contradictions between these perspectives and reintroduce routines and habits into Margaret Archer's model (Elder-Vass, 2007, 2010; Sayer, 2010). For example, Elder-Vass’s (2007) emergentist theory of action argues that some experiences and conditionings can lead to routine practices that do not require intentional reflexivity, meaning that routine and reflexivity can function together.
Elder-Vass (2007, 2010) argues that as humans, we develop a series of emergent mental entities in the form of, for example, reasons, beliefs and dispositions. As humans, we also have the causal power of reflexivity, meaning that we can think about these mental entities, influences and previous experiences to plan and make decisions. Over time, these decisions are stored and implemented mostly non-consciously, thus becoming habits and routines. In sum, Elder-Vass argues that actions are the outcomes of very recent reflections, which are seen as decisions, older reflections (decisions that we may have forgotten we made but still shape us) and experiences (such as habits or skills), which affect how we make decisions. In addition to users encountering music recommendations, their reflections and decisions are key to understanding listening modes and behaviours, as outlined in this theory. These decisions are related to not only the music recommendations made in the moment but also previous musical tastes, time of the day and social contexts, among others.
Exploring music listening modes
Through these reflexive processes, listening modes emerge based on previous experiences as well as current concerns, emotions and motivations. Nowak (2016), for example, identifies role-normative listening modes as patterns of music listening that depend on the affect assigned to particular music as well as the particular material technology used and the context in which the music can be listened to. In this way, he emphasises emotions as the main element determining what makes music adequate for a given context. In this study, I intend to explore similar modes in the sense of what listeners find to be the correct music for a particular context, but I also explore broader definitions of reflexivity, adopting a critical realist perspective.
In other words, I see listening modes as broader mental frameworks that emerge in a person as the result of the interactions between that person's personal context, the platform, and the music. Like reflexivity modes in Archer's work (e.g. 2007), listening modes mediate the interplay between cultural and social influences, the platforms we use, our previous experiences and the actions and practices we choose. In this way, a listening mode is composed of the external elements of music listening, such as how the platform is used or what songs are played, but also internal elements of agency, such as the mental and reflexive processes that lead to an action. In other words, a listening mode is composed of practices and actions, but it is also formed by the motivations, goals, emotions and previous personal experiences that may lead a person to a given practice or listening habit. Thus, here, the term ‘listening mode’ refers to how and why a person decides to listen to music, regardless of whether this is a one-time event or a recurring habit.
Furthermore, as an additional level of external influence on listening modes, it is necessary to consider the role algorithmic music recommendations may have, which has not been explored in previous studies on reflexivity. Although previous work on modes of music consumption has considered the material forms and technologies used to listen to music (Nowak, 2014, 2016), in this study, I focus on music streaming platforms and the influence they may have on users and the potential listening modes. While it is not possible to view the specific way Spotify works for each participant, the recommendations a person receives are expected to match the music the user wants to listen to, and how the data and recommendations are perceived may also change previously planned actions (Carrigan, 2017; Cole, 2024; Walsh, 2024). This also indicates the need to differentiate between media use and media practices (Mathieu, 2023). Media use, such as clicks, likes or plays, can be used to measure engagement with digital content, but not necessarily the practices that give meaning to this engagement. Thus, considering how music algorithmic recommendations are perceived, how people think about algorithms and the music they like and their intentions to accept these recommendations and play music is necessary when exploring listening modes in the context of music streaming platforms. By examining this combination of reflexive and habitual thinking, this study to explore the types of music played in different moments and the motivations behind this, as well as the role music recommendation plays here.
Method
Sample and data collection
To explore how and why people listen to music in their everyday lives, a combination of semi-structured interviews and data donations was analysed. Participants were recruited through snowball sampling through acquaintances, social media posts and recruitment posters at universities, libraries and record stores. The sample consisted of 20 Spotify users (12 women and eight men) from five nationalities (Norway, Costa Rica, China, Germany and France) who live in Norway (17) or Costa Rica (3). The participants were over 18 years old (21–65 years old, with an average of 34) and made up a highly educated sample that ranged from bachelor's students to PhD graduates working in diverse fields. All the participants used Spotify as their main music streaming platform and had had a Premium subscription for at least one year. The participants’ names have been changed in the analysis below.
Interviews with the participants were conducted in 2022 and 2023 to discuss their music listening habits, algorithmic knowledge and streaming experiences while using Spotify. Open questions that were previously developed through a literature review were posed to allow participants to speak freely about their experiences; in this way, I adopted a non-evaluative conversational approach (Brinkmann and Kvale, 2018; Smith and Elger, 2014). The interviews began with questions regarding music taste and streaming habits and ended with questions about participants’ understanding of the platform and its algorithms. Following think-aloud components used in previous research (Siles et al., 2020; Swart, 2021), participants were asked to open the platform on their phones or computers during the interview to explain how they engage with the recommendations and how they use them in various moments throughout the day.
At the end of the interviews, the participants were asked to donate their Spotify account data and thus allow me to explore the digital traces they had left behind (Ohme et al., 2023). Any person can log into their Spotify profile and request their data, which Spotify will send in 5 to 30 days. It was explained to the participants what data they would be downloading and how to delete any files containing personal information before donating their files. All participants agreed to donate their data, but only 14 sent their files. The files contained data for up to a year from the day they were requested and included a summary of the user's playlists, streaming histories and search queries, among other files. It should be noted that at the time of the interviews, it was not possible to request the extended streaming history, which includes more detailed information. Since then, Spotify has made these data more accessible.
In the current study, the analysis will focus on the streaming history files donated by the users. A person's streaming history includes the date and time at which a song was played, the name and artist of each song and how long each song was listened to during the past year, which ends the day the data were requested and downloaded. The total number of tracks included for each participant's streaming history ranged between 2836 and 67,761 tracks (17,108 on average). The complete dataset that combined all 14 participants who donated their data included 239,513 tracks played between March 19th, 2021, and January 5th, 2023, depending on when they were interviewed and when they requested their data. Note that Spotify does not define what is considered a streamed track and includes all songs streamed by a user in its data, regardless of whether they were fully streamed or not. Furthermore, the total number of tracks was filtered to remove audio-only tracks, such as podcasts, and focus on music use, including all songs that were played in their entirety, played for just for a few seconds or skipped. In addition to these filters, some errors occurred when accessing Spotify's API to collect song information and audio features. This resulted in some songs being excluded from the final dataset and two participants not being included. In the analysis, 193,587 tracks played by 12 participants were used.
The data donated by the participants were read and analysed using a Python code. This was done to identify each song, each artist and the time and day each song was played. The song name and artist were used to search for each song's ID and audio features in Spotify's API with the Spotify wrapper (Web API Spotify for Developers, n.d.; Welcome to Spotipy! Spotipy 2.0 Documentation, n.d.). Among the audio features provided by the API (e.g. liveness, loudness, mode, speechiness, tempo and time signature), five were considered the most relevant to this project based on the data obtained from the interviews and the need to consider a varied range of features. These audio features were defined according to Spotify's API documentation:
Acousticness (M = 0.34, SD = 0.35): ‘A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic’. Danceability (M = 0.61, SD = 0.17): ‘Describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable’. Energy (M = 0.57, SD = 0.24): ‘A measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy’. Instrumentalness (M = 0.11, SD = 0.28): ‘Predicts whether a track contains no vocals. ‘Ooh’ and ‘aah’ sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly ‘vocal’. The closer the instrumentalness value is to 1.0, the greater the likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0’. Valence (M = 0.47, SD = 0.24): ‘A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry)’.
Analysis procedure
The analysis design follows the general structure suggested by critical realist methodology and its approach to mixed methods, in which quantitative methods are used to establish patterns and guide the use of qualitative methods to reveal mechanisms, structures and explanations (Porpora, 2015; Zachariadis et al., 2013). Furthermore, the analysis follows methodological guidelines currently in development as reciprocal methodologies, which propose a model that merges a ‘computational analysis of user-level digital trace data and interviews with the same users’ (Robinson and Cole, 2024: 1). In the analysis below, the users’ aggregated streaming histories were used to explore their music tastes, top artists and listening patterns via descriptive statistical tests. Furthermore, the data were analysed to explore how frequent each audio feature was at various times of the day and week using a series of analysis of variance (ANOVA) tests.
Then, the interview data were used to find explanations and motivations for listening to music at different times. The Spotify data were donated and analysed after the interviews, meaning that the participants were not asked specifically about the patterns identified in their data or what they thought of the specific patterns in their donated data. Still, they were asked about how they use Spotify in their everyday lives, and the explanations given by the participants regarding how they use music and engage with Spotify were useful in exploring their motivations. The interviews were analysed with NVIVO, and a critical realist approach to thematic analysis was adopted (Fryer, 2022; Wiltshire and Ronkainen, 2021). First, the interviews were analysed via a data-led approach, allowing descriptive codes to emerge from the data, instead of using previously established or theory-based categories. Then, these codes were reviewed and grouped based on emergent explanatory themes and new theory-based codes when necessary. The resulting themes were considered to be general listening patterns and motivations, and they were compared to the data patterns identified in the data donations.
Analysis
In the analysis below, the Spotify data provide an overview of the participants’ music use in terms of what they listen to and when. These data are used to explore and compare how participants listen to various types of music throughout the day. Then, interviews are used to explore music listening practices and explain the listening patterns identified in the Spotify data. While the participants were not asked specifically about their streaming histories, the ways in which they describe their listening habits provide insights into their motivations for choosing different music at different times. The analysis also explores what causes these listening modes and patterns in terms of user reflexivity, as well as the role of algorithmic recommendations.
Music listening modes
All the participants in this study revealed diverse listening modes based on their concerns, motivations, emotions and intentions related to music. As seen in previous research, the nature of these purposes varies. Here, the participants’ intentions are described in terms of three listening modes, which were identified primarily in the interview data: the attentive mode, the emotional lean-back mode and the emotional lean-forward mode. While these three modes are, to some extent, overlapping, sequential and not exhaustive, they represent three groups of concerns and motivations that lead to particular music listening practices.
The first listening mode, here named the attentive listening mode, refers to moments during which and practices via which the participants aim to actively listen to and explore music, both old and new, and actively engage with it by paying attention to it. Jakob, a 28-year-old video game programmer, explains that he enjoys listening to music to pay attention to the musical arrangements and lyrics. In these moments, he will actively explore the music and find new artists and recommendations. He explains this as follows: So, for me, I like to listen to music in two ways. [I listen for] something that's nostalgic or fun or that I enjoy, but then, also, on a technical level, [I listen] to find new tracks and new, interesting, music compositions or interesting bands that do a weird mix of genres and stuff. So, for me, it's comfort music that I know and love and that I [have] listen[ed] to since middle school and, then, finding new bands.
A similar intention is expressed by Emma, a 28-year-old gymnast, and Olivia, who is 34 years old and works for the municipality, who both explain that the music they listen to depends on how much attention they will need to pay to the music. Like Jakob in the quote above, Olivia explains how she separates her emotional, or background, listening from moments during which she can appreciate or pay more attention to the music: ‘On the subway, I listen more intensely to the music. I listen to the lyrics and get really into it, and when I’m working, it's just background noise, sort of ambiance’. It should be noted that this also speaks to how participants reflect on their routines and decide when to adopt various listening modes depending on their needs and goals, as will be discussed below.
The attentive listening mode is guided by a person's interest in music, ability to pay attention to music and desire to explore and (re)discover music. This also means that when participants practice this mode, they are likely to be at their most reflexive and be the most conscious of their listening behaviours and how they use the streaming platform. Thus, while adopting the attentive listening mode, a person may be critical of recommendations but, at the same time, embrace and pay attention to algorithmic playlists and the potential to discover music through them. This makes this listening mode difficult to identify in streaming histories due to the emphasis on mood and emotion in Spotify's data on audio features. There is a closer link between the streaming histories and the emotion-based listening modes reviewed below. As seen in Olivia's and Jakob's quotes above, there is an attempt to separate exploratory and attentive listening from comfortable background listening. This is reflected in the two emotional listening modes described below, which are used by participants in relation to an activity or emotion but do not necessarily represent attempts to pay close attention to the music and lyrics, as in the attentive mode.
Emotion and context play an important role in how music is organised and understood by audiences; this was true before the advent of streaming platforms and remains true today (Airoldi, 2021; Airoldi et al., 2016; DeNora, 2000; Hesmondhalgh, 2013; Siles et al., 2019). Spotify also uses affects, emotions and contextual descriptions to organise music on the platform, and develop specific measures for them. In the interviews, the participants explained that they make different music choices throughout the day based on their emotions and what kind of music they need to address them. For example, the participants refer to emotional uses of music when they describe how they decide to play a song or artist for nostalgic reasons, enjoyment or comfort. For example, Emma explains that she often plays online (traditional) radio from her home country for the sake of nostalgia, while Olivia plays music she is comfortable with and can dance to when she is at home. These emotion-based music listening modes have been described in previous research in terms of how music is used for physical activity or as a memory aid (DeNora, 2000; Nowak, 2016).
The first emotion-based mode identified in the interviews is the lean-back emotional mode. In many cases, the participants explain that they play comfort music in the background while they perform other activities, such as reading or playing a videogame. While these participants are not passive in terms of how they select music, this emotion-based listening mode refers, to some extent, to leaned-back engagements with the selected music, as it is often played in the background and not necessarily as a mood management strategy. This also means the participants are less critical of algorithmic recommendations as long as they fit their current moods or social situations.
This listening mode refers to listening to music you already know, comfort music and background music, regardless of whether these are found in an algorithmic playlist or a user-created one. Thus, there is less conscious exploration than there would be in the attentive mode, but algorithmic recommendations are not rejected. This can be seen, to some extent, in the participants’ streaming data. Among the 193,658 songs listened to in the final sample, there were 37,584 unique tracks and 11,962 unique artists, suggesting that there is a large amount of repetition and habit, both within and between individuals. However, when we consider which artists were played the most, it seems that there was high variation in this regard. The most-listened-to artist was Taylor Swift (16,598 times), followed by Ariana Grande (3059), Ólafur Arnalds (2818), Karpe (2128), and Hikaru Utada (1796). Given that these account for a small part of the total streams, this fact suggests that many artists are listened to for only a few songs, indicating an attentive listening mode on the part of some participants who aim to explore music.
Still, returning to music one has previously listened to also suggests that a group of superfans routinely return to their favourite music in emotional lean-back mode. Taylor Swift was listened to 16,598 times by 10 participants, and her most-listened-to song was listened to over 250 times. However, one participant listened to Taylor Swift 16,361 times, 1 while the others listened to her music between 1 and 259 times. The situation is similar for Ariana Grande, Ólafur Arnalds and Karpe, for whom there are differences of 2585, 2809 and 670 streams between their first and second top listeners, respectively. In the case of Hikaru Utada, a single participant led to this artist placing fifth. Thus, this suggests that participants do return to artists they know. When they do this for comfort and nostalgia, it represents the lean-back mode. However, when this is done to achieve a certain mood or motivate a routine, this represents the forward-leaning mode, which will be discussed later in this section.
Furthermore, while the aim of this article is not to analyse artists or genres, it is worth noting that this top 5 also suggests some of the listening practices discussed below. Most of the artists in the top 5 are pop or rap artists, and due to their large catalogues, their songs range from energetic to acoustic and may be used by participants for a variety of reasons. Similarly, considering that Arnalds’ music is mostly instrumental, it is possible that this artist's music is primarily used during work, as some participants suggest doing as a habit. Still, while some information contained in the streaming histories, such as the time at which a song was streamed, hints at how the song is used, the specific meanings and practices surrounding these artists cannot be identified with these data alone. In this way, the third listening mode strongly reflects the ways in which various routines influence a person's music use throughout the day.
The lean-forward emotional mode, a second type of emotional listening mode, stands in contrasts with the first one in that it refers to active engagement with music. While those who adopt this mode are not necessarily more goal driven than the first group, as the examples above also show that participants are conscious of why they play certain music, this listening mode involves clearer mood management objectives and more active engagement with the music, which often cannot be relegated to algorithmic decisions or recommendations. While these two modes can be considered separate, they can also be seen as sequential in that the lean-forward mode is the more reflexive and conscious of the two and leads to the less active lean-back mode, as in the example provided below.
Most participants practice lean-forward emotional listening. For example, Noah, a 32-year-old IT researcher and programmer, describes his listening modes as work states that depend on the moods he feels and needs: calm and focused or upbeat and energetic. These states also depend on his enjoyment of the music. These active emotional listening modes are often used by the participants to assume the mood required for a particular activity. While Noah uses calm instrumental music or upbeat pop to focus at work, depending on the task, Sofie, a 25-year-old university student, uses energetic music to motivate herself before meetings.
The participants’ overall music practices and listening modes are reflected in their music streaming histories, both individually and as an aggregated dataset. When we consider the general trend in each individual participant's donated streaming history, a general habit or preference can be found in that most participants reported higher levels of energy, valence (positiveness) and danceability as compared to features such as acousticness and instrumentalness. These features were higher, on average, throughout the year and during the day, both for individual participants and for all participants at the aggregate level.
More specifically, two ANOVAs were used to determine the statistical differences between the relative prevalences of specific audio features at various days of the week and times of day. The first ANOVA showed that the differences between groups (days of the week) are statistically significant for all audio features 2 : energy, F(6, 84 836.27) = 47.83, p < .001); acousticness, F(6, 84 835.55) = 57.46, p < .001); instrumentalness, F(6, 84 717.94) = 35.71, p < .001); danceability, F(6, 84 880.66) = 30.23, p < .001) and valence, F(6, 84 897.83) = 16.23, p < .001). Games–Howell's post hoc tests show that the most significant differences in audio features were those between weekdays and weekends. Energy, danceability and valence levels are significantly different and higher on Friday than on most days, especially Sunday. Conversely, acousticness and instrumentalness levels are significantly higher on Sunday than every other day, especially Friday. Table 1 provides a summary of the changes in these audio features over the week.
Summary of audio features during the week for all participants.
Similarly, the second ANOVA showed that the differences between times of the day (morning, afternoon and evening) are statistically significant for all audio features: energy, F(2, 92 107.72) = 784.71, p < .001); acousticness, F(2, 91 962.88) = 612.97, p < .001); instrumentalness, F(2, 90 352.14) = 1574.16, p < .001); danceability, F(2, 92 517.40) = 548.27, p < .001) and valence, F(2, 93 443.45) = 347.06, p < .001). Games–Howell's test shows that all groups are significantly different (p < .001) except for the levels of energy (p = .611) and acousticness (p = .739) in the morning and afternoon. Table 2 provides a summary of these features at various times of day.
Summary of audio features during the day for all participants.
While the differences in emotional audio features revealed by the streaming history data and the ANOVAs may seem habitual, these music uses do not occur non-consciously. The users have reasons, goals, motivations and experiences, as reflected in the listening modes, that explain why they might prefer one type of music over another. Nevertheless, as mentioned above, even if these general patterns show the existence of general preferences or routines in the participants’ lives, the exact reasons these users prefer these music features and types of music are not visible in the data donations alone.
Emma, for example, explains that she is a gymnastics coach and uses Spotify to motivate herself and her students during training, as well as when she drives to and from her training sessions. This explains why her music is highly energetic throughout the day in her streaming history data. She also mentions that she likes to use slow music and white noise when she goes to sleep, which also explains the increase in acousticness during the night, also reflected in her donated data and the tests above.
However, not all participants followed this initial trend. While ten of the participants had higher average levels of energy and danceability in their individual streaming histories than the rest, two of these ten participants showed additional variation during certain periods, during which acousticness was higher than energy and danceability. Furthermore, the two remaining participants in the sample showed the opposite trend, with higher levels of acousticness and instrumentalness emerging as a general pattern. For example, William, a 41-year-old associate professor who prefers to listen to acoustic and instrumental music, both for enjoyment and at work, shows a different pattern. In contrast to the general pattern, which shows high energy levels, William's streaming history shows higher levels of acousticness throughout the year and on specific days. In the interviews, he explains that while he adopts the emotional listening modes to decide what music to play, for example, at work, he also uses the attentive listening mode in his free time, during which he enjoys curating his own music and making playlists.
Reflexivity, algorithms and other triggers
While music listening modes are, to some extent, related to an activity, the listening modes described by the participants are not necessarily linked to a place. Considering the fact that all the participants have access to Spotify on their phones and computers, they can decide when to listen to music in an emotional or attentive listening mode regardless of place. For example, most participants use travelling between their homes and workplaces as a period during which to listen to music. Some prefer to listen to music without paying attention to it, while others describe their listening as more attentive. As Nowak (2016) describes, music listeners attempt to find an adequate song for a particular context. As seen in the sections describing the listening modes above, each participant's individual goals, motivations and contexts, in addition to affordances and material influences, shape their reflexivity and decision-making in selecting the right music. Sofie explains as follows: It's really important. I mean, it's every day all the time. Every time I go out to the street, I need my earphones. If I don’t have my earphones, I don’t want to do the activity I have to do. So, it's kind of the motivation boost, I would say, and it's also a moment when I focus on my ideas. Like, a me-time. I need my music all the time, and that's why Spotify is good, because I can have it with me if I don’t have any service or if I don’t have Wi-Fi or 4G or [I am] in the plane or whatever. I need to always have the possibility to hear my music.
Aligned with the emergentist theory of action (Elder-Vass, 2007), the participants show how they have decided which songs will be adequate for various activities. None of the music modes are completely habitual or automatic, but some decisions to play specific bands are non-conscious in that these songs have already been decided upon or organised in the past with a specific moment in mind. In this way, a person has already decided what music works when they are going to sleep or driving and can play it in a habitual way, but the decision to play such music remains a conscious and reflexive one. In this way, Sofie knows which music will motivate her before a meeting, William knows which music works for writing and Emma knows what music helps her sleep. They all reflexively decide to play a particular artist or playlist at a particular moment based on their previous experiences with that artist, playlist or context.
As many decisions on the adequacy of a song are made based on previous experiences, the participants often know what to play and open Spotify with a song, artist or playlist in mind. In many cases, the participants explain that they prefer to avoid the algorithmic recommendations, even though they do use them. There is a difference between what can be seen in a user's data and their music-listening practices. Spotify's recommendations and the activities the music is used for are not always aligned. For example, William explains that even if he likes to listen to a particular type of music, this does not mean that he only wants that type of music to be recommended: Because there's no way for the system to understand that I don’t want to listen to eight hours of [a band]. I do it because I need to write, but in the evenings or on the way to work, I want to listen to something else. And, then, I would prefer what I’ve chosen myself, but I would also like that the Release Radar would be better at understanding what it is that I listen to from this hour to that hour. I don’t know. Maybe there's just somewhere in the system I could have helped the algorithm.
Thus, algorithmic recommendations play an especially significant role in two of the three listening modes described above. First, they are important when users enter the attentive listening mode and actively explore and look for new music. In this context, recommendations are often useful in terms of finding related music. A similar listening mode and relationship with the algorithmic recommendations are seen in the exploration-archival dynamic discussed by Lüders (2019). Second, in the lean-back emotional mode, recommendations can be helpful when music is played in the background. For example, Fillip, a 32-year-old PhD candidate, explains that Spotify knows his habits and what to recommend when he is standing by his front door on his way out of his apartment, and Sara, a 21-year-old university student, explains that she often plays a Spotify Radio playlist with her friends based on an artist they all know they like because then, they do not need to decide what to play. In the third mode, the lean-forward emotional listening mode, algorithmic recommendations are less influential because users have clearer intentions regarding what to listen to and what they need to get from the music. As they need the right song, the algorithm is less trusted than their own previously curated playlists.
Discussion and conclusion
In this article, music listening patterns, motivations and experiences were explored to identify the listening modes used when selecting and streaming music. Three listening modes were identified via individuals’ reflexive deliberation about their motivations, contexts, concerns and musical tastes, among other topics. The attentive listening mode results from a person's interest in exploring and paying attention to music. The lean-back emotional mode results from a person's interest in playing comforting and background music. Finally, the lean-forward emotional mode results from a person's interest in playing music with a goal in mind, such as mood management. As in previous research, these modes show that there is a balance between passive and active music consumption (Johansson, 2017b).
The dynamic between passive and active ways of listening to music is affected by how and where users decide to listen to music. In this study, users were active in deciding what to listen to, even if it was meant to be background music. The participants report that even when they select music for a routine event, such as leaving the house in the morning, or for background music while at work, they have reasons for selecting this music. This suggests that the role of reflexivity remains central, in that the participants are reflexive and conscious of the music they play during particular moments. The participants have reflected, experimented and decided, over time, which music is adequate for each moment, and they are ready to play music they know they will enjoy in an almost-habitual way. In other words, reflection in the moment and previous reflections and experiences are combined when a person decides what music to listen to, as suggested by Elder-Vass’s (2007, 2010) work. When a person listens to music, they make a conscious and reflexive decision regarding what to play, but this is based on the context in which the music is being played, previous experiences with music and previously established routines and habits in that person's life.
A critical realist view of reflexivity and agency provides a framework within which to integrate these past experiences and the structural and technological influences that affect each user. Thus, it provides a new avenue via which to explore and understand how users make choices when streaming music and how decisions become contextual (Johansson, 2017a, 2017b; Johansson and Werner, 2017). I argue that to understand how users make these contextual choices, it is necessary to further explore the listening modes that emerge through the interaction of the elements discussed here as well as how these modes are transformed by the personal experiences, preferences and context a person provides over time. In this way, these modes have the potential to explain the relationship between spontaneous and habitual decisions as well as, potentially, provide insight into the structures that influence a person's everyday choices.
Considering previous work suggesting that algorithmic recommendations may shape the choices and behaviours of users (Just and Latzer, 2017; Maasø and Spilker, 2022), it was expected that the platform's recommendations would play a central role in the participants’ decisions and listening modes. However, the personal and emotional reasons for playing a specific artist or playlist were more important for users than the algorithmic recommendations and curated playlists provided by Spotify. As in previous research, while technology influences how people interact with music, there are typically reasons for listening to music beyond recommendations (DeNora, 2000; Hesmondhalgh, 2013; Johansson, 2017a; Nowak, 2016). In this case, the influence of the recommendations can be seen in some of the listening modes. The participants explain that they use recommendations with an objective in mind. In this way, algorithmic recommendations are used by the participants practising the attentive and lean-back listening modes, as these modes focus on music discovery or background music. As others have discussed (Mathieu, 2023; Siles, 2023), algorithms are often believed to have sufficient power to overwhelm users and determine user choices, but this is only one side of the argument. User agency remains central to how music is chosen and how technology is used, and it is necessary to address the interaction between users and algorithmic systems to understand how music listening practices are developed.
The interaction between data and the uses of these data was addressed, to some extent, by developing an innovative approach that combines data donations and interviews to reflect the distinction between media uses and media practices. The quantitative streaming histories analysed here reveal when various types of music are preferred and played but not why they are preferred. Energetic music may be used for motivation during sports training or work or at a party, among other possibilities. Thus, the music use identified in the Spotify data donations cannot be fully understood without the explanations provided by the participants regarding their music practices. However, while the size of the sample is not seen to be an issue, the access to the data did present issues. The participants were asked to open Spotify and reflect on their use of the platform, but it was not possible to ask them to reflect on the data provided by the streaming histories, which they donated after the interviews. Ideally, future studies will be able to access the data donations before the interviews or schedule a second interview to expand on the reflections regarding these data. Furthermore, while the sample was varied, it was also quite homogeneous and did not provide any findings regarding the ways in which nationality, age and gender may influence listening modes.
The listening modes and music listening habits identified here depict how music streaming practices are developed, but more work is necessary to describe how they are exercised and how they evolve. This study is one of the first to do so from a critical realist and audience-based perspective using Spotify data donations. There is a large number of data available in the form of data donations and through Spotify's API that can be used to extend audience research on music streaming. Here, I focused on audio features, but other audio characteristics and genres can still be explored, as well as information on song duration and the playlists in which they are included. Future research can use the participants’ music streaming data to explore other, more nuanced patterns in consumer culture as well as to connect the streaming histories and practices with larger trends in the music industry landscape and on social media, such as TikTok.
Footnotes
Acknowledgements
The author would like to thank Taina Bucher and Marika Lüders for their feedback and support on this project, as well as the editor and reviewers for their comments on this article.
Ethical considerations
The study received ethical approval.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interest
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data is not available.
Informed consent
All participants provided written and verbal consent to participate.
