Abstract
We argue that electronic dance music (EDM) exhibits a parallel structural organization to that which has been proposed for cartoons (comics) after the model of hierarchical structure proposed in theoretical linguistics. According to this parallel, both systems are governed by general cognitive mechanisms for the narrative organization of tension and release, which are not modality-specific. We show that notions from visual narrative analysis, such as an Establisher–Initial–Peak–Release template, can be applied directly to EDM tracks as an Intro/Breakdown–Buildup–Core–Outro/Cut template. In doing so, we focus on how to formally define and operationalize relevant notions such as Breakdown, Buildup, and Core. As part of our analysis, we show that the scene-setting Establisher segments of visual narratives map onto two distinct categories in EDM: they correspond to intro sections at the beginning of a track and to breakdown sections in the middle of a track; we strengthen the analogy to visual narrative analysis by introducing refinements such as a pre-drop break that often occurs at the end of a buildup segment. To adjudicate between competing hypotheses on the hierarchical structure of a given EDM track, we demonstrate that analytical tests from linguistics and visual narrative analysis can be successfully applied. By introducing these analytical tools, this article sets the stage for further explorations in the linguistically informed analysis of the structure and meaning of EDM.
Background: From visual narrative grammar to electronic dance music
Recent years have witnessed the emergence of a body of work that extends tools that stem from—and are inspired by—formal linguistic theory to other research domains, such as visual narrative and music (Patel-Grosz et al., 2023). A central goal of this methodological extension is to reveal cognitive mechanisms that are found to be at play in varied objects of study, for example, the communicative mechanisms that underlie visual narratives, music, gesture, and dance. One prominent finding is the emergence of grouping-based structural organization, which gives rise to a rudimentary syntax in various study objects.
For example, the last 10 years have seen an increase in evidence that cartoons (comics) have a hierarchical structure, that is, a visual narrative grammar (Cohn, 2012, 2013a, and subsequent work). A sample analysis is cited in Figure 1, where panels are mapped onto narrative categories that structure the narrative, such as Peak, where narrative tension reaches its climax. Other narrative categories in Cohn’s system are Initial, where the narrative tension increases, and Release, where it dissipates. The Establisher functions as a scene-setter and is considered an optional category (Cohn, 2015) since it does not contribute to the development of narrative tension. Peak-containing sequences form constituents that inherit a higher-level role from one of the categories they contain. Figure 1 illustrates an Establisher–Initial–Peak constituent followed by an Initial–Peak constituent; since the second constituent has more narrative tension than the first, it serves as the higher-level Peak of the narrative arc. The root nodes of Cohn’s trees are labeled Arc to signify that they contain the entire narrative and do not have a function within the narrative. We argue that the presence of such narrative grammar is not an isolated property of visual narratives but a more general cognitive phenomenon that can also be found in other modalities (see also Antović, 2022 and Patrick et al., 2023, for comparable viewpoints). In this article, we are particularly interested in sound-based narratives found in one of today’s most popular types of music, electronic dance music (EDM).

Structural analysis of a boxing-match comic (Cohn, 2020, p. 361). © Neil Cohn (CC BY-NC-ND 4.0).
Definitions of syntactically relevant concepts
We define EDM as music that is produced electronically with the aim of encouraging dancing; this definition sets it apart from electronic art music, often presented to seated audiences, on the one hand, and dance music performed with non-electronic instruments, on the other hand. Since EDM specifies a type of music rather than a genre, it includes a variety of electronically produced genres, for instance, house, techno, trance, garage, and dubstep. EDM tracks generally emphasize rhythm characterized by a steady pulse (typically between 120 and 130 beats per minute in the case of the house genre and above 170 bpm in the case of drum’n’bass; see Gadir, 2014). The tracks are also primarily based on electronically produced sounds and rarely include harmonic shifts. Depending on the style, many EDM tracks also have a break routine (Solberg & Dibben, 2019; Solberg & Jensenius, 2017, 2019), destabilizing the track and creating excitement through musical peaks for the dancers and/or listeners. It is in connection with such break routines that a narrative grammar of EDM can be detected.
To analyze EDM syntax, we start by defining five terms that can be used to label sections of an EDM track (based on Butler, 2006; Gadir, 2014; Snoman, 2012; Solberg, 2014; Solberg & Dibben, 2019; and Solberg & Jensenius, 2017, 2019). As illustrated in Figure 2, an EDM track can typically be divided into the following sections: intro, breakdown, buildup, core, and outro (we address the drop below). Figure 2 is a macro-level analysis; more fine-grained analyses will be considered later in this article (see Solberg, 2014, for an analysis that splits each core section into two subsections). The core section is defined as the part of a track in which the main groove is maintained and both the bass line and bass drum are present. This is where dance clubbers typically exhibit the largest quantity of motion (Solberg, 2014; Solberg & Jensenius, 2019). Mark J. Butler coined the term core through his ethnographic work with DJs and producers, and in his words, it represents “the track in its most essential form” (Butler, 2006, p. 223). Breakdown is a formal section in which the musical texture and rhythmic framework are usually thinned out (Butler, 2006). For some EDM styles, this involves a sudden removal of the bass and bass drum (Gadir, 2014). However, this is genre-dependent, and a breakdown can be carried out in many ways by different DJs and artists (Butler, 2006; Snoman, 2019). Buildup is a section in which musical layers are gradually added or reintroduced, and various intensifying production techniques are employed, such as filtering and tweaking (Butler, 2006, 2014; Snoman, 2019). The quantity of dancers’ body motion is often reduced during breakdown and then gradually rebuilt during buildup (Solberg & Jensenius, 2019). It is helpful to label two further sections of the EDM track besides buildup and breakdown: the intro and the outro. These sections often have a sparse texture that contains the basic rhythmic pattern of the track, to which the main features of the track are either gradually added (in the intro) or from which they are removed (in the outro) (Butler, 2006). In this way, they can be part of a larger DJ performance and cross-fade, beat-match, and blend with other tracks. While they constitute a separate type of segment, the intro may share properties of the buildup in the same way the outro may share properties of a breakdown.

A waveform with a coarse analysis of Icarus 1 by Madeon (2012), genre: electro house.
The sixth (highly) relevant concept is the drop, defined as the (more or less punctual) transition from a lower-intensity to a higher-intensity section. Formally, the drop can be classified as part of the subsequent higher-intensity segment, which is often the core segment; functionally, it serves as a musical peak point in the track, perceived as a destination point for many dancers. DJs may use the term dropping the beat to pinpoint this “process of bringing in the bass drum after a removal or a breakdown” (Butler, 2006, p. 326). There are ongoing discussions within both the EDM scholarship community and among producers and audiences about whether the drop constitutes a formal section equivalent to the breakdown and buildup or a peak moment in the track (Barna, 2020; Osborn, 2019; Smith, 2021). This is particularly pertinent because the term drop has subtly different meanings in contemporary popular music and EDM genres, where it originated.
Figure 2 illustrates a first drop at approximately 69 s and a second drop at approximately 149 s. The drop is often the transition from a buildup section to a core section (as in Figure 2), although this is partly genre-dependent. In other cases, a drop may be preceded by a breakdown without a buildup (Gadir, 2014; Marchiano & Martínez, 2018). 2 The drop usually occurs at “the reintroduction of the bass and bass drum” (Solberg, 2014, p. 65), often included in its definition. Additionally, there are often additions or changes to the musical material in the core section that follow the drop to further intensify the reintroduction. However, there may be exceptions: Gadir’s (2014) analysis of Trümmerfeld (Oliver Huntemann Remix) by Extrawelt (2009) assumes a “two-stage exit from the breakdown” (p. 66) in the shape of two subsequent drops. 3 While the first drop reintroduces the bass and bass drum, it does not lead into a core section (as the main groove is not yet restored) but into a buildup section, culminating in a second drop that reintroduces the core section. Since the bass and bass drum are already present in the buildup before the second drop, their reintroduction cannot be part of the drop. Also, the number of drops in a track can vary based on musical style and the preferences of the producer/DJ.
It is worth noting that Solberg’s (2014) and subsequent work refer to a tripartition of EDM sequences into breakdown, buildup, and drop. This is motivated by a measurable change in intensity, which is connected to these three types of sequence and directly linked to the response on the dance floor. From a syntactic perspective, this notation—while intuitive and functionally motivated—glosses over the observation that the drop is a punctual transition rather than a section of a track. For example, in Figure 2, the initial buildup would be approximately 31 s, the first core segment 45 s, and the first breakdown 19 s; by contrast, drop 1 would be included in the first core segment and last only for the first second, or the first few seconds, at the beginning of this core segment. This motivates our present usage of the label core for the main groove parts, which has precedent in Solberg (2014); we will thus use breakdown, buildup, and core as the relevant terms. 4
In addition to illustrating central concepts, Figure 2 shows the standard template of EDM tracks (Butler, 2006; Snoman, 2012), which Solberg (2014, p. 18) considers the “formal scheme” of EDM, and which we call a two-peak or two-drop structure. After a relatively long intro/buildup section, the first drop (at 1:09) introduces the first core section, which runs for about 45 s (from 1:09 until 1:54). This is followed by the break routine, which Solberg and Dibben (2019) and Solberg and Jensenius (2017, 2019) define as a sequence of breakdown (at 1:54 in Figure 2), buildup (2:12–2:29), and drop (at 2:29). 5 The second core section runs for about 54 s from 2:29 to 3:23. The outro, which concludes the track, is short and ends abruptly after 8 s. 6 The second drop of an EDM track is generally more intense than the first (Solberg, 2014:66), which mirrors the structure of the boxing-match comic in Figure 1. We sketch the hierarchical structure in Figure 3, which is derived from the comic structure illustrated in Figure 1, by mapping the visual grammar notion of Initial to the EDM notion of Buildup, as well as mapping Peak to Core, Establisher to Intro, and Release to Outro. Following the notational conventions in the analysis of visual grammar, we capitalize terms such as Core when referring to the analytical structure categories but not in our discussion of the musical segments themselves. Breakdown is the only category in Figure 3 that does not map directly onto Figure 1; we address its status in a later section, where we group Breakdown and Intro together as two EDM counterparts of the Establisher category, thus refining our understanding of this narrative category in EDM.

Sketch of the narrative structure of an EDM track: Icarus (Madeon).
There is a non-trivial issue for a structural analysis of EDM involving the initial sections of tracks. Gadir (2014, p. 77) observes that clubbers “would not usually hear the first one to two minutes of [a] track while on the dance floor”; it would generally only be present for the benefit of the DJ whose task is to beat-match the track to the preceding track. For Icarus, shown in Figure 2, the intro and the first buildup (0:00–1:09) may not be part of the track that is actually experienced. While the musical excerpts of Solberg and Dibben (2019) and Solberg and Jensenius (2017, 2019) reflect this generalization, we include the intro, the initial buildup, and thus the first drop in our analysis. This is empirically motivated: although some club settings are such that the intro and initial buildup would be excluded, other performances may include them. Also, even though EDM tracks are central to dance culture, many people listen to EDM tracks in other settings (e.g., at home, on the bus). Butler (2006) highlights the way that unmixed tracks can serve as a starting point for analysis, recognizing that the experience and perception of unmixed tracks and mixed tracks used in a live performance can vary significantly. A case in point is Madeon’s 2013 performance at the Creamfields EDM festival. 7 Here, Madeon transitions from a remix of the track Rhythm Mode (Charlie Darker Remix) by Black Tiger Sex Machine to the intro and first buildup of Icarus, followed by the first drop and core section. In this context, the part of Icarus that precedes the first drop serves a double role in that it can structurally be analyzed as a breakdown equivalent in relation to the last core section of the preceding track.
Significance of the breakdown–buildup–core tripartition
Solberg (2014) approaches the breakdown–buildup–core tripartition from two perspectives: first, the notion of musical expectancy (Huron, 2006; Meyer, 1956) is applied to the cycle of increasing tension (breakdown–buildup) and subsequent release (via the drop) incorporated in EDM tracks; second, the metaphor of gravity (Lakoff & Johnson, 1999; Larson, 2012) is applied to the EDM template, involving a “feeling of being lifted, held in suspense, then dropped down and grounded” (Solberg, 2014, pp. 65–66). When connecting these notions to Cohn’s visual narrative grammar, it is important to note that EDM involves two distinct concepts of peak, as illustrated in Figure 4. There is a tension peak (red/dashed line), which is reached at the end of a buildup right before the drop (although tension peaks may also occur elsewhere; see Solberg, 2014). By contrast, there is an emotional intensity peak (blue/dotted line), which typically starts at or immediately after the drop and continues into the core section (peak-pleasurable experience; Solberg & Dibben, 2019; Solberg & Jensenius, 2017, 2019). The emotional intensity peak is accompanied by greater motion (Solberg & Jensenius, 2017, 2019) and increased skin conductance response (Solberg & Dibben, 2019). In Figure 4, the red/dashed line sketches the development of tension and release, with two tension peaks just before the drop/core sections; the blue/dotted line sketches the development of emotional intensity (based on Solberg, 2014, p. 73), which peaks after the drops, when the tension starts to be released.

A comparison between the development of tension (red/dashed) and emotional intensity (blue/dotted) in Icarus.
To the extent that the structural analysis in Figure 3 treats core segments as analogous to peaks in Cohn’s (2020) analysis of visual narratives, this type of peak would more closely correspond to the development of emotional intensity (which peaks during the core segments), not to the tension–release cycle (which peaks before the drop that introduces a core segment). The structure of the graph in Figure 3, however, is not inherently associated with any particular interpretation or affective content. Its purpose is to capture grouping and headedness information, as well as contrasts inherent in the music, which serves as a basis for analyzing listeners’ behavioral and/or affective responses to the music. While the EDM parallel to the visual narrative in Figure 1 tracks emotional intensity more than tension release, we do not claim that this dimension is somehow primary in EDM or more structurally transparent than the tension–release cycle. The two dimensions correspond to different structural properties. At a first pass, tension (red/dashed line in Figure 4) starts to increase at the start of higher-level units (corresponding to the intro and breakdown segments) and drops at the start of more prominent (peak) segments (i.e., at the start of core segments). By contrast, emotional intensity (blue/dotted line) increases at the start of more prominent units (i.e., at the drop). It decreases at the end of those prominent units (in the transition to the breakdown or outro segments). The structure diagram in Figure 3 encodes general divisions, distinctions, and relations among musical segments that provide (part of) the basis for predicting more specific responses such as emotional intensity, tension, and release.
Toward a narrative grammar of EDM
To establish a method for the structural analysis of EDM tracks, we can adapt diagnostics (tests that allow us to identify syntactic categories) that were posited for visual narratives (Cohn, 2015) after the model of more traditional linguistic diagnostics (e.g., Cheng & Corver, 2013). In this approach, the panels of a visual narrative (as shown in Figure 1) are assigned to macro-categories, which are ordered in terms of their necessity, as summarized in (1), where > represents “more necessary than.”
Peak > {Initial, Release} > {Establisher, Prolongation}
The canonical narrative phrase in visual narratives is a sequence of Establisher–Initial–Prolongation–Peak–Release (Cohn, 2015), that is, any two or more of these categories (including non-adjacent ones) that occur in the order above may form a constituent.
The Peak is categorized as the “climax or primary information” (Cohn, 2015, p. 2). In our analysis of EDM structure, Peak corresponds to the core segment, captured by the Core category, as illustrated in Table 1. Since the drop generally serves as the initial point of the core segment, we can say that an EDM track peaks at each drop. The Initial in a visual narrative leads up to the Peak, whereas the Release corresponds to the closure that follows the Peak. We can mirror this setup by building our analysis on emotional intensity and identifying the Initial of visual narratives with the Buildup in EDM tracks, whereas the Release of visual narratives is most clearly related to the EDM Outro (Table 1). Apart from the outro segment, there is another section in EDM tracks in which emotional intensity drops: at the beginning of the breakdown. However, while outro segments map onto the Release of visual narratives, the situation is somewhat different for breakdown segments: breakdowns are generally understood to initiate a new part of the track rather than serve as an endpoint (see Snoman, 2019). The notion of a break routine (Solberg & Dibben, 2019; Solberg & Jensenius, 2017, 2019) encompasses the classification of the breakdown as the first, establishing, phase of the subsequent drop. Upon closer inspection, Breakdowns thus resemble Establishers in comics, which set the scene without progressing toward the Peak; they share this property with the EDM Intro, which can also be classified as a plausible counterpart of the Establisher category. Correspondingly, the EDM counterparts of the Establisher category are given as Intro/Breakdown in Table 1.
Counterparts of visual narrative categories in EDM (first version).
The analysis illustrated in Table 1 raises the question of how to model the decrease in emotional intensity at the beginning of a breakdown, which suggests a connection to the Release category in visual narrative. Where a core segment ends and a breakdown segment begins, we often observe a brief transition (often as short as 1–3 s) distinct from each of these segments; we call this kind of transition Cut and illustrate it in Figure 5, which represents the whole of the first core segment (1:09–1:55) and breakdown (1:58–2:12) as shown in Figure 4, with the addition of the cut section in 1:55–1:58. The novel term cut refers to cutting off the energy at the end of a core section. In the original mix of Madeon’s Icarus, the cut occurs at 1:55–1:58, whereas the breakdown proper starts at 1:58. Previous analyses, such as Solberg (2014), have treated the cut as part of the core section rather than as a separate category, whereas our Figure 4 treated it as part of the breakdown. It is often unclear to the listener where exactly a cut ends and a breakdown begins. However, it appears to be specifically within the cut segment that the emotional intensity drops, warranting an analysis of Cut as an EDM counterpart of the visual Release category.

The cut (= transition between core and breakdown) in Icarus; the blue/dotted line sketches the development of emotional intensity.
To summarize, we have so far established two EDM counterparts for the Establisher category of visual narratives (Intro and Breakdown) and two for the Release category (Cut and Outro). The idea that intro and breakdown segments are instantiations of the same category (Establisher) is supported by our previous observation that intro sections can often serve as breakdowns with regard to a preceding track while serving as scene-setters for the track that contains them.
Finally, returning to our discussion of (1), Prolongations in comics are disruptions between Initial and Peak (Cohn, 2013b, 2014) that increase the tension before the Peak, thus maximizing the payoff once the Peak occurs. EDM tracks often exhibit some sort of disruption between buildup and drop, for example, a brief moment of silence, a synthetic drumroll (DR) with a crescendo, or a synth-pad swell with or without stereo panning or phasing. Such disruptions appear to announce the imminent arrival of the drop (see Peres, 2016, and Smith, 2021, for discussion; see also Marchiano & Martínez, 2018, who assume a brief breakbeat segment between buildup and drop). There is no established term for this event, which may be classified as a production technique for realizing the drop. Yet, it precedes the perceived drop, the increase in intensity, and the reintroduction of bass and bass drum. We refer to this as the pre-drop break (PDB). Table 2 summarizes the complete mapping of the relevant five visual narrative categories onto the seven EDM categories we have introduced.
Counterparts of visual narrative categories in EDM (final version).
The seven EDM categories we have introduced are comparable to the visual narrative categories (1) in their necessity for the formal scheme of an EDM track. We hypothesize that the analogous hierarchy in (2) holds. Given the central role of the breakdown segment in a break routine, a reader may find it surprising that the Breakdown in (2) is considered obligatory to a lesser extent than the Buildup; however, the outcome of the discussion of Figure 5 is that the Cut, rather than the Breakdown, gives rise to the observed destabilization of the track. Intuitively, removing the breakdown segment (1:58–2:12) of Icarus will yield the perception that a break routine is occurring, so long as a Core–Cut–Buildup–Core sequence is maintained (where Cut–Buildup–Core constitutes the break routine). This does not imply that the track that would result from removing the breakdown segment of Icarus would be equivalent to the whole track actually experienced by listeners and/or dancers, only that removing the breakdown would not destroy the structure of the track. This observation is also compatible with approaches that collapse breakdown and buildup into a single segment (Collins & Dunn, 2021) labeled breakdown, implying that a split into a breakdown part and a buildup part is not desirable (or even possible); in our model, we would classify such a combined breakdown–buildup segment as a Buildup rather than as a Breakdown. 8 As a consequence of the analytical setup in (2), the Cut does not occur in every EDM track, as it is considered obligatory to a lesser extent than the Core. We expect to find EDM sequences that consist only of a Buildup segment followed by a Core segment, among other combinatorial options that do not contain a Cut.
(2) Core > {Buildup, Cut/Outro} > {Breakdown/Intro, PDB}
We revisit our first analysis of Icarus in light of these categories. Regarding a potential PDB section, each of the two drops is announced by a 2.5 s sequence, featuring a drum roll culminating in five syncopated snare drum hits, a double tresillo figure (Biamonte, 2014) accenting every third 16th note against a duple metrical pattern, accompanied by white-noise uplifters and a bass slide accentuating the first beat of the next bar (Solberg & Dibben, 2019). We classify this sequence as a PDB for present purposes, leaving open precisely what the properties of a PDB are (i.e., whether its definitions should include an increase or decrease of intensity before the drop).
We can now compare Cohn’s (2015) diagnostics to the properties of the EDM segments. The first category to identify when providing a structural analysis for an EDM track is the Core; given the necessity hierarchy in (2), an EDM track without a Core is not well-formed. A noteworthy advantage of using Core rather than Drop as our category label is that EDM tracks (especially in genres such as minimal techno) may not include a drop at all if they have no breakdown and employ a more flat and repetitive form (Butler, 2006; Gadir, 2014). However, such a track would still have a core segment that could span the entire track. 9 A central diagnostic for categories in visual narratives is deletion; in the case of EDM, we suggest that a core segment cannot be removed without affecting the listener’s emotional reaction to the track. This is a hypothesis that could be tested in future research; however, it receives initial support from Solberg and Dibben (2019) and Solberg and Jensenius (2017, 2019), who all find that the introduction of a core segment using a drop gives rise to a peak-pleasurable experience. It is conceivable that removing a core segment would entail removing such an experience. Notably, deviations from an expected pattern can also elicit listeners’ and/or dancers’ responses. DJs and producers can employ various production techniques where removing the core segment deliberately triggers responses by deviating from the audience’s expectations.
Deleting a Buildup from an EDM track can make the drop seem to occur suddenly and abruptly, which is parallel to Initials in visual narratives (Cohn, 2015). By contrast, if the Outro at the end of a track is removed, or a Cut is absent from marking the transition from a core segment into a break routine, the core section may be perceived to lack a conclusion; this can be compared to the role of a Release in visual narrative, which is needed to bring the tension from the Peak to a conclusion.
Intro sections and PDB events are necessary to a lesser extent and function analogously to Establishers and Prolongations in a visual narrative; the same is true for the Breakdown, as discussed in connection with (2), as distinct from the Cut, which is necessary to a greater extent. The intro section sets the scene for the track, a breakdown segment sets the scene for the subsequent buildup, and the PDB announces the drop, the transition from the buildup into the core section. It is an open question whether any of these three are optional. The PDB is primarily felt to be an integral part of the drop itself, and the drop may be less effective when the PDB is removed. Figure 6 illustrates the PDB section of Spirit of Hardstyle (Extended Mix), 10 by Noisecontrollers (2017), a track in the EDM genre hardstyle, which we return to later in this section. 11 Removal of the 1.54 s PDB section does not change the categories of Buildup and Core. Intuitively, we would argue, it makes the drop that initiates the core section less effective and thus less satisfying.

PDB of the third drop in the Spirit of Hardstyle (Extended Mix) by Noisecontrollers (2017).
One important aspect of the visual narrative structure of comics is that it gives rise to a hierarchical constituent structure similar to that which we find in natural language, as illustrated in Figure 1 and applied to EDM in Figure 3. A major question introduced by Figure 3 is how to determine the constituents of an EDM track. Translating the canonical narrative phrase of visual narratives (Establisher–Initial–Prolongation–Peak–Release; Cohn, 2015) into EDM terminology, we need to factor in the observation that Establisher and Release each have two EDM analogs. We thus posit the canonical EDM sequence in (3), indicating the visual narrative counterparts in subscripts.
(3) Breakdown/Intro[Establisher]-Buildup[Initial]-PDB[Prolongation]-Core[Peak]-Cut/Outro[Release]
To demonstrate the application of structural diagnostics to EDM tracks, we focus on the properties of the Cut, which, as the transition between a core segment and a subsequent breakdown segment, may in principle form a constituent with either. This entails at least two possible constituent structures for Icarus, as shown in Figure 7. The two structures differ according to whether the Cut is counted as part of the first constituent (c[onstituent]-analysis 1, in red above the categories), or as part of the second constituent (c-analysis 2, in blue below the categories). Our treatment of the cut segment as a Release entails that only c-analysis 1 can be correct, but we can provide independent empirical evidence in favor of such an analysis.

Icarus—possible constituent structures.
Again, we can employ a deletion test to adjudicate between two structural analyses (Cohn, 2015). If we delete an entire constituent, the resulting track should still sound complete; if we delete parts of constituents while preserving constituent boundaries, the resulting track will sound incomplete. The deletion test favors c-analysis 1. To see this, let us delete all of the material up to the Cut segment, as given in Figure 8. 12 The resulting track, which would amount to the second constituent in c-analysis 2, sounds incomplete because it seems to be missing something before the initial 3 s. By contrast, if we delete the cut segment at 1:55–1:58 and start the track with the breakdown segment at 1:58, amounting to the second constituent in c-analysis 1, the track sounds complete. This suggests that there is a constituent boundary between the Cut (1:55–1:58) and the Breakdown (around 1:58). 13

Deletion test applied to Icarus.
An important remaining question regarding constituents in a visual narrative (and, by extension, in EDM) is whether they form higher-level groups and what the labels of the relevant higher-level groups should be. In a track with two constituents, as shown in Figure 7, the higher-level grouping is relatively simple because the group arises from combining the two constituents (see Cohn, 2015). 14 The main question is whether the two constituents have a higher-level function with regard to each other. In the present case, there are two possible analyses: the first constituent could be the higher-level Buildup category, with the second constituent its higher-level Core (= f[unction]-analysis 1, Figure 9) whereas the first constituent could be the higher-level Core, with the second constituent its higher-level Outro (= f-analysis 2, Figure 10). General knowledge about the formal scheme of EDM tracks militates in favor of f-analysis 1: if an EDM track has two drops, the second drop is typically more intense than the first (Butler, 2006; Snoman, 2012). This also has the direct consequence that a track that results from deleting the second (= Core) constituent of Icarus may be less satisfying for a listener than a track that results from deleting the first (= Buildup) constituent (in line with the observation that this is precisely what DJs do when they omit the first one or two minutes of a track). We thus propose the analysis of Icarus as shown in Figure 9, marking the Core category that projects to the higher level of the grouping structure in bold/blue type. The (rejected) alternative analysis is shown in Figure 10.

The complete analysis of Icarus (f-analysis 1).

The rejected (alternative) analysis of Icarus (f-analysis 2).
The reader may wonder why the first constituent in Figure 9 is classified as a higher-level Buildup[Initial] rather than a higher-level Intro[Establisher]. As discussed in the section headed Definitions of syntactically relevant concepts, the intro and initial buildup are frequently omitted in live performances, and this optionality is similar to that of the Establisher category (compare Cohn’s, 2013a, discussion of vamps). 15 However, the core and cut segments of the first constituent are not optional, indicating that the first constituent is more closely aligned with the Initial category, as it cannot be omitted in its entirety.
An extended and more complex structure can be seen in Spirit of Hardstyle (Extended Mix), introduced in Figure 6. This track lends itself to structural analysis in that it contains four drops (and thus four core sections), and it has been discussed by community resources, 16 which can be used to gauge intuitions about the structure of the track. The community resources only label the second and third drops as climax (an alternative term occasionally used for the drop in EDM), whereas the first drop is classified as mid-intro and the fourth drop is classified as mid-outro. Each of the four events is classified as a drop segment based on our definitions. However, the qualitative distinction that is made in community resources already gives us some indication of the higher-level structure of this track: the mid-intro (drop 1) is part of the higher-level Buildup category, whereas the mid-outro (drop 4) is part of the higher-level Outro category.
Based on the tests discussed above (in particular, completeness after deletion), we infer the structure in Figure 11. Note that there are four constituents, each incorporating the canonical EDM sequence as illustrated in (3). Figure 11 omits the PDBs for reasons of legibility, but each of the four drops is preceded by a very pronounced PDB, where the music fades out, and a speech sample pronounces the genre name “hardstyle.”

Analysis of spirit of hardstyle (extended mix); the four PDBs at the end of each buildup segment are omitted for reasons of legibility.
Figure 11 adopts the grouping of the constituents into [1-2]-3-4 (where the first two drops are part of the Buildup/Initial) rather than into 1-[2-3]-4 (where the second and third drop are part of the Core/Peak of the track). This is motivated by the general formal schema of EDM tracks, also found in Figure 9: when there are two drops (counting the second and third drops of Spirit of Hardstyle [Extended Mix]), the second one is typically the more intense one. Applied to Figure 11, the relevant intuition is that the track would seem more incomplete and less satisfying to a listener if it were to end after the second drop/core segment (before the second cut section around 236 s) than if it were to end after the third drop/core segment (before the third cut section around 317 s). The authors share this intuition.
A noteworthy aspect of Spirit of Hardstyle (Extended Mix) is that the music in the third breakdown segment (5:19–5:32) is nearly identical to the first 25 s (2:14–2:39) of the second buildup. This highlights one important feature of a narrative analysis of EDM: narrative categories are not defined by their musical or aural content but by their role in the track.
Both Icarus and Spirit of Hardstyle (Extended Mix) may seem to have simple structures. However, this glosses over some details of the major segments: buildups can contain internal variation, and initial surveys of a range of other tracks indicate that buildups may be the most internally complex parts of a track. For instance, the second buildup of Spirit of Hardstyle (Extended Mix) (time-stamp 134–198 in Figure 11) 17 consists of at least six subsections followed by the PDB, as illustrated in Figure 12. In each of the five phases (phase 1, phase 2, etc.), more layers are added to the track; the buildup then culminates in a DR (drum roll) that instantiates the snare-roll technique, which is followed by the PDB that leads into the drop and core section. Each of phases 1 to 4 is eight bars long; the last part of the buildup, which encompasses phase 5, DR, and PDB, is also eight bars long. 18 We segmented phase 5 and DR as separate sections to highlight the qualitative difference, but nothing hinges on this.

The second buildup of Spirit of Hardstyle (Extended Mix) (at 2:14–3:18 in the track).
Conclusion and outlook
This article proposed a syntactic analysis for EDM, which follows similar cognitive principles to that which has been suggested in the visual grammar of comics. Specifically, EDM tracks follow a narrative progression centered around a Peak event, which we have identified with the core segments (constituting the structural Core category) during which the main groove of the track is present. To our knowledge, a connection between visual narrative structures and EDM structure has not been proposed in the research literature on EDM or visual narratives; our proposal thus opens a new perspective on the structural analysis of EDM, the role of the narrative tension–release cycle in its analysis, and the way EDM is governed by general mechanisms of human perception and cognition. This initial venture into the syntax of EDM opens several lines of inquiry for future research.
First, the question of EDM semantics needs to be addressed. As observed by Garcia (2011), a communicative exchange happens between a DJ and dancers in the club context. An essential property of the drops in an EDM performance is their role in yielding synchronization, through interpersonal entrainment, among the dancers. This suggests that EDM may communicate a directive/imperative semantics.
Second, an avenue to be explored in future research is whether similar syntactic and semantic phenomena can be found in other styles of dance music. Solberg (2014) notes that the structural properties of EDM can be connected to the “chills” observed in connection with classical art music (Guhn et al., 2007). Antović (2022) suggests that a Peak in the sense of Cohn’s (2013a) visual narrative grammar can be found in The Show Must Go On by Queen (1991). Patrick et al. (2023) report experimental work on potential parallels between visual and musical narratives, albeit at the level of chord progression in 12-s-long musical sequences rather than pieces of music that span several minutes.
A third open question concerns the computational properties of visual narratives and EDM tracks, and their place in the hierarchy of formal languages (Jäger & Rogers, 2012). Computability is a mathematically precise way of classifying the difficulty of a set of sequences and the grammar that can generate them. As such, it can provide insight into human cognitive abilities. The computability of patterns (linguistic, visual, or musical) can reveal the minimum power required of human cognition to process, generate, and learn them. It is possible that different modules, even within the same modality, require different levels of computability. For example, in natural language, syntax is known to be fully context-free and, at most, mildly context-sensitive (Shieber, 1985), while phonology is regular (Kaplan & Kay, 1994). The main difference between regular and context-free languages is that context-free languages can have arbitrarily deep center-embedding structures (e.g., a language with center-embedding relative clauses). Our discussion of visual narratives and EDM tracks relates to natural language syntax in that they both concern hierarchical structures that potentially map onto a semantic interpretation. Therefore, it would be interesting to see if visual and EDM narratives also exhibit context-free patterns. In our current analysis, both visual narratives and EDM patterns fit into the class of regular languages, but it is unclear whether they are also context-free. To be classified as context-free, it should be possible for an arc α with the hierarchical template in (3) to embed another self-contained arc β, which can further contain a third arc γ, and so on. Because of physical and time constraints, it is impossible to find examples of such infinitely nested structures. Thus, we cannot determine the computability of visual narratives or EDM tracks by studying existing examples. Instead, to test whether these arcs can have context-free structures, there needs to be a way to tap into readers’ and listeners’ intuitions about possible visual narratives and EDM tracks, similar to the way in which linguists rely on language users’ judgments. A sample embedding of an arc β within an arc α is illustrated in (4a–b), showing that the Breakdown of the main α arc (in bold type) is expanded into a subordinate arc β, which serves as a track within a track. The idea would be that the β arc gives the appearance of a different track altogether while reproducing the narrative structure of an EDM track; at the end of the β arc, the α arc is resumed. This setup is recursive in that the Breakdown of the β arc could be expanded into an arc γ, and so forth.
(4) a. α-arc: Introα-Buildupα-Coreα-Cutα-
b. Breakdownα = β-arc: Buildupβ-Coreβ-Cutβ-Breakdownβ-Buildupβ-Coreβ-Outroβ
If visual and musical arcs both turned out to be as complex as natural language syntax, it would strengthen the idea that human cognition is domain-general, relying on the same machinery to generate outputs in different domains.
Footnotes
Acknowledgements
For insightful discussion, the authors thank audiences at the 15th International Symposium of Cognition, Logic, and Language (University of Latvia, Riga, 26–27 November 2021), the Crete Summer School of Linguistics, and the LINGUAE seminar. The authors are grateful for the helpful and constructive feedback from two anonymous reviewers.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was partially supported by funding from the Faculty of Humanities career development grant at the University of Oslo [PI: Patel-Grosz], EU Horizon 2020 Marie Skłodowska-Curie R&I program, under grant agreement no 945408 [recipient: Patel-Grosz], RFIEA + LABEX, French national grant, ANR-11-LABX-0027-01 [recipient: Patel-Grosz], and by the Research Council of Norway through its Centres of Excellence scheme, project number 262762 [PI: Jensenius].
