Abstract
Background
This article focuses on communication in team-based esports, particularly in the ways that
Aim
We provide an empirically-based theorization of why callouts appear to be especially important in team-based FPS games, which, because of the limited fields of vision and split-second decision-making, require players to communicate what is happening to the others in the team as they navigate the game environment.
Methods
To describe this distributed perception, we borrow from studies on active military settings that term this seeing together as
Results
Through this paper, we contribute to the ongoing research on understanding
Conclusion
The introduction of this concept to game studies can help in making sense of a key capability in networked team-based games; that is, how players collectively construct a situational awareness that encompasses teammates’ perception. Also, because of the essential role of callouts and interperceptivity in highly-skilled networked play, we point to some of the cultural contexts in which this practice is accomplished.
Introduction
One and a half minutes into a round of COUNTER-STRIKE:GLOBAL OFFENSIVE (CS:GO), a player on the counterterrorist side makes his way through a series of cramped rooms and hallways in Mirage, pausing in a covered walkway with his gun’s crosshair trained on the entrance to an adjacent tunnel to his left. Seconds pass before the last opponent emerges out of the tunnel. The player, dubbed Purple, quickly dispatches his opponent, ending the match.
What made this interaction possible? It is utterly unremarkable: one player kill, effective if workmanlike, in one of the millions of team-based First-Person Shooter (FPS) matches played every day in games like HALO, APEX LEGENDS, VALORANT, OVERWATCH, and in the case discussed in this article, CS:GO. Watching without sound, we might attribute Purple’s kill to luck; being in the right place at the right time with his crosshair coincidentally trained on the space where the last opponent emerged. There’s certainly a lot of luck involved in CS:GO – but luck has very little to do with this interaction. Playing the clip with sound on reveals a vital component in Purple’s successful prediction of where and when the last player would be vulnerable to a well-executed burst of rifle fire: his teammates knew where the last opponent was and which direction he was headed. They communicated this through callouts, which are short, repeated utterances, almost gunfire-like in their staccato bursts. Callouts are colloquial words that in the gaming community (Kiourti, 2022) and context refer to in-game locations (see Figure 1 for callouts relevant for this paper, Total CS:GO, 2023), and they are used to alert teammates to the presence of opponents at that location (Rusk & Ståhl, 2022). Hence, Purple’s ability to know where and when the last opponent would emerge from the tunnel was a social, collaborative, and communicative accomplishment enabled by the vision of his teammates, translated into verbal communication. Community-shared callouts for the map Mirage in CS:GO. (Total CS:GO, 2023, Retrieved from https://totalcsgo.com/callouts).
This article builds on the authors’ shared foundations in documenting the constitutive role that communication plays in team-based esports, particularly in the ways that callouts enable players in team-based FPSs to link their own awareness of in-game actions to that of their teammates. The first-person perspective necessitates communication to know what is happening to the others in the team as they navigate game environments defined by limited fields of vision (Duell, 2014; Halloran, 2011; Manninen, 2001; Rusk & Ståhl, 2022; Tang et al., 2012; Taylor, 2012). However, the centrality of callouts in communication also touches upon gaming’s problematic patterns of marginalization (Cote, 2020; Fickle, 2019; Ståhl & Rusk, 2022; Witkowski, 2013). In the case of competitive gaming with random co-players, it is not a stretch to state that the communication we analyze in this article is mostly limited to young, cis-gendered men through diverse forms of gatekeeping (Elam & Taylor, 2019; Gray, 2012). For fear of harassment, racial and gendered minorities often refuse to participate in game-based voice communication. The fact that verbal callouts are an expected component of networked play, and that voice is so politicized as a marker of social location, means that those whose voices mark them as ‘other’ are at a disadvantage. In this article, we recognize this issue, however, the data we rely on when analyzing the phenomenon of interperceptivity is of a complete all-men five stack team communicating through a private voice channel.
We offer an empirically-driven theorization of how callouts function in networked competitive play, particularly their capacity to enable teammates to predict the location and actions of their opponents. Borrowing from studies of communicative actions in active military settings, we describe how callouts constitute a form of distributed perception, or “interperceptivity” (v. Wedelstaedt, 2020, p. 120). We build upon von Wedelstaedt’s consideration of how helicopter gunship crews on scouting missions during the US invasion of Iraq produce ‘shootable’ targets through a routinized, distributed, and technologically mediated series of communicative actions. The account shows how helicopter gunners verbally process live video footage of potential enemy combatants on the ground, effectively transforming footage into “commensurable descriptions for further courses of action” (v. Wedelstaedt, 2020, p. 113) – in this case, the gunners confirm that the people recorded on the helicopter gunships’ roving surveillant gaze are armed enemy combatants and thus, military targets. In its granular and technical analysis of video footage of ‘live’ combat operations leading to lethal violence (which is not depicted in the clip itself), von Wedelstaedt’s analysis is both startling and illuminating. Through the careful application of video analytic techniques normally used for less high-stakes and ethically problematic workplace communications, von Wedelstaedt demonstrates how the gunship crew’s activities reflect a “distributed, technologized, and co-operatively managed epistemic practice” (v. Wedelstaedt, 2020, p. 114) in which descriptions of potential enemy combatants’ activities are deliberately emptied out of dramatic or emotional resonances, in order to produce a socially shared, institutionally actionable and empirically verifiable (though certainly not objective) picture of events on the ground as they unfold: what von Wedelstaedt terms “interperceptivity” (2020, p. 120).
Like von Wedelstaedt, we are concerned here with understanding the socially shared “joint epistemic practice” (v. Wedelstaedt, 2020, p. 114) of verbally processing live video data of a kinetic field of combat. The differences between the two contexts are obvious, and ought not be understated: one is a military operation involving asymmetries of both surveillance and weaponry, in which the aim is to determine whether those on the ground are friend or foe, with lethal outcomes; the other is a video game in which two equally matched teams strive to see the enemy first, with no discernible stakes other than an incremental shift in player and team ranking. At the same time, it is no mere coincidence that a theory used to understand communicative strategies during a military airstrike can be so handily applied to networked competitive gaming (see, e.g., Pötzsch & Hammond, 2016). After all, both are technologically mediated contexts oriented towards the violent elimination of hostile opposition, in which techniques for extending vision – of seeing before, or without, being seen – are paramount. Both contexts are also, often, stressful in that they involve carrying out precise operations under pressure from opponents/enemies and conducting communication in ‘noisy’ conditions (loud, laggy, technically failing equipment, see, e.g., Candy, 2012). These similarities hold true regardless of the representational content of the game; while we focus here on CS:GO, a game deliberately themed after contemporary theatres of military combat, our expectation is that we would find map callouts carrying out a similar role in the fantasy/sci-fi-themed theatrics of team-based FPSs like VALORANT, OVERWATCH or APEX LEGENDS. Furthermore, von Wedelstaedt himself notes how the term is adapted from the concept of “intercorporeality”, particularly as used in studies of team sports (v. Wedelstaedt, 2020, p. 120; Witkowski, 2018). That our theorization of communication during networked play should engage a triadic relationship between competitive gaming, sport, and military is further evidence of the shared techniques of extended vision among these three areas, not to mention the much more tangible circulation of capital, expertise, and media technologies within these overlapping industries (Colás, 2017; Crogan, 2011; Elam & Taylor, 2019; Taylor, 2020) and their status as masculinized domains in which weaponized vision is linked to fantasies of domination (Taylor, 2021). We return to this relationship in our discussion.
Perception and Cognition in Studies of Play
The development of military media formed a key part of the ideological, technological, and institutional context for early video games, and militarized modes of perception and cognition feature strongly across multiple game genres regardless of their overt representational and aesthetic themes (Crogan, 2011; Derian, 2009; Elam, 2018; Phillips, 2018; Pötzsch & Hammond, 2016). Simply put, ‘eliminating threats’ remains one of the core mechanics of contemporary games, and detecting threats before they detect you frequently constitutes an imperative. This is particularly the case in networked competitive games. And yet, while perception has been both studied and enacted as a networked capacity in military contexts, perception is overwhelmingly regarded in studies of gaming as a singular and individualized capacity. Much of this has to do with the fact that theorizations of perception in gameplay, particularly (but certainly not limited to) those working within interpretivist and phenomenonological traditions, still predominantly take the individual player’s/researcher’s engagement with the game as the epistemological starting point, even when pushing for posthumanist perspectives (Ash, 2013; Giddings, 2017; Keogh, 2018; Shinkle, 2008). As a result, we have wonderful accounts of how perception and attention are distributed across human and non-human actors (including interfaces, controllers, and peripherals) but comparatively little to say about how they are distributed across one player to the next in collaborative contexts.
In a different vein, the educational interest in massively multiplayer online games in the mid-2000’s produced robust understandings of how cognition is socially (and technologically) distributed, across in-game artifacts and interfaces, paratextual resources, verbal, gestural, and typed interactions between players, and so on. But while theorizations of cognition often either include perception as part of cognitive processes or understand perception to be an unproblematic condition of learning, something that just happens, it is useful to separate them – to treat perception as a necessary precursor for cognition. Teams playing competitive team-based FPS games work, through their interpersonal communication, to (co)construct an epistemic practice that produces interperceptivity: a state of socially distributed perception. But that interperceptivity does not determine a particular course of action. That is, which situationally relevant actions that perception should trigger – how the information provided by callouts should be acted upon – is expected common ground knowledge that each teammate should just know and is part of the communally shared knowledge of how to, normatively, play the game. The interperceptivity is explicitly and demonstrably done and socially accomplished, but what to do with the collective vision is more implicit, since there is seldom time to discuss and negotiate decisions during the fast-paced combat of competitive FPS games (see, e.g., Kiourti, 2019, 2022; Rusk et al., 2023; T. L. Taylor 2015; Wright, et al., 2002).
Perception is an active, agentic process of differentiation which is acquired through and as part of competent navigation through an environment, and it is a socially shared act. It is a precursor to learning which, in this framework, is fundamentally “a process of becoming attuned to certain aspects of the environment in such a way that we gain new affordances, new ways to act and interact with the world” (Linderoth & Bennerstedt, 2007, p. 602). To become attuned to the game environment, to find new ways of handling oneself in the game and make use of the information gained through the accomplished interperceptivity, players rely on a form of expertise that involves locally distributed experience and reflexivity and that is not limited to individual's specific skills and knowledge (Arminen & Simonen, 2021; Kirschner & Williams, 2013). This expertise allows the players to understand the implications of in-game events similarly as others competent in the game play and differently than a lay party. However, this expertise is not a unilateral or monolithic knowledge that is displayed in the same way regardless of situation, context and players involved in said situations and contexts. Players can have expertise but still make diverse decisions in similar situations. Instead of expertise being the performance of canonical actions, it can also be reflected through a player’s professional sensitivity to perceive nuances in situations that require actions outside of the canon. Expertise also, usually, involves some kind of moral dimension that connects to players’ authority in the team, both expected and de facto oriented to authority (Arminen & Simonen, 2021).
Interperceptivity is therefore both a social accomplishment, and one that must be continually refreshed over the course of a match, and a precondition for the intricate choreography of highly-skilled FPS play (Rusk et al., 2023; Rusk & Ståhl, 2022; Witkowski, 2012). We propose a complement of, and extension to, Witkowski’s generative account of intercorporeality during CS:GO play, wherein “with each move a player makes, the opponents and indeed teammates are seeing ‘locations’ and reacting to these changing landscapes [...] in which every movement is crucial to the endgame state” (Witkowski, 2012, p. 358). We offer interperceptivity as a constitutive element of players’ distributed sense-making and as a precursor for further coordinated action and, hence, as a way for describing and understanding the choreography that Witkowski (and others) describe. Finally, we offer a consideration of the cultural contexts in which callouts become so central: the masculinized culture of competitive gaming, and the globalized esports industry that incentivizes, and indeed glorifies, the cultivation of competencies that have their origins in the military but are now associated with FPS play. These conditions shape who is encouraged, and permitted, to develop gaming competencies (Rusk & Ståhl, in press; Taylor, 2015; Witkowski, 2013). Contributing to this work on esports and social justice, we consider how capacity to speak without fear of being silenced could be yet another barrier to gender-, race-, and language-based diversity in esports.
Game Context, Methodology and Data
The game context under scrutiny in this article, CS:GO (Valve Corporation & Hidden Path Entertainment, 2012), typically involves matches between two teams, with 5 players each playing against each other. The first team to win 16 rounds (max time for a round is 115 seconds) wins the match and matches last approximately 20–45 minutes. Teams start as defenders (counter terrorists) or attackers (terrorists) and then swap roles after 15 rounds. In our data, teams play sabotage, in which they win rounds by detonating/defusing the bomb or eliminating the opposing team in the round. One important point, with regards to seeing together, is that when a player’s avatar dies during an active round, they function as a spectator until the round ends. Spectators can switch between the vantage points of the remaining teammates until the end of the round and still talk to everyone in the team and provide callouts. At the start of the next round, all players are revived (respawned).
In this study, we employ an empirically grounded participant perspective through a descriptive qualitative case study that is informed by ethnographic methods (see Parker-Jenkins, 2018). We analyze the in-game interaction using an applied form of ethnomethodology (EM) (cf., Reeves, et al., 2009, 2017) and, in this paper, we focus on an immersed understanding of a phenomenon through a specific case. By studying the interaction and communication conducted by a team playing competitive CS:GO matches, we can offer insight into how interperceptivity, including notions connected to it, and the empirical data are connected. EM studies the methods of participants from a perspective on social life as orderly and as a demonstrable and socially accomplished product of collective human activity (Garfinkel, 1967, p. 31). It provides the tools to analyze the practical methods used by participants. The details of participants' methods to interpret situations and how they act on those interpretations reveal the taken-for-granted structures that are oriented to as participants arrange their forms of gameplay, which are often ”susceptible to be [...] ironized or exoticized in academic work.” (Reeves, et al., 2009, p. 207). EM research strives to provide insight into the detailed ways in which participants provide accounts to each other and how these accounts are tied to the activities that they are part of (Reeves, et al., 2017). Therefore, it includes techniques for documenting social actions and identifying what is characteristic of particular social activities (Sacks, 1995). This approach provides analytical tools for treating different modalities as intertwined and constitutive of the actions performed by the participants. The analysis places social action in their immediate temporal and sequential contexts and recognizes that the organization of action can involve simultaneous and parallel flows of verbal, embodied, and digital modes (Goodwin, 2013).
Data was collected as part of a project that involved a large dataset from 2017–2018, which has been reported on before (see, e.g., Rusk et al., 2023; Rusk & Ståhl, 2020, 2022; Ståhl & Rusk, 2022). In this paper, we build on the previous works and extend our understanding of how players, collectively, organize their teamplay in networked competitive games. The data used in this study was screen recorded by the participants (17-18 years old, all male), themselves, who were, at the time, students at a vocational school in Finland that offered esports as a minor subject. The larger dataset consists of two teams playing competitive CS:GO matches. Part of their esports studies was to play, in a team together with classmates, a well-known esports game to receive credits. The participants are multilingual and speak Swedish (mother tongue), Finnish and English. When playing together, they mostly used Swedish together with some words in English and Finnish. In all matches, all participants were geographically dispersed, and they relied on a Discord voice channel (a well-known VoIP social platform) to be able to communicate freely with each other during the matches.
The broader project (see, e.g., Rusk et al., 2023; Rusk & Ståhl, 2020, 2022; Ståhl & Rusk, 2022) includes almost 9 hours of data gathered from multiple players’ perspectives across 14 matches, played on various maps (each with its own specific community-created collection of callouts). In this paper, we focus on analyzing, in detail, a specific 2v1 situation towards the end of a round played on the map for which we have the most data: Mirage (see Figure 1 for callouts on that map). We refer to players using the color they chose to represent them in-game (Blue, Green, Yellow, Orange, Purple). One team playing as defenders (CT) is in focus in this paper to display the accomplishment of interperceptivity, although the analysis is based on analyses of the entire dataset, as well as the collective knowledge of our previous studies on similar situations. The situation we present and analyze in detail below is not particularly significant; this is the point. The continually updated production of a socially shared perceptual field, distributed across teammates, is, for these players, a taken for granted and unremarkable component of their play. Likewise, through our own shared background in studying game play and teamplay in fast-paced games like CS:GO, we have accumulated a context-sensitive understanding of the game play, similar to the very distinct forms of knowledge that the players possess and employ seamlessly in a taken for granted kind of manner, both individually and collectively (Arminen & Simonen, 2021). Through that level of competence in understanding the game environment and game play, we can analyze how participants were accomplishing interperceptivity.
The transcription builds on the Jefferson (2004) and Mondada (2019) transcription systems. Additionally, we use screenshots of in-game situations and the in-game mini map with colored lines to visualize the general movements of each player on the map. The colors of the lines also correspond with each player's lines in the transcript.
This research has been conducted following the ethical requirements established by the Finnish national board of research ethics. We use color-pseudonyms and the participants were given as much control of the data as possible (Murphy & Dingwall, 2001): (1) participants, parents, and teachers were informed of the study’s aim and what participation entailed; (2) participants volunteered to be part of the study through informed consent; (3) participants handled the screen recordings and decided which matches to send.
Results
The round that this specific situation is from, is in its entirety of active play approximately 80 seconds long. The team is playing as defenders, and it is the sixth round of the match. They are, at the start of the round, leading by 5 points to 0. Before the round starts, the team does not discuss strategy or how they will divide themselves in the next round.
The Run-up to the 2v1 Situation
First off, the team moves into positions to defend the bombsites. They do not explicitly agree on who goes where. Nevertheless, the start of the round seems to be unproblematic as they seem to be dividing up onto the map without anyone objecting to it (one B, one mid, and three A, see Figure 2). Moving into position.
As they move into position, they use grenades (smoke, firebomb, flash and frag grenades) to initiate control of the areas. Green also explicitly announces that he does not want to get sniped by a scout (a sniper weapon) at mid, and smokes the window area of the map and waits inside the smoke to listen to enemy presence and callouts from the team. The three who are on bombsite A use grenades and move around actively, making a lot of noise. The one player on bombsite B, Purple, uses a similar strategy and smokes apps. As he moves towards van to throw the next grenade, he gets flashed by an opponent. There have been no calls regarding enemy presence until that point, so we can surmise (as could the team) that the opponents may be moving towards site B as Purple is preparing to secure it.
Purple calls out the flash at site B. However, the teammates do not start moving towards site B, since one flash is not enough evidence of site B being the opponents’ target. Nevertheless, this information is crucial to continue building on. Purple continues running towards van, although he is flashed (cannot see clearly because of the screen having become very bright). He jumps up onto van and gets shot at from apps and, as he runs for cover, calls out that the opponents are attacking site B. This call indicates to the others that the first flash was not just a ruse. They start moving towards site B from their current positions (see Figure 3). Purple takes cover on B and fires and throws a grenade at positions in apps where he, relying on historical knowledge of the map, knows that the opponents may be, as well as to indicate to the opponents that it is not risk-free to run out on site B while he waits for reinforcements. Gathering intel and regrouping (1).
Now, the team regroups at site B and attempts to hold it (see Figure 4), so that the opponents will not be able to plant the bomb. Blue stays at mid, to check if the opponents are still trying to flank or move to site A. During this activity there is a more active use of callouts regarding both own and enemy actions. Holding B.
Purple calls another opponent flash and Green calls that he is flashing apps. Soon after, he calls that one opponent is pushing onto site B and that he can see him at the opening near balcony. Orange aims at the opening from behind and eliminates the opponent. Green announces that he is throwing a firebomb and after that there is a long moment when no opponents are seen. Only a smoke and a flash are thrown onto site B by the opponents and a grenade by Green. After this quiet moment, the team starts to spread out again: Blue holds mid, Orange moves towards mid through short and Yellow moves towards market as Green and Purple stay at site B. As Orange moves towards mid, Blue dies near under (see Figure 5). Gathering intel and regrouping (2).
Blue calls it out, however, he admits that he does not know if it was from under or mid, since he could not see because of flashes being thrown at his position. Orange aligns to the information and checks under and mid. At the same time, Yellow moves towards window through market while intermittently turning around to check apps. Orange moves closer to under and eliminates the opponents that appear (Figure 6). Orange eliminates opponents near under.
When Yellow hears Orange shooting at the enemy, he moves more swiftly towards window. Green and Purple are still holding site B. Then Orange gets shot from behind (from top mid) and Blue is unsure, at first, but then he calls top mid. Now Yellow has made it to window and can directly use the information in Blue's call, and he eliminates the opponent at top mid. After Yellow has eliminated the opponent, he stays at window and holds the position, since there may be more enemy presence there. These events culminate in the sequence we analyze more closely, in which Purple and Green – the two remaining teammates – dispatch the last opponent and win the round.
The 2v1: Finding the Last Opponent
In excerpt 1, Yellow is killed by the last opponent in the round (line 49). Purple and Green are left to find and eliminate the last opponent. As they have done throughout the round, the teammates employ callouts as an epistemic practice to stitch together their separate, unfolding fields of vision, allowing a broader, verbally shared perception of enemy movements which, in turn, allows for a successful (and relatively easy to demonstrate), coordination of gameplay.
Excerpt 1. Finding the last opponent in the round.
In line 50, Green is answering his own question (from before) and calling out that apps seems empty. Yellow overlaps Green’s turn and calls out, twice, where the last opponent (who killed him) was positioned (line 51 and yellow cross and red bomb icon in Figure 7). Armed, as it were, with this refreshed extended vision, Purple (purple line and dot in fig.7 and 8) and Green (green line and dot in Figures 7 and 8) advance towards positions where the opponent could be moving or from where they can see the opponent. Green asks if it’s the last enemy (line 53). Yellow and Purple confirm (lines 55–56) and Yellow adds to his previous callout that the enemy may now be in connector (line 58). Blue repeats the callouts that have been provided (line 60). During this time, Purple moves to jungle, from where he can see connector, and takes cover (purple line and dot, Figure 8). Green says that he’ll check connector (green line and dot, Figure 8) from B short, but that the opponent may have gone back to bombsite B (line 62). However, Green sees the opponent (red bomb symbol, Figure 8) and speaks in an excited and fast manner when he calls out twice that the opponent is in connector and that the opponent is moving towards bombsite A. Purple is already in an advantageous position, his weapon up and his crosshair trained on the spot where he knows – thanks to the verbally shared perception of enemy movements – the enemy will appear. He waits for the enemy to rush out of connector and eliminates the opponent and the team wins the round (Figure 9). The search begins. Elimination of the last opponent. Purple eliminates the last opponent in the round.


As we have emphasized throughout, this is a mundane moment; but it is also a decisive one, and one where we might clearly grasp the forms of embodied and communicative competencies in play. Normative discourses on FPS games – fueled, in part, by resilient strains of research problematizing their ‘first person’ enactment of militarized combat – might focus on the methodical and workmanlike way Purple lines up a shot and, with reflexes that many of us might look upon jealously, dispatches the last opponent milliseconds after the opponent emerges from connector. But as we have shown, building on the more nuanced and non-sensational work on the communicative practices that constitute FPS play, this final kill is a culmination of a complex sequence of coordinated action. Purple’s advantageous positioning is far from a matter of ‘being in the right place at the right time’. He put himself there in time to orient his field of vision and crosshair towards the exact spot where the opponent would emerge, because his situational awareness of the unfolding coordinated action made it clear that the last opponent would likely be using connector to traverse to bombsite A.
Purple’s actions can be explained using two frames: situational awareness and interperceptivity, of which situational awareness explains the events on an individual level, whereas interperceptivity offers an understanding of the events from a collective perspective. Situational awareness is a term that competitive players use to refer to sets of abilities required for effectively selecting relevant information in an unfolding arena of combat (specifically, but not limited to, the position of teammates and enemies and their health, status, and/or weaponry, the location and accessibility of resources, and the relative urgency of win/loss conditions) and processing this information into a coherent and actionable understanding of available, expected, and preferred, courses of action. Situational awareness is not one singular skill, but is produced through many interlocking competencies, often specific to the mechanics of a specific game. In CS:GO, situational awareness includes a routinized knowledge of the layout of a given map (in this case, Mirage), such that little conscious effort has to be expended to think about how to get from one point to another; an understanding of key chokepoints, those parts on a map which are likely to be more hotly contested, and at which times; and crucially, for our analysis, a handle on a given map’s inventory of callouts. All in-game movements and actions indicate that both Green and Purple align with and orient towards the information that the callouts provide. They are the epistemic practice that glues and converges the teammates’ visualities into a collective perceptual net stretched across the game’s environment; callouts form the (ongoing, continually refreshed) precondition for coordinated play. To reiterate, Purple’s advantageous positioning in the jungle is no accident, because of the entire team recognizing and understanding the implicit, possible, and relevant next actions that are inherent in callouts and enable a shared sensory field: interperceptivity. But that interperceptivity would not be possible without players having historical, communal, knowledge of the map and the game, as well as individual situational awareness.
Discussion
What we have described here through the exhaustive and granular tools of EM is a series of actions that players themselves would regard as mundane and insignificant. A player shouting connector during a match in Mirage is understood, by teammates, as indicating that from their current location, they can see or hear an opponent in or at the passageway. This communication is often provided in stressful and sometimes tumultuous situations where the audible space is contested (Rusk et al., 2023; v.Wedelstaedt, 2020). There is also the risk of the network connection being slow or not working (Candy, 2012). Presuming the message goes through and is heard by teammates, this player has now extended their team’s collective perception in what has elsewhere been termed “verbal screenlooking” (Taylor, 2012). Now, it is the teammates’ responsibility to use the information in one way or another (Rusk & Ståhl, 2022). These next actions are highly contextual: dependent on team strategy, individual player decisions and dispositions, how many teammates and opponents remain alive, which weapons and utilities that are at the players’ disposal, the match’s objective, time left in the round, and so on. The key point is that every callout made contributes to a team’s capacity to collectively track the movements and locations of their opponents, constituting a shared, dynamic, and continuously updated perceptual field that exceeds the visual and audible range of any single player. Given its central role in contemporary practices of networked play, it is worth unpacking some of the cultural contexts in which callouts are situated. Below, we consider callouts and the accomplishment of networked perception in the context of the relationship between gaming and the military (with sports forming a third player, though left out here for the sake of brevity). We also ruminate on the politicization of feminine voice in networked gaming that makes it challenging, if not unsafe, for women and non-binary gamers to develop these communicative competencies.
The socially accomplished practice in competitive networked gaming that we have described as interperceptivity is indebted to theorizations of how human and technological modes of perception are woven together in military operations. It is worth considering the longer media genealogy that makes it possible for us to port a theoretical term used to describe such operations (which, itself, was ported from sports studies) over to accounts of networked gameplay. Historical accounts of optical media underscore the central role that visual perception has played in the development of modern military technologies, showing how the resultant techniques of ‘weaponized vision’ became embedded in more domesticated media – including digital games (Crandall, 1999; Kittler, 2010; Virilio, 1989). In media technological terms, detection (perception) is the precursor to processing (cognition). If the central task of militarized information processing is to sort friend from foe, the central task of optical media is to gather this data in the first place and render it available for processing. Packer and Reeves discuss how this sequence of detection and processing was put into practice in the development of the United States’ Semi-Automatic Ground Environment (SAGE) in the mid-twentieth century, a system of radar sites and networked computers that Packer and Reeves see as a catalyst for the present-day proliferation of automated weapon systems: “for SAGE, the first order of business was to locate and track all airborne vehicles. Once detected, the second order of business was to determine whether any given radar blip posed a threat” (Packer & Reeves, 2020, p. 317). The operational logic of SAGE – to see first, and farther – is not only at work in contemporary state-run surveillant technologies (such as networked drones); it is also the premise of many contemporary competitive games (see Crogan, 2011, for more).
The fact that the domains most closely linked in this epistemic practice are live military operations (and, more indirectly, professional sports) poses compelling questions for us as theorists of education, gaming, and media. We are not interested in such explicitly stated educational questions as ‘what are these players learning', and 'how might this transfer to other domains’ – there is simply too much specificity and specialization in skilled play to consider how players’ highly contextualized communicative competencies might transfer to other game genres or cross over into other domains of experience. Likewise, we are not suggesting that participating in modes of mediated vision that were initially developed for the state’s application of lethal force translates into ideological support for militarism among players. While it may be tempting to find significance in the fact that the CS:GO players in this study live in a nation with compulsory military service for young men, we are more inclined to locate the reasons for the game’s popularity in its status as one of the most long-running, and lucrative, esports in Europe (Witkowski, 2018). Likewise, we find the use of callouts in CS:GO compelling, in part, because callouts are such a core feature of competitive FPS play across multiple titles, from the overtly militarized world of CS:GO to the more fantastical landscapes of VALORANT and APEX LEGENDS. While it is well beyond the scope of this paper to inquire into the communicative practices of those games (or, for that matter, of other CS:GO players), we would expect that the routinized circulation of English-language callouts in these games, and the use of these callouts in professional esports settings, works to impel similar kinds of communicative practices across otherwise different and distinct player communities.
That said, the connections between gaming and militarism are manifold and significant, beginning with the historical and ongoing circulation of technologies, capital, personnel, and technical knowledge between the games industry and the military-industrial complex (Crogan, 2011; Elam, 2018; Pötzsch & Hammond, 2016). Like Phillips in her “textual and technological history” of headshots (2018, p. 136), we are less interested in postulating a direct relationship between proficiency in militaristic games and predisposition towards ‘real life’ violence, than we are in understanding how modes of weaponized vision (and perception) circulate so readily between militarized contexts and the leisure practices of (predominantly) young men. Phillips notes that “there has been no conclusive evidence linking video game violence with aggression in the physical world,” and yet “we face a future in which a growing civilian body considers shooting for the skull a norm, even a joy” (2018, p. 147). Like the headshot, the cultural significance of callouts is bolstered by an esports industry that rewards their competent use—meaning there are powerful incentives for competitive FPS players to hone skills that originate in the military and other arenas of lethal force. While it is beyond the scope of this paper to tease out the implications of this, we can point to another, related concern regarding the cultural context of callouts: the persistent marginalization of women from competitive gaming.
The social accomplishment of interperceptivity using callouts in voice communication is not (necessarily) available to all players, since the flourishing misogyny and racism of the cultural context of gaming silences marginalized players (see, e.g., Rusk & Ståhl, in press; Cote, 2020; Elam & Taylor, 2019; Fickle, 2019; Gray, 2012; Ståhl & Rusk, 2022; Witkowski, 2013), especially in random lobbies. In a domain in which communication is a vital part of competitive players’ skillsets, silencing players based on their (perceived) identity functions as a powerful form of social – and economic – exclusion. It inhibits the accomplishment of interperceptivity within the team. In this article, we recognize this issue, however, the data we rely on when analyzing the phenomenon of interperceptivity is of a complete five stack team that communicates using a Discord voice channel.
Conclusion
While verbal map callouts are not as consistently used in other competitive gaming communities as in FPSs, and especially CS:GO, the fact that all such games put players in situations where they have limited lines of sight and where seeing the opponent before they see you is vital to team success, means that interperceptivity becomes a key capability. In EAGUE OF LEGENDS and DOTA2, for instance, where ‘pinging’ locations on a map often replaces verbal communication, pinging effectively allows for interperceptivity – and knowing when and how to ping becomes as vital a skill in these games as the use of map callouts in CS:GO. Our hypothesis in introducing interperceptivity to games and esports scholars is that it should prove useful in making sense of communication and collaboration in other networked, team-based games.
Supplemental Material
Supplemental Material - The Social Accomplishment of Seeing Together in Networked Team Play
Supplemental Material for The Social Accomplishment of Seeing Together in Networked Team Play by Fredrik Rusk, Nicholas Taylor and Matilda Ståhl in Simulation & Gaming
Footnotes
Acknowledgments
We wish to thank the participants that willingly and openly shared highly personal data with us so that we, as researchers, better could understand what online gaming was to them.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Svenska Kulturfonden, Högskolestiftelsen i Österbotten.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
