Abstract
Qualitative researchers often work with texts transcribed from social interactions such as interviews, meetings, and presentations. However, how we make sense of such data to generate promising cues for further analysis is rarely discussed. This article proposes mode-enhanced transcription as a tool for sensitizing social interaction data, defined as a process in which researchers attune their attention to the dynamic interplay of verbal and nonverbal features, expressions, and acts when transcribing and proofreading professional transcripts. Two scenarios for using mode-enhanced transcription are introduced: sensitizing previously collected data and engaging with modes purposefully. Their implications for research focus, data collection, and data analysis are discussed based on a demonstration of the process with a previously collected dataset and an illustrative review of published articles that display mode-enhanced excerpts. The article outlines the benefits and further considerations of using mode-enhanced transcription as a sensitizing tool.
Keywords
Sensitizing Social Interaction with a Mode-Enhanced Transcribing Process
Qualitative researchers often work with texts transcribed from social interactions, such as interviews, meetings, and presentations. Although significant progress has been made in advancing data analysis (e.g., Clarke et al., 2021; Locke et al., 2022; Mees-Buss et al., 2022; O’Kane et al., 2021; Pratt et al., 2022), little attention has been given to sensitizing such data, that is, to generate promising cues for further analysis through “deep engagement with the data” (Jarzabkowski et al., 2021, p. 72; Kohler et al., 2022). In rare cases, researchers have discussed their sensitizing tools, such as using tables to “make sense of their data, even if, ultimately, [these tables] are not included in the final version of a paper” (Cloutier & Ravasi, 2021, p. 118) or visualization by “constantly scribbling, sketching, drawing” intuitive fragments and tentative interpretations to aid reflection on the transcripts (Ravasi, 2017, p. 243). As Corley observed in a Showcase Symposium held at the Academy of Management Annual Meeting in 2016, “you don’t pick up [these sensitizing tools] in a lot of methodology texts and how-to type of articles” (see Gehman et al., 2018, p. 296). Hence, this article proposes mode-enhanced transcription as another sensitizing tool that assists researchers in making sense of their qualitative data for further analysis.
Mode-enhanced transcription as a sensitizing tool refers to a process in which researchers attune their attention to the dynamic interplay of an uninterrupted stretch of speaking with prosodic features (e.g., speed and volume), paralinguistic expressions (e.g., pausing and laughing), and acts (e.g., gazes and body movement) when transcribing and proofreading professional transcripts. By actively engaging in this process, researchers immerse deeply in their data (Kohler et al., 2022), which enables them to contextualize the interaction and become aware of such nuances as emotions, practices, and power dynamics in their data (Pink, 2011; Poland, 1995; Sandberg, 2005). Thus, transcribing and proofreading professional transcripts is not “a mundane, time-consuming chore” (Tilley, 2003, p. 771) but an invaluable tool that enables researchers to attend to nuances in social interaction (Kohler et al., 2022) and generates cues for further analysis.
The article begins with an introduction to transcribing and transcripts in organizational research. I then explain the complexity of transcription and introduce mode-enhanced transcription as a sensitizing tool. Two scenarios are proposed for using the tool. One is to sensitize previously collected data, demonstrated by a personal experience of using mode-enhanced transcription to generate fresh cues from a previously collected dataset about culture at a technology start-up. The other is to engage with modes purposefully, to exemplify which I draw on eight published articles that suggest such a process in their research design and display more than one mode in their findings. Implications for research focus, data collection, and data analysis are discussed in both scenarios. The article concludes by discussing the benefits and further considerations of using mode-enhanced transcription to sensitize data collected either in physical contexts or via videoconferencing software.
Transcribing and Transcript in Organizational Research
Qualitative researchers often work with texts transcribed from social interactions, such as interviews, meetings, and presentations. Transcribing refers to a process of “turning a strip of ‘naturally’ occurring talk … into writing … to develop insights into the moment-by-moment and in situ construction of social reality and to provide evidence in developing an argument for an academic audience” (Bezemer & Mavers, 2011, p. 191). In organizational research, transcribers are expected to write down “the words heard on a recording” (Hammersley, 2010, pp. 559–560) to produce a neat, plain text containing what is said in social interaction, known as a verbatim transcript. These texts are often produced by professional transcribers. Typically, researchers treat verbatim transcripts as “quarries for potentially quotable and codable content” (Myers & Lampropoulou, 2016, p. 1), and these monomodal texts as their data (Pink, 2011). However, this approach underestimates the “unhidden complexity” of transcription (Hammersley, 2010, p. 554) and misses the opportunity of “understanding and appreciating … fine-grained properties” of social interaction (Pratt et al., 2022, p. 214), which is the data we need to make sense of and analyze.
Unpacking the Unhidden Complexity of Transcription
The “unhidden complexity” of transcription (Hammersley, 2010, p. 554) involves two interrelated issues of textualizing social interaction. The first issue is the nature of real-time social interactions, such as interviews, meetings, and presentations, which are inherently multimodal (cf. LeBaron et al., 2018). Modes are defined as “the culturally and socially produced resources for representation” and include speech, gesture, and facial expression (cf. Pink, 2011, p. 263). In a real-time social interaction, participants are not concerned with the particular words used, but with the understanding of what is being said, along with “other symbolic expressions, and ‘artifacts,’ of thinking, feeling, believing, valuing, and acting” (Gee, 1996, p. 131). They are also less focally aware of their nonverbal features and expressions (Polanyi, 1958), which are all vital modes from which researchers can make meaning in the cultural and social sphere where social interaction occurs (Jefferson, 1996; Kress, 2009). Hence, it is vital to be aware of the constitutive nature of social interaction as a multimodal performance (Sorsa et al., 2014), but this aspect is underrepresented in verbatim transcripts and needs sensitizing.
The second issue is the nature of transcribing as a social practice of construction (Hammersley, 2010). Kress (2005, p. 15) states that “[b]ecause words rely on convention and conventional acceptance, words are always general, and therefore vague. Words being nearly empty of meaning need filling with the hearer/reader's meaning.” When producing texts from the recording of multimodal social interaction (Baralou & Tsoukas, 2015; Myers & Lampropoulou, 2016), the person who does this has to “make significant representational choices [about what and how to textualize], whilst acknowledging that they are constrained by the social context” (Bezemer & Mavers, 2011, p. 194). The choices needed in the reconstruction process (Pink, 2011) range from whether to include silences or time pauses to whether and how to incorporate acts, and all of these choices can be rational (Hammersley, 2010), reflexive (Cunliffe, 2002), and assumption-laden (Flyvbjerg, 2001; Stake, 1995). By engaging with transcription, researchers immerse themselves in these social contexts and become aware of the unsaid, the unusual, and the unexpected, which are not always accessible when working solely with verbatim transcripts but contain interesting cues worth further analysis.
A Mode-Enhanced Transcribing Process as a Sensitizing Tool
This article proposes a mode-enhanced transcribing process as a tool for sensitizing social interaction data. Engaging with mode-enhanced transcription attunes a researcher's attention to the “semiotic resources beyond verbal language” (Jancsary et al., 2016, p. 181), which include (1) prosodic features such as speed, volume, and intonation; (2) paralinguistic expressions including pausing, laughing, and sniffing (Baralou & Tsoukas, 2015; Jefferson, 1996; Myers & Lampropoulou, 2016); and (3) acts such as eye contact, gestures, postures, gazes, body movement, and manipulation of objects in social interaction (Norris, 2004; Wohlwend, 2011). Engaging in the process, either when producing a transcript or proofreading a professionally produced text, enables researchers to attend to such nuances as assumptions and unusual moments and to generate interesting cues for further analysis. The differences between verbatim and mode-enhanced transcription are summarized in Table 1.
A Comparison Between Verbatim and Mode-Enhanced Transcription
First, engaging in a mode-enhanced transcribing process allows researchers to (re)live the moments of interaction they have co-created (e.g., interviews) and observed directly (e.g., meetings and presentations) or indirectly (e.g., social interaction recorded by someone else). This can prompt them to be aware of the contexts and the “ongoing, contextualized interpretation by speakers and listeners that shapes the emerging conversational events” (Lapadat & Lindsay, 1999, p. 70). Through this process, researchers can also surface and reflect on the taken-for-granted perceptions, overlooked perspectives, and unusual moments (Hindmarsh & Llewellyn, 2018), which may be valuable to pursue further in data analysis (Davis, 1971; Jonsen et al., 2018).
Second, engaging in a mode-enhanced transcribing process directs a researcher's attention to the material means for representation in the interaction, such as acts, gestures, and body language that participants exhibit as well as the physical materials and tools these participants manipulate in their speech (Baralou & Tsoukas, 2015). Through the process, researchers may notice the patterns of mode interplay between verbal and nonverbal features, expressions, and acts and their relations to the social context as a whole, all of which convey meanings that researchers could make sense of (Sandberg, 2005). These patterns are also invaluable for researchers to sensitize data collected virtually via videoconferencing software, which is challenging to contextualize without a physical context. 1 Finally, presenting some of these nuances could strengthen the trustworthiness of qualitative research (Kohler et al., 2022), which I now discuss.
Mode-Enhanced Transcripts as Artifacts
Engaging in a mode-enhanced transcribing process formally, by adding abbreviated notations of nonverbal features, expressions, and acts to a verbatim transcript, results in a mode-enhanced transcript. This transcript is a text of multimodal ensembles (Kress, 2011) that contains “a socially and culturally shaped set of resources for making meanings, such a speech, gesture or image” (Bezemer & Mavers, 2011, p. 196). Such texts are professional artifacts: the finished products of the transcribing process 2 (Bezemer & Mavers, 2011). Researchers could choose to present these mode-enhanced excerpts to strengthen the trustworthiness of their research and make their interpretations more accessible and comprehensible (Jancsary et al., 2016; Wertsch, 1991). However, it is vital to note that our primary purpose in engaging with mode-enhanced transcription is to interact with our data extensively, rather than producing a mode-enhanced transcript. The essence is in the process of doing.
Two Scenarios for Sensitizing Social Interaction with the Mode-Enhanced Transcribing Process
Below, I propose two scenarios in which researchers can use mode-enhanced transcription as a sensitizing tool. One is to sensitize previously collected data. In this scenario, researchers attune attention to mode interplay when reviewing data collected previously, either by themselves or someone else, to generate fresh cues for analysis or reanalysis. They may also engage in the process when they feel stuck while analyzing verbatim transcripts. The demonstration below illustrates how I engage in the transcribing process as a tool for making sense of and generating fresh cues from a dataset that I collected previously. The other scenario is to engage with modes purposefully. In this one, researchers consider mode-enhanced transcription at the outset of research design. They are aware of the epistemological assumptions entailed in transcription (Bezemer & Mavers, 2011) and elucidate some of these aspects in their research process (e.g., Jarzabkowski & Lê, 2017; Pouthier, 2017). I draw on eight articles published in the leading journals 3 to identify such practices, and I then discuss implications for research focus, data collection, and data analysis. A summary can be found in Table 2.
Two Scenarios for Sensitizing Social Interaction Using a Mode-Enhanced Transcribing Process
Sensitizing Previously Collected Data with the Mode-Enhanced Transcribing Process
Researchers can engage in the mode-enhanced transcribing process to sensitize previously collected interaction data, identifying the nuances that they may have ignored in their monomodal verbatim transcripts (Pink, 2011), and thus generating fresh cues that they may have missed. In these cases, researchers may focus on prosodic features, such as loud emphasis, elongation, quicker utterance, pauses, and their interplay with paralinguistic expressions such as laughter, and distinctive acts such as knocking on a table (Sorsa et al., 2014). The demonstration below shows how this process helps sensitize a previously collected dataset about culture at MAX (a pseudonym for a technology start-up). Implications for research focus, data collection, and data analysis are discussed to guide researchers.
Prolog: Revisiting a Culture Study in a Technology Start-Up
MAX is a technology start-up located in Melbourne, Australia. Back in 2007, it was at the stage of scaling. Concerned with rapid global expansion as an early-stage venture, the founding directors were eager to maintain MAX's culture, which they regarded as crucial for their success. Part of the study was to understand the founders’ perceptions of MAX's culture. Hence, I interviewed Bob and James (both pseudonyms), cofounders and managing directors, separately. Interview questions mostly concerned cultural values, such as “What do people value at MAX?” and “How do you feel about the values MAX has?” I also jotted down notes immediately after these interviews. Here, I focus on the interview with Bob.
The original analysis was based on a verbatim transcript, along with a positivist assumption (Burrell & Morgan, 1979; Cilesiz & Greckhamer, 2022) that data are “produced through objects in the world imprinting their characters upon our sense” (Hammersley, 2010, p. 554; Sandberg, 2005). The study finds that the managers’ documentation of values and culture tends to achieve their expectation of assisting in the maintenance of corporate culture and its transfer within the global network. As the verbatim quotations in Table 3 suggest, in Bob's view, MAX's culture, such as “having fun value,” emerged at the early stage, “the culture becomes what the business [is],” which was then codified, “grab on that, lock it down,” and enforced in various local offices globally: “the core is going to be the same.”
A Comparison Between a Verbatim and a Mode-Enhanced Transcript in MAX's Case
Transcription notations: (()) A depiction of paralinguistic features or acts confirmed by field notes; >< A quicker utterance; Co:lon An extension of the sound or syllable; LOUD Louder; ((pause)) Nothing said; Co:::lons Prolong the stretch; ° A passage of talk quieter than the surrounding talk.
Several years later, I revisited the MAX dataset, which comprises interview recordings, transcripts, field notes, and some archival documents provided by the venture. First, I proofread my transcript while listening to the interview recording. I soon became intrigued by Bob's linguistic pattern (Gee, 2009). I found that he spoke at a faster pace, in phrases, in a noticeably stronger Australian accent, when describing his past experience as an engineer before cofounding MAX; he slowed down with many pauses, stresses, and elongations when the topic moved to his current experience as a managing director of MAX. I also noticed the repeated sound of thud in the soundtrack, which was confirmed by my field note that Bob was playing with a miniature Australian football when interviewed. These materials suggest nuances that I did not notice previously when focusing on the verbatim transcript. Hence, I decided to try mode-enhanced transcription to explore whether fresh insights can be generated from this dataset.
Mode-Enhanced Transcribing Process as a Sensitizing Tool
Based on Bob's linguistic pattern in the original soundtrack and his acts recorded in the field notes, I focused on some of the most distinctive features, particularly speed, volume, and vocal stress, and added these features to the verbatim transcript with Jefferson's transcription notations (Atkinson & Heritage, 1999). Table 3 illustrates a comparison between a verbatim and a mode-enhanced transcript with notation legends in the footnote. However, it is critical to note that attention should be given to the transcribing process for deep interaction with data rather than producing a mode-enhanced transcript perfectly.
Attuning attention to ignored nuances
When engaging in a mode-enhanced transcribing process, our focal consciousness is naturally drawn to such features in the recording as the elongation, volume, pauses, and other background sounds, such as the ball bouncing in Bob's case. This is when we begin our sensitizing process. Below, I present three examples to show how mode-enhanced transcription oriented my attention to the potential tensions, blurs, and resonances (Wohlwend, 2011) in Bob's talk that had been ignored previously.
The first example is from Bob's response to what corporate culture is. In Excerpt #1 in Table 3, two characteristics immediately attract attention. The first one is the elongation of the “becomes” of culture and the “culture” itself. The elongation tentatively suggests that Bob has a strong view about the emergence of MAX's culture. The second feature is the ball bouncing during Bob's monolog. When Bob bounced his miniature Australian football, he was talking about a hypothetical scenario that he disagreed with. Similarly, in a separate episode, Bob banged the desk when he was describing MAX at the early stage as “young and exciting” and praised an employee who was no longer with the venture as someone who was “really dedicated” and “work[ed] hard.” These patterns suggest that when Bob displayed these nonverbal features, expressions, and acts, he seemed to feel strongly about something positive, such as the emergence of a culture, or negative, such as codifying a culture, which was occurring at the time of the interview.
The second example is Bob's reaction to the question of how to deal with the potential conflict between the Australian culture that MAX intended to promote and the local cultures in the overseas offices. In Excerpt #2, attention is drawn to the loud emphasis on “fun,” the elongation of “because,” and the elongation of “these things,” by which Bob means values. First, Bob emphasized the “having fun value” at MAX. Then he paused, seeking to embed the value in a broader Australian context and thus highlight MAX's Australian origin. Then he sought to explain that such a value could be unique by elongating “because” and pausing to search for a counterexample: Japan. He bounced the football when speaking about his perception of workplace culture in Japan. The act of ball bouncing occurred simultaneously with his emphasis on the office workers in Japan either “at work” or “to work.” Finally, he proposed to bring “some of these things” to other countries. By elongating “these things,” Bob reinforced the values, such as having fun at MAX, and the plan to “bring” these values to other countries. The emphases, elongations, and occasional acts attuned my attention to Bob's dichotomy between the home culture and the host culture, with which he sought to highlight the supremacy of the fun culture at MAX.
The third example is Bob's response to the question about MAX's attempt to “standardize culture.” 4 In Excerpt #3, emphases are given to several dichotomies: different versus same, core versus surface, not to do versus the best way. There are also multiple pauses before elaboration and further explanation. These suggested that Bob was struggling between what to do and what not to do and between what should be different and what should be the same about the work culture across MAX's overseas offices by loudly emphasizing “not” and “same” and elongating “different” and “core.” He seemed to support the current practice of documenting these value statements (e.g., “lock it down”) but signaled that he was not entirely sure whether these values could be enforced by elongating “becoming aware of it” and murmuring over “as alleged.” Finally, he wrapped up and reinforced the idea of retaining “the same culture, same core culture” by admitting that it could be different “on the surface.” These emphases and elongations of dichotomized words make me become aware of the potential tensions, conflicts, and paradoxes in Bob's perception of culture.
Generating fresh cues for reanalysis
As I attuned my attention to the interplay between various modes when engaging in the mode-enhanced transcribing process, I noticed several areas of potential interest when reanalyzing the data. First, there are tensions and discrepancies between Bob's definition of the home culture that MAX is embedded in and the host cultures that MAX is entering or is about to enter due to its global expansion. Bob was convinced that MAX's culture became what it is because of everybody who lived and worked there at the early stage. He opposed the idea that a company can create a culture by writing down what they would like to become, which was shown in his emphasis on “becomes” and “culture” in the first excerpt. However, when discussing the cultures of other countries, Bob readily objectified these as having or having not. He compared Australian culture with Japanese culture and highlighted having fun as part of Australian culture, and tied this to the success of MAX, as the second excerpt has shown. Hence, one area to consider when reanalyzing the data may be the epistemology of various cultures in the home country versus the host country.
Second, there are tensions and conflicts between building a culture at the early stage and maintaining the culture when the venture grows. Bob's emergence theory about how “culture becomes” soon gave in to the codification argument to ensure it is enforced in the overseas offices, as the first and third excerpts have shown. This can orient my attention to the founding team's perceptions of corporate culture over time and prompt me to explore the dynamics through which these perceptions remained contradictory or became reconciled.
Third, there are conflicts and blurs in Bob's understanding of culture “on the surface” versus core values. On the one hand, Bob stressed that working culture differed in Australia and Japan and characterized the Japanese as either “at work” or “to work” and “that's it,” as the second excerpt has shown. The remark highlights the superiority of one value in his home culture and indicates his attempt to bring “these things” to the host culture. On the other hand, there is a level of uneasiness when asked whether he was thinking about “standardizing” corporate cultures in his overseas offices by imposing MAX's Australian culture in an attempt to prevail over various host cultures. Bob hesitated by pausing a few times, emphasized the coreness of MAX's culture, and admitted the existence of different ways of acting, as the third excerpt has shown. Hence, I can become attentive to the justifications that informants offered when explaining their understanding of culture.
Epilog: A Reflection
The demonstration above illustrates how engaging with mode-enhanced transcription helps sensitize the previously collected data, uncovering neglected nuances, and generating fresh cues for reanalysis. By attuning my attention to these nonverbal features, expressions, and acts, I quickly reimmersed in the data that I collected years ago, which triggered some new ways of seeing (Gioia et al., 2013; Jarzabkowski et al., 2021). However, effective use of the tool requires us to reflect on our paradigmatic preference (Cilesiz & Greckhamer, 2022). Indeed, when trying this tool, I was receiving “methodological socialization” in a PhD program in the United States, where I found myself experiencing the trajectory of “readily accept[ing] postpositivism” (Cilesiz & Greckhamer, 2022, p. 356) and began to identify myself as an interpretivist. The awareness of such a preference is prominent among researchers who engage with modes purposefully, which is the other scenario, discussed later.
Implications for Research Focus
Researchers who engage in a mode-enhanced transcribing process to sensitize their previously collected data may well have their intended research questions to address and literature to contribute to. Hence, sensitizing their data in this way does not automatically alter their research. Instead, it could sharpen their research focus and add nuances to their interpretations. For example, in the demonstration above, as I became aware of the founders’ contradictory perceptions of their home and host cultures, I could sharpen my research focus from identifying how ventures maintain their culture during global expansion to understanding how ventures manage contradictory perceptions when intending to maintain their cultures. Alternatively, sensitizing the data with the mode-enhanced transcribing process could reshape the research focus, should researchers choose to do so (see Pratt et al., 2022). For example, this case study about how ventures maintain corporate culture globally could be redesigned as a process study that explores how perceptions of corporate culture shift as ventures grow temporally and spatially.
Implications for Data Collection
Sensitizing the previously collected data with the mode-enhanced transcribing process does not affect data collection. However, researchers need to assess the quality of their data, particularly the availability of mode representations in their data, before using this sensitizing tool. The demonstration above shows that this process may benefit researchers with high-quality recordings of social interactions in which prosodic features and paralinguistic expressions are captured effectively. It is also ideal for ethnographers who regularly take notes or write reflective memos when observing and interviewing in their fieldwork because such notes and memos can help identify and confirm various acts in these audio-recorded social interactions (e.g., Jarzabkowski & Lê, 2017; Pouthier, 2017).
Implications for Data Analysis
Sensitizing the previously collected data with the mode-enhanced transcribing process does not always change how data is analyzed with a particular qualitative method. However, as researchers interact with their data deeply, they may become aware of the unsaid, the unusual, and the unexpected in these mode interplays. As a result, their research focus may shift, and so may their data analyzing strategy (Pratt et al., 2022). Researchers may also use other sensitizing tools, such as tabulation and visual representation, for generating alternative cues that may also be worth pursuing (e.g., Cloutier & Ravasi, 2021; Ravasi, 2017).
Engaging with Modes Purposefully with the Mode-Enhanced Transcribing Process
Qualitative researchers can choose to engage with modes purposefully by considering mode-enhanced transcription at the outset of their research design (Hindmarsh & Llewellyn, 2018; Jonsen et al., 2018). To illustrate practices that researchers have adopted, I reviewed eight articles published in leading journals that each presented some form of mode-enhanced excerpt. The review shows that purposefully engaged researchers either fully acknowledge the theoretical underpinnings in their research, which often suggests some form of multimodality (Bencherki et al., 2021; Nicolini, 2009), or strive to be transparent about their analytic process, such as showing how emotion is captured in their studies (e.g., Jarzabkowski & Lê, 2017; Liu & Maitlis, 2014). Consequently, these researchers tend to focus on such features as speed, intonation, and volume and their interplay with paralinguistic expressions and physical acts to a great extent. A summary can be found in Table 4.
A Summary of Engaging with Modes Purposefully Based on Eight Published Articles
In addition, I identified the research focus, data, and modes in each of these articles, as given in Table 5. I then discuss their implications for research focus, data collection, and data analysis to guide researchers. Unlike the demonstration above that focuses on the process, this section illustrates the outcomes of researchers’ sensitizing, analyzing, and theorizing processes.
An Illustrative Review of Eight Published Articles
Implications for Research Focus
Studies that display mode-enhanced transcripts tend to focus on three research areas: emotion (e.g., Jarzabkowski & Lê, 2017; Liu & Maitlis, 2014; Pouthier, 2017), discourse and communication (Bencherki et al., 2021; Nathues et al., 2022; Wenzel & Koch, 2018), and practice theory (Bencherki et al., 2021; Jarzabkowski & Lê, 2017). For example, Jarzabkowski and Lê (2017, pp. 442–443) employed a practice lens to understand how humor was used to balance paradoxical goals. Similarly, Pouthier (2017) focused on how socioemotional behaviors, such as griping and joking, shaped team communication. Other researchers have focused on strategy-as-practice and communication. For example, Wenzel and Koch (2018) explored how keynote speeches came into being as a staged genre of strategic communication from both a strategy-as-practice and a critical discursive perspective. Nathues et al. (2022, p. 9) examined co-authoring practices in strategy making, in which organizational actors intended to “speak and act in the name of their supervisors, their organizations, rules they must follow, etc.” Their detailed analysis also suggests that this mode-enhanced approach may benefit other areas, such as power dynamics in social interaction 5 and reflection on researcher identity, because of such dynamics. 6 Researchers have acknowledged these dynamics when interviewing elite participants (Empson, 2013; Ma et al., 2021) and when reflecting on their researcher identity (e.g., Alcadipani et al., 2015; Cunliffe & Alcadipani, 2016). Hence, researchers interested in power dynamics and researcher identity may also choose to sensitize their data with the mode-enhanced transcribing process.
Implications for Data Collection
In these eight studies, authors either video-recorded social interactions, such as meetings, presentations, and interviews (e.g., Bencherki et al., 2021; Liu & Maitlis, 2014) or audio-recorded them supplemented by extensive field notes and reflective memos (e.g., Jarzabkowski & Lê, 2017; Nicolini, 2009). They often notice various modes or patterns of mode interplay in their data collection and record them, wittingly or unwittingly, in their field notes. These field notes and recordings become essential materials for researchers to further sensitize their data and represent it as multimodal. For example, in a study about griping and joking, Pouthier (2017, p. 754) admitted that it was not her initial research design, but “their significance emerged through observation.” Ultimately, she marked all the incidents based on “paralinguistic, prosodic, and discoursal clues” and reconstructed a mode-enhanced transcript using her field notes and recordings. Similarly, Jarzabkowski and Lê's (2017, p. 433) study “was inspired by an observation [in the field]: people make a lot of jokes about paradoxical conditions.” Hence, researchers can remain reflective in the field. Researchers may also consider their data collection strategies, such as video-recording interviews and meetings (Heath & Luff, 2018; Jarrett & Liu, 2018; Mengis et al., 2018), or using professional recording devices to ensure high-quality soundtracks and specialty software to timestamp field notes while recording. Similar attention to various modes or patterns of mode interplay may be particularly useful when collecting virtual interactions, such as videoconferencing meetings, interviews, and presentations.
Implications for Data Analysis
Engaging purposefully with mode-enhanced transcription does not dictate the direction of data analysis, even though it may help generate codes 7 (Locke et al., 2022). Instead, it is the research design that shapes how researchers analyze their data. For example, Bencherki et al.’s (2021) research question is how members of a community-based organization decide strategic issues in a strategic planning exercise. Their analysis “followed insights from interaction analysis” (p. 613) and focused on “what people were saying and doing as social action, asking, for each turn of talk or gesture: what are people doing here?” (p. 614). Data analysis can also be shaped by theoretical and epistemological underpinnings (Gehman et al., 2018). For example, Nicolini (2009, p. 1396) argues that practices should be represented from “different angles, such as verbal words, vocal expression, and bodily acts, through a toolkit-logic approach.” These examples also echo the recent call for methodological bricolage (Pratt et al., 2022), which requires deep interaction with the qualitative data and can be facilitated by mode-enhanced transcription. Nevertheless, the usual challenges of qualitative work remain (e.g., Jonsen et al., 2018; Sandberg, 2005).
Benefits and Further Considerations of Using Mode-Enhanced Transcription as a Sensitizing Tool
As shown above, sensitizing data with mode-enhanced transcription enables researchers to interact with their data extensively, attune their attention to the often-neglected nuances, and generate promising cues, which may be valuable for “creatively combining codes, abstracting from the data, dialoguing with existing theory, and so on.” (Locke et al., 2022; Pratt et al., 2022, p. 214). Researchers with rich experience in, familiarity with, and deep knowledge of the context may uncover these insights, nevertheless. However, such insights may be neglected when researchers only focus on verbatim words. Hence, engaging in the mode-enhanced transcribing process is useful for researchers working with social interaction data in several ways.
First, mode-enhanced transcription is useful for researchers who plan to (re)visit data collected previously, whether by themselves or by someone else. Engaging in the process offers researchers a vital opportunity to relive the social interaction and (re)immerse themselves in the data, enabling them to attune their attention to a few features and expressions, such as elongation, loud emphasis, pause, quick utterance, and distinctive acts. By attending to these nuances, researchers become aware of taken-for-granted perceptions, overlooked perspectives, and unusual moments (Hindmarsh & Llewellyn, 2018), which may lead to potential cues for creative analysis (Klag & Langley, 2013). Mode-enhanced transcription can also be valuable for researchers to sensitize data collected with videoconferencing software. Such data are often challenging to contextualize due to a lack of physical contexts.
Second, considering mode-enhanced transcription at the outset of research design is beneficial for researchers to uncover social practices, display emotion empirically, and unpack processes of meaning-making, strategy making, and (inter)organizational communication, as the illustrative review above has shown. Engaging in the process purposefully allows researchers to stay true to the theoretical underpinnings of social theory and be transparent about their analytic process, such as showing how emotion is captured by examining both verbal statements and nonverbal cues (e.g., Liu & Maitlis, 2014; Pouthier, 2017). Considering modes proactively also prepares researchers for collecting mode-rich empirical materials to ensure the availability of modes relevant to detailed analysis (e.g., Nicolini, 2009; Wenzel & Koch, 2018). Finally, researchers who engage with modes purposefully can be well served by noting various modes and mode interplays in their transcripts.
Third, researchers may augment the trustworthiness of their work by presenting a mode-enhanced transcript. It may help researchers defend the analyzing process, which is often built upon heuristics, sign-reading, and sense-making (Poland, 1995). As Tilley (2003, p. 771) commented, researchers can strengthen claims of trustworthiness of data by making visible the complexity of transcription work, acknowledging the interpretive reality of data constructed, and providing insight into the ways in which they specifically address issues of trustworthiness in their research practices.
However, it is vital to note that a mode-enhanced transcript as an artifact does not offer “a full or objective record of ‘what happened’” (Hammersley, 2010, p. 555). Instead, it is always mediated by our perceptions of transcription (Myers & Lampropoulou, 2016) and of social interaction and by our research interests (Lapadat & Lindsay, 1999; Sandberg, 2005). Likewise, such a transcript does not unveil the sensitizing process completely because an article in print is the outcome of an iterative process of sensitizing, interrogating, and navigating between theory and data (Locke et al., 2022; Mees-Buss et al., 2022).
There are some further considerations when using mode-enhanced transcription as a sensitizing tool. First, because transcribing is often considered “a mundane, time-consuming chore” (Tilley, 2003, p. 771), engaging in the mode-enhanced transcribing process can be seen as an overwhelming effort. Researchers can choose to attend to various modes while listening to their recordings as an integrated part of their meaning-making process without producing a mode-enhanced transcript (Hindmarsh & Llewellyn, 2018; Jonsen et al., 2018). Hence, engaging in this process is to “supply heuristic” (Hammersley, 2010, p. 27). Researchers can also make abbreviated notations of such features and expressions as speed, volume, pausing, and laughing when proofreading the professionally transcribed texts to facilitate deep interaction with the data.
Second, mode-enhanced transcription does not “provide recipes for doing research or even specific guidance” (Hammersley, 2010, p. 27). It does not render meanings automatically. Rather, as Hindmarsh and Llewellyn (2018, p. 430) rightly noted, “[the] measure of evidence is not that a particular ‘cue’ occurred, but that the cue is ‘oriented to’ by those in the setting as having a particular quality or implication.” Thus, as a sensitizing tool, mode-enhanced transcription is generative. It is also worth noting the interpretivist assumption underlying such a process (Locke et al., 2022; Mees-Buss et al., 2022). Researchers may need to be mindful of their paradigmatic preferences (Cilesiz & Greckhamer, 2022).
Third, sensitizing the previously transcribed verbatim texts with a mode-enhanced process may lead researchers to believe that a meticulously constructed transcript is important and even essential. Engaging in this process is not about producing an object or artifact, although presenting mode-enhanced excerpts may augment the trustworthiness of research. If researchers choose to construct and present mode-enhanced transcripts, such representations need to fit their research design and focus (see Gehman et al., 2018 for a theory–methods fit discussion) rather than overwhelming readers with unnecessary notations and signs. After all, mode-enhanced transcription is intended to be a sensitizing tool that helps researchers attune attention to nuances and generate potential cues, which may lead to interesting findings and trustworthy presentation of qualitative research.
Footnotes
Acknowledgments
The author would like to acknowledge and thank Lisa Dorner and Ruben van Werven for their helpful comments on the earlier versions of this article. Special appreciation goes to Ray Zammuto, whose support and guidance makes this article possible. The author would like to thank Thomas Greckhamer and two anonymous reviewers, whose active engagement and thoughtful comments helped refine this article. Finally, the author would like to thank Paul Bliese for his assistance and support during the reviewing process.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
