Abstract
This article reflects on my experiences as a male researcher using voice-only WhatsApp interviews to study women's affect and Taliban violence in Pakistan's Swat Valley. It considers the opportunities and constraints posed by doing research in supposedly disembodied online space. It also positions remote voice-only interviews as both embodied and embedded practices. This understanding situates the embodied reflexivity and gendered positionality of the researcher in relation to research participants—a relationship largely absent in online, qualitative voice-only interviewing literature. While internet-mediated settings do indeed offer some opportunities, their ability to circumvent gender boundaries is largely over-celebrated and has not received enough critical attention. I demonstrate why researcher feelings, positionality, and embodied reflexivity should be central concerns in post-COVID online, voice-only interviewing.
Keywords
Introduction
Methodological literature on voice-only interviewing is predominantly concerned with the presence or absence of visual queues, length of words and quality of data, and participants’ access to technology (Johnson et al., 2019). Other well-documented practical considerations include platform security features, participant preference for certain mediums (e.g., video, voice or email), and researchers’ use of remote interviewing in specific contexts (Lawrence, 2022; Mwambari et al., 2022: 972). The literature on remote interviewing (before, during and post-COVID-19) generally overlooks the researcher's gendered positionality in voice-only interviewing (Hall et al. 2021). In contrast, I argue that good interviewing is facilitated by a reflexive awareness of and engagement with the embodied and performed dimensions of the interview (Ezzy, 2010). To support this claim, I offer reflections from my research in the post-COVID researchscape of Pakistan's conflict-affected Swat Valley to examine how embodied gendered experiences (of researcher and participant) influence remote interviews in places with strict cultures of gender segregation.
Knowledge production is deeply embedded in sensory experience, and bodies are always a meaningful presence in research. A body is a material entity whose potential meanings are constituted and circumscribed by cultures through particular discursive systems that privilege certain sets of norms and values that regulate interactions (Ellingson, 2017: 2). Embodiment, as a state, is contingent upon the environment and the context of the body. The literature on conflict-affected regions in Pakistan tends to either accept the methodological inaccessibility of the opposite gender or refuse to discuss gender at all (Khan, 2020). The former tendency is promoted by male anthropologists, sociologists and political scientists, while the latter tendency is prevalent among women scholars in the region.
In this article, I avoid such silencing of gendered bodies—both of the researcher and of the participants (Ellingson, 2017:18)—and position online interviewing as an embodied practice. Voice-only online interviews are often thought to be disembodied due to the physical distance, time zone differences, and a relative lack of interviewer control (Rosalind and Holand, 2013: 48). The physical proximity of bodies in a shared space—which often anchors reflexivity—is absent in virtual space. Therefore, this paper asks how such a disembodied space is embodied and lived through sensory and affective modes (Pink et al., 2016) and how gendered positionality influences the dynamics of interviewing in a virtual environment. These questions are largely overlooked, despite a renewed focus on online interviewing in the wake of COVID-19 and nascent calls for increased reflexivity around researcher positionality in virtual interview settings (Roberts et al., 2021).
The gendered bodies of both the researcher and the participants are important elements of knowledge production (Ellingson, 2012). Space is also important, as lived space surrounds and influences how we act, feel, move, and understand our way of being (Ellingson, 2017: 23). The online interview setting is a disembodied space that only allows for hearing and sight. Such an embodied experience can teach us about our own embodiment and what we take for granted (Turner, 2000). COVID-19 opened the possibility for me to remotely interview Pakhtun women about the affective dimensions of Taliban violence in women's markets. This is not to claim post-COVID methodological solutionism (Fleschenberg and Holz, 2022); rather, I acknowledge a gap in the online interviewing literature around gendered positionalities in the research process. This paper tracks voice-only online interviewing's possibilities, while also warning against its constraints in fragile contexts.
Following Holstein and Gubrium (1995), I consider interviewing to be an active process where two participants (interviewer and the interviewee) generate meanings through verbal and nonverbal communication. It is a co-construction of knowledge. The researcher's gendered positionality and the mode of interviewing (online or face-to-face) have methodological implications for qualitative fieldwork. However, there is limited work on how gender-marked bodies of interviewer and interviewee shape online interviews as an active process of co-constructing knowledge. Related entries in
The next section offers the study background and situates these reflections within my broader research project. It also explains how I arrived at voice-only interviewing as the most suitable mode for the Swat Valley. Next, I frame the voice-only interview setting as an embodied space. Section four outlines the significance of voice-only interviews for hard-to-reach populations in conflict-affected settings. While acknowledging the benefits of internet-mediated interviewing methods, I also foreground two major challenges. Firstly, interviewing women in the Swat Valley through online methods is difficult without social networks facilitating access to potential participants. Secondly, there are cultural, contextual and gendered limits to voice-only interviewing that demand embodied, culturally sensitive listening.
Methodology and context
The COVID-19 pandemic encouraged deeper and more reflexive engagements with online methods (Johnson et al., 2019). The use of internet or computer-mediated methods is context-sensitive. Certain tools such as Zoom (Howlett, 2022), Facetime (Weller, 2015), Skype (Seitz, 2015), email (James and Busher, 2006), Facebook (Pousti et al., 2021), WhatsApp (Colom, 2021), and telephones (Pell et al., 2020) cannot be proposed
My reflective account of conducting online interviews—both ‘being here’ in a basement in Stockwell, London and ‘being there’ in Pakistan—emerges from my involvement with an ongoing multidisciplinary project. The project titled, (omitted for review) was designed and commenced in 2019, in the pre-pandemic world. The project sought to learn how the communities and landscapes in Swat are healing from and reconciling with the wounds inflicted by Taliban violence more than a decade ago. It covers four thematic dimensions: poetry, historical heritage and archaeological sites, lived heritage (including women's markets), and natural resources. This paper only reflects on the fieldwork related to women's markets in the Swat Valley.
The project team aimed for the conventional (in the pre-COVID world) ‘gold-standard’ of in-person interviews (Johnson et al., 2019). All four team members from western academic institutions had prior experience conducting fieldwork in the Pakistan-Afghanistan borderland, and everyone visited Pakistan to conduct fieldwork on their respective thematic areas. Initially, we never considered conducting online interviews since the subject matter was so concerned with place, space, objects, memories, affect and feelings. The envisioned data generation techniques included textual and audio-visual analyses of archival materials, interviews, focus-group discussions, observations, participatory photography, participatory sketch drawing and transect walks. However, COVID-19 and its travel restrictions forced us to rethink our research design and incorporate remote interviewing (Sy et al., 2020: 602–603).
Our decision to use online communication was taken iteratively and with some trepidation. We initially contemplated emailing our key informants and adding face-to-face interviews at a later stage (James and Busher, 2006). (NAME), an international expert on community-led heritage practices with extensive networks of collaborators in Islamabad and Swat, was the first in our team to (successfully) email a journalist and local heritage activist in Swat. The interview was in English because (NAME) does not speak Pashto and the interviewee is fluent in English. However, when I conducted a follow-up interview with the same interviewee in Pashto over WhatsApp, he stated, “I think if the same [email] interview was conducted in Pashto [first language] or in Urdu [second language], I could have responded better and of course in more detail” (June 4, 2020). This comment from an interviewee, who has published extensively in English, made it clear that synchronous interviews (either video or voice) in Pashto was our only real option. Furthermore, most of our interviewees lacked the technical competency to write in Pashto on a QWERTY keyboard.
Our choice to use voice-only WhatsApp interviews was informed by practical and contextual considerations. Telephone calls from London to Pakistan were more expensive than voice-over-internet protocols. Zoom was not considered due to concerns over its security lapses and video leaks, and the potential consequences for participants in a conflict-affected region (Mwambari et al., 2022). Such lapses pose a great risk to the participants in Swat, where memories of recent conflict are still vivid, and individuals have been targeted by both sides of the conflict (the Pakistani military and the Taliban) for expressing their opinions. Skype was ruled out to save participants the hassle of installing additional software. The app Signal, which has better security than WhatsApp, was not used since asking participants to install software with enhanced security features could generate distrust about the aims of our project. In the first round of synchronous interviews, I co-conducted 13 (11 male and two female) interviews with poets in Swat (alongside my colleague, NAME). Four of the 11 interviews with male poets were converted into voice (from video) calls to enhance the flow of communication due to the interviewees’ unstable connections. The two female interviewees only agreed to voice interviews, not videos.
Voice-only WhatsApp interviews were also used for the women's market interviewees. I conducted 18 semi-structured remote interviews with women participants (between December 2020 and April 2022) and 18 face-to-face interviews with male participants (June–September 2021) in women's markets in Swat. These interviews explored lived experiences of the conflict and post-conflict interactions with human and nonhuman subjects in these markets. Six of the 18 face-to-face interviews with male participants were conducted by (NAME), our male research assistant. Another six of the face-to-face interviews with women were conducted by (NAME), our female research assistant who lives in the study area. 1 Our research assistants were employed for a year: they were insiders in terms of gender, language and culture, and also outsiders since they had never met the participants nor visited most of the locations. All the interviews were conducted in Pashto, which is my native language. After the eighth interview, we added a closing question specifically about the interviewee's experience with voice-only interviewing.
All the interviews offered interesting non-visual background cues specific to each interview setting (cf. Khan, 2020). In some instances, notes on these background non-visual cues were compared with reflective notes from my female RA, who conducted six face-to-face interviews in Swat in November-December 2020. The following section analyses my reflective notes on the nonvisual dynamics of voice-only interview settings in addition to my field notes from in-person fieldwork in the women's markets in August 2021. I compare my field notes from online voice-only interviews with ‘being there’ in the women's markets to offer a reflexive account of my own embodied and embedded experience as a source of methodological insight (Hine, 2015: 16). While the online interviews occurred well before my physical visits to women's markets in Swat, the latter helped me to better understand the embodied and embedded nature of technology and its potentialities for facilitating my interviews with women interviewees.
The voice-only remote interview setting as an embodied space
Voice-only interview space entails embodied affects that direct and redirect the flow of communication between the interviewer and the interviewee. Unlike in broader online virtual space, where the adoption of multiple identities allows for a distinction between the corporeal and virtual body (Taylor, 1999), in voice-only interviewing, the virtual body reflects and extends the corporeal body embedded within its everyday social realities. Against this backdrop, I view virtual space as neither disembodied nor decontextualized; hence, gendered bodies and their boundaries shape the dynamics of digital interviewing (Van Doorn, 2011). Gendered and embodied sensibilities of the interviewer and interviewee shape this space and its affectivity. The flow of conversation may be hindered by the conversation topic, the interviewee's embodied experiences, or the interviewee's background setting and cultural context.
When exploring women's conflict experiences in the marketplaces, I was interested in the objects that resurface past memories associated with Taliban violence in the Swat Valley. A 31-year-old woman interviewee (MW12, 15/8/2021) brought my attention to bras, a sensitive but interesting object within the dynamics of conflict (Khan, 2024). Literature on voice-only interviewing often celebrates its usefulness for exploring sensitive topics (over face-to-face interviewing (Scipes et al., 2019)). However, given the sensitive nature and culture-specific practices, these were never easy questions to ask. In particular, questions about sensitive objects like bras generated discomfort in the virtual interview setting. I felt uncomfortable with the choice to either avoid the question or make my interviewee uncomfortable. One interviewee declared, “I do not talk about these things even with every woman.” I found myself without words, unsure how to phrase the next question. I paused, but thankfully, the interviewee, who had a copy of the questions, helped move us along, asking, “question 9?” I replied, “yes” (MW13, 8/4/2022). The interviewee's proactive sensibility in response to my confusion prevented an atmosphere filled with discomfort. At the end of the interview, the interviewee and I agreed that the sensitive questions (about bras) would be easier and more open with a woman interviewer. Thus, voice-only interviews do not always facilitate conversations on sensitive topics (Scipes et al., 2019; Trier-Bieniek, 2012)—gender also determines the ease or difficulty of such work.
Regardless of the topic (gender sensitive or not), embodied experiences shape the visceral dynamics of remote, voice-only interviews. An interview with a 35-year-old university lecturer began with an account of her experience seeing the Taliban marching from vehicle to vehicle with a “man's head chopped from its body” in the busy bazaar. These actions were meant to intimidate women who wanted to come out of their homes—the message was that the next head would be theirs. The interviewee recalled every minute detail of the event, at one point pausing and reflecting, “this memory is so scary that even now when I am describing the event, my body is shivering” (MW11, 2/7/2021). The interviewee's shivering body—as communicated through words—generated an atmosphere of care, and I redirected the conversation. Instead of jumping to a different line of questioning, I asked the interviewee if she wanted to terminate the interview. The interviewee appreciated my concern and stated, “I would love to continue if you want to ask me more general questions related to women's markets” (MW11, 2/7/2021).
Both silences and various noises in the interviewee's background have implications for remote voice-only interviews. For Sipes et al. (2019:212), a lack of background noise indicates a poor internet connection, while silences in the background indicate that interviewees are thinking and shaping their responses to a sensitive topic. In my case, complete silence and a confident tone implied that the interviewee located herself in a space without other people. Disruption to that silence from background noises (e.g., animated conversations, knocking doors, footsteps walking towards the interviewee, someone calling the interviewee's name from outside the room) influenced the flow of conversation and redirected the affective atmosphere of the interview setting. Moreover, these everyday background noises can help us interpret the interviewee's responses. Thus, attention to background sounds is necessary for qualitative interviews, as they help capture the everyday lived social realities of the interviewee.
Scholars conducting online voice-only interviews over Zoom, Skype, or WhatsApp often report major problems like dropped calls, an inability to understand pauses, the absence of visual cues, and uncertainty about when to interject (Sipes et al., 2019:8–9). Indeed, such problems are particularly significant within unequal digital divides. However, the “epistemic limits,” to use Thanem and Nights’ (2019: 23) phrase, of voice-only interviewing literature becomes evident in its obsession with technological influences, not embodied aspects. The failure to attend to the embodied reality of technology in our participants’ lives risks missing what feminist geographer Richa Nagar (2019) calls “epistemic energy” that is out there in the field. This epistemic energy cannot be explored from behind laptop screens in book-studded offices or interviewers’ cosy global north living rooms.
The relevance of online research for hard-to-reach populations and conflict-affected regions
The “body is the vehicle of being in the world” (Merleau-Ponty, 1945/1962:82, cited in Sharma et al., 2009:1643). I particularly felt the presence of my body while conducting in-person research in women's market shops and alleyways between June and September 2021. The multiple interactions occurring around me demonstrated the limits and potentialities of in-person research. I felt vulnerable to being misperceived by male interlocutors in the women's market, and powerless due to my inability to speak with women despite their presence all around me in the marketplace. These embodied experiences and emotions provided a rich physical context for my ongoing use of voice-over-internet protocols.
In the Pakhtun cultural context, male researchers find it difficult to access women for face-to-face interviews in a shared physical setting. Therefore, the potentialities of internet-mediated tools for interviewing women cannot be fully grasped unless the researcher's body is centred in reflexive accounts (Sharma et al., 2009: 1642–44). The promise of technology for facilitating interviews with difficult-to-access populations is not disembedded from the cultural context within which technology use is socially constructed. Yet, the cultural embeddedness of technology use by research participants receives little attention in the literature on telephone interviews and voice-over-internet protocols (which often celebrates how these modes help recruit difficult-to-access participants (Holt, 2010; Scipes et al., 2019; Self, 2021; Sturges and Hanrahan, 2004)). Furthermore, most methodological texts about WhatsApp as a mediated means of communication consider text-based messages (Colom, 2021; Gibson, 2020:15).
Online methods literature is often silent about intersubjective communication that co-produces knowledge in specific cultural contexts (Ellingson, 2017). Certainly, technology's affordances are valuable; however, without a nuanced discussion about how virtual interactions are connected to the material worlds of the researcher and the research participants, they are meaningless (Morrow et al., 2015: 534). The extant literature on telephone interviewing (Farooq and De Villiers, 2017; Holt, 2010) and voice-only Skype interviews (Sipes et al., 2019) rarely goes beyond its advantages in facilitating discussions about sensitive issues (Mann and Stewart, 2000)) and benefits vis-à-vis face-to-face interviews (e.g., accessing hard-to-reach populations and regions, such as war zones (Obdenakker, 2006), travel costs and time savings (Irvine, 2011)).
One often unstated advantage is that online interviews allow interviewees to select and adjust their interview settings (Self, 2021). This is important in contexts where present male family members may start responding on the interviewee's behalf, dismiss the interviewer's questions, or constantly stare at the interviewer and interviewee (Ibtasam et al., 2019:11). Even my interviewees (not in a face-to-face setting) were concerned about being overheard by male family members with whom they do not want to share their experiences (Khan, 2020; Seitz, 2015). To avoid such a situation, some respondents suggested a meeting time when “the male family members are not at home,” others closed the doors of their rooms, and some preferred to be interviewed in their office. One of my interviewees stated, “I can talk in a more relaxed environment there [in the office] because no one from my family will be listening to what I am saying about my experiences in the market” (field note, 3 December 2020). When the home was the only option for the interviewee, I relied on background auditory cues to understand the setting and contextualize her responses rather than ascribing short responses or non-richness to the voice-only interview format (Johnson et al., 2019).
In this case, the interviewee's physical setting was more important than factors associated with unequal digital divides. For instance, a 34-year-old university lecturer speaking to me from her residence in a girls’ hostel suddenly took a 45-s pause while talking about her experiences with Taliban crises. I did not interject, assuming her internet connection was bad. However, once we reconnected, she quickly stated, “a student came into my room, and you know, I cannot talk about these things [memories of conflict] in front of others”. In another instance, a 21-year-old university student speaking to me from her home was providing very short responses with long pauses. Later, when I probed about her memories of coming home after four months of displacement in 2009 due to military conflict, she narrated the story of a beheaded militant without any pause. I was confused by how she narrated this “scary memory” compared to her earlier, guarded responses to more mundane questions. She later clarified, “my elder sister [mashra khor] was here earlier and hence I could not talk openly”.
Can internet-mediated methods work without an added layer of mediation?
Recruiting strangers for non-face-to-face interviews (Sturges and Hanrahan, 2004) often involves social media pages, message boards, organizational leadership, and emails (Crowley, 2007; Scipes et al., 2019; Self, 2021; Trier-Bieniek, 2012). In some instances, researchers directly recruit participants using chat services (Lawrence, 2022). However, none of these methods were useful for this study—my internet-mediated interviews required an added layer of mediation, that is, contact through existing social networks.
COVID-19 amplified the problems associated with using internet-mediated methods for interviewing women. Limitations on women's time and space at home increased due to lockdown measures, especially in Pakistan (Kirmani, 2020). Women's already constrained use of the internet and smartphones was subjected to increased surveillance (Ibtasam et al., 2019; Mwambari et al., 2019). Women could rarely answer phones in isolation, and they often hand phones to their husbands or other male relatives when speaking to a male stranger (i.e., the researcher) (Shah, 2022). In a context where online hate speech, harassment, and privacy breaches characterise women's access to communication technology and conversations with stranger men (Ali Aksar et al., 2020:9), how could I contact women interviewees who I did not know?
In Pakistan, the majority of women internet users access social media platforms and communicate using mobile phones (Ibtasam et al., 2019). Pakistan has recently narrowed the gender gap in South Asian adoption and use of mobile technology (Shanahan, 2021). Nevertheless, Pakistan has one of the widest gender gaps in internet use and mobile ownership. Women are 38% less likely than men to own a mobile phone and 49% less likely to use mobile data. Only 50% of women own a mobile phone, compared to 81% of men (with only 20% of women owning a smartphone, compared to 37% of men) (Shanahan, 2021). Of the women in Pakistan who own a smartphone, 16% do not use mobile internet (Shanahan, 2021: 10–12). COVID-19 restrictions (e.g., lockdowns, working from home) further curtailed women's ability to use the internet without family surveillance in offices, educational institutions or libraries.
Despite these inequalities, I avoided using social media platforms such as Facebook and Twitter to recruit potential participants. Firstly, the Pakistani state considers social media to be a threat to the national image (Kirmani, 2021). Since our project examined conflict and counterinsurgency operations (Marsden and Hopkins, 2013), there was a potential risk to our participants from the national security forces and Taliban alike. In addition, the Pakistani internet landscape is awash with fake social media accounts, and women do not feel safe sharing information on the internet. This left WhatsApp voice-only contact as the most practical option for carrying out our interviews. The reliance on WhatsApp—and exclusion of Instagram, Facebook, Twitter, IMO chatrooms and Clubhouse—meant that we could not recruit respondents randomly, without the mediation of locally embedded social ties. However, WhatsApp offered a safer interview setting, which allowed the interviewees to reflect in detail.
The random recruitment of women participants for voice-only remote interviews was highly impractical. Records of women as mobile phone owners do not exist. Moreover, cultural family norms limit women's ability to speak to male strangers over the phone. Male family members are often discomforted by women's interactions with men beyond their nuclear family. Therefore, women's mobile phones—the enabler of such interactions— are always under scrutiny. Women either do not respond at all, or do not respond positively to calls from stranger men (Khan, 2020). It is presumably possible to access women interviewees through their male family members but, as Shah (2022) demonstrates, these interview settings do not facilitate free expression, as male family members are present.
Therefore, none of my women interviewees were randomly recruited. Access was negotiated by my personal and professional contacts (men and women) in the region who had personal ties with potential interviewees. Both men and women were helpful in negotiating access with women in professional settings; access to non-working women was only possible through other women. All these women interviewees were educated, so my claim that internet-mediated methods can facilitate a male interviewer's access to women interviewees in hard-to-access populations remains untested with uneducated women in the region. Internet-mediated methods in contexts like the Swat Valley are unlikely to succeed without an additional layer of mediation through locally embedded personal ties (e.g., trusted colleagues, friends, teachers, or students). Interestingly, family ties (both men and women) were not helpful in negotiating online interviews with women participants. The locally embedded personal ties bridged the trust between the women interviewees and a male interviewer who were complete strangers. None of the online interviews featured a male family member lurking in the background. In fact, many women deliberately selected a time or setting where they could elude male presence.
Cultural, contextual and gendered limits: Challenges and solutions
Background sounds during the interview were significant for contextualizing the interviewee's responses. For example, a 23-year-old Master's student was cheerful and confident until a sudden knock on the door. As soon as the interviewee heard the knocking, she asked me to excuse her so that she could attend to the door. The knocking deprived her of a private space, which she had created for the interview, and resulted in a sudden change in tone. I realized that the interview would be challenging; immediately handing up the call would have generated suspicions that could have been harmful in her family setting. I could hear a male voice calling the interviewee's name while knocking, and some steps walking towards the interviewee when the interviewee started talking again after a 10-s pause. The interviewee re-started the conversation with a completely different topic by saying that “I will be in the final year now once the exams are conducted”. I told the interviewee, “You can opt out of the interview any time, even now” (interview notes, 9 December 2020).
Audio interviews are often considered the ‘second-best choice ‘(Holt, 2010) since they lack paralinguistic cues and non-verbal communication (Weller, 2015:23). As Deakin and Wakefield (2013:605) note, all the subtle visual and non-verbal cues that help contextualize the interviewee in a face-to-face scenario are lost. However, in my case, audio interviews were the only (not ‘second best’) option. Even when I allowed the interviewees to choose which internet-mediated methods (video or audio) they preferred (Weller, 2015), they favoured audio-only interviews.
Online voice-only interviews require going beyond “effective listening”. As Farooq and De Villiers (2017: 307) suggest, interviewers must pick up on changes in verbal cues like “pauses, hurried answers, tones, etc. And indicate if interviewees are confused, hesitating or experiencing frustration.” These conversational elements need to be contextualized within the cultural dynamics and the physical settings where the interviewee is located. The interviewer's invisibility is juxtaposed with the interviewee's visibility to the people surrounding her, potentially reducing the ability to create a positive ambience and establish rapport (Obdenakker, 2006). Therefore, in voice-only interviews, we must attend to the background auditory clues for effective interviewing. The interviewer's ears are the only sensory connection to the interviewee's physical surroundings.
Culturally sensitive listening is a prerequisite. This includes elements of “reflective listening” (Au, 2019: 64), effective listening, and culturally sensitive communications (Brooks et al., 2019). Transformations in the interviewee's physical settings can be detected through close attention to all kinds of background sounds. Changes in tone, pauses, shifts in subject, refusal to open up, and sudden unresponsiveness can all indicate events in the interviewee's physical space. I noted numerous background sounds, including male and female voices, opening and closing doors, complete or partial silence, message alerts and ringing phones, footsteps, ongoing cautious, affectionate, or angry exchanges between the interviewer and his/her family members, or among other family members.
Culturally sensitive listening may not be possible when there are cross-cultural differences between the researcher and research participant, especially if the interviewer cannot understand the interviewee's home language. This was evident well before I started interviewing women. In our online interviews with male poets (reference omitted for review), a 60-year-old retired schoolteacher told us he could not read the information sheet and consent form because his featurephone did not allow him to open PDF documents from WhatsApp. He agreed that emailing the documents would resolve the problem, but asked us to wait and briefly left the room. He first asked about his son who was “out with his friends”, and then abruptly asked his wife, “hey, do I have an email?” Our ability to understand the language in this background conversation informed us about the role of technology in the interviewee's life. With the women interviewees (who had the technological competence to handle emails, text messages and phone calls), these background conversations often prompted me to ask whether we should continue the interview, and helped contextualize her pauses, suddenly short responses, or changes in her tone.
Conclusion
The literature on voice-only interviews—both with telephones (Holt, 2010) and internet protocols (Crowley, 2007; Self, 2021; Trier-Bieniek, 2012)—calls for increased reflexivity. My engagement with online interviews was forced by COVID-19. Nevertheless, it highlighted the absence of gendered reflexivity in voice-only interview settings and dynamics. In this sense, COVID-19 was “not an event, [but] a reminder of the actuality of such debates” (Nyenyezi Bisoka, 2020: 2). Beyond being a mode of communication, voice-only interviews involve an epistemological researcher standpoint. Such reflection is either entirely absent (cf. Deakin and Wakefield, 2013) or not thought to affect online interview dynamics (Lorence 2020; Weller, 2015). Technology can facilitate access to distanced and geographically dispersed populations; however, a reflexive engagement with the body (of researcher and participant) is needed to capture fragments of epistemic energy ‘out there’ across geographic regions. Our reflections on online interviews will remain incomplete until the embodied gendered and lived realities of the participants and cultural positionality of the researcher are evaluated (in telephone interviews (Farooq and De Villiers, 2017; Holt, 2010) and voice-over-internet protocols (Scipes et al., 2019; Self, 2021; Trier-Bieniek, 2012)). Therefore, it is important for researchers to recognize what they are doing, when they do it, and what it means to take data at face value (Sandelowski, 2002).
The context of the researcher and research participants matters (Self, 2021). Online, voice-only interviews appear disembodied to those who do not reflect on how gendered relations embedded within the participant's cultural context shape the intersubjective dynamics of knowledge production through remote interviewing (Deakin and Wakefield, 2013; Seitz, 2015; Sipes et al., 2019; Weller, 2015). Practicing such an embodied reflexivity has broader methodological implications for qualitative interviewing. Reflexivity around internet-mediated tools calls for attention to their emergent, virtual and contextual features (Pousti et al., 2021: 357). Such reflection should be critical in the Arendtian sense, where context is viewed as “irreducibly ‘reflexive space,’ within which, reflection is inevitably shaped by the context in which it occurs” (cited in Pousti et al., 2021: 365). Bids to utilize the opportunities of internet-mediated research often overlook embodied reflexivity. While the importance of gendered bodies in voice-only interviews is recognized (Crowley, 2007; Farooq and De Villiers, 2017; Holt, 2010), there is insufficient reflection on how it shapes the interviewing practice and the data generated. Embodied reflexivity—in conjunction with attention to unplanned mundane occurrences in the interviewee's background—is important to writing the larger political and social world in which the interviewees (and the researcher) are embedded.
This reflexive account has demonstrated how technologies like WhatsApp can help us overcome gender barriers and reach a difficult-to-access population. However, it has also highlighted limitations in terms of who can be included in internet-mediated research. The voice-only interview setting is an embodied space; cultural roles may hinder access to participants, with implications for remote interviewing recruitment. I have also shown how culturally sensitive listening in the absence of visual and paralinguistic cues can improve interview analysis. This article is not yet another entry to the protracted debate of online vs. in-person interviews, nor is it prescribing how to use technology and voice-only interviews with difficult-to-access populations in conflict-affected regions. Instead, I call on qualitative researchers to continue critically engaging with the embodied and embedded aspects of voice-only interviews.
Footnotes
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the British Academy of Management, (grant number HDV190288).
Notes
Author biography
Muhammad Salman Khan is a Lecturer in development geography at the Department of Geography, King's College London. As an economic and institutional sociologist, he is interested in local political economy with special interest in informal economy, institutional analysis, local governance and trust, borderland markets, cultural production (with a special interest in affect theory), and geopolitics. He is currently working on a project aimed at capturing, through engagement between academic and non academic mode of thinking and expression, the affective and emotional dimensions of development policies in Pakistan.
