Abstract
In this article, we present a comprehensive approach to analysis to assist researchers in conducting and analyzing photovoice studies. A screening of primary studies in four systematic reviews focusing on photovoice research revealed that the focus of analysis of researchers is the narrative provided with the photos from the participants, which undermines the potential of the photos themselves to provide meaning. In addition, the analytical effort of photovoice researchers is often limited to the interpretive phase in their projects. The question matrices we developed facilitate photovoice researchers who aim to give more weight to photos as an interpretive medium and wish to extend their analytical lens to different phases of a research cycle. They focus our analytical attention on three different sites—site of production, site of photo, and site of audiencing, and three different modalities—technological modality, compositional modality, and social modality. The matrices are designed to present an overview of the important dimensions that researchers might need to take into account when conducting photovoice research studies. We provide relevant examples to illustrate the potential risks and benefits of the analytical choices we make. Photovoice researchers should increase their awareness of the impact of our choices on the analytical process and avoid the analytical strategies that may disempower participants and reproduce existing power relationships.
Background
The interest in visual research methods is increasing beyond disciplines that have a long tradition in using the visual in research such as anthropology and the arts and design sector (Guillemin, 2004). The method has gained popularity in researchers from other fields like health care and social and behavioral sciences (Pain, 2012). Visual research methods can be conceptualized as methods that use still or moving images as part of a research process (Rose, 2014). They can be used to collect and analyze data as well as disseminate research findings. Many advantages have been assigned to the use of images in research. Images offer a different way of describing everyday activities and people’s understandings of space, place, and relationships. In combination with narratives, they enable researchers to better portray the diversity of human experiences (Collier, 1957; Prosser & Loxley, 2008). When they are used as a tool to elicit narratives in interviews, focus group discussions, or other forms of data collection techniques, images may work as a door opener providing concrete talking points, keep the participants’ attention, and allow them to take on the role of experts of their own lives (Pauwels, 2015a). Images may also trigger a deeper level of human consciousness, latent memory, and emotional and sensory layers of informant’s life stories (Happer, 2002; Riddett-Moore & Siegesmund, 2012) and reconnect us to elements of our own experiences (Roberts, 2011).
Since C. Wang and Burris (1994) introduced the method of photovoice in a participatory research project conducted to empower rural women in Yunan province of China, considerably more social behavioral scientists have taken this method on board, particularly those working from a critical emancipatory qualitative research tradition. Photovoice is understood as a process in which grassroots communities are asked to document their lived experience through photos and discuss these photos, with the aims of identifying and representing their community, promoting dialogue, encouraging actions, and potentially influencing decision making on the policy level (Hannay et al., 2013; C. Wang & Burris, 1997; C. C. Wang, 1999; C. C. Wang et al., 1998; Q. Wang & Hannes, 2014). It builds on principles of feminist theory, documentary photography, and education for critical consciousness (C. C. Wang et al., 2000). In the original photovoice studies from C. Wang and Burris (1994, 1997), the feminist theory inspired the authors to engage the rural Chinese women as authorities and experts of their own lives. Through documentary photography, rural women were able to portray social life circumstances and mental welfare states as embedded in society. The process of taking and critically discussing the photos and presenting the findings to the policy makers enabled the women to advocate for change, to make policy makers more aware of their needs and the social conditions influencing their health status. It is through constant dialogues between different stakeholders that changes at the level of individual and community can be achieved (C. Wang & Burris, 1994).
The general procedure to collect and analyze data and represent findings from photovoice is as follows (C. C. Wang, 1999, 2006): (1) a group of participants is recruited; (2) an information session is organized to introduce the participants the project and familiarize them with the method, the use of camera and related ethical issues, and the informed consent is gained from the participants; (3) a period of time to take photos is provided; (4) some focus groups or interviews to discuss the photos are arranged; (5) an analysis of narratives and photos is conducted; (6) the findings are often shared with the policy makers and disseminated among the public; (7) when authors decide to write a report on such studies, a selection of photos is often brought in to support the identified categories and statements made and to strengthen the findings section.
Problem Statement
Our team conducted a series of photovoice projects on challenges of international students studying abroad (Swarts et al., 2019; Q. Wang & Hannes, 2014; Q. Wang et al., 2018) according to the methodological state of the art presented by the developers of the method (C. Wang & Burris, 1994, 1997). While the results we generated were interesting, we felt that the analytical process adopted in our photovoice projects did not fully do justice to the complexity involved in analyzing and interpreting visual data collected. In an attempt to optimize procedures for upcoming projects, we scoped the literature to identify reviews on photovoice methodology to learn from the practice of others. We screened primary studies included in four reviews (Catalani & Minkler, 2010; Coemans et al., 2017; Evans-Agnew & Rosemberg, 2016; Hergenrather et al., 2009). What we learned from extracting and analyzing these data is as follows.
First, we noticed that there was a tendency to downplay the role of photos in the way photovoice trajectories were conducted and reported on. Their role was often limited to either assisting participants in verbalizing their experiences or supporting a particular narrative story line evolving from the data (Coemans et al., 2017; Pain, 2012). Researchers also tended to use textual data from the interview transcripts or group discussions as their main source for analysis and interpretation (Catalani & Minkler, 2010). Photos themselves were seldom analyzed based on their intrinsic qualities such as aesthetic value and composition.
Second, we experienced that analytical attention and the description hereof mainly focused on the interpretation phase to read the photos in conjunction with the narratives provided. Other phases in the research process wherein important analytical decisions were made did not seem to receive much attention in the reports. We therefore identify a need to further explore these different layers of analysis and their potential influence on photovoice research projects. Based on these findings, we argue that photovoice methodology would benefit from an analytical makeover.
Research Objectives and Questions
Building on the work of respected visual research scholars (Pauwels, 2015a, 2015b; Rose, 2012, 2016) who push toward methodological progress in visual research, this study aims to develop several question matrices to support photovoice researchers in developing a more comprehensive approach to analysis that moves beyond narratives and beyond a phased approach to conceptualizing photovoice studies. In line with the original aims of photovoice (C. Wang & Burris, 1997), we pay particular attention to analytical layers related to the critical, participatory, and empowerment principles that underpin photovoice methodology. The overall research questions guiding this study are as follows: What are the potential dimensions in a photovoice project that demand researchers’ analytical attention? What sort of supportive questions can be proposed to guide photovoice researchers in developing an analytical plan for their photovoice research studies? How do the choices we make in terms of producing, modifying, interpreting, and disseminating findings in photovoice studies influence our analytical gaze?
The following sections provide insight into both the development process of the supportive question matrices and their structure.
Development of the Question Matrices
C. C. Wang (1999), the original developer of photovoice methodology, introduces a three-pronged approach to understand the influence of photos and identifies the following dimensions to consider in the analytical process: “(1) the production of the images, (2) the reception of the images and the meanings attributed to them by audiences, and (3) the content of the images themselves” (p. 186). These dimensions seem to match the suggestions of other expert scholars in the field of visual research. Rose (2012), a social philosopher in the fields of philosophy and sociology, proposes to approach analysis from three different sites: (1) site of production, which is where an image is made, (2) site of images, referring to the visual content, and (3) site of audiencing, which is where is received by spectators. In a more recent edition of her work (Rose, 2016), Rose adds a fourth site—site of circulation, outlining where and how an image might travel. In this study, we consider Rose’s (2016) site of circulation as part of the site of audiencing. These dimensions proposed by C. C. Wang (1999) and Rose (2012) form the basis for the rows of our question matrices. Rose (2012) further divides each site into three modalities including a technological modality, compositional modality, and social modality. These modalities form the basis for the columns in our question matrices. In this article, technological modality refers to the technologies used to make, modify, communicate, and display a photo and the technical training and skills of participants. When analyzing the compositional modality, researchers can consider not only the specific material qualities of a photo but also the organization of trainings, story lines, and the materials for dissemination. As a photo is seen as a representation of reality, its perception is influenced by the contexts of production and reception (Sturken & Cartwright, 2001). Here, the contexts are referred to as social modality. They may encompass the broad range of economic, political, cultural, and historical circumstances that influence how a photo is made, the ethical values and belief systems of participants, and the identities of the photo-taker, the subject, and the audience.
The matrices were further refined based on our reading of relevant academic resources on visual analysis and visual methodologies (Kress & Van Leeuwen, 2006; Lester, 1995; Margolis & Pauwels, 2011; Parsons, 2002; Pauwels, 2015b; Siegesmund, 2004, 2015; Sturken & Cartwright, 2001; Van Leeuwen & Jewitt, 2001). We identified key analytical issues to consider and matched them with an appropriate analytical question to ask. Part of the questions has been identified in the work described above (Rose, 2016). The question approach was chosen to promote critical thinking and reflection, stimulate in-depth and comprehensive exploration of the subject matter, generate new analytical ideas, and raise awareness about the potential benefits and risks of particular analytical decisions.
A draft version of the question matrices was commented on by three experts in visual research and iteratively revised through discussion. The experts provided open-ended comments on the content included in the question matrices to ensure the comprehensibility of the questions introduced and to identify any issues with the definitions or wording of the different dimensions discussed. The draft also received feedback from the qualitative inquiry research team from the SoMeTHin’K (Social, Methodological and Theoretical Innovation ’Kreative) at our university and several delegates of the 3rd European Congress of Qualitative Inquiry during an oral session.
In what follows, we describe the analytical purpose of the questions populating the matrices, and how we, as researchers, have analytically responded to the questions proposed. The core analytical dimensions identified as relevant for photovoice methodology are illustrated with some examples from previously conducted studies, mostly drawn from our own research.
Structure of the Question Matrices
Site of Production
Site of production refers to where a photo is made in photovoice projects. Central to the analytical gaze for this site is the question of how a photo is taken. The site of production can be divided into two parts: preproduction and production. Preproduction is related to the planning and preparation for shooting (mostly training), and production is the actual shooting of photos in the field. An overview of the important questions to consider in the site of production is listed in Table 1.
Question Matrix 1 - The Analytical Questions to Consider in the Site of Production.
Site of production—Technological modality
In terms of technological preproduction level, analysis can intentionally or unintentionally be influenced by participants’ expertise and training in photography. Familiarity with the photography devices and knowledge about photography can be advantages to photovoice projects as participants can make use of these techniques to better portray their lives, such as adjusting the shutter speed to emphasize certain aspects in a photo. Depending on the expertise of the participants, technical training could be provided to familiarize them with the basics of photography and composition. Such training can be an empowering process as it may provide new perspectives for participants and open up new possibilities for them to express themselves and portray their living conditions. For example, the option of using metaphors gives the participants an alternative way of expression than capturing the exact scene. However, the downside of technical training is that it may alter participants’ naturalistic practices and styles of representation, which may prevent researchers from seeing particular layers (Harrison, 2002, as cited in Catalani & Minkler, 2010).
In the production of the technological modality, the analytical focus is on the technologies used for taking the photo. First and foremost, it concerns the question of who takes the photo. Photovoice is a participant-generated photo producing technique, which does not mean that it is the participant who takes photos at all times. Sometimes, a participant may instruct someone else to take a photo while he/she is the subject of the photo (Pauwels, 2015a). Figure 1 taken from a photovoice project exploring Asian international students’ adjustment experiences in Flanders (Q. Wang & Hannes, 2014) shows a participant playing with a toy gun. The participant himself appears in the photo, which suggests that it is made by a third party. It raises the questions of who owns the photo and what this might mean in the context of intellectual property as the photographer is neither a researcher nor a participant here.

Playing with a toy gun.
The technologies used such as the choice of a camera can influence the composition of the photo taken and therefore have an impact on the analysis. For example, it makes a difference whether a camera can adjust the settings such as the shutter speed. Fast shutter speed can capture motions in sharp as if they are frozen, while with slow shutter speed, it is more likely that subjects in movement will be vague, or that certain parts are not sharp enough to be fully visible. Figure 2 from the study of Asian international students (Q. Wang & Hannes, 2014) shows a person playing in a band. Because of his movement, his face is blurred and barely recognizable. When the face is out of focus, our attention is naturally drawn to the rest of his body and the surrounding. Some participants use this strategy purposefully to protect the subjects’ identities (Hannes & Parylo, 2014; Wang, 2020).

Competition of bands.
Site of production—Compositional modality
On the compositional preproduction level, the analytical focus is on the organization of the training: They could be provided in a one-way lecturing and/or interactive style, individual coaching and/or group style, or face-to-face and/or distance learning style. Often training is organized in a one-way manner: Researchers deliver the information face-to-face in front of a group of participants. There is a potential risk that participants may not fully understand what has been delivered. Providing interactive sessions, combining with individual coaching and possibly with online forum could potentially introduce a deeper level of engagement and understanding.
In the production part of the compositional modality, the analysis of a photo may be influenced by whether and how a photo is staged. Figure 3 taken from a participant of a photovoice project conducted by Johnson and colleagues (2012) is a good illustration of a staged photograph. In staged photography, the agenda of the participants should be put ahead of that of the subjects (Shanidze, 2016). In this case, the participant lined the children on the wall. As in the first few shots the children appeared quite stiff in front of the camera, the participant instructed them on how to behave in the photo. By contrast, the subject of a photo can be photographed in his or her natural behavior or environment, paying little attention to the presence of the camera. This kind of photo is often known as a candid photograph. An example is Figure 4 that depicts some kids playing on a sloping wall taken by a participant in our study on the adjustment challenges of South American international students in Flanders (Q. Wang et al., 2018). Asking for consent before taking a photo can affect the spontaneity of the subjects (Q. Wang & Hannes, 2014), which is why many prefer candid photography.

“Silly” group photo of residents of orphanage in Njabini, Kenya.

Kids can do dangerous things here and their parents will just say it is okay.
Also, the relation between camera and subject can be complicated, with different factors influencing how a photo is taken. The angle and position of the camera in relation to the subject create different types of views: bird’s-eye view, eye-level view, worm’s-eye view, frontal view, side view, and back view. If the maker of a photo shoots something from a bird’s-eye view, he or she looks down at the subject, which might suggest a position of symbolic power or even condemn over the subject. If the photo-taker uses a worm’s-eye view, he or she looks up and it may create a sort of symbolic power over himself or herself and the potential viewers. Equality of power is shown in the eye-level point of view. Frontality invites involvement with the subjects, which is nicely shown in Figure 3. Profiles and backs would probably suggest a sense of detachment, as illustrated in Figure 4 (Jewitt & Oyama, 2004). Training in the use of cameras might influence whether they choose the camera angle and position intentionally or not, based on their learning of the impact of these stances. In combination with narratives, it may be much more informative than analyzing photos on their own.
Site of production—Social modality
In the preproduction part of social modality, the focus of analysis is on training in ethical issues. As participatory action research, photovoice disrupts the traditional researcher–researched power relationships, allowing the participants to decide what is useful and important to represent in the photographs (Harley, 2012; Prins, 2010). As the participants are the ones in control in the field, ethical training is especially important for them to be able to judge the potential risks and protect the subjects (Hannes & Parylo, 2014). It could, for example, include information on privacy, safety, ownership of the photos, and a discussion of potential solutions to shoot relevant photos in the absence of informed consent (Evans-Agnew & Rosemberg, 2016; Wang, 2020). Ethical training can influence the way participants take photos. For instance, participants could become creative in their strategies to collect photos to avoid the situation of having to ask for consent (Q. Wang & Hannes, 2014; Q. Wang et al., 2018). They may choose to photograph scenery, parts of the body, themselves, and their friends, due to the unwillingness or discomfort to approach strangers (Hannes & Parylo, 2014). These alternatives to bypass the consent may work well under the ethical rules. However, it might limit our understanding of certain issues that might be best captured by showing the faces of certain subjects. In that sense, photovoice researchers need to be aware of the impact of ethical training on participants, and how training should be provided to better serve both the ethical rules and the research objectives.
As to the production part, the analytical focus of social modality is on the identities of the photo-taker and the subject, and the broad context of taking the photo. Social identities such as gender, race, sexuality, and class of the photo-taker and the subject can be an important element to consider when analyzing a photo. The relations between the photo-taker and the subject, the time, place, and reasons for making the photo are also factors to be taken into account. Producing photos in photovoice projects is an act linked to broader social, economic, political, cultural, and historical influences. For example, in a study exploring learners’ perspectives and changes by attending literacy classes in two rural villages of Colima and Rosario de Mora (Prins, 2010), participants encountered suspicion and criticism from the local villagers when taking photos, making them feel vulnerable and embarrassed. Suspicion was associated with the local political environment: the political repression and the civil war created the local villagers’ fear and distrust toward outsiders. Suspicion was brought by the locals’ belief that photography was a tool for sorcery. As cameras were not affordable by the local villagers, photo-taking was uncommon in these villages. Therefore, there was a prevailing social norm that people who took photos were considered abnormal. This led to criticism for the participants. Researchers should consider the contextual factors that might cause potential problems and obstacles impeding the production of photos when designing photovoice research.
Site of Photo
Site of photo can be understood as the content of a photo. Content is an umbrella term that may include many different elements such as the visual symbols through which we can read meaning into the photo, the intrinsic art and design-related qualities of the photo, the potential missions, and the cultural patterns revealed in the photo. Site of photo explores the following questions: How is the photo modified? What does the photo show to the research team? The same three modalities could be looked at in developing questions in the site of photo (see Table 2).
Question Matrix 2 - The Analytical Questions to Consider in the Site of Photo.
Site of photo—Technological modality
On the technological level, one could ask the question of whether and how traces of modification of a photo influence analysis. Modification, also referred to as postproduction, can be understood as adjusting the photo and preparing it for use in certain contexts, for example by (1) removing certain parts or elements (e.g., blocking or blurring), (2) replacing or substituting elements (e.g., displacing certain objects with others), (3) amplifying or reducing elements (e.g., manipulating the size, color, or intensity), (4) emphasizing or schematizing elements (e.g., cropping, drawing, or taking notes on a photo for emphasis; Hook & Glaveanu, 2013). Modification can be applied for different purposes such as to emphasize certain aspects of a photo and fit a particular affective dimension of experience. For instance, in a project applying a sensory approach, especially using photos, to explore the relationship between people and their living environment (Coemans et al., 2019), Annemie Moriau made a photo collage (see Figure 5) out of the photos of buildings she had photographed in a Belgian city. The original colors of the photos were removed, and the black and white tone was used to create a silhouette effect. The extremely bright light rendered the details of the photos less visible, which emphasized the shape of vertical buildings and their windows. She compressed the width of the photos, making them thinner so that they were more able to capture her feeling of being squeezed between buildings and the idea that the city had to be read vertically instead of horizontally (Coemans et al., 2019). In photovoice projects, the modification does not always remove or change information, it may also add information and therefore increase our understanding of what exactly is at stake.

Photo collage made by Annemie Moriau.
Site of photo—Compositional modality
In the site of photo, compositional modality focuses on both the symbols and intrinsic qualities of the photo itself. The task of symbolic or semiotic reading of a photo is to read the symbols (signifier) to identify the meaning assigned to them (signified). In the layer of denotative meaning, the question of “what, or who is being depicted here?” is concerned. These are usually visible characteristics. Symbols could also convey a more abstract layer of connotative meaning. This layer deals with the ideas and values expressed through what is represented and the way it is represented (Van Leeuwen, 2001). Take Figure 6 taken in our photovoice study investigating South American students’ acculturation experiences (Q. Wang et al., 2018) as an example. On the denotative level, it shows four people holding four cups of coffee doing the movement of “cheers.” On the background, we can see a bag of snacks, two knives on a cutting board, some cups, a table, a chair, a refrigerator, and kitchen cabinets. One the connotative level, the photo shows that these people are having a get-together in the kitchen and celebrating something, which indicates that this might be a photo depicting social network and friendship. Besides the symbolic layer of a photo, interpretation is also informed by the qualities that shape a photo. Analyzing based on the qualities, also known as qualitative reasoning, provides a critical approach to comprehend a photo (Coemans et al., 2019). This analytical lens focuses on the elements of art and principles of design (Eisner, 1994; Coemans et al., 2019; Siegesmund, 2015). The elements of art can include line, shape, color, space, and form. When these elements are combined or related to each other, they reveal interesting patterns of structures that are known as the principles of design. They may encompass balance, emphasis/center of interest, contrast, repetition/variation, and pattern. The organization of the elements of art and principles of design can signify meanings, give visual effects, generate sensory experience to the viewers, or evoke a particular somatic response. The sensory experience one felt is different from simply knowing the meaning of a photo. It calls for one to feel a photo’s qualities and experience its “presence and power” (Smith, 1995, as cited in Parsons, 2002, p. 26). When we analyze the quality layer of Figure 6, it shows a symmetry created by four arms (lines) and four cups of coffee (shapes) that brings balance to the photo and provides us with a sense of stability. The vertical and horizontal lines reach toward the middle, directing our attention to the center of interest of the photo—the four cups of coffee. The round shape of the cups, the coffee inside, and the similar posture of the hands create a pattern. These generate a feeling of unity and solidarity, reflecting the relationship between these people. Analyzing based on the quality lens opens up a new space for understanding photos and provides an additional layer of information (Coemans et al., 2019). It may sometimes reinforce, as in this case, or contradict the meaning revealed in the symbolic layer of the photos.

Opportunities to share my culture.
Site of photo—Social modality
The last level in the site of photo is social modality, focusing on the social missions to be achieved and the cultural patterns residing in the symbols and qualities of a photo. We can link symbols and qualities with social missions by analyzing the particular purposes, the intended audience of a photo, and how the symbols and qualities of the photo contribute to the missions. Certain aspects of participants’ culture, such as their cultural patterns and norms, might be revealed in the symbols and qualities of a photo. Take Figure 7 for example. It is a photo of a card wishing people a happy Chinese New Year, taken by a participant from one of our photovoice projects (Q. Wang & Hannes, 2014). Based on this photo, we suppose it is part of the participant’s culture to send cards during the Chinese New Year festival. The choice of the red color for the card symbolizes happiness and blessings in his culture. Decoding the symbols and qualities into a participant’s own or local cultural patterns can reveal the broader cultural context of a photo, which might be particularly interesting for photovoice researchers.

Sending Chinese New Year wish card to family.
Site of Audiencing
An actual photo created by a participant might in the end be used to spark discussions on a phenomenon of interest. Site of audiencing is therefore where a photo is disseminated and received by the participants, the research team, and the public. It deals with the questions: How is the photo interpreted and disseminated? What does the photo do to the audience? Audiencing in photovoice research can be divided into two categories: the dialogue between the participants and the researcher team in interviews or group discussions and the dialogue with the public. Three modalities can be discussed in the site of audiencing (see Table 3).
Question Matrix 3 - The Analytical Questions to Consider in the Site of Audiencing.
Site of audiencing—Technological modality
In terms of technological modality, the analytical focus for the group participants and research team is on the way a photo is discussed with the participants. Hence, it is important to consider whether the photos are selected by participants, researchers, or in a joined effort and explore the criteria the selection is based on. Another important element is the way photos are displayed for discussions: whether they are prints or displayed on a screen (e.g., laptop, smartphone). The color, size, paper type (mat, glossy…) of the prints or the size of the screen may influence people’s gaze. The organization of discussions, whether in the form of focus groups, interviews, or the combination of the two, can also make a difference in the analysis. The advantage of focus groups over interviews is that they encourage participants to draw links between each other’s photos and ideas, and this may bring out interesting topics or reveal important issues. However, it may be difficult to find the time and place that is suitable for all participants, and participants may be more reluctant to reveal very personal or sensitive stories or responses to photos in a group (Murray & Nash, 2017; Pauwels, 2015a), which might be some of the challenges of conducting focus group discussions.
As to the public and research team, the analytical focus of technological modality is on the way photo is disseminated among the public. First, the medium chosen for displaying, communicating, and disseminating the photos is of importance from the perspective of outreach. It could vary from posters, exhibitions, and various media including the internet, television, and newspapers. The next focus of analysis is the question of who designs and decides on the medium and who controls the dissemination phase: They could vary from participants, researchers, professional artists, members from the community, or a combination of different stakeholders. In addition, attention should be paid to how dissemination is organized, how photos and texts (i.e., titles and narratives) are displayed—whether they are prints or in a digital form displayed on a screen, and how the medium influences the way photos and texts are disseminated and interpreted. Based on the reviews of photovoice studies (Coemans et al., 2017; Evans-Agnew & Rosemberg, 2016; Hergenrather et al., 2009), exhibition is the most common medium chosen for dissemination. However, this medium is constrained by its physical location, and therefore, it is often confined to the local communities. Depending on the research goals, researchers can choose to exhibit online (e.g., Adams et al., 2012; Booth & Booth, 2003) or broadcast by radio, television, or newspapers (e.g., Kwok & Ku, 2008; Necheles et al., 2007) when it is interesting to reach a broader audience.
Site of audiencing—Compositional modality
Regarding compositional modality, the analytical focus for participants and research team is on the story lines providing meaningful context to the symbols and qualities of the photo. Questions can trigger stories explicitly told by the participant about the photo. Discussion can also be generated around the ideas and values expressed by the participant and the significance of the photo to the participant (Pauwels, 2015a). Tacit knowledge, such as the way participants organize the relationships of qualities to construct meaning and express their feelings and emotions, and the sensory experience or somatic response they bring can also be important focuses when discussing and analyzing the photo. Here, tacit knowledge refers to what people know before they can put it into words. It might require researchers to think about interview questions that can access participants’ deeper layers of knowledge and understandings. In the study conducted by Coemans and colleagues (2019), participating artists experienced difficulties in explaining their creation experiences, as applying the quality lens had become an unconscious process. To help them articulate their intentional or unintentional process of making use of the quality lens, researchers adjusted the interview questions with the help of a professional visual artist. For example, regarding the photo collage discussed above (see Figure 5), the researcher asked, “the shapes, in particular the rectangular shapes and the verticality in your work, are striking to me. When did you get the idea to recognize with vertical lines?” and “another principle that your work evokes in me is emphasis, or rather the lack of emphasis. You let various elements in the photos fade, which creates a blurred image for me. What did you want to express with that?” The researcher redirected discussions to a more emotional layer of participants’ experiences. This somehow indicates that the level of engagement with compositional modality is highly dependent on how the interview guide is designed and how much emphasis is put on the qualities of photos in the type of questions asked.
For the public and research team, the analytical focus is on the composition of the research materials for dissemination. The selection of photos and texts used in the research materials for dissemination can be one element of analysis: How are they selected? Are they selected by researchers, artists, participants, or the joint effort between them? The inner relationship between photos or photos and texts can be an interesting analytical point to focus on. It may be the texts that take priority, with the photos considered as an illustration of the texts or vice versa. Another possible relation is that the texts and photos are considered equally important and should be presented in interplay with each other to enhance meaning or stimulate reflections (Pain, 2012). Necheles and colleagues (2007) investigated how youth perceived health issues and advocated health in their communities. In the dissemination phase, the participants selected some photos and came up with the appropriate texts to communicate the findings. They compiled the photos and texts and designed several posters (see Figure 8) jointly with a graphic artist. This created unity in the posters by providing a balanced and holistic account of texts and photos. The texts communicated the main message and purpose, and the photos were vivid illustrations of the texts.

The three posters displayed at the California Science Center and student-created captions articulating their intent.
Site of audiencing—Social modality
Regarding social modality, the focus for participants and research team is on the identity formation of participants. More specifically, the focus is on how meaning is co-constructed in the dialogue between participants and researchers. When responding to a photo, participants and researchers have to refer to their referential systems to make connections. The referential system includes knowledge frames, cultural frames, ideas, values, and past experiences (Parsons, 2002). An example of how cultural frames have evolved during a discussion is that a Chinese participant discussed a photo of a man walking with a baby stroller in the street (see Figure 9; Q. Wang & Hannes, 2014). As the phenomenon of men taking care of their babies was not as common in China as in some Western countries such as Belgium, this participant expressed feelings of cultural shock: “luckily I am not a Belgian, but it is a good thing, and we need to share something with girls.” It shows that empowerment through photovoice is not limited to changes in life circumstances. It may also appear through the internal change of awareness and attitude (Parsons, 1988, in Sadan, 1997). The narrative given by this participant reflects that he becomes more critical toward his own culture and more open to accept new cultural values.

Men taking care of their babies.
In addition, the choice of location and time frame within which photos are discussed might influence participants’ engagement in the discussions. Discussing photos at home may bring more comfort and a sense of familiarity and safety for some participants compared to other settings such as an office, a café, or a community house. However, it might be the other way around for those who experience domestic violence. Saturday afternoons might be more involving for participants than Friday nights after they finish schools or work (Phelan & Kinsella, 2011). Perceptions might change in the light of ongoing political or cultural events taking place in public. Researchers can take these practical arrangements into consideration when designing and analyzing photovoice research.
Furthermore, silencing or avoiding talking about particular issues in the dialogue between participants and researchers can be an interesting point to focus on. In a study from Meo (2010), the researcher reported that students who were from the working class and marginalized families tried to distance themselves from negative images and identities related to their disadvantaged living conditions by bypassing their economic situations in the discussions. However, particular traces in the photos showed their deprived living conditions, such as the poor condition of the walls and using curtains instead of doors, enabling the researcher to connect verbal story lines with elements that were silenced by participants, but visible without necessarily having to ask about them. When the narratives fall short, a reading of the photos from background to foreground becomes more relevant. This is particularly the case when dealing with situations of trauma or violence-related research, shameful or embarrassing topics for which people lack the jargon to openly discuss this. It also invites us to think about how to ethically relate to what is silenced.
For the public and research team part, the analytical focus regarding the social modality is on the identity formation of the public. More specifically, it can be on the composition of the target audience of dissemination. The audience can include people with authority (i.e., policy makers), community members, nongovernmental organizations, and other stakeholders. There is a possibility to liaise with the photo-takers and where possible the subjects of the photos in the public dissemination phase of a photovoice research project (Wang, 2020). The social identities of the audience (i.e., gender, race, sexuality, and class) and their impact on the interpretations of the audience can be another element of analysis. The research findings disseminated may evoke new understandings and sensory experiences in the public captured through on the spot surveys and interviews. This is considered co-construction of meaning, in this case between the public and research team. Furthermore, the context of viewing could have a potential influence on the interpretations of photos: Looking at photos in a quiet place such as a library or a community room, may invite more careful and deeper thoughts, than viewing them on a corner of a street, where people’s attention might easily get distracted.
Also, notions about what is an appropriate photo to share publicly are partially culturally constructed, which can be an ethical challenge in terms of dissemination. For example, sharing a photo of a naked infant was considered normal for Vietnamese participants in a project investigating mothers’ experiences of infant settling (Murray & Nash, 2017). However, even with the explicit consent from the subjects, in this case, the parents, researchers are advised to use the photos carefully and obscure the photos where necessary to protect the subjects from potential embarrassment and exploitation. The intended outcomes of the dissemination, the changes and transformations made on the community level can also be important elements to pay attention to as it reveals people’s response patterns to the findings.
Discussion
Our question matrices have been developed as a tool that can support a more comprehensive and informed approach to photovoice analysis and highlight the analytical potential of the often neglected analytical dimensions. Our intention is to move the analytical framework in photovoice beyond a mere narrative approach and unfold the multilayered meaning of experiences as inventories in photovoice research projects. While the three modalities in our question matrices have been presented separately, they are in fact interrelated and can influence one another. For instance, particular choices made on technological modality and social modality level may influence the composition of a photo. Analyzing one modality in relation to other modalities therefore facilitates the interpretation exercises of researchers and provides a richer account of the data available. The matrices are helpful for both beginning and expert photovoice researchers and should not be used as a strict checklist but rather as a guide to support good practice. The difference between a checklist approach and a guide approach is that a checklist is like an anchor, providing a list of items to comply with when conducting research, while a guide acts like a compass, giving researchers a direction. Our matrices are by no means a tool designed to routinize or instrumentalize researchers’ thinking into a series of prescribed questions to follow blindly in their photovoice projects. Rather, they provide clarity about the options when conducting research. The responsibility for choosing the ones appropriate to a specific research project remains with the researchers. Our matrices not only help researchers to evaluate potential analytical options but also to motivate their choices. The matrices are meant to spark our imagination on how analytical choices work against or in favor of power mechanisms introduced in a photovoice research process. We discuss two elements that feed into this argument.
First, we have integrated the compositional modality, referring to an interpretation of photos based on the quality lens in our question matrices. Compared with the more conventional strategy of analyzing the narratives and the symbolic layer of photos, it provides an alternative way of interpreting photos, focusing more on the “experiential, embodied response” of people (Rose, 2016). Elements of visual art and principles of design may disrupt old perceptions and allow deeper interpretations. They help us to enter the tacit knowledge level of participants (Polanyi, 1967). Analyzing through a quality lens enriches, extends, mutates, and challenges other complementary or competing interpretations of visual data (Coemans et al., 2019), an argument that requires formal testing on an empirical level. In addition, it places more analytical weight on the photos, which might be beneficial for people who have a low command of spoken languages or are hesitant to speak.
A second important consequence of proposing comprehensive matrices is that analytical attention has shifted from the actual analytical and interpretation phase in a research cycle to alertness for how decisions on all sites and modalities influence our constant flow of analytical thinking. Our question matrices highlight several aspects currently underinvestigated in photovoice. They also invite us to pay attention to what is not photographed, particularly when considering the social modality related to a project.
If photovoice is meant to embrace critical social theory and social constructionism as the underpinning philosophical stance, then we might fall short in bringing people’s voice out if we stick to a conventional analytical approach. In critical social theory, research should aim at diminishing social injustice as well as changing and transforming circumstances on individual and community levels (Savin-Baden & Major, 2013). Photovoice provides an opportunity for participants to co-construct voice by entrusting cameras into the hands of less powerful ones, such as grassroots communities, and letting them have a “say” in public discourse and policy making (Luttrell & Chalfen, 2010). Social constructionist research emphasizes dialogue (Savin-Baden & Major, 2013), which means that participants and researchers are treated as equals and co-learners and are in a constant dialogue to exchange knowledge and understandings (Johnston, 2016). The different dimensions discussed provide a gateway through which researchers, participants, and the public can co-construct through the process. For example, a dialogue can be integrated into the training of photography (site of production, technological modality), discussion on ethical issues (site of production, site of audiencing, social modality), or discussing photos (site of audiencing, compositional modality).
Working our way through the analytical dimensions reveals particular risk areas for disempowerment as well. Researchers are vested with the power to give a particular type of voice to the participants as well as to remove certain possibilities from the process. Our analytical choices may well reinforce or reproduce the power mechanisms that we tend to avoid. For instance, as described in Catalani and Minkler’s review (2010), some researchers prefer not to include technical training, as they believe that the untrained eye of participants provides a rich source of data in itself. Training might interfere with participants’ spontaneous way of expressing themselves, taking into account the expectations of others who present themselves as experts and are therefore more skilled in photographic practice. This may disempower the participants. We argue that engaging participants in the process of what analytical options to adopt in different sites and modalities, rather than giving them a voice, reveals the true spirit of participation. Participants need to be aware of all the choices they have for the sites and modalities, for example, the options on how they voice their concerns, how to frame things, whether or not to stage. This facilitates informed choice and speaks toward the idea of power with the people. Our role as a researcher is to talk about these options through with them. Instead of serving only our interests, we also work toward awareness raising and capacity building of participants, for example, by creating learning opportunities on the technological level, which empowers them to see things differently or clearer, by providing ethical training on the social level, which gives them a sense of how to gather data under ethical guidelines, or by discussing both the symbols and qualities of their photos, which makes them more aware of their tacit knowledge, learn from other participants and gain a deeper understanding toward their photos.
Several authors have stated that reading photos based on their intrinsic qualities provides access to emotional layers of information that might not appear in a narrative (Coemans et al., 2019). In this way, interpretations resulting from compositional readings add to the story lines developed from the narratives provided. While different types of readings may complement each other, leading to a more comprehensive understanding, they may also unintentionally introduce a bias related to the overinterpretation of data. Readings that require specific skills and knowledge frames may shift the interpretation in a different direction than originally intended by participants. This raises an ethical concern that invites us into considering the credibility of the interpretations. Who speaks for whom and is this legitimate? Also, do we potentially expose something participants do not wish to share? The fragile symbolic relationship between different members of the team should carefully be monitored for unintentional power mechanisms that might do harm to participants or keep hierarchies in place.
The question matrices have two limitations. One is pragmatic and the other one is of a more scientific nature and related to the trustworthiness of the findings. The added dimensions in the comprehensive framework for analysis compared to the conventional analytical strategies might increase the workload for researchers. In addition, a certain level of aesthetic academic literacy is required from researchers to be able to use all dimensions. It is therefore recommendable to invest in the development of training that can be embedded in university curricula to prepare students for working in a multimodal scientific culture and society.
The matrices have mainly been developed with photovoice projects in mind and from a critical empowerment perspective. The questions adapted from Rose (2016) make them relevant for other types of visual research projects such as photo diary, photo elicitation, and drawing. However, its value has yet to be proven in such cases. The question matrices are the result of studying the literature. Its relevance has to be judged based on the application and replication of this framework to a variety of study topics approached via photovoice research. This will allow us to evaluate how the matrices can be optimized for further use.
Conclusion
The comprehensive question matrices presented in this paper support photovoice researchers in thinking their analytical options through and motivating the choices they make to opt for particular analytical strategies. The examples provided are meant to raise awareness in photovoice researchers about the potential risks and benefits associated with each analytical decision made. First, the matrices are an invitation to researchers to increase transparency on the analytical decisions they make for others to learn from. Second, the question matrices provide a basis for discussing analytical options with participants and making well-informed decisions about what, how, where, and when to wear or share a pair of analytical glasses. They open up opportunities to think beyond analysis as an isolated phase in a research process. They particularly emphasize the social modality involved in critical, participatory research of a visual nature. As photovoice is a method that is continuous in development, we encourage researchers to add new dimensions and additional questions to the matrices and consider them as a living document in constant need for future refinement as visual methodologies progress over time.
Footnotes
Acknowledgments
We wish to thank Professor Dr. Luc Pauwels, Professor Dr. Ching Lin Pang, Professor Dr. Richard Siegesmund, and Professor Dr. Lode Vermeersch for their constructive comments and suggestions to improve the article. We appreciate the helpful feedback received from the entire qualitative inquiry team of our research group. We are also grateful to the anonymous reviewers for their insightful comments on the previous versions of the article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article. This study was funded by China Scholarship Council (File No. 201307650005) and a policy mandate from KU Leuven.
