Abstract
Gender and intersectional data are recognized as vital to addressing gender-based violence. We engage this thesis through a case study of a gender data project at the Colombia-Venezuela border. Coming from an underexplored vantage point in the literature, we trouble the assumption that more data are always better for advancing feminist objectives around GBV. We advance the concept of “negotiated refusal” to make sense of the decision of the project's frontline implementers to collect less data. We argue that the complex character of inequalities and the dynamic nature of context requires flexibility in what gender and intersectional data should consist of and that top-down frameworks may ultimately prove counter-productive to gender equality efforts.
Introduction
On June 18, 2024, “CRAF’d,” a joint fund inspired by the UN Secretary General's Data Strategy and made up of the European Union, Finland, Germany, the Netherlands, and the United States, launched a 3 million dollar open call for projects using data to address gender-based violence in fragile and crisis contexts. 1 This call forms part of the “gender data revolution,” a theory of change that first emerged in international development approximately a decade ago that posits that more and better data on women and girls are imperative to achieving gender equality. Over the past several years, the gender data revolution has evolved, broadened, and crystallized into a “gender data agenda” that functions as a mode of governance, training the attention of development and humanitarian actors on data collection not only as part of monitoring and evaluation processes but as tantamount to all areas of sustainable development (Cookson & Fuentes, 2025).
Recently, there have been two notable shifts within the gender data agenda. The first relates to calls for more granularity and data disaggregation beyond gender to highlight intersecting inequalities (Badiee & Buvinic, n.d), including the creation of frameworks to guide these efforts. The second has been an increasing recognition of alternative or “complementary” forms of data to fill gaps in official statistics, including to help generate the intersectional data solicited through the first shift. One such alternative approach is citizen-generated data (CGD), which involves communities or organizations representing them collecting data about the problems that affect them, including through the use of digital technologies. 2 While the goals and motivations of CGD vary significantly based on a myriad of factors (Crooks & Currie, 2021; D’Ignazio et al., 2022; Heeks & Shekhar, 2019), broadly speaking, these initiatives seek to produce more localized and empirically textured data in order to (re)direct attention and resources. These recent shifts constitute what some refer to as an “inclusive” or “intersectional data” approach.
Efforts to generate gender data have largely been understood either as “top-down” governance imperatives (e.g., Taylor, 2020) or “bottom-up” activist responses (e.g., Suárez Val, 2023), both of which are shaped by the prevailing assumption that more data is better. This article troubles this assumption through an empirical case of a citizen-generated gender data project we operated called “Cosas de Mujeres” (in English, “Women’ Stuff”) that generated gender-based violence (GBV) data during a development-humanitarian crisis. The case presents an example of frontline implementers deciding to collect less data despite requests from a range of development and humanitarian actors to collect more—including in cases where the data could have been used to better direct attention and resources.
In analyzing this case, we draw on the concept of “refusal” in feminist and critical data studies (Garcia et al., 2020) to advance the concept of negotiated refusal. Negotiated refusal helps us make sense of two data collection decisions: (1) to forego an opportunity to collect multiply disaggregated “intersectional data” because it was not deemed reasonable nor obviously useful, and (2) to stop collecting location data that were useful but became unreasonable in a rapid change of context. In both cases of refusal, we negotiated as a team, with local collaborators and with donors, ultimately arriving at alternative actions. Through analysis of the motivations and constraints that shaped our data practices, this article complicates a top-down/bottom-up narrative about gender data. Moreover, it underscores that collecting gender data is as much a social activity as it is a technical one. In the context of growing interest in “inclusive” and “intersectional” gender data, and the development of technical frameworks to guide its generation, we ultimately argue for principles of flexibility: not all gender or intersectional data should be collected. For citizen-generated data projects in particular, data collection priorities should be defined iteratively, mediated by what is useful and reasonable in a particular context.
The first section of this article reviews the gender data revolution as it relates to GBV data, discussing how its guiding thesis has been promoted, critiqued, adapted, and evolved. The second and third sections introduce our empirical case, “Cosas de Mujeres,” and outline our action research approach and data sources. Section four presents our findings in two parts, focusing on intersectional data and location data, and considering these, section five advances the concept of “negotiated refusal” to make sense of our decision to produce less rather than more data. We conclude with implications for both scholarship and practice.
The Gender Data Revolution: An Evolving “Measurement Imperative”
In 2014, then U.S. Secretary of State Hilary Clinton called for more and better data on the lives of women and girls to catalyze progress on gender equality. This spurred a series of high-profile financial and partnership commitments to close gender data gaps in alignment with the United Nations Sustainable Development Goals (SDGs) (Rogers, 2016). This call built on what is now over a quarter century of feminist and women's organizing for improvement in the generation and use of gender statistics in development (Azcona & Bhatt, 2020). Many of the resulting efforts have revolved around generating cross-comparable quantitative data for tracking progress on the SDGs, prompting some to critique a “governance by indicators” approach (Taylor, 2020). Notably, the gender data revolution emerged within a broader accountability and evidence-based policy regime in international development, one informed by an “audit culture” in high-income countries funding development work (Eyben et al., 2015). At the same time, the adoption of the gender data agenda also reflects more grassroots concerns with the need to generate visibility of problems that otherwise go ignored and unaddressed across many different spheres of social, political, and economic life.
The gender data revolution as it applies to GBV builds upon a long history—pre-dating the SDGs—of efforts by women's rights advocates to generate and use different forms of data to bring the then-largely hidden problem of GBV to light and to help women to understand that they were not alone. Among many other initiatives, this includes qualitative research endeavors of the 1970s to document domestic violence (Leung et al., 2019), efforts of activists in Mexico to count femi(ni)cides in Ciudad Juarez that the government systematically denied (Benítez, 1999), and larger scale quantitative and population-based prevalence studies by the World Health Organization and others in the 1990s and early 2000s (García-Moreno et al., 2005). Where such efforts were previously siloed and on the margins of the development agenda, the establishment of the SDGs and calls for a gender data revolution normalized the production of gender data collection as a development priority.
Today, calls to “urgently close the data gap to end violence against women” abound (UNECE, 2019). The United Nations Economic Commission for Europe, for example, asserts that while “having the full picture is crucial for effective action to end violence against women … efforts to address this critical sustainable development and human rights challenge remain severely hampered by lack of data” (UNECE, 2019). This theory of change drives a range of data efforts, from an Ernst and Young-funded initiative that uses data analytics to “eradicate gender-based violence in Kenya's urban slums” (Ernst & Young, n.d.), to the inclusion of improving data and evidence on violence as a key activity in a global plan of action on GBV adopted by the 193 Member States of the 69th World Health Assembly (World Health Organization 2016), to the most recent iteration of the U.S. government's strategy to eliminate GBV worldwide, of which advancing gender data is a core pillar (U.S. State Department, 2022). The Women, Peace and Security agenda, meanwhile, sees data collection on serious issues such as conflict-related sexual violence as imperative to better outcomes, even while recognizing that these data are extremely difficult to collect (UN Security Council, 2010).
The mandate to generate more data on gender inequalities has also given rise to various critiques. The focus has largely (though not exclusively, as discussed below) focused on the production of quantitative data that are objective and comparable across contexts and thus lends itself to measuring progress. Many critics have argued that quantitative indicators only provide a façade of objectivity and that their very construction rests upon the subjective application of norms and ideas about what matters. Some have theorized data collection mandates from elite institutions as a “measurement imperative” (Buss, 2015) constituting a governance regime shaping gender politics and development practice. They raise several problems with quantitative governance. The “seduction of quantification,” for example, masks the underlying drivers of gendered inequalities (Merry, 2016). This is because indicators are the result of a process of simplification, which “results from the exclusion of social dimensions that cannot easily be translated into categories, not because they are unimportant, but because they are rather complex and fluid” (Liebowitz & Zwingel, 2014, p. 356). Development interventions that appear objectively successful when impact is measured on a select handful of indicators may be masking unjust gender dynamics that are otherwise revealed through the use of a broader range of information and data-gathering techniques (Cookson, 2018; Mills, 2016). This all has important implications for gender and development practice because funding and ultimately policy and programming decisions get made based on what is measurable (Fejerskov, 2017).
Scholars of measurement and violence specifically have challenged the thesis that “if more are counted, they’ll count more” (Nelson, 2015, p. 41). Rather, projects of quantitative accounting are never as simple as counting the number of victims (or survivors) of violence (Nelson, 2015). Numbers can obscure different experiences, and there is vast underreporting due to factors such as stigma and inadequate institutions and referral pathways (True, 2020). This is true with regard to intimate partner violence and understanding women's experiences of trying to access support in its aftermath (Fuentes, 2020). For these and other reasons, feminist scholars have long emphasized the role of qualitative data for uncovering and explaining inequalities and as part of a broader conceptualization of methods of knowledge production (Ellsberg & Heise, 2005; Sen, 2020). Elsewhere, we have built on these calls, advocating for “widen[ing] the catchment of data that is considered important for understanding the impact of … violence, and for identifying financial, strategic, and programmatic forms of support that could help catalyze meaningful change” (Fuentes & Cookson, 2020, p. 14).
Such critical engagement appears to have had some purchase. Taylor (2020) for example demonstrates how UN Women engages strategically with quantitative data collection mandates by harnessing the “language of numbers” to bring attention to dynamics of gender inequality such as unpaid labor and GBV. In making a call for better-disaggregated data across the SDGs, UN Women helped advance women's representation in the quantitative knowledge base from which many development decisions are made, while also recognizing the limits of indicators for explaining progress or stasis (Taylor, 2020). Efforts to leverage the tools and frames from hegemonic data epistemologies are also increasingly notable at the grassroots and local levels (D’Ignazio, 2024). Reflecting on the work of community-based femi(ni)cide observatories, Suárez Val has developed the concept of “strategic datafication” to make sense of the motivations behind data activism that seeks to mobilize the power and legitimacy that may derive from quantification, without reducing victims to numbers (Suárez Val, 2023).
A recent shift within the gender data agenda favors greater disaggregation beyond sex or gender to highlight intersecting inequalities. This shift builds on the recognition of discrimination based on a range of factors (e.g., disability, race, national origin, property, political opinion) in the 2030 Sustainable Development Agenda (Azcona & Bhatt, 2020). The Inclusive Data Charter was founded in 2018 with 10 partner organizations (“Inclusive Data Charter Champions”) as “a global initiative to mobilize political commitments and meaningful actions to advance inclusive and disaggregated data.” 3 Its recent 2023 strategic review found an increased interest in intersectional approaches to data that include and extend beyond data disaggregation to analyses aimed at reducing inequalities within and between groups. 4
In a post at the end of 2023, the Executive Director of Data2x, an organization founded to advance the gender data revolution globally, wrote in a blog post that: “We all know that granular evidence on those features of exclusion that intersect with gender is the first step towards the design of effective policies that will empower the excluded, close gender inequality gaps and leave no one behind.” (Baptista, 2023). The post announced a new commitment in the organization “to demystify what it takes to form a well-functioning gender data system that roots intersectionality in a gender and development perspective and derives practical implications for effective policymaking in the Global South” (Baptista, 2023). Such a shift in an organization that has been foundational to the rise and reach of the gender data revolution is noteworthy, not least because it raises questions around shifting data goalposts—how much data is necessary to spur action on injustices? (Cookson & Fuentes, 2025).
In the spirit of recognizing the possibilities and limitations of different forms of knowledge production, in the past decade, there has also been a shift toward recognizing alternative or “complementary” forms of data. This includes passively generated “big data” that provide real-time insights into gender differences at a large scale (Vaitla et al., 2020), though relatively little of which is used to identify or address GBV (c.f. Belotti et al., 2021).
More recently is a shift to recognize citizen-generated data (CGD), “a problem-focused data source that can complement and fill gaps in national statistics” and that “involves communities collaborating to collect data they need to understand and tackle a problem that affects them directly” (Global Partnership, 2018). The Global Partnership for Sustainable Development Data notes several advantages of CGD, including that it can foster public engagement and participation in more inclusive government decision-making, it is typically lower-cost, can be real time, and can be more disaggregated (Global Partnership, 2018.). CGD often makes use of digital technology to facilitate grassroots or crowd-sourced data collection including on sexual and gender-based violence (Cookson et al., 2023). A task force set up by the Global Partnership is currently working toward a set of recommendations around the adoption and use of CGD, in what appears to be a new iteration of “normalizing” a data collection agenda. While CGD very recently appears to be gaining some acceptance in high-level development discourse and practice, this emerging international agenda is predated by a rich history of subnational and community-level analog “data activism” around GBV, in which individuals take on the painstaking work of combing through media reports to monitor and track cases of femi(ni)cide (D’Ignazio, 2024).
There is also some trepidation about what CGD can achieve, particularly considering the power relations shaping data collection, governance, and (in)action (Crooks & Currie, 2021). Some note that over indexing on the importance of CGD risks pushing responsibility for documentation of harms onto the very communities experiencing them (Crooks & Currie, 2021). Furthermore, the dynamics that CGD seeks to capture are often already known to the state and that CGD initiatives lean on the financial and time investments of marginalized citizens to map inequalities, while the state retreats and divests (Crooks & Currie, 2021). In these cases, the data produced through CGD are unlikely to lead to action because lack of data is not the critical barrier. Others note that the intended data users of CGD initiatives (service providers, government agencies, or international donors) may be reluctant to use the data due either to concerns or doubts over data quality or because of a priori devaluation of the communities doing the data collection and/or whose experiences are the focus of the data (Heeks & Shekhar, 2019). Feminist critical data studies scholars focused on GBV have shown why it is necessary to both acknowledge the significant contributions of local-level efforts to map or “countermap” violence through CGD initiatives, while also recognizing the risks, costs, and limitations of these data practices in the face of structural inequalities (D’Ignazio et al., 2022; McIlwaine et al., 2023).
Cumulatively, these more recent shifts in the gender data revolution embrace, at the highest levels of the gender data agenda, citizen-generated data and greater disaggregation toward what is increasingly referred to as an “intersectional data” approach. Much of the literature reviewed above treats gender data either as a “top-down” governance imperative or as a “bottom-up” grassroots activist response. In both cases, there is an implicit assumption that more data is better. These assumptions, however, present tensions that are under-explored in the literature. This article presents an empirical case that troubles these assumptions.
Cosas De Mujeres: A Citizen-Generated Gender Data Project
Cosas de Mujeres is a citizen-generated gender data project that operated in Colombia between January 2020 to August 2022. At the time, Colombia was receiving some 2.5 million Venezuelan migrants annually since 2015 in what became the largest recorded refugee crisis in the Americas (UNHCR, 2022, p. 5). It was also the site of conflict-fueled internal displacement of an additional 9.5 million people (Government of Colombia, 2023), constituting a “humanitarian-development nexus” (Hinds, 2015). Cosas de Mujeres operated in three cities: Cúcuta (pop. 711,715) on the border with Venezuela, Cartagena (pop. 914,552) on the Caribbean coast, and Bucaramanga (pop. 581,130) an inland city. The project was funded by USAID Colombia Transforma, Global Affairs Canada, UN Women, Turn.io, and Ladysmith (an organization cofounded by the authors).
The idea for Cosas de Mujeres (CDM) followed from the publication of an op-ed in the Washington Post written by one of the project's founders (Zulver, 2019). Dr. Zulver had received reports of GBV and an unmet need for prevention and response services. Upon questioning, local officials explained that while they too had observed the problem, they were unable to respond because of a lack of systematically collected data. In the words of one official, as a result, “perpetrators take advantage of the fact that [women] are almost invisible” (Zulver, 2019). While frontline humanitarian and development agency staff knew that GBV was widespread, they cited a lack of data as preventing them from acting. Upon discussion, we decided to develop a program that addressed the specific barrier to action cited by the service providers: a lack of gender data.
The development of CDM included a design period in which we consulted with intended data users about what they needed. Our consultations included program staff and lawyers at justice institutions, local women's rights advocates and civil society organizations serving women and LGBTQ populations, social workers, clergy at a Catholic church serving peripheral neighborhoods, local academics focused on GBV, NGO staff at clinics that provide women's health services, multilateral humanitarian and global development organizations, and foreign aid institutions. We also consulted migrant women (and some men, when they approached us) about their access to services and resources, including technology (e.g., phones, internet, data). Consultations took place where migrants gathered: the border crossing, health clinics and dispersal points for in-kind transfers, and a neighborhood where women engaged in sex work.
During this initial period of field research, we learned about various constraints service providers faced, in particular a shortfall of resources to meet demand. We also learned about the deeply challenging context migrant women navigated. After crossing into Colombia via pathways where sexual and gender-based violence was common, they arrived at a city and a system of social services that was entirely new to them. We found that they were often unaware of the services that did exist. Many arrived with children in tow, for whom they were solely responsible because their partners had migrated earlier and elsewhere. They were not eligible to work formally and as such scrambled to secure food and rent in overcrowded housing, often in quite peripheral neighborhoods. Women sold their hair and engaged in survival sex to make ends meet. Venezuelan women were framed in popular narratives as hyper-sexualized, a stereotype that informed the kinds of sexual violence perpetrated against them. At the same time, we were made aware of instances of Colombian women opening their doors to Venezuelan women in an act of solidarity. We also learned that many of them had access to a smartphone (either their own or through a family or community member) and were using WhatsApp to coordinate their journeys and stay in touch with family located elsewhere. 5
Ultimately, we landed on a project that sought to address GBV through three pathways: first, by using WhatsApp to provide women with information about where they can access services that prevent and respond to GBV; second, by generating anonymous data that can be used to inform a medium and long-term advocacy strategy for more effective service delivery; and third, by fostering solidarity among Venezuelan and Colombian women in host communities. 6 This approach aimed to avoid a “data extractivism” model that collects personal data to generate value that may not be experienced as a benefit to the population from whom the data were collected (Horst et al., 2024). Rather, we sought to meet women's needs in the short term, by providing them with information about existing services, as well as in the medium and long terms, by using the data to advocate for service improvements that would ultimately reduce vulnerability to violence and better respond when it was perpetrated. While our project had these interrelated objectives and pathways for addressing GBV, in practice, when there was tension between meeting a short-term need (e.g., a woman seeking out a shelter) and gathering data for advocacy, we always traded off data collection in favor of service provision (e.g., if a woman did not provide her consent to have the data she shared anonymized and recorded, this did not preclude her from receiving the service).
Following advice from a local feminist organization, we advertised the project as one addressing broadly “cosas de mujeres” (women's stuff) to reduce the risk that women would be made unsafe if found to be interacting with an antiviolence project. We only discussed the project's aim of addressing GBV explicitly when in direct conversation with service providers, donors, and with women themselves in safe spaces, such as when our community workers connected one-on-one with women and during community meetings and workshops. This ultimately had the advantage of enabling us to collect data and respond more comprehensively to complicated situations in which poverty, housing instability, care responsibilities, and lack of access to sexual and reproductive and other health services compounded to exacerbate vulnerability to violence (Maclin et al., 2022).
The project worked as follows. A team of local community workers disseminated project information through GBV workshops, posting flyers with the WhatsApp number on telephone poles and bathroom stalls in bars, speaking with people about it at service points such as health posts, beauty salons, and soup kitchens, and networking with community leaders. When someone messaged the WhatsApp number, they received an automated response explaining that CDM provided information about services relevant to women, asking for consent to share de-identified data, and asking how CDM could help. Afterward, a CDM telephone operator—typically a Colombian or Venezuelan woman living in Colombia, although at first it was our founding team members—took over the messaging. The conversation from that point on was between two humans. The content generated by these message threads was our data. From the message threads, and where women consented, we extracted anonymized quantitative and qualitative data on challenges women were facing and their perceptions of their own needs. We never asked for names, identification numbers, street addresses, or any other identifying information. When these were provided without our request, they were never recorded in our data sheets.
We routinely analyzed these data and disseminated it in the form of workshops and “Gender Data Briefs” with the intended data users we initially consulted, plus others in the consortium of international organizations working on the refugee and migrant crisis response (GIFFM), the local police who received and could respond to GBV reports, and the national statistics office (DANE, 2021). The project eventually expanded to Cartagena and Bucaramanga where it operated until 2022. Ultimately, CDM met the criteria for a citizen-generated data project by being explicitly problem-focused, filling gaps in national statistics, and involving communities in collecting data on a problem that affects them directly. CDM challenges typical characterizations of citizen data projects because it was neither entirely community-rooted (e.g., data projects in D’Ignazio, 2024), nor was it a purely top-down “helicoptered in” initiative (Heeks & Shekhar, 2019).
Research Approach and Data Sources
This article presents a qualitative case study of Cosas de Mujeres. Case studies fulfill a range of unique functions in social science research. Significant among these is the identification of causal relationships and “strategic structure,” that is “how interaction effects of one kind or another influence options, processes and outcomes” (Widner et al., 2022, p. 4). Case studies thus document how preexisting contextual factors like environment or income influence outcomes, or similarly how practices of negotiation and contestation do (Widner et al., 2022, p. 8).
We adopt an action research approach to this case study. While action research encompasses a range of different models, the approach is broadly distinguished by a dual commitment to address practical concerns and contribute to social scientific knowledge (Koshy, 2005). It involves “learning in and through action and reflection” (McNiff, 2013, p. 24), including by “thinking carefully about the circumstances you are in, how you got here, and why the situation is as it is” (McNiff, 2013, p. 25). As “constructive enquiry” (Koshy, 2005, p. 9), action research is an attractive mode of knowledge production for research-practitioners, including at the intersection of technology and humanitarian action (Dekker et al., 2022; Holeman & Kane, 2020). Our engagement with CDM at the time of its implementation was practice oriented, though we brought our scholarly training to bear on it. We assumed lead roles in fundraising, context and user research, project design and iteration, research on technology considerations and requirements, network building at local, national, and international scales, responding to WhatsApp messages, analyzing the data, and disseminating findings.
Our data sources for this article draw on five sources of primary data in English and Spanish, including two years of (1) field notes; (2) internal project communication (e.g., emails, project documentation); (3) program policies and protocols; (4) external project presentations; and (5) interviews with eight CDM staff members conducted after the project was completed. Spanish language data presented in this article have been translated by the authors. We analyzed these data with a view to understanding how a data collection imperative plays out in practice.
Findings: Collecting Gender (and Other) Data for a GBV Project
An overarching finding is that we produced fewer data points, rather than more data points, and this was largely a conscious choice. This dynamic emerged in two ways—with regard to what is increasingly referred to as “intersectional data,” and with regard to location (geographic) data. We discuss both in turn through the lens of “negotiated refusal.”
Refusing “Intersectional Data”
The consultations we conducted during the design phase of our project yielded requests for 25 different data points (33 total requests, including duplicates):
Number-victims of violence Norms and beliefs Type of violence Inadequacy of collective response (gaps in response?) Trafficking (“trata”) Location (“zona”) Perpetrator Relationship status of perpetrator Sought response services? yes/no/why Age (child/adolescent/adult) Sought services? (follow-up) What happened at the services? Type of violence Education level Income level Revictimization Type of GBV Men (number/experiences of) Mobility Pregnant? Children? Service availability Service quality Migration status Sexuality (LGBTQ)
Some of these data points were demographic in the sense of being identity related. These included age, education level, income level, migration status, and sexuality. 7 Others were specifically related to violent incidents: type of violence, location of the incident, relationship to the perpetrator, whether it was a revictimization. Others still related to women's experiences with service provision: whether they sought response services after the incident, what happened when they did, and perceptions of availability, accessibility, and quality. Yet others related to the woman's broader context: was she pregnant or did she have children, was she in a situation of trafficking, and where in the city was she located.
Many of these requests came with reasonable motivations. For example, age was useful to a legal defense organization because Colombian law afforded children and adults different protections. Age and gender were requested because of concerns that children and men could be left out of the humanitarian response. Trafficking was viewed as a pervasive issue hidden in plain sight, and thus was not being adequately addressed—data could bring it to light. We received requests for data on location and experiences with service provision to evidence what was already widely perceived to be an inadequate response, particularly in peripheral neighborhoods, and to better direct funding and efforts. The request for data on pregnancy status was because of widespread stories about obstetric mistreatment and violence against Venezuelan women trying to access maternity services in a deeply overwhelmed public health system. Data on whether a woman had children reflected a recognition that “children change things,” including the need for income and care services. Notably, we were asked to and asked not to collect data on migration status (and not to collect data on involvement in sex work) for fear that collecting such data could render women more vulnerable. Many of these requests reflected a desire for what, in the words of a donor, were perceived as “practical data”—particularly that which could help identify “gaps in the collective response” (Field Notes, October 2019).
We ultimately accommodated only some of these data requests. Our decision-making process for which data points to collect was informed by several constraints, some of which were technical, others of which were social, and these shifted somewhat over time due to a rapidly evolving context. We originally decided to collect the data points depicted in Table 1, based on the listed rationale. Some of the data points were actively collected, such as age and location. Others surfaced through the data analyst's interpretation of the messages (e.g., type of violence).
Data Points Originally Collected by Cosas de Mujeres.
There were several motivations driving our refusal to collect some data. The parameters around our data collection were iterative: they were mediated by constantly refocusing on what we perceived to be “practical” data, based on our interactions with women, service providers, and donors in the specific context. We wanted to collect data that were reasonable to solicit given that we were collecting data from women in vulnerable situations, and useful for achieving both the medium-longer term aim of informing stronger GBV prevention and response services, and the complementary, short-term aim of meeting women's immediate need for information about existing services. We determined that soliciting the maximum number of demographic data points related to women's identities was not compatible with these aims.
First, we focused on what was considered reasonable. A few factors motivated our decision to ask the minimum number of questions. One was prior experience in frontline GBV practice and established GBV response “best practice”: we were aware that the more questions asked, the higher the likelihood of losing the person because the line of questioning felt suspicious (Fulfer et al., 2007; Heron & Eisma, 2021). A CDM team member recounted this experience: …it so quickly became like, oh, we can't actually get all of those data points reasonably because you would end up just, like, [firing off questions] like, how old are you? Where are you located? What's your nationality? And as soon as you got into stuff like that around nationality, for example, it was like, ooh, then the person on the other end of the line is going to be like, why are you asking me that? Particularly in a context, if it's like they're, you know, they are undocumented…. You know … it was like, too much. And then, like, as if you're going to ask someone like, oh, you know, are you a sex worker? Have you been trafficked?… You can understand why organizations are interested in that kind of data, but it was really not clear. It definitely wasn't helping us meet the short-term needs. And then all of those data points actually sort of, like, became [about] trying to collect them. (CDM Staff #7 Interview)
This tension was notable, for example, when a woman wrote requesting information about where to access abortion services. In such cases, women often declined—refused—to answer any other identity-based questions. This repeated refusal was itself an important form of useful data (CDM Staff #8 Interview). Ultimately, it enabled us to iterate on the service we provided by iterating on our approach—we were more likely to be able to arrive at the point in the messaging thread where we provided the needed information—and build trust for a repeated interaction should a women need further help—if we asked fewer questions.
We were also aware of the risks women were taking by discussing violence on the phone. This surfaced in two ways. One was keeping women engaged in a line of questioning for too long when the perpetrator could be nearby. Our earlier field research had made us aware that women were often living in over-crowded housing, including sometimes with perpetrators of violence. We also knew that they were sharing phones with family and community members. We wanted to avoid a dynamic where women felt compelled to answer strings of identity-related questions before their need for information was met.
We were also concerned about the risks posed by “data joining” and how these might shape what was deemed reasonable to collect. Data joining is a process whereby anonymized data from different sources gets layered together in ways that comprise personally identifiable information. Our awareness about the risks of data joining was due to a separate stream of gender data work we were undertaking at the time, which focused on bridging the concerns and interests of technology companies working on “data for good” initiatives. Reducing the number of identity-related data points collected helped to minimize risk of identifying specific individuals or communities.
In addition to grappling with what data seemed reasonable to collect, the team also considered what would be useful. One of the ways we made sense of what counted as “useful” data was whether the data were helping tell a story about unmet need. The following excerpt from one CDM staff member emphasizes this point: It seemed important to me [to collect the data we did] to the extent that this allowed us, let's say, to characterize who those women were who were writing to us … well they are women between the ages of such and such, they are women who have between one and six children … who are requesting these services. And there we would count how many services were requested in the month for legal assistance, and suddenly a month later there were no requests for legal assistance, but there were many requests for information on regularization…. Another month it was another healthcare service, for example. Those situations helped us understand a bit how those women's needs were changing, it helped us understand a bit what they needed…. So for women who wrote to us and told us, look friend, I have my three children here, I’m begging on the street, I haven’t been able to feed them, [the data collection prompts] allowed us to understand in a quantified way what I was hearing in a qualitative manner. (CDM Staff #5 Interview)
Emphasizing data on needs was not straightforwardly a matter of disregarding the importance of understanding “who these women were.” Yet this focus does contrast with GBV data collection approaches that seek out multiple demographic data points and solicit information about victimization (which women often must repeat). Such approaches are oriented toward generating a picture about the women and which identity categories they fall into—sometimes because these are data the donor is interested in. 8 To be sure, we did actively collect some demographic points that were significant in shaping service access and referral pathways (such as migration status or children). Yet the motivation driving our data collection efforts was problem prevention and response, rather than problem description. In this context and bearing in mind the constraints around the number of questions it was reasonable to ask, it made sense to prioritize data that immediately indicated what types of interventions might address the inequalities and injustices being experienced. When we collected data on age and migration status (which feature commonly in intersectional/inclusive data frameworks) this decision was motivated by recognition that some service pathways were age or migration status dependent. Other demographic data points, such as the presence of children (which does not feature in intersectional/inclusive data frameworks), were motivated by what we heard in our project design consultations: that women's vulnerability to violence and their service access was often mediated by their responsibility for children.
In some instances, these decisions (or “refusals”) regarding data collection required us to negotiate and push back with donors and service providers, as described by team members: Some donors required like a big chunk of data. So, for example, one of them had like a monitoring and evaluation team that asked for very disaggregated data on our beneficiaries, as they call them. So, for example, knowing how many men and how many women, knowing their age, knowing if they were part of the LGBT community or if they have disabilities or some sort of, like a lot of categorization of the population that we were attending. And … we don't collect as many data points on their demographics, on their identifiable information because we just want to know, like, the situation they're going through and if their needs and interests are being met, and if not, why not, and if yes why, why are they doing that? So, the thing is that I think we could negotiate and push back and just taking a clear stance on why we want to collect as few data points as possible … just the necessary ones. (CDM Staff #4 Interview)
A key distinction requires underscoring here. Top-down guidance for the collection of inclusive or intersectional data tends to provide an ever-growing formula of identity-related data points for disaggregation. 9 These data points commonly include sex and/or gender, race, and age, though they increasingly also include disability, sexual orientation and gender identity (or “SOGI”), citizenship, and sometimes ethnicity. 10 Other factors such as economic status (class), religion, geography, and language are only sometimes included. This dynamic is not unique to practice-oriented conceptual frameworks. Bentley et al.'s (2023) recent review of 172 scholarly research articles that self-declare as deploying an “intersectional approach,” found that these are dominated by “additive thinking, in which separate lines of inquiry (around gender, race, class, etc.) are added together” (Bentley et al., 2023, p. 2). In practice, this leads to a layering of selected individual or group identifiers, as though these were sufficiently explanatory for situations of inequality. For Bentley et al., one answer in scholarly data science research is to practice “articulation,” that is, to recognize the historical origins of inequality, engage in systems thinking, and practice reflexivity in relation to one's own positionality (Bentley et al., 2023, p. 13). These reflections are particularly useful for improving scholarly inquiry; the authors acknowledge that “recommending intersectionality as a solution in data-related policy and practice often lacks clarity” for practitioners (Bentley et al., 2023, p. 13).
Some data science scholars have sought to move intersectional data practices beyond buzzwords, including through recommendations for data minimalism and relationship-building with communities affected by data practices (D’Ignazio & Klein, 2020; Vannini et al., 2019). But as the gender data agenda at the highest levels of the international development system expands into new measurement imperatives around “intersectional data,” there remain some key tensions that matter for practice. What are the implications of a “top-down” call for an intersectional data approach in the “specific datasets or data contexts” (Bentley et al., 2023) within which a citizen-generated data project like Cosas de Mujeres operates? First, what information is required to identify practical solutions (in addition to identifying inequalities), and second, what limitations or guardrails to an intersectional approach may be required to minimize risk to the very populations from and about whom data are being collected? 11
Refusing Useful Data (Location)
One of the data points we were asked to collect was data on location. We considered these to be useful data because they had immediate practical implications. Donors and service providers had a sense that migrant women were settling in neighborhoods that were not currently served by international cooperation (Field notes, November 2019). Many (though not all) of the services available were located in central neighborhoods in Cucuta that tended to be higher income.
12
In our fieldwork, we noted that women in peripheral barrios “feel left behind by international agencies, who they say show up at other barrios but not theirs” (Focus Group Discussion #1, February 2020). The distance to services, the cost of transportation, and the lack of childcare centers all served as barriers to access: “everything is just too far” (Focus Group Discussion # 2 March 2020). Donors thus asked for location data for them to better target their investments. In our project dissemination efforts, we were thus encouraged to “work ‘outwards in,’” starting with neighborhoods that aren’t being reached by international cooperation (Meeting with Donor 1). In communication with our donors upon completion of our pilot, we noted: Resolving mobility barriers is beyond the scope of our intervention, but we foresee that the data we collect on this may be useful to other mobility-focused initiatives, and there may be some opportunity in a scale-up to link to these initiatives. (Email, Feb 2020)
In the initial months of 2020, the situation in Cúcuta was deteriorating. Service providers related that the women they were seeing were poorer and “more desperate,” with an increasing need for psychosocial services in addition to help with necessities (Notes, February–March 2020). Then the COVID-19 pandemic arrived in Colombia. The border with Venezuela closed on March 14, 2020, restricting passage between the two countries and the government also instructed humanitarian responders to reduce service provision by half (Welsh, 2020). On March 24, the president declared a national lockdown which lasted until August 31 of the same year. In addition to closing schools, workplaces, and public transportation, services that are vital to GBV prevention and response were also closed or had access restricted.
We immediately noticed the impacts of the lockdown on women's physical safety and economic security in our message threads, reflecting a broader dynamic observed elsewhere where hotlines were available (Rocha et al., 2024). Women expressed their fear of sleeping on the streets with their children due to their inability to pay rent or access shelters, concern about where they would find income or vouchers to feed their families, and concern about being in overcrowded conditions with abusers, including potential abusers of their children. Even when the Colombian President ordered local services to provide resources to women and children who were experiencing domestic violence, we still received reports from women and from grassroots organizations that when women attempted to denounce violence to the officials, they were told that “the services are closed,” “what you are denouncing doesn’t constitute an emergency,” “you have to put up with the aggressor because of the movement ban,” and “woman, try to create a peaceful environment, so your man doesn’t get angry” (Zulver et al., 2020).
In this context, we made two significant decisions with regard to data collection. The first was to stop proactively requesting location and age data. With so many services closed and/or shifting to virtual models, the team members operating the phone felt that location and age were no longer essential data points. Moreover, they felt that requesting these data extended the time that a woman was on her phone and in a messaging conversation with CDM—and in a situation of violence, increased the risk that she could be caught messaging in. By June 2020, our team was discussing a “data minimization” policy, and by July, our data protocol no longer included active solicitation of location data, though these data were recorded if it was provided proactively.
These decisions were not always straightforward. In particular, the discussions around location involved working through a tension: we knew that location data were among one of the most useful data points we collected. In terms of enabling the identification of a problem, we heard consistently from women that services were located too far away from them. Arriving there would require multiple bus rides, and they did not have the fare to bring their children along. They also could not leave their children behind, because they often resided in overcrowded boarding houses where predatory men also stayed. When we shared these data with one of our donors, they explored the possibility of sending mobile service units to peripheral/underserved neighborhoods. On this basis, team members who interacted more closely with the donors were reluctant to stop collecting location data.
Meanwhile, the team members who were engaging with women via messages and were experiencing first-hand the sense of urgency inherent to these interactions, advocated for a minimalist approach that prioritized meeting women's immediate needs and reducing risk wherever possible in a deeply challenging situation: …in the course of just the aftermath of the pilot, Covid hit and I was the one who had possession of the phone and I saw [the messages from women experiencing GBV] in real time… …I remember that, thinking, why am I asking for her location? These services have all, for the most part, moved online. And this is extending the interaction. And she's potentially right now sheltered in place with a perpetrator, so we're no longer going to ask that. And so at that time … we made the decision to stop asking, what neighborhood are you in? (CDM Staff #3 Interview)
Understanding this decision requires appreciating that the work was not only technical—it was also deeply affective, influenced by human emotion, intuition, and connection. The urgency of women's situations came through in the qualitative content of the messages and influenced the data collection decisions that were made. The affective nature of the work is evident in an interview with a CDM digital platform operator, who was responsible for answering messages and systematizing the data. In reflecting upon what it was like collecting data through the platform, she said: I liked being able to talk to women everyday, learn about their lives, their thoughts, their feelings. Many of them opened up and told me about all kinds of situations…. It was like, let's say, suddenly removing all the barriers and getting to be closer, because it really is something that I also wanted to know about them, or that they knew that they are not alone, that I could understand them. And many of these women said nice things to me, others said “well, thank you.” I don’t know, it was nice really, I think nice is an understatement, but it was satisfying to be able to say “I’m here if you need to talk, I’ll listen to you.” (CDM Staff #5 Interview)
Appreciating the affective nature of the work—and how there was a sense of relation to the women who messaged in—helps to explicate the reflex to collect less data, rather than more. The argument for doing so certainly was not without context. The team members who operated the digital platform and answered the messages had educational and professional backgrounds in GBV response and there was a respect for the value of this intuition. Ultimately, we agreed to pursue the data minimization approach, though this was not without recognition of the costs that decision might also eventually have for our longer-term advocacy strategy.
Developing Alternative Actions
The second significant decision we took because of the rapidly evolving Covid context was to collect more data on service availability, accessibility, and quality. We had always collected these data passively through the messaging service, for example, when a woman sent us a message reporting that a service was closed or had refused her service. When we launched CDM, we collected data on services actively through what we referred to as a “ground-truthing” or “service-mapping” exercise, in which we assembled a list of available services and contacted these directly to verify that their advertised services were accurate. We also partnered with local civil society organizations to review and iterate on this service map. We occasionally updated the list to reflect changes when we were made aware of them by service providers or our CDM staff.
Even before the pandemic, we had begun observing that service provision was erratic and unreliable. In early March, one of our team members noted: I think we need to be programming in “re-ground truthing” service offerings (and adding new ones) every 2 months if possible. Changes happen quickly with service providers (quality of service; hours of operation; which services are offered to for free, etc). We can’t risk referring women to closed or poorly operating services. (Notes, February–March, 2020)
We were keenly aware at the time of the challenges service providers faced to meet needs that far outstretched their resources and capacities. After the Colombian government mandated service and travel restrictions, updates to the service mapping became even more urgent. This urgency remained even when measures were loosened because of deteriorating conditions in the broader humanitarian crisis response. The work of collecting information about service availability was undertaken by CDM community workers who undertook the work in person, by visiting service providers, calling them, and corresponding directly via WhatsApp with staff at those organizations. This work was also increasingly done by CDM staff that operated the messaging platform.
When we were designing the project, we did not initially conceive of our “ground-truthing” work as a source of data. Yet as the project progressed, it became clear that the resulting service mapping was an invaluable source of actionable gender data. When we discovered that a hotline number was not in service, or some operating hours were inaccurate, this prompted an effort on the part of local team members to reach out and change it.
The shifts in context we witnessed in Colombia prompted an evolution in what we consider to be useful or “practical data,” and how it could be generated. …I think when people think of GBV data, they're thinking of surveillance data, like data on the forms of the violence. Where is this happening, how is it happening? And I think one of the things we saw…is how important data on the services is. Because … when services fail to respond to women's needs or don't have the resources available, it creates [a] culture of impunity and it normalizes violence…. there's constantly this dialogue of a lot of the service providers and international organizations, I'd say, of like, “oh, we need to change the culture, the normalization of violence that people have. Oh, it's these poor people who have normalized violence”. No, these services have normalized the violence as well, because they don't have the resources as well. Mostly service providers wanted to do a better job. And so I think that the service mapping part and the evidence we collected on that [was] a really important set of information. (CDM Staff #1 Interview)
While we were refusing to collect location data, we simultaneously shifted toward new data points. These new data, which revealed a largely erratic and unpredictable service offering landscape, shifted our gaze toward state and international actors. While not a perfect replacement, these new data were also useful, making their way into op-eds, webinars, and conversations with governments and donors.
Discussion: Negotiated Refusal as Critical Feminist Data Practice
Feminist critical data studies scholars have developed the concept of “critical refusal” to make sense of cases where data projects (or the individuals behind them) “talk back” to or turn away from hegemonic data regimes. Critical refusal is not (only or always) about saying “no” to an “elite” data imperative (Taylor, 2020). It is also deeply generative and “seeded with a vision of what can and should be” if data practices are to be geared toward human rights and social justice (Garcia et al., 2020). In both research and practice contexts, critical refusal unfolds by stopping harmful data collection or by “negotiating and developing alternative actions” (Garcia et al., 2020). 13 Critical refusal is also used to make sense of resistance “from below.” For example, where individuals or communities whose experiences with data regimes have historically been governed by logics of (non-consensual) extraction or surveillance say no—even where they have not been given the right to say no (Zong & Matias, 2024). 14 In their work mapping activists’ data practices in relation to femi(ni)cide, D’Ignazio et al. have also advanced the concept of “counterdata” to make sense of efforts to reject and contest state and other “official” data (or missing data) about GBV (D’Ignazio et al., 2022; C. McIlwaine et al., 2023). 15
Cosas de Mujeres maps in interesting ways onto the politics and practice of refusal as theorized in feminist critical data studies. In the case of this citizen-generated gender data project, we collected less data rather than more data—pushing back, negotiating, and charting alternative pathways around and through the measurement imperatives that govern the calls for data from different actors. In many ways, we refused the “intersectional” data call because we did not see the collection of multiple data points on women's identities as useful—or, even if useful, perhaps not reasonable given the circumstances of GBV in a humanitarian setting. Moreover, when the circumstances in which we were operating changed, we even stopped collecting a useful data point (location), pivoting toward data on service availability and quality.
Reflecting on the motivations and (shifting) constraints that ultimately shaped these decisions, we see Cosas de Mujeres as a case of negotiated refusal. As with critical refusal, negotiated refusal accounts for the deeply social—rather than merely technical—nature of feminist data practices. It builds on the concept of critical refusal in two ways. First, it carves out space for making sense of feminist data practices “from the middle”: in other words, from the perspectives of those who are both tethered to the communities who are the focus of a data collection project, as well as to the broader international development ecosystem from which “top-down” agendas for data emerge. These varied positionalities meant that our team members understood (if not always agreed with) the logics and motivations governing requests for more gender data “from above,” including from donors and service providers. But because of our built-in collaborations with local grassroots organizations and women themselves, this also meant that we had access to and could be responsive to the data concerns and refusals emerging “bottom-up” vis-à-vis our own frontline team members and the communities Cosas de Mujeres sought to serve. Second, as a conceptualization of refusal “from the middle,” negotiated refusal accounts for the instances where data projects, or the people behind them, have some power to negotiate alternative actions. It is important to acknowledge that our ability to push back and to negotiate alternative ways forward with regards to data demands was due to the fact that we (the founders and designers of the project) did have some power. As other feminist critical data scholar-practitioners have noted, this option is not equally available to everyone: “[b]ecause refusal happens within these systems of power, people's options for refusal will depend on their relationship to power.” 16
As noted in the introduction, the gender data revolution is largely governed by an assumption that more data are better: that shedding light on inequalities will contribute to action to address those inequalities, including in the form of better policymaking and service provision. The expanding agenda around “intersectional data” hooks calls for more data points and disaggregation onto this theory of change. While we received requests to collect dozens of data points for Cosas de Mujeres, we decided not to accommodate them all, even if collecting these might have “closed a data gap.” Viewing our decision-making through the lens of “negotiated refusal” surfaces two important tensions associated with “intersectional data.”
First, our pushback against more data collection is not representative of a rejection of the value of an analysis that accounts for gender and intersecting inequalities experienced by women in the three cities we operated. Rather, it reflects resistance to the assumption that “top-down” or formulaic data agendas will align with what is useful and reasonable to collect in a particular context. Due to our team's collective and varied vantage points as feminist scholar-practitioners and community social workers with experience in GBV advocacy as well as technology and digital data, we knew that immediate service provision, safety, and privacy had to take precedence over data demands, even if those were tied to medium and long-term change goals. We said yes to some data demands, and no to others (without dismissing these outright), and instead sought to chart alternative pathways forward. We understood and were motivated to collect several data points, including women's identities (such as migration status), but these had a threshold-which itself was iterative-of practicality/usefulness and reasonableness. In the case of the former, the question was whether the data could help inform better policies and service provision; in the case of the latter, the question was whether collecting the data introduced disproportionate burdens or risks on women themselves in this particular context.
This leads us to a second reflection. As we have noted elsewhere, there are important distinctions that are often glossed over between gender data and sex-disaggregated statistics, as well as between “gender data” and gender analysis. In her study of how “gender equality” has been taken up and implemented in international development, Ellerby argues that efforts at “women's inclusion” take a “technocratic shortcut” that ultimately fails to tackle the power and interests that create the conditions for gender and intersecting inequalities (Ellerby, 2018). An additive approach to data collection on an expanding number of identity categories falls short of qualifying as “intersectional data.” Rather, it risks amounting to a “shortcut” around the kind of power analysis for which critical and practice-oriented feminist scholars have long advocated, including using the lens of intersectionality (see Kabeer, 2014). While multiple data points about populations can certainly lend themselves to an intersectional analysis of inequality, collecting them at all costs would be deeply misguided and potentially harmful. So too would be the assumption that identity data—no matter how multiply disaggregated—is sufficient for moving beyond the description of inequalities to enabling the kinds of action citizen-generated data were designed to provoke. 17
Conclusion
Cosas de Mujeres shut down its operations in 2022. We were unsuccessful—and in some cases, had refused—to meet donor demands for ongoing innovation, which was often equated with automation and the use of chatbots. We believed that additional automation would pose unnecessary risks and would undercut the benefits of human-to-human interaction (Cookson & Fuentes, 2025).
A significant contribution of D’Ignazio and her “Datos Contra el Feminicidio” colleagues’ work (see for example Suárez Val, 2023) is that it demonstrates how—even where new technologies such as machine learning are introduced (D’Ignazio & Klein, 2020)—data work is not purely nor even primarily technical, but deeply emotional, affective, and social. In contrast to the landscape of technocratic gender data initiatives linked to the international development agenda, the starting point for these initiatives is acknowledging that “data are always political, and data, are always produced” (D’Ignazio, 2024) (12). So too are the logics behind the decision to accommodate, refuse, or negotiate a data collection imperative.
Our empirical case emphasizes the social and affective nature of data collection, advancing the concept of negotiated refusal from a vantage point that is neither purely “top-down” nor entirely “bottom-up.” For decades, women's rights researchers and advocates have been fighting to have alternative/subjective forms of knowledge accepted and validated by decision-makers. The recent moves toward concretizing citizen-generated data as a valid form of knowledge production in international development is a welcomed step in that direction. But our study shows the tensions inherent to this. If the data had been collected as a matter of pure technicality, the project could have done other harms. The social and affective elements of the project informed its limitations and its possibilities and power.
Practically speaking, negotiated refusal is rooted in and opens further pathways for a feminist data practice that is responsive to the social nature of CGD projects, particularly though not exclusively those focused on sensitive issues in rapidly evolving contexts. Technical frameworks should not supersede or override the urgency of local context and considerations. Rather, the complex character of gendered inequalities and contexts in which they manifest requires flexibility in what gender data—or intersectional data—should consist of. These determinations should not be made by elite institutions but rather on a case-by-case basis through deliberation among those collecting, providing, and using the data.
Footnotes
Acknowledgments
We would like to thank Lucía Mesa Vélez and Norma Patiño Sánchez for their research assistance on this project.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The research for this paper was funded by the Social Sciences and Humanities Research Council of Canada.
