Abstract
To fulfill their missions, humanitarian organizations are engaged in the production and dissemination of a specific form of knowledge about refugees—namely, evidence. Yet relatively little scholarly work has been done on evidence as a basis for humanitarian decision-making in refugee crises. In this article, we analyze the evidence produced on Syrian refugees in Jordan between 2012 and 2017 by humanitarian actors, its appropriateness and relevance, to assess (a) potential side-effects for refugees and (b) functions for humanitarian organizations. Drawing on 234 documents on Syrian refugees in Jordan collected on major humanitarian platforms and interviews with experts from international organizations, this article shows that, due to the type of evidence produced, Syrian refugees living outside camps are invisibilized and their needs might be less considered in aid projects. The article also shows that some evidence practices are related to the outsourcing of data collection and analysis to specialized actors, and to quantification and standardization of evidence that plays a part in consolidating the legitimacy of humanitarian organizations. Our results highlight the importance of considering this specific form of knowledge when studying humanitarian interventions in international migration and refugee governance.
Introduction
International and humanitarian organizations, both governmental and non-governmental, have become pivotal actors in the management and governance of refugee crises worldwide (Geiger and Pécoud 2014), in the sense that they contribute, to varying degrees, to “the delineation of the refugee category, the procedures that govern access to it, the food people receive, the shelter they are provided, and also the withdrawal of these services” (Feldman 2018, 4). To fulfill their missions, international organizations (IOs) rely on various forms of knowledge, and many of them are also engaged in knowledge production (Zapp 2018; Glasman 2019).
Migration scholars have investigated the role of knowledge production by IOs in migration governance and in shaping discourses about migrants (Geiger and Pécoud 2014; Geiger and Koch 2018; Bartels 2018; Heller and Pécoud 2020) and the functions and effects of knowledge for these organizations (Boswell 2009; Korneev 2018). These studies have analyzed traditional forms of knowledge such as official declarations, programmatic publications, expert knowledge, or count data, based on single case studies from specific IOs. Until now, much less attention has been paid to the use of “evidence,” a more recent form of knowledge increasingly produced by IOs and other humanitarian actors involved in forced migration governance. Evidence-based policy-making (EBPM) advocates for the use of scientifically produced sets of information, such as analyses, needs assessments, and impact evaluations, to move policy decisions toward “what works” (Dijkzeul, Hilhorst and Walker 2013; Knox Clarke and Darcy 2014; Newman 2017; Standring 2017). Very few studies have empirically analyzed the evidence about refugees 1 disseminated by various actors in the humanitarian sector in crisis contexts, and its function for migration governance.
This article starts from an analysis of the evidence itself. Through an in-depth characterization of the evidence produced by IOs and humanitarian organizations about refugees in a specific crisis context and an analysis of its appropriateness with respect to scientific standards of soundness, robustness, and relevance, both from a methodological standpoint and regarding the population of concern and the context, we question the role played by evidence in the system of refugee aid and its side-effects concerning refugees. The questions that motivate this article are: how far does this type of practical-operational knowledge, as produced by humanitarian organizations, constitute a reliable basis for decision-making, which considers the whole refugee population and its needs with appropriate or unbiased methodologies for data collection and analysis? Secondly, does evidence, as produced in practice by humanitarian organizations, serve to fulfill other functions for those organizations than solely informing decisions on refugee aid?
These questions will guide our empirical analysis of the evidence produced by IOs and humanitarian organizations during the Syrian refugee crisis in Jordan. While Jordan is not a signatory to the 1951 Geneva Convention on refugees, it has a long tradition of hosting refugees (Palestinians, Iraqis) and has become a main host country for Syrian asylum-seekers in the Middle East (Kelberer 2017). Jordan also plays an active role in the Regional Refugee Response launched in 2012 by the United Nations High Commissioner for Refugees (UNHCR) (Kelberer 2017), and Amman coordinates the regional humanitarian response for the Syrian crisis, with a significant concentration of national and international actors involved in the refugee response. This article, therefore, resonates with the literature on humanitarian action and refugees in Jordan showing how practices of humanitarian organizations and the Jordanian government's encampment policy (Achilli 2015) participate in the disempowerment (Turner 2021) and control of refugees (Lenner 2020). To analyze the evidence produced in this case, we draw on two types of data: a quantitative database with 150 variables coded on 234 documents published by humanitarian actors and IOs between 2012 and 2017; and interviews with experts from IOs and non-governmental organizations (NGOs).
Our article offers three main contributions. First, we contribute to the literature on migration-related knowledge production by IOs and its entanglement with migration governance by providing an empirical analysis of an under-researched yet significant form of knowledge, namely evidence, involved in decision-making processes regarding refugee aid in crisis regions. Second, with our analysis of practices of knowledge production through evidence-making, we contribute to the literature on legitimation of organizations through knowledge production in the field of immigration policy (Boswell 2009). We highlight that evidence is produced by organizations so as to build their legitimacy in the humanitarian sector and enhance their credibility in decision-making (Boswell 2008, 471). Quantification practices by humanitarians are an essential part of this process (Glasman 2019; Lawson 2021). We also learn from our analysis the orientation of IOs and humanitarian organizations toward targets set by donors and supervising institutions (i.e., upward accountability), rather than toward refugees (i.e., downward accountability) (Burlin 2020). Finally, this article provides insights into the potential side-effects of evidence, which has become a means to select aid beneficiaries. Altogether, we show that evidence is inherently political.
The article is structured as follows. We first review the literature on knowledge and EPBM in humanitarian settings and migration governance to ground our analytical framework (first section). The second section details our data and methodology, before we turn to the presentation of our results (third section) and a discussion of the function of evidence for humanitarian organizations, and of the consequences it might have for the refugee population (fourth section). The last section concludes with implications of EPBM in the management of refugee crises and opens avenues for future research on this type of knowledge within the study of international migration.
Knowledge and Evidence in Forced Migration Governance
Since the emergence of the “new refugee regime complex” 2 (Betts 2010), humanitarian actors, especially IOs, have had a growing role in managing forced migration in hosting countries (Geiger and Pécoud 2014). Some international organizations are more involved than others in the coordination of the humanitarian response to crises or leading providers of humanitarian data (UNHCR and, to a lesser extent, the International Organization for Migration, IOM), but, together with NGOs, they work to provide humanitarian assistance to vulnerable populations (Glasman 2019). Knowledge production has increasingly become a central tool in international and humanitarian migration governance (Bartels 2018). To answer our research questions, we consider two streams in the literature that have hitherto been separated: operational studies, which address questions of what they call the “quality” of evidence in the humanitarian sector; and academic studies in sociology and political science that research the function of knowledge production in international migration governance. This last literature has not yet analyzed the growing role of evidence, as a distinct and specific form of knowledge.
Evidence-Based Policy-Making in a Humanitarian Setting
The idea of EBPM progressively entered the humanitarian field during the 1990s (Sutcliffe and Court 2005). It relates to the knowledge gap between science and policy (Cornelius, Martin and Hollifield 1994), which is especially relevant in migration and humanitarian issues (Baldwin-Edwards, Blitz and Crawley 2019) and according to which decision-makers make insufficient use of (scientific) knowledge. Figure 1 illustrates the “evidence movement” (Hansen and Rieper 2009) in the field of forced migration, with a marked increase in its production following the outbreak of war in Syria.

Documents posted on reliefweb about refugees (world).
The mostly operational literature dedicated to EBPM in the humanitarian field analyzes various kinds of evidence produced by humanitarian actors on refugees, including publications a priori understood by EBPM advocates as non-political, such as needs assessments or project evaluations, monitoring data, or desk/systematic reviews (Glasman 2019, 5). This body of literature discusses the production of evidence from three main angles. First, it analyzes EBPM's relevance on practical grounds for increasing efficacy, transparency, and accountability in managing refugee crises and allocating aid (Sutcliffe and Court 2005; Knox Clarke and Darcy 2014; Vayrynen 2001). Secondly, it identifies important constraints in EBPM implementation in humanitarian contexts. Political instability in crisis regions is one of them (Sutcliffe and Court 2005). Despite the humanitarian sector's professionalization (Kukkonen 2018) and the existence of collaborations between data-specialist agencies, donors, and scientists (Banatvala and Zwi 2000), connections remain loose: humanitarian agencies require fast results (Levine 2016) and sometimes dramatize situations to gain attention and funding (De Chaine 2002). Data collection in cases of forced migration faces ethical and methodological challenges: the vulnerability of the population involved, and difficulties such as sampling temporarily mobile persons (Harrell-Bond and Voutira 2007). All these constraints might hamper the production of appropriate evidence and partially explain the context of evidence production in which the aforementioned difficulties limit the robustness of evidence production.
Thirdly, a few studies empirically analyzing evidence highlight that the evidence provided in needs assessments in specific humanitarian cases is not valid (i.e., unsatisfactory measurement), precise (i.e., no probabilistic sampling or small sample sizes, no experimental sample), or comparative (i.e., no control group) (see Spiegel et al. 2004 and Blanchet et al. 2017). These findings from operational studies call into question the generalization of results and recommendations. Furthermore, short-term action is often privileged in refugee crises, focusing attention on specific emergency sectors at the expense of others (Thompson 2017). This operational literature assessing the implementation of EBPM in the humanitarian sector could be developed further: the analysis of evidence could be more systematized, exhaustive and quantified, involve more longitudinal approaches, and provide more discussions on the role of humanitarian organizations in designing evidence. This literature has also associated notions of “quality” of evidence in needs assessments with scientific criteria borrowed from experimental and quantitative methodologies—mainly causality and statistical robustness (Sackett et al. 1996). Yet critical arguments about EBPM point to the positivist framing of most evidence-based approaches, the valuing of measurements from larger samples, and the omission of “meanings” for individuals (Rigo 2018). Other authors propose to move away from the hierarchy of evidence to adopt the idea of “appropriateness” for decision-making (Parkhurst and Abeysinghe 2016; on hierarchies of evidence, see also Parkhurst 2017, p. 57–58). We follow this approach in this article.
Operational studies have rarely questioned the larger function of evidence production, its embeddedness in political and organizational processes, and its significance for migration governance. The humanitarian arena only exists because it has been discursively created by various stakeholders, reinforcing the need for humanitarian actors to show that they are “doing good work” (Hilhorst and Jansen 2010, 1122), through a multiplication of impact assessments in crisis regions since the 2000s (Watson 2008). We first need to provide an empirical analysis of the evidence humanitarian actors rely on when making decisions about refugee aid and to understand what are the processes behind its production in order, secondly, to investigate the function that evidence on refugees fulfills for humanitarian organizations and the side-effects of this evidence produced by those organizations, for the refugee population.
Knowledge Production, International Organizations, and Migration Governance
Some of the academic literature in sociology and political science focuses on the link between expert knowledge and migration governance (Pécoud 2015; Boswell 2009; Rigo 2018; Massari 2021). These studies frequently refer to the Foucauldian relationship between knowledge production and the exercise of power (Gaventa and Cornwall 2008; Carmel and Kan 2018). By producing statistics on refugees, identifying their moves, origins, and hosting conditions, the role of IOs sometimes resembles that of government/state control in crisis and refugee-hosting regions: the introduction of iris scan technology for refugee registration provides a good example of how organizations such as the UNHCR participate in refugee control in framing migration in terms of security for host countries (Lenner 2020, Massari 2021). In distinguishing between those who deserve international protection and those who do not (and who should return to their country of origin), IOs exercise power through an “international migration narrative” (Pécoud 2015). In this process, the need for funding turns the production of knowledge into a logic of justification, advocacy, and communication, and IOs “contribute to construct ‘reality’, often ‘creating’ problems to be solved” (Korneev 2014, 891). The embeddedness of knowledge production in systems of power also leads to considering local knowledge in crisis regions only if it fits the “pre-established, ‘international’ framework” (Kluczewska 2020, 186).
Many studies also stress the crucial role played by count data in migration management (on the “datafication of migration management,” see Broeders and Dijstelbloem 2015). While count data are not this article's focus, research on this type of data gives insights into processes and functions of knowledge production by IOs that are useful for framing our research questions and, later, for interpreting our results. Despite the contestedness and incongruence of count data (on counting Syrian refugees in Jordan, see Lenner 2020), quantification practices have been shown to enable IOs to demonstrate their expertise (Glasman 2019) and produce a form of transparency illusion (Hansen 2015), and to secure their position (through funding) as key partners of Western governments in managing migration flows (Korneev 2018). In this competition, counting refugees sometimes becomes “convenient for financially hard-pressed NGOs” (Lenner 2020, 288). Scheel and Ustek-Spilda (2019, 675) even argue that some organizations, such as the IOM, produce ignorance about the limits of the data they use to quantify migration as a strategy to sell expertise.
Finally, in her work on the linkage between expert knowledge and migration policy in Europe, Boswell (2008; 2009) highlights three functions of knowledge for organizations: instrumental (“a concern to ensure decisions are based on sound reasoning and empirical knowledge”); legitimating (“a means of demonstrating the credibility of the organization or its decisions” and of gaining “epistemic authority” (Boswell 2008, 4)); and substantiating (“to lend support to a particular policy preference or proposed course of action” (Boswell 2009, 79)). Action organizations derive their legitimacy from their societal interventions and must ensure loyalty from their own members (internal legitimacy), as well as from beneficiaries or other organizations (external legitimacy, see Boswell 2008, 5). In this article, we build on Boswell's approach to legitimacy and understand the introduction of EBPM in this broader context: research on the refugee response in Jordan seems to indicate that the number of beneficiaries of aid programs counts more than the programs’ quality and that “upward accountability” to donors is increasingly favored over “downward accountability” to the populations concerned (Burlin 2020).
Because evidence is increasingly used in the humanitarian sector to respond to refugees’ needs in host countries, we propose, first, to assess empirically the appropriateness and relevance of evidence produced and assumed to be objective and non-politicized by humanitarian organizations on Syrian refugees in Jordan, in order, secondly, to show the role played by evidence production in legitimating humanitarian organizations in refugee governance as well as the potential consequences of the use of evidence for the refugee population.
Data and Methodology
Since the beginning of the war in Syria in mid-2012, Jordan has become one of the most important receiving countries for Syrian asylum-seekers in the region, alongside Lebanon and Turkey (Kelberer 2017): by the end of 2012, 115,558 Syrian refugees were registered by the UNHCR in Jordan, with over 600,000 persons recorded after July 2014. 3 While refugee camps were soon built in Zaatari and Azraq, in September 2017, 79 percent of the registered Syrian population in Jordan were living in (peri-)urban and rural areas (UNHCR 2017). According to estimates based on the 2015 Jordan census, 1,265,514 Syrian citizens 4 were living in Jordan in 2015, 95 percent of them in urban areas. 5 In 2015, the onset of the Jordan Response Plan (JRP) by the Jordanian government and the implementation of the Refugee and Resilience Plans, followed by the Jordan Compact in 2016, reoriented, at least on paper, humanitarian efforts from immediate relief, protection, and vulnerability to more developmental concerns (Thompson 2017). This shift has been fueled by Jordan's growing concerns over its capacity to manage a protracted crisis and pressure from the EU to contain migrants in the Middle East (Tsourapas 2019; Ali 2021). As a result, humanitarian funding has largely been allocated to resilience projects though with mixed results on refugees’ socio-economic empowerment (Burlin 2020).
To answer our research questions, we collected data on the major platforms that disseminate information and evidence about the regions where humanitarian work is deployed, focusing on four platforms that contain the most documents published online about refugees and vulnerable populations in the world: 6 the REACH Initiative Resource Centre (130), the UNHCR Data Portal Assessment Registry (69), Relief Web (41), and the Active Learning Network for Accountability and Performance (ALNAP) 7 (3) (see Online Appendix for more details). The analysis draws on a total of 234 reports referring to “Jordan” and published between 2012 and 2017 (see Online Appendix Table A). We chose this time frame because it allowed us to observe variations in the humanitarian response: from immediate urgent relief with its many logistical challenges at the beginning of the crisis in 2012 to a phase of anchoring and structuring the humanitarian response (2012–2015) and, finally, to a phase of protracted crisis that brought to light the need for resilience and development of local communities (2015–2017). More than 150 variables were coded for each report on seven themes: source of the document and actors involved, platform and publication, topics addressed, geographical zone, data, analysis, references, and lexical issues. 8 To ensure inter-coder reliability, we simultaneously coded a subset of reports and had ex-post in-depth discussions to be sure we had a similar understanding of what was measured.
Empirically, we approach the type of evidence through the concept of appropriateness. Since the definition of evidence in the humanitarian sector is considered unclear (Kukkonen 2018), we propose a broad definition of evidence appropriateness based on a thorough investigation of the types of evidence collected, analyzed, and disseminated by IOs and humanitarian organizations. Evidence appropriateness consists of two dimensions: internal appropriateness, which relates to the characteristics of evidence, and the general coherence of the information which it is supposed to describe or analyze; and external appropriateness, which investigates the relevance of the information produced to actual situations and contexts. Internal appropriateness relates to the idea of soundness, robustness, according to scientific standards that have been defined by the scientific community. Evidence is claimed by humanitarian organizations in the context of EBPM to be a guarantee of scientific standards in the production of knowledge that grounds policies and humanitarian aid. Indeed, humanitarian actors rely on the strong assumption “that evidence is somehow apolitical, and the terminology of ‘evidence-based policy’ has a clear implication that there must be a single correct policy choice that the evidence legitimates or justifies” (Parkhurst 2017, 71). What we aim to do with the concept of appropriateness is to track whether humanitarian auto-produced evidence effectively responds to/matches scientific methodologies and relevance. Based on the literature (Knox Clarke and Darcy 2014; Christoplos et al. 2017), we include as internal appropriateness the following characteristics: data and methodological approach, data analysis, triangulation, transparency of methodologies (Table 1). These variables are used to perform a Multiple Correspondence Analysis (MCA). MCA is an inductive form of principal component analysis applied to categorical data (Le Roux and Rouanet 2010), whose purpose is to explore and summarize data to identify the reports’ similar or opposed characteristics and the dimensions of appropriateness structuring the sample. Hierarchical clustering using Ward's method 9 was then performed on the MCA results to identify different types of reports.
Variables Included in MCA.
To explore the relevance of evidence, we look at aspects indispensable for effective and efficient aid delivery (Mazurana, Benelli and Walker 2013) such as the sectors addressed in the reports, the geographical dispersion of the data, and the disaggregation of results by gender and age.
To grasp the process of evidence production and practices of different actors, we also conducted interviews with six experts who worked for international humanitarian organizations: UN agencies (Interviews 1 and 2) and NGOs, two specialized in the production of evidence (Interviews 3, 4 and 5), and one involved in humanitarian service delivery (Interview 6). Interviews focused on cooperation between organizations, processes in data collection, and production of reports, as well as their use of evidence and its general impact on the sector and refugees. Interviewees were all involved in data collection, and their responses are situated and subjective regarding organizational contexts and interests. The interpretations of their response are, therefore, subjective perceptions of the evidence production process from within the humanitarian sector.
Evidence Produced on Syrian Refugees in Jordan
A Multi-Actor Perspective on the Production of Evidence about Refugees
A glance at the number of reports and academic publications on Syrian refugees in Jordan (Figure 2) indicates that academia 10 and humanitarian organizations have different temporalities. One interviewee (4) identified a short “window of opportunity” in which humanitarian evidence can be used, while academic studies “are not bound by these small windows of decision-making.” It is not rare that large international NGOs cooperate with research institutes or individual researchers as mentioned by some interviewees, but all confirmed that academic output was rarely read by those responsible for writing reports or supervising needs assessments in IOs, due to time constraints but also because “academic works take years to be published” (I5). As Figure 2 shows, the number of academic publications rises since 2015. Thus, humanitarian organizations are crucial actors in evidence production on refugees throughout the crisis. While academic research on refugees in Jordan is linked materially and logistically to the humanitarian infrastructures (Pascucci 2017), delays in academic publishing and the limited degree of effective collaboration between humanitarian actors and academia create a vacuum in evidence production, which is filled either by polyvalent actors that are involved as much in humanitarian action as in the attempt to objectively assess situations and needs or by profit or non-profit organizations specialized in evidence production, about which there has been to date little research or critical reviewing.

Actors in the evidence-making process on Syrian refugees in Jordan and academic publications.
We identified four types of actors in the documents retrieved from the humanitarian platforms: (1) IOs with a prevalent role in humanitarian aid (i.e., the UNHCR, but also the United Nations International Children's Emergency Fund (UNICEF), the World Food Programme (WFP), UN Women and the United Nations Population Fund, and sporadically other IOs, such as the World Health Organization (WHO), the World Bank, the IOM, or the International Labour Organization (ILO)); (2) international and more rarely local NGOs; and (3) governments, both Jordanian and foreign. In the first three years of the Syrian crisis, the role of NGOs, the “first-comers,” in the production of assessments about Syrian refugees in Jordan decreased significantly. Their share remained minor, compared with the strong and constant involvement of UN organizations over the period (2012–2017). The involvement of governments as donors in the production of evidence about Syrian refugees was moderate and mainly from the UK, Northern European countries, Australia, Canada, the United States, Switzerland, and Jordan. Local or national non-governmental humanitarian actors were also little involved in the production and dissemination of evidence and, if they were involved, local knowledge was instrumental (e.g., for data collection) and had a self-legitimating function for IOs (Kluczewska 2020).
The fourth actor (4) is new in the humanitarian sector and governance of forced migration: a set of organizations, either private or non-profit, specializing in data collection and the production and dissemination of information. The REACH initiative, for instance, proposed to “effectively respond to the needs of crisis-affected communities” 11 and progressively became a pivotal actor in the production of needs assessments and evaluations in Jordan: it contributed to 94 percent of the reports published in 2015, and in 2017 to 78 percent. Private actors such as polling organizations or consulting firms also emerged but remained marginal (involved in 14 percent of reports in 2017). Whereas the outsourcing of migration policy to the private sector is well-researched (López-Sala and Godenau 2022), the outsourcing of the production of evidence about refugees to expert or specialized organizations has not yet been documented. Finally, the UNHCR developed strong collaborations with the REACH initiative, with almost two-thirds of the reports published in the 2012–2017 period involving a collaboration between these actors. Collaborations between different actors were, in fact, usual (84% of the reports), with almost systematic participation by UN agencies, which hold the key to access registration data for sampling.
These results suggest two tendencies: first, a trend toward outsourcing data collection, data analysis, and report production to specialized actors, creating a market for evidence that might become profitable for a number of organizations, and secondly, a concentration of evidence production in the hands of one dominant, not-for-profit player with a leading position in the Jordanian “evidence market”. This situation might produce a standardization of processes and methodologies, leaving little room for opposing views or competing evidence, and allowing economies of scale. As we could observe in our data, REACH re-used datasets (especially large samples) to produce multiple assessments on different issues and for different funders. One question arising from our analysis of the actors involved in evidence production and addressed below is whether evidence appropriateness was associated with a certain type of actor.
Variability in the Appropriateness of Evidence
The results derived from the MCA indicate that the reports’ appropriateness was structured around a first “data and triangulation axis” (horizontal) (i.e., type of data and use of references) and a second axis understood as a transparency and validation axis (variance in the provision of information on methodology and references) (see Online Appendix Figure A). The third axis relates to the degree of robustness of analyses (see Online Appendix Figure B). The hierarchical cluster analysis based on MCA results enabled us to identify four types of reports (Online Appendix Figure C) 12 presented below, using further qualitative information found in the reports.
Cluster 1 : Poor Evidence (53 Documents)
The UN was involved in 83 percent of reports in this cluster, a relatively high share, as is the case for reports in Cluster 2 (93%). Cluster 1 largely comprises documents based on quantitative data but derived from non-random samples (43%, Table 2). A non-negligible proportion (15%) of the documents drew on data from unspecified sources. Only 6 percent of reports included econometrics or statistical tests, and only 9 percent considered a control population, such as Jordanians or non-Syrian refugees. Moreover, the methodology on which the results presented in the reports are based was rarely informed in a detailed way (lack of information on type of sampling/recruitment of participants, number of respondents, potential bias, etc.). A large share of reports contained only little information on data and sampling. Nor was triangulation practiced in these reports: documents including references were rare (only 8%), and only one reference was included, on average.
Description of the Clusters.
Cluster 2 : Standardized Evidence from the UN/REACH Collaboration through Monitoring Data (58 Documents)
Documents in this cluster were the product of collaborations between REACH and the UN in 98 percent of the cases, a higher share than in the other clusters. Documents were largely based on quantitative data (98%), especially monitoring data. The use of random samples was more frequent than in Cluster 1 (26% vs. 6%), as was detailed information on the methodology. As in Cluster 1, documents in Cluster 2 rarely provided statistical tests or econometrics, and almost all focused on Syrian refugees, without a comparison group (in the sample, only 16% of documents included a control group). Interestingly, half the documents in Cluster 2 were only one page long, and almost two-thirds were mere “factsheets” for counting and monitoring purposes, with standard presentation (Figure 3) and little interpretation of results. The number of factsheets also explains why the number of references used in this cluster was low (4 on average).

Example of a factsheet. Source: UNICEF, 2015.
In this cluster, we also learn that a toolkit provided by a Web platform called “Raosoft” 13 was frequently used by humanitarian organizations as a standard of quality and robustness to determine sample sizes according to the base population. However, the use of this platform is embedded in a process of routinization which can ultimately obscure the importance of sampling designs for representativeness. Such routinization leads to incoherent affirmations in several documents, such as “Findings are statistically representative with a 97 percent level of confidence and 4 percent margin of error.”
Cluster 3 : Appropriate Evidence from NGOs (38 Documents)
Compared with the evidence found in the first two clusters, the evidence reassembled in Cluster 3 was indeed mainly based on qualitative data, and even if information on sampling and methodology was sparse, 58 percent of documents included interview extracts and a more in-depth analysis of individual and community strategies, decision-making, and experiences of settlement. NGOs were more strongly involved (in 63% of reports, followed by the UN with 45% and states with 26%) with the lowest share of collaborations between different types of actors. NGOs’ apparent preference for qualitative methods is consistent with their budget constraints but also with their emphasis on individual stories or, as stated by two interviewees from an UN agency and from an NGO delivering aid, on “specific narratives” (I2) because “the human story is what makes it appealing” (I6). As shown in the next section, those reports drawing on qualitative data focused more frequently on subgroups such as women or young boys or girls. As expected from the nature of the data, 60 percent of documents had no statistical results, but 37 percent provided descriptive statistics from qualitative data (from key informant interviews or focus group discussions, FGDs). The statistical analysis of qualitative data is viewed critically by one interviewee working for an NGO specialized in data collection and analysis: qualitative data are “super-useful, but on average, they are badly implemented,” and “we find that analysis and interpretation of qualitative data is done through percentage of FGDs reporting this or that, which does not make any sense because that is not the point of doing qualitative analysis…. To do it properly, it's complicated because it requires more time and actually more skills…, and very often, this is not implemented properly” (I4). The high valuation of hard data in the evidence industry might also explain this practice. The use of FGDs by evidence producers in the humanitarian sector is widespread, justified by cost and time-efficiency constraints (Acocella 2012), but sometimes raised concerns on ethics and quality of results: participants might be reluctant to publicly express themselves (Smith 1995) on sensitive issues such as sexual violence. The same is true for the recurrent use of purposive sampling, a cheaper and cost-effective sampling method that can lead to larger bias and potential errors of judgment by data collectors in their choice of respondents. 14 Finally, this cluster had the highest share of reports including a bibliography (66%) and the highest number of references (18 on average). Reports in Cluster 3 frequently referred to other NGOs’ works (61%), as well as to works going beyond evaluation reports (47%). These practices of triangulation and information gathering for analyzing better situations also indicate more appropriate evidence than that found in Clusters 1 and 2.
Cluster 4: Evidence from Mixed-Methods Data (85 Documents)
Compared with Cluster 3, Cluster 4 had a large pool of documents for which REACH collected the data. NGO involvement was limited, but 83 percent of documents were products of collaborations that did not necessarily involve the typical REACH-UN combination. This cluster was characterized by the highest share of documents mixing quantitative and qualitative data (89%), drawing on random samples (33%), and including a control population (about a quarter). Comparison with Jordanians is not self-evident: as stated by Interviewee 2, who worked for an UN organization, comparing the situation of refugees with that of Jordanians is a politically sensitive issue and encounters the Jordanian government's reluctance. Half the documents provided detailed information on the methodology, and 13 percent included significance tests or econometrics—the highest portion in all clusters. Representative of this cluster is a report based on a panel data survey, which is normally rare due to time constraints in humanitarian settings (I2), and combining several surveys, focus groups, observations, and key informant interviews, including refugees. 27 percent of documents addressed saturation in qualitative work, but triangulation through the use of references to other works was frequent, as in Cluster 3 (59% had a reference list).
The way refugees were designated in Cluster-4 reports is interesting in how such reports presented the population of concern. Several works have analyzed how refugees and particular groups among them are framed or labeled (e.g., Roger Zetter's works (1991; 2007) on how the refugee label is formed, transformed, and politicized). Through a simplified lexical analysis, we identified the terms used in each report to find the primary (most frequently occurring) and secondary terms. Cluster 3, and, to a lesser extent, Cluster 4 gave significance to sub-populations of refugees referred to as “refugees” but also as “girls,” “boys,” “women,” “men,” and “children.” This lexical scope contrasts with the one used in reports in the UN/REACH collaboration (Cluster 2) in which methodological terms such as “respondents,” “key informants,” and “households” predominantly designated the refugee population. Those terms refer to the domain of fieldwork and sampling in data collection processes. They are framed for statisticians and neutralize social and political identities, resulting in a form of methodological reification. The use of those methodological categories indicates instrumentation and objectification of these populations. Those categories are also social constructions that serve the needs of evidence producers rather than the population studied itself, which might influence discourses on refugees and the design of aid and interventions.
To summarize, through our inductive quantitative approach, we identified four classes of reports characterized by different levels of evidence appropriateness and involvement of actors. The qualitative cluster (3) and the mixed-methods cluster (4), which together represent half the reports, clearly stand out with a high internal appropriateness of evidence, while about a quarter of them (Cluster 1) reveal biased evidence and lack of transparency on methodologies. Collaborations between actors had two opposing outcomes: either a higher appropriateness of evidence (Clusters 3 and 4) through pooling methodological skills and sharing costs or a routinization of data collection, analysis and limited interpretations and an objectification and reification of populations through the use of a technical lexicon (households, respondents, etc.) for designating “refugees” (Cluster 2, UN-REACH collaboration). Standardization was, nonetheless, regarded by organizations specialized in data collection and analysis as a criterion of data reliability (“the more standardization, the better,” I5). Interestingly, despite being a central actor in evidence production, the UN stood out in neither of the two clusters displaying a high evidence appropriateness (Clusters 3 and 4), raising many questions regarding its leading role in managing the crisis in Jordan.
Relevance of the Evidence
How far does the evidence produced on Syrian refugees in Jordan enable humanitarian organizations to address the situation of this population and its specificities? To answer this question, we look by cluster at the issues or sectors addressed in the reports, the geographical dispersion of the population concerned, and the population groups considered.
Access to services and fundamental needs and children's education were the issues most frequently addressed across the four clusters, but to different degrees (Figure 4). The UNHCR-REACH collaboration cluster (Cluster 2) was especially concerned with housing, as well as with Water, Sanitation and Hygiene programs (WASH) (i.e., those corresponding to UNHCR budget efforts). Reports in this cluster clearly focused on the camp population's specific needs, which is unsurprising inasmuch as camps are directly managed by IOs, but also surprising, since the majority of Syrian refugees live outside camps and in permanent housing (Tiltnes, Zhang and Pedersen 2019, 37). Contrary to UNHCR official priorities, solution orientations such as self-reliance and integration schemes were under much less scrutiny in all the reports (UNHCR 2020; Hansen 2018) but gained in importance over the observation period. More generally, in the entire sample, work and wages, education, vulnerability, and resilience were increasingly addressed over the period (Online Appendix Figure D).

The first five issues addressed in each cluster.
Overall, 58 percent of documents dealt with camp populations and 42 percent only with these populations. The overrepresentation of camps was especially prevalent in reports emerging from collaborations between specialized agencies and the UN (Cluster 2), while Clusters 3 and 4 tended to focus more frequently on camps and/or non-camp areas (Figure 5). Nonetheless, in some interviews, especially with members of IOs or NGOs actively involved in the humanitarian response, respondents denied a bias toward camps in their own organization: “We sample regardless of where they [refugees] reside, or their nationality, we have a team for camp populations and out-of-camp” (I1), or “We work both in and outside the camp, access to the population is not hard here, in Jordan in particular” (I6).

Locations sampled in the reports, by cluster: camp vs. non-camp.
The emphasis on camps was linked to organizations’ capacity to respond to needs of the whole population of concern, including those living outside camps, as this quotation from an NGO senior analyst shows: there was “overwhelming information on camp areas but less on other areas…, and since resources are limited, the question of how to allocate funds is important” (I5). This quote suggests that organizations’ choices of collecting evidence in camps were also related to their organizational capacities, especially in a context where funding was scarce and donors required accountability. Thus, the focus on camps not only contributes to an invisibilization of out-of-camp refugees but also reduces the potential for different aid between those living inside and outside camps for example. Ethical issues in aid distribution were raised by interviewees involved with data collection: “Refugees are asked questions and afterwards are not always provided with assistance” (I4), “refugees are having questions put to them every day. It is overwhelming for the people” (I3). For those living in camps, there was “assessment fatigue” (I3 and I4). Those ethical issues raise questions considering that a substantial proportion of reports did not lead to evidence that addresses the whole Syrian population. The frequent re-use of data by organizations could be a way to minimize assessment fatigue, but this practice might also suggest that organizations engaged in collecting and accumulating as much information as possible on several domains of life, implying long and burdensome interview sessions for refugees.
The Zaatari camp, built in 2012, was clearly a focal point for IOs: 39 percent of documents sampled it, particularly in Cluster 2 (Table 3). Interviewee 2 regretted this focus on Zaatari, also by the research community: “All professors always go to Zaatari…, but the refugees in the host community are worse off.” 15 While 29 percent of registered Syrian refugees in Jordan live in the Azraq camp or the city of Amman, those locations did not attract so much attention in the reports compared with Zaatari. Only 14 percent of documents dealt with Amman and 13 percent with Azraq. Other camps, such as Irbid or Mafraq, attracted much less attention, and it was even worse for out-of-camp populations living in rural or urban informal settlements. Overall, it seems that evidence displayed by the UNHCR-REACH collaboration (Cluster 2) had low relevance for a majority of Syrian refugees in Jordan living outside camps with needs specific to their situation.
Locations Sampled in the Reports and Share by Cluster.
* Source: UNHCR Jordan, 2019
https://data2.unhcr.org/en/documents/download/65827
Furthermore, the overrepresentation of camp populations inevitably led to a lack of control population in analyses. Interviewee 4 recognized this focus of evidence production on camp populations, explaining that needs assessments were conducted where aid was already delivered and sampling procedures less complicated. Moreover, assessing refugees’ situation in informal tented settlements was also a political issue: “Nobody cares, they are invisible. The government doesn’t want to have research done there” (I6). This last point signals how evidence can sometimes be directly shaped and affected by political motives, with “substantiating” purposes (Boswell 2008).
Sex- and age-disaggregated data were seen by Interviewee 1 as an important guarantee of quality but also a precious tool for targeting aid delivery. Disaggregation by age and/or gender was not systematic in our sample (Table 4). Gender and age were addressed in 45 percent and 30 percent of the documents respectively. Gender was less frequently considered in reports on camps (Cluster 2) and more prevalent in Clusters 3 and 4. Age disaggregation was, by contrast, more prevalent when camps and other localities were studied and in reports drawing on sound evidence from qualitative data (Cluster 3). Reports drawing mainly on qualitative data (Cluster 3) were better able to look at specific subgroups.
Accounting for Gender and Age in the Reports, by Location and Cluster.
* Statistically significant at the 1% level.
A strong and significant correlation between gender and age, signaling frequent use of data disaggregation, was mainly found in reports focused on the camp population but not in those addressing the non-camp population (Table 4). This difference in data disaggregation between camp and non-camp populations means that Syrian women living outside camps, for example, were less likely to benefit from projects adapted to their needs. On the basis of insights from interviews, it seems that a vicious circle emerged in this relationship between humanitarian aid and evidence production, fostering the invisibilization of the most vulnerable groups within Jordan's refugee population. This result is somewhat inconsistent with the statement by an expert from a large IO, who said that vulnerability assessments aimed at targeting the most vulnerable, explaining later in the interview that targeting was important because of limited aid capacities (“one cannot go for one million refugees,” I1). This point resonates with Sözer's critical analysis of the use of the vulnerability notion, arguing that it “implies selective rather than additional assistance” (2019: 2), as well as with Turner's study on labeling “vulnerable Syrian men” in Jordan, which showed the centrality of the notion of vulnerability in aid distribution and how its use by humanitarians contributed to refugees’ disempowerment (Turner 2021).
A few Insights on Legitimacy, Accountability, and the Side-effects of Evidence for Refugees
Our typology of reports shows that the evidence produced on Syrian refugees in Jordan was heterogeneous, did not always meet the standards of scientific best practices, and was not always relevant when considering the whole refugee population. Compared with other types of knowledge in the field of migration, evidence is explicitly expected to provide scientific rationales for the design of aid appropriate to the needs of those who flee conflicts. Yet, our results underline a political function of evidence that is not obvious at first glance given the primary function of evidence as supposedly “scientific” and “unbiased”, a situation that creates side-effects for the refugee population.
Downward and Upward: Conflicting Notions of Political Accountability and Legitimacy in the Humanitarian Sector
Boswell's argument (2008) on the function of expert knowledge on migration for governments and, more broadly, organizations can be applied to evidence. The fact that humanitarian organizations do not always produce appropriate evidence on refugees might signal that evidence is not only designed as a knowledge base to inform decisions (instrumental function), but also fulfills a legitimating function (Boswell 2009) for humanitarian organizations that must appear credible and acting rationally. As indicated in the introduction, EBPM has engaged humanitarian organizations in a sort of exponential and self-sustaining dynamic of collecting and analyzing data and disseminating evidence. Part of this self-sustaining process results from performance-based management and funding issues: “The production of evidence is very important in order to have other projects” (I6), or, as stated by Interviewee 3 (in an NGO), evidence is used for “advocacy” and related to “– not a surprise – money!”.
Though we observed a trend toward more evidence from qualitative data on Syrian refugees over the analyzed period (see Online Appendix Figure E), our expert interviews indicate that funding issues inflect the type of evidence collected toward “quantitative information” or “hard data” that are preferred by donors (I6). According to two of our interviewees, this leads to “a lot of duplication [of data, surveys, etc.…], especially here in Jordan” (I6) and a real need for more analysis (I6), triangulation, and sense-making (I2). The constant production of evidence creates competition between humanitarian organizations, and the UN is the master in this competition. The UN holds key information to access and survey populations (I2 and I4), especially out-of-camp populations (although it makes little use of this information, as our analysis shows). In the words of one interviewee: “data is a new kind of gold” (I2). Actors specializing in data collection and analysis have thus entered the game and strengthened competition and the status of evidence in the race for funding. Showing skills in quantitative data collection and in the capacity to follow common standards validated by the humanitarian sector enhances the credibility, trustworthiness, and reputation of humanitarian organizations.
Quantification is coupled with a tendency toward the standardization of data, assessments and reports, which can be understood as a form of “pragmatic legitimacy” (Suchman 1995, 585) implemented by UN agencies and other IOs. In addition to the fact that evidence often lacks appropriateness and relevance, as we show, it also often has little to do with improved decision-making: “It's a myth that decisions are taken upon evidence” (I4). This particular type of knowledge can also play a part in conflicting views between humanitarian actors and donors on the most adequate way to deal with refugees’ needs: “Donor-driven programs are not the most relevant [for the needs of refugees], there is political pressure sometimes, and sometimes big private-sector donors are egotistic in their demands and don’t respond to displays of evidence” (I2). Evidence was, therefore, not sufficient to “empower” and insulate humanitarian actors from political or donor-driven motives that remained significant in shaping projects and programs for refugees in humanitarian settings. This shows that in some cases, evidence performs a substantiating role (Boswell 2008). It is thus produced to get funding for aid programs that match donors’ preferences more than refugees’ needs.
Our quantitative and qualitative results, therefore, enable us to uncover important processes in the humanitarian sector that are not neutral for refugee governance because they tend to favor quantifying refugees’ needs over identifying and differentiating needs. To satisfy donors, organizations need to show that their projects are based on evidence. The paradox is that humanitarian organizations are “avowedly non-political, neutral actors” (Feldman 2018, 4) but, being dependent on donors, they are sandwiched between their search for legitimation and the need to develop systems of accountability to the population they “govern”: refugees. Quantified evidence in particular has become the guarantee for “good work” and it is more important than doing the good work in itself and taking account of potential side-effects of evidence-based humanitarian aid on refugees.
The Side-Effects of Evidence for Refugee Populations
The type of evidence produced by humanitarian organizations in crisis situations is not without consequences for refugees. Because they are not neutral, methodologies and quantification processes (Desrosières 2014) can create new forms of inequalities and exclusion or new types of domination hidden behind an apparent image of rationalization and objectification. The emphasis put on camps in needs assessments was at odds with the large presence of Syrian refugees in host communities. Surveying specific groups of refugees (i.e., those living in camps) also became a means of selection of aid beneficiaries, as indicated by one interviewee. In parallel, being under-questioned in out-of-camps settings might exacerbate the feeling of exclusion of some groups among the refugee population (Omata 2019)
Evidence production also raises ethical questions about the direct impacts it can have on refugees. As already mentioned in the empirical part, over-surveying populations to obtain recurrent evidence might create “assessment fatigue” and raise false expectations on aid among refugees (Omata 2019). The trend toward quantification of needs (Glasman 2019) and focus on some specific subtopic of humanitarian interventions such as basic service provision, fundamental needs and children’ education, as shown, does not only mean that refugees’ strategies for building their lives were left in the shadows but also that the needs of a large part of the refugee population remained unaddressed. Some issues are not put under scrutiny by IOs, due to political sensitivity: dissemination of unemployment rates, comparisons between Jordanians and Syrian refugees, or information on informal tented settlements that might destabilize social cohesion in the country (I2). In a way, a marginalization of some political and/or relevant issues and population can thus be attributed to evidence production (Parkhurst 2017). Those “issue biases” in the production of evidence, through the invocation of evidence about particular outcomes, are important for migration governance as they can “serve to impose political priority in unrepresentative ways” (Parkhurst 2017, 27).
Finally, research has shown the diverse categories and labels used by humanitarian actors to qualify Syrians and how ordering practices based on those categories lead to different degrees of assistance (Janmyr and Mourad 2018). In the evidence on Syrians in Jordan analyzed here, we observed a frequent use of methodology-related terms, such as “households” or “respondents”. While the term “refugee” is highly bureaucratic but also political (Zetter 1991), the semantic shift toward intangible and methodological framing further obscures the complexity of refugees’ lives (Byrne et al. 2013, 100), rendering part of them invisible. This normalized language plays a part in the “international migration narrative” described by Pécoud (2015) and characterized by a loss of political meaning of the situation experienced by refugees and of humanitarian action when designing aid and interventions. The focus on camps in the reports might also have contributed to consolidating a narrative on refugees living on humanitarian aid. Altogether, our results show that the accumulation of evidence is not only a waste of energy and money but, also can produce side-effects when applied in humanitarian refugee governance contexts led by competition for funds and the search of legitimacy for their interventions. Evidence on refugees as produced by humanitarian organizations thus becomes counterproductive to some of the very ends of humanitarian action: the improvement of refugees’ living conditions and well-being.
Conclusion
Despite the official function of evidence of serving as a scientific rationale for humanitarian action, our results indicate that it serves as a booster of upward accountability and legitimation for humanitarian organizations involved in competition for funding and as a show of good work designed for donors. In that sense, evidence—like other forms of knowledge in the migration field—is highly political. Evidence-making and quantification practices thus have practical consequences for refugees and migration governance: they have become a means to select aid beneficiaries through the focus on those living in the camps that were the most accessible. These results thus call for a general discussion on the status and limits of evidence-based policy-making in the humanitarian governance of refugee crises: the sector should assess the risk of EBPM becoming a replacement mechanism for more democratic and participatory forms of downward accountability toward refugees.
Our research has implications for the wider study of international migration in crisis contexts: in-depth empirical studies should investigate the consequences of evidence production by the humanitarian sector on the actual lives and conditions of refugees by looking at how aid is allocated and whether this aid has changed since the development of EBPM. The way the refugee population perceives or appropriates this evidence production process could also be put under more scrutiny, as well as refugees’ involvement in designing the humanitarian support they can benefit from. More generally, one could ask whether the increasing use of evidence has changed the way migration crises are managed by humanitarian actors: are crises such as the Syrian refugee crisis in Jordan managed better, differently, or less inclusively as a result of using the evidence? These are questions that still need to be answered.
Supplemental Material
sj-docx-1-mrx-10.1177_01979183231154506 - Supplemental material for Evidence as a Specific Knowledge to Inform Humanitarian Decision-Making in Migration Crisis Contexts: The Case of Syrian Refugees in Jordan
Supplemental material, sj-docx-1-mrx-10.1177_01979183231154506 for Evidence as a Specific Knowledge to Inform Humanitarian Decision-Making in Migration Crisis Contexts: The Case of Syrian Refugees in Jordan by Gwendoline Promsopha and Ingrid Tucci in International Migration Review
Footnotes
Acknowledgements
We thank Thomas Faist for his comments on an earlier draft and Lea Macias for her collaboration and help in building the database at the beginning of this research. This article was written as part of the project “Time of conflicts / time of migration” (PI: Kamel Doraï) supported by the French National Research Agency (ANR).
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Agence Nationale de la Recherche
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
