Abstract
Strong and impartial verification mechanisms are indispensable for the success of civil war peace agreements. Verification establishes standards for evaluating compliance, prevents destabilization, and reduces the risk of renewed conflict. However, the lack of a shared understanding of implementation progress often leads to disagreements, requiring impartial and mutually accepted information to adjudicate disputes. Ambiguous language and diverse interpretations among stakeholders further complicate verification efforts. Unfortunately, robust and impartial verification has been rare in practice, and global verification efforts have lacked systematic, replicable methodologies, resulting in low external validity. This article introduces a novel, generalizable, and empirically rigorous verification approach developed through the 2016 Colombian Final Peace Agreement. We provide coding rules for different commitment types and implementation scores at various levels of aggregation. Our dataset includes monthly coding of 578 commitments. In a research application, we demonstrate a micro-level relationship between international accompaniment and the implementation levels of specific stipulations. This approach not only fulfills an official verification mandate but also sets a new benchmark for monitoring and evaluating peace agreements worldwide.
Introduction
Civil conflicts remain one of the most pressing global challenges of our time. While negotiated peace agreements are a common tool for ending such conflicts (Kreutz, 2010), their long-term success—and the prevention of future wars—depends on their implementation (Joshi & Quinn, 2017; Quinn et al., 2019). Implementation, in turn, benefits from monitoring and verification as verification not only establishes criteria for evaluating compliance but also fosters a shared understanding of progress, which helps maintain the participation of the parties (Joshi et al., 2017a; Mattes & Savun, 2010; Stedman et al., 2002; Walter, 2002). Disagreements over implementation can escalate into dangerous deadlocks, where each party tethers its compliance to the other’s; a common path to reignited conflict.
We conceptualize the implementation of a peace agreement as the policy actions taken to enact the agreed-upon political, social, and economic reforms or changes, as mutually accepted by both the government and the rebel group. Despite the centrality of implementation to sustainable peace, the theorization of micro-implementation processes—and the role of verification within them—remains underdeveloped. Empirical studies show a correlation between low implementation and the recurrence of civil war (e.g., Joshi & Quinn, 2017), but the root cause often lies in disagreements over what constitutes implementation, how much has been achieved, and whether timelines for reciprocal actions are being met. While delays are frequently attributed to logistical and resource constraints, these challenges are compounded by the inherently political nature of peace agreements. Verification is not merely a technical exercise but a deeply contested process, where partisan approaches and intensity bias—the disproportionate weighting of certain provisions by stakeholders—can undermine legitimacy. Moreover, robust verification mechanisms risk fueling public disillusionment if gaps in implementation are revealed but subsequently ignored.
The Colombian context illustrates these challenges. More than eight years after the signing of the 2016 peace agreement, many provisions remain partially or fully unimplemented. Yet, partial implementation has sufficed to sustain peace between the signatories, even as other conflicts persist. This raises critical questions about the relationship between implementation, verification, and perceptions of non-implementation hindering future negotiations with other armed groups, locking countries into cycles of continuous violence.
Historically, verification bodies have struggled to provide robust and impartial assessments. Partisan composition of joint verification bodies, ambiguous language in peace agreements, and the manipulation of events or records often distort public perceptions and fuel skepticism. Additionally, provisions addressing long-term structural issues—such as social, economic, and demographic reforms—are particularly challenging to implement and verify, often moving at a glacial pace. This article helps address these challenges by introducing a novel methodology for measuring the micro-implementation of peace agreements, applied to the 2016 Colombian peace agreement. Our dataset, unprecedented in scope and granularity, provides a time-series analysis of 578 stipulations coded monthly, amassing over 55,000 observations. The contributions of this work can be distilled into three key areas:
A Generalizable and Globally Replicable Methodology. First, the methodology introduced here is generalizable and offers a globally replicable tool for assessing the implementation of civil war peace agreements in as close to real time as possible. It provides a systematic framework for tracking implementation progress on an ongoing basis, ensuring consistency and comparability across different implementation settings. Holistic Coverage of Accord Commitments. Second, the dataset provides holistic coverage of all accord commitments, going beyond the provision-level analyses that have dominated prior studies. While previous research has enriched our understanding of implementation at a broad level, this new dataset goes much deeper, examining every distinct stipulation and policy instrument employed to achieve provision-level objectives. A Foundation for Future Policy Assessments. Third, the methodology and resulting dataset lay a solid groundwork for future policy assessments of whether the reform initiatives negotiated in peace agreements lead to the desired socio-political changes. Until now, such assessments have been hindered by the lack of micro-level data on specific policies and programs. This dataset addresses that gap by providing time-series data on the implementation of 578 stipulations, coded on a monthly basis—amassing over 55,000 observations.
In the next section, we provide essential background on the project, detailing its origins, objectives, and significance. We then explain the methodology and implementation scoring system, and how stipulations are coded and tracked over time. Following this, we discuss evidence and source preservation, highlighting the use of MAXQDA, a qualitative software package, to ensure transparency and reproducibility. We also explore how implementation scores can be aggregated or disaggregated based on the accord’s structure, academic categories, and types of policy instruments. In a research application, we demonstrate the utility of the dataset by measuring and testing the effects of international accompaniment at the micro level of implementation—that is, the stipulation level. Finally, the conclusion reflects on the replicability of the methodology and outlines future research avenues.
Background and Verification Mandate
In 2012, researchers at the Peace Accords Matrix project at the University of Notre Dame, began to provide research support to the negotiations between the Government of Colombia (GoC) and the Fuerzas Armadas Revolucionarias de Colombia—Ejército del Pueblo (FARC-EP). Our primary task was to evaluate the effectiveness of verification mechanisms in past comprehensive peace agreements and identify areas for improvement through comparative analysis. Over time, these efforts revealed a consistent pattern: Verification mechanisms have historically struggled to deliver robust, impartial, and transparent assessments of implementation.
The norm over the last three decades has been the establishment of joint political commissions, composed of equal representation from the former adversaries and occasionally including regional or international representatives. While these types of commissions have a role, their partisan nature and conflicting mandates often lead to serious dysfunction. Our analysis concluded that while joint political bodies have their place in a peace process, impartial verification should not be entrusted to partisan commissions.
Another critical shortcoming of past verification efforts is their tendency to exhibit intensity bias, disproportionately focusing on the early years of implementation and prioritizing provisions related to demobilization and disarmament. While these provisions are undeniably important, the narrow focus often comes at the expense of other commitments, particularly those addressing long-term social, economic and cultural issues. This imbalance weakens the overall implementation process and perpetuates the structural inequalities that fueled the conflict in the first place.
Finally, past verification bodies have often lacked transparency in their methodologies, failing to document and substantiate their judgments adequately. We do not know of any case where these bodies have made their source materials or raw data publicly available, leaving their conclusions open to questions. Given that the interpretation of what constitutes implementation is often socially contested, we argued for greater transparency in verification processes, particularly sources and evidence.
These insights formed the basis of our recommendations to the Colombian peace process. After extensive discussions, the signatory parties expressed interest in testing a new approach, and we were asked to design a rigorous verification and monitoring methodology to address the weaknesses of past processes. Now in its eighth year of continuous operation, this initiative has generated a unique dataset that meticulously tracks the monthly implementation of 578 stipulations within the Colombian Peace Agreement. An accompanying qualitative database, accessible through MAXQDA software, contains nearly 26,000 implementation events tagged to individual stipulations and 12,000 source documents.
Peace Agreement Data Trends
The new micro-implementation data on the Colombian peace agreement represents the latest evolution in a three-decade-long trend toward greater granularity in the textual analysis and measurement of peace agreement fulfilment. This progression can be understood as three distinct generations of specificity, each building on the limitations of the previous one. First-generation peace agreement data, or the earliest efforts to systematically study peace agreements focused on compiling a population of accords for comparative analysis against other modes of conflict termination. Key early works in this line include Licklider (1995), Mason and Fett (1996), Walter (1997), Hartzell (1999), and Doyle and Sambanis (2000). These studies laid the groundwork for a new subfield of research on civil war recurrence, suggesting that peace agreements could be effective tools for resolving civil wars—though much more evidence was needed to confirm this. However, the main limitation of first-generation data was its inability to distinguish between accords by type or quality, as these studies did not analyse the content of the agreements themselves.
Second-generation peace agreement data, or the next wave of research, shifted focus to the textual content of peace agreements, coding and categorizing the provisions within them. Key works here include Hartzell and Hoddie (2003), Fortna (2004), Walter (2004), Quinn et al. (2007), Mattes and Savun (2010), DeRouen et al. (2009), and Nilsson (2012). What began as individual researchers analysing a small number of provisions for specific studies eventually led to creation of comprehensive datasets that expanded the scope and variety of provisions under examination. The UCDP Peace Agreement Dataset (Harbom et al., 2006; introduced 30 different types of provisions in more than 200 interstate peace agreements. Bell and O’Rourke (2009) greatly increased the number of topics and agreements coded on transitional justice. The Peace Accords Matrix project coded 51 types of provisions in 34 comprehensive peace agreements after 1989 (these 34 CPAs subsume over 100 partial agreements) (Joshi & Darby, 2013). Mallinder (2018) introduced a suite of variables on 289 post-conflict amnesties from 1990 to 2016. Bell and Badanjak (2019) introduced coding for 225+ topics in peace process related documents. Fontana et al. (2021a) codes 40 provisions in 286 accords in five dimensions of power-sharing. These data allowed researchers to identify the basic functions that peace agreements must fulfill to manage the chaotic transition from war to peace. However, a key limitation was that second-generation data could not account for the significant variation in how—or whether—provisions were implemented across different agreements.
The third generation of research emphasized the implementation of after the signing of peace agreements. Hoddie and Hartzell (2003) offered cross-sectional implementation data on military reform in 33 peace agreements. Also, in the area of power-sharing, Jarstad and Nilsson (2008) included cross-sectional implementation data on military, political, and territorial power-sharing in 83 agreements after 1989. Ottmann and Vüllers (2015) introduced a suite of variables in 79 agreements on the implementation of political, military, economic and territorial power-sharing provisions (1989–2006). Joshi et al. (2015) offered time-series data on the implementation of 51 types of provisions in 34 comprehensive peace agreements after 1989. Prorok and Cil (2021) introduced cross-sectional implementation scores on 31 types of provisions in 68 agreements. This shift allowed scholars to examine the efficacy of peace agreements as a function of two interrelated factors: The content of the accord and the extent to which its provisions were put into practice. 1 Third-generation data had its own limitations in that provisions, while useful as broad categories, often encompassed smaller, operational components that were critical to their success. Without examining these micro-level components, it was difficult to assess whether provisions were truly fulfilling their intended functions.
The dataset introduced in this article marks the potential beginning of a fourth generation of peace agreement data, one that focuses on the operational effectiveness of the component programs working beneath the provision level. Over the past three decades, research has significantly advanced our understanding of the essential functions that peace agreements must perform to effectively manage the transition out of armed conflict. To end civil wars and prevent recurrence, peace agreements must: (a) end direct hostilities and dissolve military rivalries, (b) restore some degree of normalized political relations, and (c) address the root causes of future rebel recruitment (Joshi & Quinn, 2017).
These functions are carried out by integrated sets of provisions. However, the success of these provisions depends on the micro-implementation processes within them, as well as the timing of implementation. To track these processes in real time, a new methodology was needed—one capable of capturing the granularity and complexity of implementation at the operational level. In the following sections, we detail this methodology and its application.
Dataset Description
Unit of Analysis: Stipulation
The parties to the Colombian Peace Agreement tasked the Peace Accords Matrix Project with developing a methodology for the systematic and impartial verification of every stipulation in the Colombian Agreement over a period of at least 10 years. Borrowing from contract law, the term “stipulation” refers to each specific obligation within the agreement. For an obligation to qualify as a stipulation, it must meet three criteria: discreteness, observability, and measurability.
Discreteness means that each stipulation’s implementation must be distinguishable from others, ensuring that it is only counted once. While stipulations may share interdependencies, they must allow for varying rates of implementation. For example, two stipulations might be sequentially interdependent, where the initiation of the second depends on the completion of the first. However, the success of the first does not guarantee the success of the second, ensuring that different aspects of the same process are not erroneously counted as multiple stipulations. Observability means that monitoring must detect actions by implementing agents without over-reliance on official testimony. High observability discourages misrepresentation by those responsible for implementation. Measurability means the process must generate quantifiable outputs to track progress toward fulfilling the mandate. After months of consultation and deliberation, this process resulted in the identification of 578 stipulations within the Colombian peace agreement. The final number of stipulations was shared with the signatory parties—a significant advance in peace process transparency and accountability in its own right.
General Implementation Assessment Framework
Our coding framework centers around the concept of implementation viability, which assesses whether an implementation process has a realistic potential to achieve its intended objectives. A process is deemed viable if it demonstrates operational competence and takes significant steps toward its goals. The evaluation of viability involves a comprehensive analysis of available resources, capacity-building initiatives, planning documents, stakeholder engagement, and progress toward well-defined objectives. Table 1 presents this general implementation assessment framework.
General Implementation Assessment Framework.
Policy Instruments
This coding framework discerns between different types of policy instruments used to achieve implementation objectives. Below we provide coding guidelines for the largest categories of policy instruments: Laws and Regulations, Institution-building, Programs, and Appointing Personnel.
Coding Guidelines for Laws and Regulations
Many stipulations in the Colombian agreement require the creation or modification of laws and regulations. The legislative process typically follows well-defined steps, and the coding guidelines reflect these stages. Table 2 presents coding guidelines for laws and regulations.
Coding Guidelines for Legislative Progress.
Example: Stipulation 510 mandates the National Government, in coordination with the judicial branch, to present a draft law promoting plea agreements with certain organizations. For the first 15 months, no implementation activity was observed (Score 0). In March 2018, the draft law was unveiled, shifting the score to 1. By June 2018, both the House and Senate approved the legislation, achieving full implementation (Score 3).
Coding Guidelines for Institution-building
Institution-building stipulations are categorized into two types based on their mandates. Type I Institutions have no predefined end date and are expected to operate indefinitely. Full implementation is achieved when the institution is fully operational and consistently produces outcomes aligned with its mandate. Table 3 presents coding guidelines for Type I institution-building.
Coding Rules for Type I Institution-building.
Example: Stipulation 399 establishes the Special Unit for the Search for Missing Persons (UBPD). It moved to Score 1 in April 2017 when Decree Laws 588 and 589 created the UBPD. By February 2018, a budget was allocated and a director was in place, making it viable (Score 2). Full implementation was achieved in September 2018 when the UBPD had a director, budget, personnel, office space, operational plan, and had opened active missing persons cases.
Type II Institutions have a limited term and conclude when their objectives are achieved. These institutions are deemed viable upon becoming operational, indicating they are equipped to fulfill their responsibilities. Full implementation is not achieved until the institution successfully delivers the end product. Table 4 presents coding guidelines for Type II institution-building.
Coding Rules for Type II Institution-building.
Example: Stipulation 394 establishes a gender working group within the Truth, Coexistence, and Non-Recurrence Commission. Implementation began in December 2017 (Score 1) with the selection of a commissioner and members. By November 2018, the group was operational (Score 2), and by June 2019, it had completed its mandate (Score 3).
Coding Guidelines for Programs
Programs are systematically categorized into two distinct types based on their respective mandates. Type I Programs are specifically designed to achieve operational status and produce measurable outputs within a broadly defined mandate. A critical distinction between programs and institutions lies in their operational requirements. While institutions often serve as the structural framework for governance and policy implementation, programs are typically more focused, goal-oriented initiatives that may or may not require independent operational structures. Table 5 presents coding guidelines for Type I programs.
Coding Guidelines for Type I Programs.
It is essential to recognize that programs are frequently executed within pre-existing or newly established institutions. In many cases, multiple programs can be administered concurrently within a single institutional framework. As a result, coding rules for programs must account for the fact that programs do not invariably necessitate separate leadership structures, additional personnel, dedicated budgets, or distinct physical spaces. For example, when a new program is integrated into an existing institution—such as the Government of Colombia’s Post-conflict Management Unit (PMI)—it may not require an entirely new budget allocation. Instead, the institution may already possess an operational budget that can be reallocated to support the program within the relevant policy area.
In practice, the implementation of programs often involves the strategic reallocation of resources within existing institutional frameworks. This approach underscores the importance of flexibility and efficiency in program design and execution. Consequently, the coding rules for programs prioritize programmatic development over the creation of new institutional structures. This emphasis reflects the reality that programs are often embedded within broader institutional contexts and rely on the adaptive use of existing resources to achieve their objectives.
Example: Stipulation 3 establishes the Comprehensive Land Access Subsidy (SIAT) program. Initiated by Decree Law 902 of 2017, it remains non-viable (Score 1) due to unclear targeting criteria and the continued operation of the older SIRA program. Under the Rural Agricultural Planning Unit of the Ministry of Agriculture and Rural Development, SIAT is supposed to reform the existing Integral Subsidy for Agrarian Reform program (SIRA), which predates the peace accord. The Fifth Report of the Office of the Comptroller General concludes that SIAT lacks clear targeting criteria for defining the beneficiary population, and the old SIRA program remains operational.
In contrast to Type I Programs, which focus on operational outputs within a broad mandate, Type II Programs are designed to achieve specific, well-defined end goals, often within time-bound frameworks intended to conclude once their objectives are met. Table 6 presents coding guidelines for Type II Programs reflect this distinct purpose, emphasizing progress toward fulfilling their mandates and eventual closure. Type II Programs are characterized by their goal-oriented nature and finite timelines, unlike Type I Programs, which may continue indefinitely within institutional frameworks. This distinction has several implications: Resource allocation for Type II Programs often requires dedicated resources to achieve goals within a defined period, though these resources are typically reallocated or repurposed after program closure; performance measurement relies on clear metrics and monitoring mechanisms to assess success against predefined targets; institutional flexibility is prioritized, as Type II Programs often leverage existing structures, minimizing the need for new bureaucratic frameworks; and sustainability considerations are less critical their limited operational lifespan. These features highlight the unique challenges and opportunities associated with Type II Programs, which are tailored to address specific challenges or deliver targeted outcomes efficiently and effectively.
Example: Stipulation 1 establishes the Land Fund, targeting the distribution of 3 million hectares over 12 years, benefiting landless campesinos. Initiated in May 2017, it remains non-viable (Score 1), having distributed only 243,888 hectares.
Coding Guidelines for Type II Programs.
Coding Guidelines for Personnel Selection
In policy and governance, personnel are a critical policy instrument when deliberately deployed to execute policies, programs, and initiatives. Their qualifications, skills, and insights significantly influence the interpretation, formulation, and execution of policies. The Colombian accord places significant emphasis on the selection and appointment of individuals to serve on commissions, working groups, courts, and other organizational structures. This ensures that the right individuals are entrusted with advancing the accord’s objectives. The coding rules for the selection of personnel can be seen in Table 7:
Coding Guidelines for Selecting Personnel.
Example: Stipulation 410 mandates the selection of a director for the Special Unit for the Search for Persons deemed as Missing (UBPD) to be chosen by the Mechanism for the Selection of Magistrates for the Special Jurisdiction for Peace based on the qualifications developed with the International Committee for the Red Cross and the International Commission on Disappeared Persons. In April 2017, the stipulation was moved to a one when the selection committee convened and established criteria. In August 2017, a list of applicants was published (Score 2), and in September 2017, Luz Marina Monzón Cifuentes was appointed as director, achieving full implementation (Score 3).
Implementation Reversals
The dataset also accounts for implementation reversals, which refer to the deliberate dismantling or regression of previously established implementation efforts. These reversals are critical to track, as they reflect shifts in political will, policy priorities, or institutional capacity that can undermine progress toward programmatic goals. For example, Stipulation 324 addresses legislative adjustments and stipulates: “The Government will make the necessary legislative and regulatory adjustments to temporarily waive criminal action proceedings and sanctions against small farmers who have been linked to illicit crop cultivation.” This stipulation aims to provide legal protections for small farmers, many of whom are economically vulnerable and have been coerced into illicit crop cultivation due to a lack of viable alternatives. Initially, this stipulation received a score of 1 in September 2019 when a draft bill was introduced in Congress, signaling the beginning of legislative efforts to address the issue. However, the process experienced a significant setback when the bill was withdrawn in June 2021, resulting in a regression to a score of 0. This reversal highlights the fragility of policy implementation in contexts where political or institutional support is inconsistent.
Aggregating and Disaggregating Implementation Scores
In the methodology of the Colombia Barometer Initiative allows for flexible analysis of implementation at various levels. Figure 1 illustrates the primary methods of organizing or deconstructing the 578 stipulations. The Colombian Accord can be analysed into two distinct categorization paths: One aligns with the natural sequencing of issues in the accord, while the other aligns with academic categorizations. As seen in Figure 1, the Colombian Accord is divided into six Chapters or Points, 18 Themes, and 70 Subthemes.
Ways of Aggregating and Disaggregating Stipulations.
Figure 2 presents monthly implementation progress for all 578 stipulations. Over half of the accord’s stipulations are either completed or on track to be, while the other half face significant uncertainty. The graph reveals a rapid increase in fully implemented stipulations during the first year, followed by a gradual leveling off.
Monthly Implementation Progress of 578 Stipulations in the Colombian Agreement.
Figure 3 breaks down implementation progress across the six Points of the Colombian agreement. Notably, elements addressing short-term risks, such as Points 3 (“End of Conflict”) and 6 (“Implementation and Verification”), exhibit higher levels of implementation. In contrast, progress on structural and long-term peacebuilding objectives, such as addressing root causes of conflict, has been slower and more challenging.
Implementation of Stipulations by the Six Points of the Agreement.
Figure 4 provides a granular view of implementation across 70 subthemes, calculated as the mean score for stipulations within each subtheme. This analysis underscores the disparity between short-term and long-term peacebuilding objectives, with many subthemes related to structural reforms lagging behind.
Mean Implementation Score for 70 Accord Subthemes (December 2016 to October 2023).
Transitioning now to academic classifications, the 578 stipulations have been organized into 31 policy domains based on categories from the Peace Accords Matrix Project (Joshi et al., 2015; Joshi & Darby, 2013). Figure 5 illustrates these policy domains and the current state of implementation, revealing significant variation in progress across different policy areas. Highly implemented stipulations include areas such as ceasefire arrangements, verification mechanisms, constitutional reforms, and demobilization efforts. In contrast, policy domains like decentralization, civilian administration reform, reparations, minority rights, women’s rights, human rights, and economic and social development exhibit lower levels of implementation. Notably, decentralization emerges as the most under-implemented domain within Colombia’s peace process, which is particularly striking given the central role that the concept of territorial peace played in both the negotiations and the accord’s pedagogical framework (Cairo et al., 2018). Furthermore, a significant portion of Civilian Administration Reform stipulations also advocate for the devolution of centralized powers, underscoring the interconnectedness of these underperforming domains.
Within the provisions of the Colombian Peace Accord, a wide array of policy instruments come into play, as illustrated in Figure 5. Here, the term policy instrument refers to the tools or techniques employed by implementing authorities to produce social change (Acciai & Capano, 2021; Salamon, 2002). These instruments are cross-cutting and can be applied across various policy domains. For example, stipulations categorized under the policy instrument “increasing accountability” appear across multiple policy domains in the Colombian Accord, including Civil Administration Reform, Decentralization, Economic/Social Development, Electoral/Political Party Reform, Human Rights, Paramilitary Groups, Reintegration, Reparations, Verification, and Education Reform.
Policy Domains Sorted Lowest to Highest Mean Implementation Score.
Figure 6 reveals dynamic shifts in the types of instruments employed over time, reflecting the evolving nature of the implementation process. Approximately 80% of stipulations completed within the first year relied on just five instrument subtypes: The establishment of monitoring mechanisms, appointment processes, formation of commissions, law enactment, and procedural actions. These instruments are well-suited for the early stages of implementation, where quick wins and foundational structures are created. However, after the first year, the focus shifted significantly. Stipulations that extended beyond the initial phase primarily relied on more complex instruments, such as institution-building, planning, public awareness initiatives, investigative processes, and consultative processes.
Policy Instruments Sorted Lowest to Highest Mean Implementation Score.
Transversal Themes: Gender, Ethnicity, and Territory
The Colombian Peace Accord incorporates three transversal themes—gender, ethnicity, and territory—which serve as intersectional frameworks for implementation. These themes are not standalone stipulations but rather supplementary requirements that cut across multiple stipulations. The accord includes 130 stipulations that call for a gendered implementation approach. These stipulations aim to promote equitable representation and participation, gendered programming, and disaggregated statistics based on gender. The accord also includes 80 stipulations that require a consultative ethnic approach. These stipulations either focus on territories recognized as belonging to ethnic communities, or explicitly mentioning Indigenous, Black, Afro-descendant, Raizal, Palenquero, or Roma communities. Finally, the accord features 100 stipulations that emphasize the territorial approach, necessitating citizen engagement from specific regions.
Transparency, Evidence, and Sources
Over the past eight years, the Colombia Barometer Initiative has gathered a wealth of information from a diverse range of sources, including government agencies, domestic and international organizations, media outlets, and through direct interactions with stakeholders involved in the implementation process across Colombia. A central objective of this project has been to establish a robust standard of transparency regarding evidence and sources.
To achieve this, we developed a comprehensive database dedicated to archiving implementation events and source documents. The database is managed using MAXQDA, a qualitative data archival and analysis tool. Within MAXQDA, implementation narratives are documented and segments of text in sources containing relevant evidence are tagged to specific stipulations. Currently, the database includes approximately 12,000 source documents and 26,000 tagged text segments linked to stipulations. For each of these segments, the project records the location of the implementation event at the departmental and municipal level, facilitating subnational analyses. A key feature of this system is its ability to record the location of implementation events at the departmental and municipal levels. This granularity enables detailed subnational analyses, allowing researchers and policymakers to identify regional variations in implementation progress.
Finding Information on a Specific Stipulation in MAXQDA
Figure 7 provides a detailed look at the MAXQDA interface when searching for implementation data at the stipulation level, showcasing its utility for qualitative researchers. For example, initiating a search with the term “185” (located at the top of the bottom left window) retrieves the entry “185: Create a Special Electoral Mission…” By selecting this option, users can access the “retrieved segments” tagged to stipulation 185, which are displayed in the bottom right window. These segments provide rich, context-specific data for in-depth qualitative analysis.
Searching MAXQDA for Stipulation 185.
A unique feature of MAXQDA is its ability to reveal connections between stipulations, which is particularly useful for understanding the complexity of policy implementation. In the top right window, vertical lines adjacent to the highlighted segment indicate that stipulations 186, 499, and 536 are also tagged to the same segment (#07_24_17e_ED). This interconnectedness allows researchers to explore how different stipulations overlap, interact, and influence one another within the implementation process. For qualitative researchers, this feature opens up opportunities to examine cross-cutting themes, synergies, and tensions between policy elements, providing a more holistic understanding of the peace process. Moreover, MAXQDA’s capacity to link related stipulations enables researchers to trace the evolution of implementation efforts over time and across different policy domains. This is particularly valuable for identifying causal pathways, unintended consequences, and contextual factors that shape implementation outcomes.
Research Application: Evaluating a New International Accompaniment Approach
The Colombian Peace Agreement includes 12 distinct stipulations (535–546) that assign international organizations the task of supporting the implementation of specific commitments based on their thematic expertise. These stipulations represent an innovative intended to leverage the resources and expertise of international actors to enhance the implementation process. In MAXQDA, segments of text describing the actions taken by these organizations are tagged to both the relevant international accompaniment stipulation and the stipulation receiving implementation support.
Example: Stipulation 185, which calls for the establishment of a Special Electoral Mission, and is connected to stipulation 536 through shared segment #7_24_17e_ED. Stipulation 536 assigns responsibility to the Netherlands Institute for Multiparty Democracy (NIMD) for supporting aspects of point 2 of the agreement. NIMD accepted this role and was entrusted with establishing the Special Electoral Mission, effectively serving as its Technical Secretariat.
With thousands of shared segments in MAXQDA connecting international accompaniment actions to specific stipulations, we are able to address a critical question: Do these innovative accompaniment approaches contribute to improved implementation outcomes? Should they be replicated in future negotiations and agreements? To answer this, we developed a research design to evaluate the impact of international accompaniment on stipulation-level implementation in the Colombian accord.
Research Design
To evaluate the impact of international accompaniment on stipulation-level implementation, we utilized MAXQDA to extract all shared segments between the 12 international accompaniment stipulations (535–546) and the remaining 566 stipulations in the agreement (578 total stipulations minus 12 accompaniment stipulations). Our primary independent variable is the cumulative number of times each of the 566 stipulations is tagged to an international accompaniment stipulation through shared segments. This variable, which we term international actor accompaniment, is continuous, and ranges from 0 to 51 tagged segments. To account for the skewed distribution of tags, we applied log normalization. The dependent variable consists of annual implementation scores (ranging from 0 to 3) for each of the 566 stipulations. For this analysis, implementation progress is assessed at the end of each calendar year for the first five years. While this study focuses on a limited application of the data, it lays the groundwork for more advanced time-series analyses. Future researchers will have the opportunity to build on this foundation, developing more comprehensive models to explore the long-term dynamics of Colombia’s implementation process.
Results
The findings, presented in Table 8, reveal that the international accompaniment variable demonstrates a positive and statistically significant impact across all years analysed (p < .001). These results provide strong empirical evidence for the effectiveness of international accompaniment as a peacebuilding strategy and highlight its potential as a model for future agreements.
Effects of International Accompaniment on the Implementation of Stipulations in Colombian Agreement.
Robust standard errors are in parenthesis.
Two tail tests.
*p < .05, **p < .01, ***p < .001.
Figure 8 plots the coefficients for our primary independent variable (international accompaniment) and five control variables, with estimates derived from ordinal generalized linear models. Significance is indicated by coefficients that do not intersect the red zero line, providing a visual representation of the strength and direction of each variable’s impact. A clear and consistent pattern emerges from the analysis: Stipulations that received greater international accompaniment demonstrated significantly higher rates of implementation compared to those with limited or no international involvement. The number of international accompaniment connections exhibited high statistical significance in each year, underscoring the robustness of this relationship.
Coefficients Plot on Effects of International Accompaniment on the Implementation of Stipulations in Colombian Agreement.
Notably, the impact of international accompaniment was most pronounced during the first year of implementation. A one-unit increase in international engagement (equivalent to one shared segment with stipulations 535–546) increased the odds of achieving a higher implementation category by 0.927 (Model 1 in Table 8). This finding points to the important role of early international support in jumpstarting the implementation process and setting a positive trajectory for long-term success.
Figure 9 visually represents the marginal effects of international accompaniment on the likelihood of any given stipulation progressing from one implementation category to another, based on Model 1. The results show that as the level of international accompaniment increases for a given stipulation, the probability of achieving full implementation rises significantly, while the likelihood of remaining at a lower stage decreases. Specifically, the margin increases from 0.114 (for stipulations with zero international accompaniment shared segments) to 0.930 (for stipulations with five international accompaniment shared segments), representing a remarkable 816% increase. This result suggests that sustained and substantial support from international actors can have a transformative impact on implementation outcomes for stipulations that receive it.
Marginal Effects of International Accompaniment on Stipulation Implementation.
Conclusion
The peace agreement signed between the GoC and the FARC-EP in November 2016 marked the beginning of an implementation process unprecedented in its scale. From the outset, the accord was recognized for its innovative approach to implementation planning and its emphasis on verification (Herbolzheimer, 2016). A key innovation was the introduction of impartial, academic verification aimed at providing timely, empirical, and comprehensive coverage of implementation. The dataset presented in this article represents a step toward fulfilling this mandate, offering a replicable methodology that has the potential to help standardize peace agreement monitoring and verification in the future.
Looking Ahead: Future Research
This dataset opens several promising avenues for future research and policy innovation, offering opportunities to deepen our understanding of peace accord implementation and its broader societal impacts. Each of these avenues addresses critical gaps in the literature and provides practical insights for policymakers and practitioners.
Understanding Micro-implementation
The dataset enables researchers to delve into the micro-level dynamics of implementation, identifying factors that contribute to variation in outcomes. These factors may include monitoring challenges, such as inadequate oversight mechanisms; resource limitations, such as budget constraints or insufficient personnel; and relational issues, such as coordination problems between government units or between state and non-state actors. By systematically analysing these barriers, researchers can develop tailored support systems to address specific implementation challenges. This line of research can inform the design of more adaptive and context-sensitive implementation frameworks, ultimately enhancing the effectiveness of peace agreements.
Post-accord Comparative Policy Studies
While many provisions in peace agreements are implemented, they often fail to deliver their intended benefits, raising questions about their long-term impact. This dataset provides a foundation for post-accord comparative policy studies, enabling researchers to evaluate the downstream sociological, political, and economic impacts of specific stipulations. By identifying successes and failures, researchers can contribute to improving policy design and implementation. If this methodology is replicated in other peace processes in the decades to come—comparative data will expand, allowing for cross-contextual analyses. These comparisons can reveal patterns and principles that are transferable across settings, contributing to a more robust understanding of policy design and implementation in post-conflict recovery.
Public Perception and Peacebuilding
Public perception plays a critical role in shaping the legitimacy and sustainability of peace processes. By combining implementation scores with survey and perception data, researchers can explore how public attitudes toward peace reforms evolve over time. For example, how do shifts in public opinion influence the political will to implement contentious reforms? This approach can provide insights into the consistency, stability, and influence of public attitudes on peacebuilding efforts. Understanding these dynamics is essential for designing inclusive and participatory peace processes that resonate with local populations.
Linguistic and Textual Analysis
Finally, the dataset’s integration of qualitative and quantitative data makes it particularly well-suited for linguistic and textual analyses of peace agreement texts and source materials. Researchers can examine how language choices—such as framing, terminology, and rhetorical strategies—shape the interpretation and implementation of peace agreements. For instance, how do different actors interpret ambiguous terms, such as “enhance,” “support,” “reconciliation,” or “territorial?” How do textual choices reflect power dynamics or negotiation strategies during the drafting process? Such analyses can shed light on the interplay between language and policy outcomes, revealing how discursive practices influence implementation strategies and stakeholder behavior. This line of inquiry can also inform the design of more precise and actionable peace agreement texts, reducing the risk of misinterpretation or manipulation during implementation. In sum, this new dataset can help researchers and policymakers design, implement, and evaluate remedial responses in post-accord settings, offering insights into the complex interplay between policy design, public perception, and societal impact.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Colombia Barometer Initiative was supported by the United States Department of State’s Bureau of Conflict and Stabilization Operations, Humanity United Foundation, the UN Multi-Party Trust Fund, and the European Union.
