Scott Chaplowe - Confronting Obsessive Accountability Disorder

Confronting Obsessive Accountability Disorder (OMD) in Evaluation

Scott Chaplowe. December, 2024

This piece was written for the Independent Philanthropy Association South Africa (IPASA), and can be found on their website here.

Introduction

The rhetoric of transformation and the concepts which underpin it speaks to our shared aspiration and hope that humanity is capable of BIG change in response to the inextricably complex mess it confronts. Transformation entails radical systems change, upending business-as-usual, and with that the status-que and our sense of stability. It invites us to embrace bold and courageous conversations across all aspects of our work, including ‘evaluation-as-usual.’ As a field that straddles both theory and practice, evaluation is uniquely positioned to contribute to transformational change, but this potential depends on its ability to transform from within.

The transformational agenda has ushered in an appreciation and respect for complexity and systems thinking in international development and its evaluation, stressing that the systems in which we operate – environmentally, socially, and economically – are dynamic, interconnected, and unpredictable. In other words, these systems, and the change we seek to leverage cannot be fully understood through the analysis of their components in isolation. Yet, despite the heightened focus on and acknowledgment of complex systems analysis in evaluation, bad habits are hard to change, especially when institutionalized in the bureaucracies and marketplace that shape and govern evaluation.

Confronting the Obsessive Accountability Disorder (OAD)

Accountability in itself is invaluable, helping to ensure responsible, transparent, and effective investment. However, an Obsessive Accountability Disorder (OAD) occurs when accountability to interventions and their metrics overshadows and detracts from evaluating broader and more consequential impacts and implications on human and natural systems.

What does OAD look like in practice? It arises when funders and implementers become so fixated on the outputs and outcomes captured in their Theory of Change (ToC) and monitoring and evaluation plan agreed upon when the project was conceptualized, that they overlook the broader impact of the intervention, which may or may not link up with what they thought was possible during project design. This mindset overlooks what really happens, preoccupied with finding evidence of ‘success’ (or failure) according to narrowly-focused, pre-defined indicators. OAD contributes to risk-adverse attitudes and a fear of failure that impede the experimentation essential for emergent learning and innovation necessary for transformational change work.

OAD leads a family of other bad habits that hamper evaluation’s transformational potential, summarized below. True to systems dynamics, these habits are interrelated and self-reinforcing, and have largely become evaluation-as-usual, (although there are exceptions – e.g., see the Ellen MacArthur Foundation’s Systems and the circular economy. If funders want to do transformational work, they need to be willing to critically examine their evaluation habits and their impact on grantees.

Project Box – Evaluation has been largely shaped by a project-centric mindset, confined to discrete interventions delivered by single agencies and funded by individual donors. This project mentality boxes in an intervention by fixating on linear intervention designs where there is a clear and simple line of causality. We find this in our familiar friends, the logic model, results frameworks, and theory of change, resulting in the monitoring and evaluation (M&E) of their pre-determined results, targets, and performance metrics (KPIs). However, such siloed thinking is reductionist rather than systemic, excluding unintended consequences (positive or negative) on the broader system in which interventions are delivered. While it is convenient to have neatly defined projects, change does not happen in closed systems – boxes – but in open systems that transcend time, place, and specialized interests. And the dynamics of change is not linear.

The diagram below (Chaplowe and Hejnowicz 2021) represents the linear results chain in an intervention’s design, from inputs to intended impact (colored green). This design typically becomes the focus of M&E to the exclusion of other influencing factors and unintended positive and negative consequences (colored red).

Temporal Trap – Reality is complex and does not behave according to preconceived timelines and payment schedules based on how an intervention’s design predicts change over time. Instead, a myriad of emergent variables can come into play, influencing the intended intervention logic of our carefully designed interventions, (poignantly exemplified by COVID-19). Consequently, evaluation needs to be more agile and responsive, providing immediate feedback that enables interventions to course correct. Such monitoring as evaluation is more iterative and ongoing compared to the conventional recipe of a baseline study, midterm review and final evaluation.

Moreover, the temporal trap extends beyond the project/program timeframe to longer-term impacts after funding and implementation for an intervention ends. Conventional evaluations that rely on summative (final) evaluations at the project/program end without ex-post evaluations conducted years later fail to assess longer-term impacts, neglecting potential unintended consequences over time.

Quantitative Box – Also dubbed Obsessive Measurement Disorder (Natsios 2010, and the inspiration of my use of OAD), occurs when there is a fixation on quantitative metrics and methods to establish objective certainty when such certainty does not exist. Quantitative measurement undeniably has immense value in evaluation, but a tyranny of metrics (Muller 2018) can reinforce a narrow scope of inquiry on pre-determined indicators. While it is often asserted that "What gets measured gets done," another witticism cautions that "Not everything that can be counted counts, and not everything that counts can be counted." Reality is not a binary concept that can be wrapped up neatly in numbers captured by KPIs and displayed on dashboards.

The quantitative box is characterized by experimental and quasi experimental designs, which employ statical analysis to determine the degree to which an intervention has had impact (attribution analysis). For instance, randomly controlled trials (RCTs) compare the effects of an intervention on a randomly selected sample of the target population against a group that did not receive it. However, by focusing on individual outcomes and controlling for observable and non-observable characteristics, RCTs can overlook the effects of multiple inputs on various outcomes typical of complex systems, potentially missing unintended consequences beyond the intervention logic. Their limitations remind us that the same complexities that make development hard to achieve make it hard to measure.

Historical origins and consequences

In summary, despite the increasing enthusiasm for complexity-responsive and systems-savvy methods, ‘evaluation-as-usual’ is largely commissioned and employed for accountability purposes, retarding its potential to promote the adaptive learning and innovation required for transformational change. OAD is characterized by a fixation on quantitative methodologies to assess the degree to which an intervention achieved its predetermined results. This may make the measurement of and accountability to these results more doable, but it is ultimately reductionist, restricting inquiry and hindering emergent learning and innovation essential for transformational change work.

OAD, along with the other bad habits it fuels, reminds us that evaluation is embedded in the political economy, influenced by the same market and power dynamics that shape the interventions under evaluation. Box 1 summarizes the historical roots of OAD. To a large extent, the current practice of international development evaluation matured in the 1990s with the ascendancy of results-based management (RBM) as the dominant paradigm. RBM extended the “principle-agent theory” in the private sector to the public and civic sectors, later known as the New Public Management (Vedung 2010 and Muller 2018). It is a business model premised on the belief that the interests of agents will diverge from the principals unless performance metrics are used to report to the principals to ensure the agents perform accountably. Historically, this trend coincided with the neoliberalism of Reaganomics and Thatcherism, and the belief that it was desirable to create marketlike conditions within the public and civic sectors so they could be run more like a business.

An especially troubling outcome is that OAD undermines the core value proposition of evaluation. The most widely accepted definition of evaluation, from evaluation pioneer Michael Scriven, is deceptively straightforward: to “judge the merit, worth, and significance of things.” However, when evaluation seeks to replace judgement with standardized measurement, it risks consigning itself to a descriptive, tick-box, accounting exercise that steers clear of judgment rather than providing it. It is paradoxical that Michael Scriven later coined the term ‘valuephobia’ to caution against this trend that erodes evaluation’s fundamental tenets. Perversely, “the snake of accountability eats its own tail,”(Muller 2018).

Conclusion

In closing, accountability is not inherently bad, but an Obsessive Accountability Disorder is, handicapping evaluation’s transformational potential. It reinforces a narrow focus on whether things are done right versus whether the right things are being done in the first place. Like any addiction, confronting the problem is the first step towards recovery, carrying the kernel of its solution. OAD and its associated bad habits point toward the inner transformation evaluation requires if it is to contribute to the broader transformational agenda.