This blog is about evaluation. Many of you have been subjected to an evaluation or will be in the future, and some of you will be part of an evaluation team in some setting.
Our guess is that many of you may have felt frustrated by evaluations as they have been carried out, or by their frame of reference. However, we claim that this need not be the case, that evaluations, well understood, can and should be a positive incentive on your path to future development.
To introduce a metaphor here: we think of evaluation as a way station in a partially unknown terrain where you get a chance to reflect on your (and your colleagues’) journey and your path going forward. (In Norway, we may use the concept of “betjente turisthytter” where we are invited to take a rest on our hike through the mountains. Like the one in the featured image, Skogadalsbøen, west in Jotunheimen).

We want to explain our notion of evaluation through some conceptual clarifications which we believe are often lost along the path, but are crucial to the positive functions of evaluation.

Evaluations that are actually evaluations (not other things masquerading as evaluation) answer questions such as “what does ‘good’ look like?”
Values are essential to answering these types of questions. Values are held by individuals; in evaluation they need to be shared. However, often what is shared are just the value words (e.g., effectiveness, efficiency, worth, significance, etc.), which lead people to think they agree about ideas when, in fact, they may not.
The evaluator has a unique role in pointing out the values and their meanings – or helping the group to articulate/generate/surface those: “hey you, stop a minute, let’s reflect on where you’ve come from or where you want to go!”
When evaluation practice is divorced from surfacing and discussing values it can become a bureaucratic mechanism or a basic research effort. In either case, this will lead to evaluation being less relevant, less powerful, and less valuable to those who care about whatever is being evaluated. Below we will discuss what evaluation is and how it works, what values are, and how we can identify and use values in evaluation, so that our “rest stop” with evaluation can provide the most benefit.


What is evaluation and how does it work?
Often evaluation is about progress, which is essentially a value-based measure to assess how you have fared so far and how you may envision your further development. Often this means relating a given past distance or accomplishment to an established value or set of values.
We can conceive of progress in two ways: i) backward-looking or ii) forward-looking. The backward-looking conception is based on a fixed set of values (in science, e.g., the traditional notion of proximity to the truth).
Your progress so far will be assessed in relation to your distance from these values, which define the goal of your journey. The forward-looking version of progress is the dynamic one, where you continuously move away from the initial starting point (in science, e.g., a state of ignorance) towards relative improvements depending on current needs and purposes.
Philosophers talk here also of an evolutionary notion of progress (e.g., Thomas Kuhn 1962). Thus, in the forward-looking conception the goal is not fixed beforehand but is subject to occurring needs and sub-goals on your way. Your future path may divide into different directions and you may progress on each of these paths (a further elaboration of this is found in Kaiser’s thesis, 1993).

At the point of evaluation, you can look at your progress from both backwards and forwards perspectives, which means the act of evaluation occurs in a liminal or between state. The process of evaluation needs to recognize where you have come from and what you thought you would gain on the way. But it also needs to prepare you for the alternative choices you face to guide on your next steps.
In this endeavour the role of values takes on a crucial role. Let us explain this in a little more detail.

The task of evaluation is to be able to provide a judgement about the goodness of something (or of different aspects of something), some given unit, some program, some institution, or some system.
The judgement comes from applying a values lens to facts. Where a research effort might report the strength of a causal claim, for instance, whether a program can really be credited for creating an impact, and how much of that impact there was, an evaluation will give a perspective on how valuable that impact is for different stakeholder groups. Was that impact any good for the program’s clients? Their families? The program staff? The environment?

Valuing illustration
Developed by Mepham et al. (2006) and further developed by others (Kaiser et al 2007;  Kaiser & Forsberg 2001).

To conduct evaluation systematically requires full description and full judgement of the something that is the focus of the inquiry.
This process is captured in the four steps of the logic of evaluation: 1. define criteria, 2. set standards, 3. measure, 4. synthesise judgement (Fournier, 1995; Scriven, 1991). (Or, if you’re an overachiever, nine steps, see Gullickson 2020). This logic underpins all inquiry that is actually evaluative. (There are many inquiries that say they are evaluation but are really just research – very valuable and important – but not evaluation).
While the logic of evaluation is often used to look backwards, its nature is iterative – a look backwards informs what goodness looks like for the next version of the program, and shapes how that effort is designed.

Values come into the evaluation process in Steps 1,2, and 4 of the logic. Values underpin the criteria (what does good look like for the thing being evaluated?), the standards (how good is the thing based on various aspects of each criterion?), and the synthesis (when we put the evidence together with the criteria and standards, what do we know about how good this thing really is?). However, the criteria must be developed based not only on what good looks like for the most powerful parties involved in the evaluation. Robust evaluation needs to consider various perspectives on good – underpinned by the values of the people who are involved with the thing being evaluated.
Evaluation that advances equity will prioritise the values of the most vulnerable stakeholders.


What are values?
Values are here tentatively defined as: “reference points for evaluating something as positive or negative. Values are rationally and emotionally binding and they give long-term orientation and motivation for action.”[1]
The noun “value” is what we can describe as the general feature of the active process of a “valuation” / an “evaluation”. This conception of values opens a path for empirical research, placing values between attitudes and preferences on one side and norms and principles on the other. There is no definite list describing all possible values held by people; all lists are in an important sense incomplete, but some lists are sufficient given a context and the people.
Values are only motivational in a restricted sense. Most actions will not be directly caused by conscious reflection on a certain value; often they can be rule-based or automatic, as implied by our cultural or group identity.
Yet, once an act is done, the act might mostly be described as an outcome of a set of beliefs and a value, such that the act is the result of the most adequate way to reach the goal / to realize the value. 

Our values are important elements of personal identity. When values are ascribed to social groups or institutional / organizational structures, they define the identity of that structure by giving it a direction for what this structure aims to achieve or to provide to others.
In this sense, the set of values of an organization or program provide the benchmark on which this unit wants to be measured. It is important to note that the function of values, both on the individual and social / organizational level, is typically as a heuristic, not a fixed decision rule / algorithm. This “openness” is important since often we need to deal with contexts where there is tension or conflict between different values. 

There is an important difference between the “forward-looking” and the “backward-looking” function of values.
In the forward-looking setting we need to deal with decision uncertainty and deliberate our trade-offs and assessments or relevance. Yet, once decisions are made, and we switch to the backward-looking mode, the set of values typically acts as critically justificationary.
This means basically that if good decisions were made, we can justify them by the accepted values they promote. If we cannot find any resemblance to the values of the individual or organization, then we might have to conclude that the previous decision was not good, not really justified by the given values. 

One important feature of the set of values of any kind is that they also provide the bridge between me / us and the others, or between the internal and the external. The external is the unit, the social structure, the “people”, that our value-based actions are always embedded within. Typically, one aims at coherence in this respect. If the split between internal and external is too large, social cohesion is undermined, and the stage is set for serious conflict. 


How can we identify and use values in evaluation?
In this section we now introduce a values identification matrix (VIM), a tool designed to help the evaluator map the complex set of values. The VIM’s purpose is to support the development of defensible criteria which are critical to reach an evaluative judgement.
Such criteria are particularly important when the stakes are high. In this respect, the VIM has a “backward-looking” function, as the evidence collected is compared against the criteria (looking back at ‘what matters’). 
However, the VIM can also be used to shape future iterations of programs and interventions, to ensure they actually align with the values of stakeholders and reduce the likelihood of decisions that are not good (as above).

The matrix was informed by a framework developed by Mepham et al. (2006) and further developed by others (Kaiser et al 2007;  Kaiser & Forsberg 2001) to support the analysis of ethical issues concerning emerging biotechnologies. The original framework applies three ethical principles that consider value from welfare, autonomy and justice.
The VIM has five principles; one has a consequentialist (outcomes) focus; three are drawn from a deontological perspective (duty, rights, equity), and the fifth is focused on ethic of care. 

The rows of the matrix include all groups that have a legitimate interest in the thing that is being evaluated. We refer to these as interest groups. Typically, this would include the beneficiaries or end users of an initiative, the decision-makers who provide the resources that enable the initiative to exist or be sustained, and others who have a direct interest in the initiative. Depending on the context, key interest groups might also include non-human species such as animals or forests. 
The literature can also be used as an interest group, to explore what is already known about the thing being evaluated, and what values it has been judged with previously.

The intersection of interest groups and values perspectives in each cell of the matrix provides a systematic way to identify all relevant perspectives on what matters in relation to the programme being assessed.
The value perspectives are identified by asking interest groups, for example, through key stakeholder interviews and other sources such as relevant literature and programme documentation. In developing defensible criteria, the value perspectives are taken at face value, but are warranted, for example through reference to the expertise of stakeholders or to the literature.  The resulting value perspectives then inform a set of criteria that fully describe the dimensions of merit.
In doing so, abstract terms such as ‘quality’, ‘value for money’ or ‘effectiveness’ are unpacked. For examples of the VIM in practice check out Roorda & Gullickson (2019) or Roorda, Gullickson & Renger (2020). You can download the VIM here https://tinyurl.com/CriteriaMatrixRego.

So what is the benefit of the VIM?
Systematically identifying the value perspectives for all interest groups serves to avoid an initiative being evaluated through the lens of the most powerful stakeholders (usually government) or the lens of the evaluator. The VIM does not require the evaluator to defend one normative stance; rather it encourages value to be considered from multiple perspectives as relevant to a context.
In evaluations of initiatives that serve indigenous peoples or those with a focus on the environment, values such as respect, caring, collaboration, reciprocity, and trust may be brought to a fore. As such, the framework acts as a prompt for evaluators to consider relevant values that may not seem evident to them or to the commissioner.
Thus, evaluators inhabit the metaphorical way stations where we can refresh and update our incomplete maps into the future, where we can chart uncertainties and refine our malleable value sets.


On the way
In sum, in spite of many unsatisfactory evaluation practices around us, we see a positive function for evaluations as helpful reflection on the past and guidance to future developments.
A reflective stance on changing sets of value and the explicit integration of value lenses into our evaluations, for instance by integrating the ethical matrix approach, serves us well to prepare us for our path forward.
We just wish that this value-talk would eventually trickle through to commissioners who see the evaluation task as the quick-fix to get people in line with their wishes and impose fixed standards in a world in a flux.
In a world in flux, we need to take a moment at the way station and truly evaluate, to see what we’ve accomplished and make a good decision about where we’d like to go next.

*****


[1] This definition is taken from the EU project “Value-Isobars” 2009-2011, coordinated by Kaiser; cf.: https://cordis.europa.eu/project/id/230557

References

Fournier, D. M. (1995). Establishing evaluative conclusions: A distinction between general and working logic. New Directions for Program Evaluation, 1995(68), 15–32. https://doi.org/10.1002/ev.1017.

Gullickson, A. M. (2020). The Whole Elephant: Defining Evaluation. Evaluation and Program Planning, 79(January), 101787. https://doi.org/10.1016/j.evalprogplan.2020.101787

Kaiser, M. (1993), Aspekte des wissenschaftlichen Fortschritts,  Europäische Hochschulschriften XX/398, Peter Lang Verlag, Frankfurt a.M, 1993.

Kaiser, M. & Forsberg, E.-M. (2001), “Assessing Fisheries – Using an Ethical Matrix in a Participatory Process.”, Journal of Agricultural and Environmental Ethics, 14: 191-200.

Kaiser,M., Millar, K., Thorstensen, E. & Tomkins, S. (2007), “Developing the ethical matrix as a decision support framework: GM fish as a case study”, Journal of Agricultural and Environmental Ethics, vol.20, 65-80.

Kuhn, T. (1962, 1970), The Structure of Scientific Revolutions. Princeton University Press.

Mepham, B., Kaiser, M., Thorstensen, E., Tomkins, S., Millar, K. (2006): Ethical Matrix – Manual, LEI: onderdeel van Wageningen UR. The Hague

Roorda, M., & Gullickson, A. M. (2019). Developing evaluation criteria using an ethical lens. Evaluation Journal of Australasia, 19(4). https://doi.org/10.1177/1035719X19891991

Roorda, M., Gullickson, A., & Renger, R. (2020). Whose values? Decision making in a COVID-19 emergency-management scenario. Evaluation Matters—He Take Tō Te Aromatawai, 6, 1–14. https://doi.org/https://doi.org/10.18296/em.0057

Scriven, M. (1991). Evaluation Thesaurus (Vol. 4th). Newbury Park, CA: Sage.

Featured image: Den Norske Turistfornings cabin, Skogadalsbøen in Jotunheimen. Photo: Sindre Thoresen Lønnes/DNT.

Matthias Kaiser
Matthias Kaiser is leader of AFINO's research group Quality and foresight in responsible research and innovationRead more about Matthias on AFINO's webpage.
Amy Gullickson
Amy Gullickson

Amy Gullickson is the Director of the Centre for Program Evaluation at the University of Melbourne in Melbourne, Australia, and the Chair of the International Society for Evaluation Education. She led development of the fully online Masters and Graduate Certificate programs in evaluation at Melbourne, and has been deeply engaged with the Australian Evaluation Society in the development and refinement of their competencies for evaluators (take an evaluator competencies self-assessment here). She spends her time conducting evaluations, teaching people to conduct evaluations, teaching people to teach others how to conduct evaluations, and teaching organisations how to integrate evaluation into their day to day operations - and doing research on all of the above. She has a Twitter account, but can’t remember her password or username…

 

Mathea Roorda
Mathea Roorda

Mathea Roorda, PhD., is an evaluator at Allen + Clarke, a consultancy based in Wellington, New Zealand. She works on evaluations of public policies and programmes across Australasia. Her PhD research focused on identifying defensible criteria, a critical component of evaluative reasoning.