Panning for gold

Liam Maxwell reflects on the search for interventions that count

Whoever we are and whatever we are doing, we want our efforts to count. For those involved in education this pursuit of effectiveness is spurred on by the critical nature of their task. Parents, teachers, researchers, and policy makers are all united around the central aim of securing the best future for the next generation. What this looks like and how to get there, though, is a contentious issue. The search for effective methods and interventions is a bit like panning for gold. First, you need a method to sift through the dross and get to those elusive nuggets. Then, as these will be found in different shapes and sizes, one needs to be able to weigh them and see how valuable they really are. Thankfully, educational researchers and practitioners are not alone in their search for the right tools and measures. Many insights may be gained from colleagues in social policy, health, and social work. Learning these lessons was a main aim for the ESRC seminar held in March 2015.

What we know
● Even the most well-designed studies can run into difficulties in the real world.
● Many studies need to be combined to properly evaluate what works.
● Analysing interventions in terms of their monetary costs and benefits is an important step forward. However, broadening our focus to include wellbeing could lead to measures that capture more of what matters to school communities.

The scientific method offers a way to systematically quantify and evaluate different educational practices and programmes. It can give us the “pan” we need to be able to separate and spot the gold. Yet, as we all know, all that glitters is not gold, and some research designs are better at determining the purity of the nugget. In this respect the randomised control trial (RCT) is seen as the “gold standard”. Borrowed from health, this design involves the random allocation of participants to a control and a treatment group. The randomisation reduces bias and improves balance in known and unknown participant characteristics, giving greater certainty that changes result from the treatment.

All that is gold does not glitter

At the seminar we heard from Colin Waterman of the National Implementation Service, who explained some of the potentials and pitfalls of the RCT design when applied in a social work context. Effective intervention delivery and evaluation arise from the confluence of practitioner understanding, motivation, and skill in delivery. Colin explained how some of the social workers in his study either did not have a clear understanding of the research rationale or were opposed to the methodology, believing the RCT to be unethical and inappropriately applied in their context. As a result, recruitment was difficult; there was a high drop-out rate and a stretching of the referral criteria to include those for whom the intervention was less appropriate.

Coupled with these issues, Colin explained his concerns at the speed with which assessments were carried out. In this, and other evaluations like it, timing and cost constraints require outcome measures to be taken almost immediately. This can mean that even the best and most motivated practitioners just haven’t had enough time to practise. Without this “bedding in” time, improvements are likely to be reduced. These and other concerns led Colin to suggest that individual RCTs do not always yield the golden certainty they promise. To properly evaluate the effectiveness “weight” of an intervention or approach one needs to combine results from many studies and measure their efficacy in a way that enables easy comparison.

Many nuggets provide food for thought

Just as unrefined gold panned from a stream varies in size and quality, so too do the results of evaluations. It is not enough simply to “weigh” the studies using their effect sizes; some account needs to be taken of the strength of the research design, the sample size, and the context of the study. Gretchen Bjornstad, from the Dartington Social Research Unit, described a system that provides consistent rules to account for these differences and a method of combination and comparison. Specifically, Gretchen presented her team’s work adapting for a British context the meta-analytic techniques pioneered by the Washington State Institute for Public Policy (WSIPP). Developed to analyse the costs and benefits of interventions for children, studies are found, mathematically combined, and evaluated against a variety of outcome measures in education, behaviour, and mental health. In addition, the longevity of intervention effects on these outcomes is estimated to give an assessment of impact across the life span. The effects on these measures are then monetised by estimating tax dollars saved as the result of positive outcomes achieved or negative outcomes avoided. Costs and benefits can then be weighed and commissioning decisions taken with financial justification.

The Dartington Social Research Unit now has a growing number of WSIPP-style meta-analyses that estimate how valuable certain British interventions are (see www. This is clearly an important contribution and demonstrates a method that “solves” many problems. Firstly, by analysing data from many studies it can reduce the uncertainty of effects from individual evaluations. Secondly, it allows an estimate of efficacy to be made beyond the end point of a research project. Finally, it allows interventions to be compared using a simple, easy-to-understand quantifier: money.

It’s not all about the money

If we have effective methods to find, refine, and weigh gold, then it might be easy to stop there, content with all that this gives us. However, to extend the metaphor just a little further, is the value of a golden nugget simply how much it may be sold for? This unsolved issue was tackled in Richard Cookson’s presentation. As a health economist he is used to investigating the costs and benefits of different interventions. He outlined his sector’s move to include measures of well-being, describing the increasing use of the Quality Adjusted Life Year (QALY). The basic premise of the QALY is that not all life years are equal. Those spent in ill health may be considered as less valuable than those spent in full health. For example, ten years lived in considerable discomfort may be perceived as worth less than five years living pain free.

The concept could be readily adapted for other sectors, including education, where a well-being QALY or “WELBY” was proposed. Broadening the analysis beyond the pound sign, this would enable consideration of life enhancement. That is, how an intervention contributes to people’s perception of how their life is going. How one captures wellbeing, however, could be a matter of some discussion. Richard gave two suggestions. First, a life satisfaction index. Participants could be asked to rate how satisfied they are with their life so far in different areas. A second suggestion would be to combine health and economic measures. It’s easy to imagine how an educational intervention might affect a child’s economic future and their health. The relative merits of these approaches are beyond the scope of this article, but the take home message is that a well-being measure may help us to capture more in our evaluations of what matters.


With the advent of the internet and the rise of open access journals we have a larger stream of information than ever before. Furthermore, with the rising number of academies in the English education system, the power of commissioning is being given to decision makers nearer the ground. This represents both an opportunity and a challenge. School leaders searching for “golden” interventions need to be able to compare and contrast based on what matters to their school community. As such, from delivering practitioners, to partnering parents, to commissioning leaders, there is a challenge to be ready to engage in a debate about what really counts and how it should be counted.

About the author

Liam Maxwell is an educator and analyst who has worked in a wide range of roles with children and young people. With previous experience in secondary inclusion, youth justice, primary school teaching, and research he currently works as part of a local authority team forecasting and planning school places.

Further reading

Aos S et al (2004), Benefits and Costs of Prevention and Early Intervention Programs for Youth. Olympia: Washington State Institute for Public Policy.

Biehal N et al (2012), The Care Placements Evaluation (CaPE) of Multidimensional Treatment Foster Care for Adolescents (MTFC-A). Research Report DfE-RR194.

Dixon J et al (2014), Trials and Tribulations: Challenges and Prospects for Randomised Controlled Trials of Social Work with Children, British Journal of Social Work, 44, 1563–1581.


November 2015