View from an English education Triallist

Ben Styles recounts why we still have some way to go before widespread acceptance of rigorous research methods within the English education system

I have recently been reminded of the difficulty we face when trying to communicate null or negative findings from research. In spring 2013, a team from Coventry University delivered the Chatterbooks programme as part of a randomised controlled trial (RCT) funded by the Education Endowment Foundation (EEF). Chatterbooks is an extracurricular reading initiative that aims to increase a child’s motivation to read by providing schools with tools and resources to encourage reading for pleasure. In May 2014, the NFER Education Trials Unit published the results of the trial. Chatterbooks had an estimated average effect of slowing progress in reading by two months, although we could not be confident this negative effect hadn’t arisen by chance. If it was a genuine effect, it could have been because control pupils were learning faster in their existing lessons so any improvements as a result of Chatterbooks were offset. Either way, we can be pretty certain that across the 12 secondary schools involved in this trial, Chatterbooks was of no help in improving average attainment in reading for the children involved when compared to “business as usual”. Fast forward just over a year and the Department for Education (DfE) in England is funding The Reading Agency to extend Chatterbooks to 200 more primary schools all over the country.

What we know
● It is difficult to disseminate null or negative research findings.
● The results of rigorous evaluation mean more if more schools are willing to participate in the research.
● We should still challenge whether the funding decisions of central government are evidence-based.

Is it possible that the DfE was unaware of the results of the trial? It received limited press coverage at the time of publication as it returned a null result, but the trial was part-funded by DfE. The evaluation was carried out with 11- and 12-year-olds who were struggling with reading at the start of secondary school whereas the new roll-out funding is for children in primary school between the ages of 7 and 11. It is just possible that Chatterbooks is effective when run in primary schools and of no effect in secondary schools. How different are 11-year-olds struggling with reading when they start secondary school to children with similar difficulties at the end of primary school?

One of the most consistent things we see from data is that the spread of ability within a year group far outstrips the average progress children make between years. Therefore, there will be children at the end of primary school who have similar reading skills to those who took part in the trial. In fact, there will also be children with similar reading skills in the year below, and the year below that.

Exact details of what the new funding is for are not yet available. It may be for attending extracurricular book clubs, in which case we might suggest that the results of the trial are less relevant. However, the question remains as to how we guarantee that the pupils who need the help attend the Chatterbooks book clubs? The trial targeted children who were struggling readers, and it was not successful. If the book clubs do not provide support for these children, success seems even less likely. I would maintain that the results of the trial should prompt serious consideration when deciding whether or not to fund the programme for any phase of schooling.

At the very least, a robust evaluation of the use of these funds is warranted, for example through randomisation of the Chatterbooks programme implemented in the way intended for this roll-out.

This story highlights two aspects of the June 2014 ESRC seminar on “Overcoming the challenges of commissioning and conducting RCTs”. This well-attended meeting unfolded in a way that reflected the delegate mix. The researchers delivering the talks had wanted to discuss, among other things, the challenge of disseminating null or negative research findings. The delegates, a healthy cross-section of researchers, policy makers and research commissioners, preferred to talk about the generalizability of RCT results.

The generalizability problem is one that researchers often avoid. From a strict statistical perspective, the confidence intervals that we produce are only valid for random samples of pupils or schools. For a large effectiveness trial across 100 schools, we draw a random sample but, of course, not all schools want to take part and this potentially leads to sampling bias. This is why I highlighted recruitment as the main risk to future trials during the seminar. With a non-random sample, any RCT results are open to the criticism “but will it work in my school?” In practice, it is not clear how important this worry is. Randomisation itself assures an unbiased comparison between intervention and control groups. Without going into statistical detail, providing that the schools in the trial are numerous and representative of the population of schools, I see it as unreasonable to reject the results of a trial on the grounds of generalizability.

If every school we approach were willing to be part of a trial, we would be able to use genuinely random samples and generalizability would cease to be a problem. RCTs within the English education system have reached one in five schools. By now, most schools have probably been approached at least once to be part of a trial. Usually, it is not a lack of interest or respect for research that prevents participation. It is more likely to be lack of time. Before the EEF was founded in 2011, there was concern that schools would not want to participate in RCTs. This worry has proved unfounded. The next stage is to make trial participation the norm rather than the preserve of a subgroup of research-engaged schools. Perhaps this can be done using a pledge campaign for head teachers similar to Ben Goldacre’s Alltrials campaign, which aims to get all RCTs – past, present and future – registered: “I pledge that if my school is asked to participate in an RCT for an intervention that shows promise, we will sign up.” Or perhaps it is about teachers gradually embracing trials over time through reforms to teacher training. The intensive work carried out by the Institute for Effective Education, Durham University, NFER, the EEF, and others to promote evidence-informed education will surely help too.

We are a long way from academic journals, let alone the press, giving the same weight to null or negative findings as they do to those that demonstrate a positive effect. Null or negative findings are, of course, just as important as positive ones. A school spending its valuable pupil premium resources on an intervention that is demonstrated to be ineffective can quickly change tack to something that has greater weight of evidence behind it. Was there robust evidence for or against the other book club programmes the DfE was considering when awarding its funding? Probably not. It is still early days in terms of the percolation of rigorous evidence through English education research. The wider question is not so much whether this was the right book club for DfE to fund but whether it should be funding book clubs at all. In the department’s defence, it looks like a ministerial decision that is likely to be popular with the public, so the decision was probably outside the scope of evidence. I would argue, however, that these struggling readers could have been better served had the money been directed at the most cost-effective intervention as demonstrated through rigorous evaluation and meta-analysis. Thankfully, this kind of evidence is now accumulating at breathtaking speed in England just as it has done over decades in the US. Perhaps next time the English government is deciding how to spend its money on improving reading, it will have more to go on.

About the author

Ben Styles is head of the National Foundation for Educational Research’s Education Trials Unit and a Research Director in its Centre for Statistics. He directs five of the seven trials currently in progress at NFER, writing and presenting regularly on aspects of education RCT methodology.

Further reading

Styles B, Clarkson R, and Fowler K (2014), Chatterbooks – Evaluation Report and Executive Summary. London: Education Endowment Foundation. Available:

Department for Education Press Release:


November 2015