Randomised controlled trials (RCTs) are a growing phenomenon in education research. And the debate now appears to have moved beyond the question of whether or not such trials are desirable to one about whether or not it is possible to run robust trials in practice. The NFER Education Trials Unit’s evaluation of the Improving Numeracy and Literacy programme, conducted by Jack Worth, Juliet Sizmur, Rob Ager, and Ben Styles, demonstrates that it is indeed possible.*
The Improving Numeracy and Literacy programme consisted of two separate teacher-training programmes, developed by Professors Terezinha Nunes and Peter Bryant of the University of Oxford, which were designed to improve pupils’ numeracy and literacy ability. Previous, small-scale interventions by the developers indicated that both interventions had promising effects on measures closely aligned to the concepts being taught. The purpose of the RCT was to evaluate the efficacy of the interventions when delivered by teachers in a whole-class situation.
The interventions were aimed at teachers of 6-7 year old Year 2 pupils, and included a day of teacher training, resources for delivering ten lessons, and accompanying computer games for the pupils to use in class and/or at home. The Mathematics and Reasoning programme aimed to develop children’s understanding of the logical principles underlying mathematics, and the Literacy and Morphemes programme aimed to improve spelling and reading comprehension by teaching children about sentence structure and morphemes (units of language that convey meaning).
The Oxford team recruited 55 participating schools, and once all pupils had been tested in literacy and numeracy in the autumn, the NFER randomly allocated a third of schools to the literacy intervention, a third to the numeracy intervention, and the rest to the business-as-usual control group. For those randomised to one of the intervention groups, teacher training took place in December and the intervention was delivered in the spring term. After Easter, NFER test administrators tested all pupils again in numeracy and literacy.**
The results showed that pupils in the numeracy intervention schools made 0.2 standard deviations more progress in numeracy than pupils in the control schools, which is roughly equivalent to three additional months’ progress (or 20 PISA points higher performance). The pupils who received the literacy programme made slightly less progress in literacy than the control group, but this difference was too small to confidently conclude that it was caused by the intervention.
Since pupils in all groups were tested for both numeracy and literacy, we were also able to estimate the ‘spill-over’ effects of the numeracy programme on literacy, and vice versa. However, our analysis shows that there were no statistically significant spill-over effects.
Additionally, we analysed the association between greater use of the accompanying computer games and the effects of the interventions. Interestingly, the results suggest that pupils who played more games also made more progress in numeracy. However, this analysis is susceptible to selection bias: pupils playing more games also had on average higher pre-test scores and were less likely to have low socioeconomic background. The positive association may thus be due to these underlying differences rather than the causal impact of playing more games.
Importantly, because all 55 schools in this project completed the testing, the results cannot be biased due to some of their dropping out of the trial prior to completion. Also, test scores were only missing for about one in eight pupils, which is consistent with normal levels of daily absence and pupils leaving schools. In other words, we can be quite confident that the effects detected represent a causal effect of the numeracy intervention on learning.
While the intervention had a promising impact on test scores – and given its low cost also seems to demonstrate good value for money – some questions remain. For example, could the positive results merely be a statistical fluke? And are the results generalisable to all types of school? These are valid questions that cannot be answered by this study.
But this ‘efficacy’ trial is not intended to be the final word on the topic – it aimed to test whether the intervention works under ideal conditions, with intensive support from its creators. The next step would be to run an ‘effectiveness’ trial, which seeks to test whether the intervention can work at scale in a large number of schools, when others apart from the developers act as deliverers. Testing the intervention using a scalable model in another RCT would (if successful) add further credibility to the relevance of this study’s findings.
The Improving Numeracy and Literacy evaluation adds weight to the argument that it is both desirable and feasible to run robust educational RCTs in English schools – and that such evaluations can help identify programmes that can have positive effects on pupil attainment. However, while RCTs are often seen as the gold standard for credibly demonstrating intervention effects, the results of one trial are just the starting point for policy development; the most important contribution of education experiments will be the body of evidence that is generated from the running and replication of many trials over time and in different settings.
*The trial was one of many randomised trials sponsored by the Education Endowment Foundation, an independent charity set up by the Department for Education.
**Importantly, the test administrators did not know to which group schools had been allocated. The gold standard RCT in medicine is ‘double-blind’, which means that neither participant nor researcher knows whether the patient receives the drug or placebo. Double-blind trials are impossible for most educational trials, but blindness on part of researchers and administrators can be achieved in most cases.
Jack Worth is Research Manager at the National Foundation for Educational Research (NFER). As part of the NFER Education Trials Unit, he managed the evaluation of the University of Oxford Improving Numeracy and Literacy programme, a large cluster-randomised trial in primary schools. Previously he was Researcher at the Centre for Market and Public Organisation (CMPO).
This comment piece originally appeared in the CMRE Annual Research Digest 2015. The piece discusses an evaluation undertaken for the Education Endowment Foundation, an independent charity set up by the Department for Education, by Jack Worth, Juliet Sizmur, Rob Ager, and Ben Styles, of the NFER Education Trials Unit. The evaluation report (July 2015) may be downloaded free here.