The problem with Progress 8



I have never much liked the word ‘progress’ as a description of how successful a child has been at school.  To me it has a vague, annoyingly fussy and managerial feel which obfuscates the true purpose of schools and, wherever possible, I try to use ‘learning’ instead.  Whatever it is called, measuring how much students have improved over time has become increasingly influential on the way in which teachers and schools are judged.  Recently, progress as the primary measure of a school’s effectiveness has become formalised and embedded by the introduction of Progress 8.

On the face of it this appears fairer than judging schools on the attainment of exam cohorts.  This is because, the thinking goes, attainment is, at least partially, the result of intelligence; a school with lots of clever children is likely to get better grades than one with less intelligent ones and, because intakes are, theoretically at least, out of the control of schools, it would be unfair to judge a school’s effectiveness on these results.  Progress seems a fairer measure because children are assessed from the point at which they started.

As well meaning as P8 might be, it is riddled with problems and is actually a very unreliable measure of school effectiveness, which may come to have a damaging effect on the education of some children.

The statistical problems with P8 have already been very robustly covered by others, as has the unpleasant way in which it pits schools and their pupils against each other in a drier, more academic version of Battle Royale.  I’d go even further though, and argue that the measure is actually conceptually flawed because it is based on three incorrect assumptions:

The KS2 data used as the benchmark against which progress is assessed is safe.

I have already written lengthily on this here.  Briefly, KS2 tests assess the attainment of children in discrete disciplines (English and Maths).  They are not tests of a child’s capacity to learn.  The grades children achieve in these tests are the result of a very wide range of factors of which their ‘intelligence’ (I use quotation makes here because of the lack of consensus as to what this is) is only one.  Using the grade a child achieved in primary school in two subjects is not a safe starting point against which to assess their attainment in another five years later.  This is continually misunderstood with most schools (and Ofsted) wrongly referring to students as high, mid or low ability based on these attainment tests.

If we accept these arguments, and that KS2 data is not therefore a safe measure against which to assess learning, P8 loses credibility.

Even if we put aside concerns around the safety of KS2 data and blindly accept averaged scores in English and maths as reliable and appropriate benchmarks against which to assess the learning of pupils in other subjects, further issues remain.

Socio-economic context makes no difference to the speed at which children learn.

I do understand why socio-economic context is ignored in progress measures.  Recognising the influence and taking it into account can, of course, lead to lower expectations of certain groups of children, which may embed differences in attainment.  This, of course, is why Michael Gove got rid of the Contextual Value Added measure.  But we all know it makes a huge difference and ignoring it won’t make it go away; children from poorer backgrounds lack significant advantages held by their wealthier peers and, as a direct result, are less likely to do as well at school. Teachers and leaders know that it is harder to get good results in a school in a disadvantaged area than it is to get them in an advantaged one, which means given the choice they are more likely to choose to work in more affluent ones.  When combined with performance related pay, this creates an educational landscape in which recruitment and retention are much more difficult for some schools than others.  Of course, some schools in very disadvantaged areas buck the trend and achieve outstanding outcomes (the school I attended was one of these), but using these as evidence that it as easy to run a poor school than a rich ones is to indulge in whataboutery and, in my view, knowingly disingenuous.

Perhaps worst of all, ignoring context means ignoring the issues that most disadvantage some children.  For example, white English boys with prior poor attainment are the group least likely to make ‘good progress.’  If we know this then we should accept that a school with a high proportion of these children is less likely to achieve a good P8 score than one with a smaller proportion regardless of the quality of teaching.  This should be acknowledged and a system-wide focus on the reasons for this underachievement would almost certainly be more productive than ignoring the issue.  Not doing so creates conditions in which schools may be incentivised to reduce their proportion of poorly-performing groups, which is clearly neither desirable nor the intention of the P8 measure.

The difference in value between the different GCSE grades is uniform.

An intended consequence of P8 is that the progress of all students now matters equally.  Previously, when schools were measured on the percentage of students achieving Cs or above, a lot of effort was put into children struggling on the D/C border with other children perhaps neglected as a result.  P8 should, at least theoretically, end this with the value difference between a new 1 and a 2 being the same, or close to the same, as between a 3 and a 4, or a 4 and a 5.  This should mean all children are pushed to improve, not just a few key groups.

The problem with this is, outside of the measure itself, the different GCSE grades are of different value.  Colleges do not offer children places based on the progress they have made.  Nor do universities, or internships or employers.  Life differentiates by attainment, not achievement.  This means that schools may actually be quite right to care more about whether a child achieves a 3 or 4 than they do about whether or not a child achieves a 1 or a 2.  A 4 might get a child into a college when a 3 does not, whereas it is far less likely achieving a 2 instead of a 1 will make a difference to a pupil’s future opportunities.  I suspect that many schools and teachers will, either consciously or sub-consciously, recognise this, which will lead to tension between how a school is judged and what is best for the pupils who attend it.

I worry that the focus on progress for all, as well meaning as it is, may create a system in which what happens inside school becomes less and less relevant to the world outside.  Teachers, schools, Ofsted and the DfE may care very deeply about progress but outside education it is poorly understood and typically ignored.  I don’t think parents, as much as we might think they should, typically care very much about how much ‘progress’ children make at a school; they care about attainment because they know it is this that really determines their child’s future opportunities.

It seems to me that an important unintended consequence of our preoccupation with progress has been the creation of an accountability measure largely irrelevant to the world beyond the rarefied ecosystem in which it was conceived and incubated.   This is, of course, not to say that progress should be ignored; doing this could easily cause schools to teach to the middle and, by so doing, de-prioritise the learning of both lower and higher attaining pupils.

If we really need a system to judge schools against each other, it must be based on credible data, acknowledge the role of context and reflect both achievement and attainment.  While I’m happy to be put right on anything I’ve got wrong, as it stands, I don’t believe P8 does any of this.




7 thoughts on “The problem with Progress 8

  1. Interesting piece. I think it’s wrong though. I don’t think the introduction of Progress 8 makes any of the above assumptions. It would do if it were claimed to be perfect, but it is just one measure. Like any measure, it has issues with validity – and hence I think this is worth exploring. But I think you make some errors.

    1. I don’t think it assumes the KS2 data is safe. I think it takes the KS2 data as a starting point because that is what we have, and the bigger the cohort the more valid it is over an entire cohort. Your point is valid for individuals, but Progress 8 doesn’t measure individuals – it measures an aggregate, and as the cohort increases the measure becomes increasingly valid.

    2. I don’t think that the introduction of Progress 8 assumes that socio-economic context makes no difference. I think this is a valid issue with the validity of comparisons made with the P8 measure, and is important to explore. Disadvantaged pupils nationally have a P8 score of less than -0.3, and so if one has a high proportion of disadvantaged pupils who are being compared to better-off peers, one’s score is likely to be lower.

    But as you say, we have to consider the lessons of the past here. I think you dismiss these too easily. As you say, a few years ago one of the main data points measuring schools was “Contextual Value Added”. This means that a variety of factors were taken into account and schools were judged on whether their pupils had made more progress than other schools in the same sort of area with similar pupils. It sounds very fair – until one considers that the education system in the UK ends up with what can only be described as criminal outcomes for the poorest pupils. And if those outcomes are criminal, how is it ‘OK’ to be average.

    If poor pupils do badly across the UK – far worse than I believe is possible – then to measure them against themselves as if doing slightly better than average for that cohort is doing well leads to the kind of race to the bottom I know that you are not in favour of. I’d therefore counsel against any move back to a purely contextual accountability figure. And I write as someone who worked in a school that benefited from ‘CVA’ being a key measure in the past. I recognise having it might be a more accurate description of the relative strength of schools with such cohorts, but the unintended consequences is to have lower expectations of disadvantaged pupils – you recognise this but then appear to promote it anyway.

    3. The difference between GCSE grades isn’t uniform I agree but in a slightly different way. Getting from a G to an F in some subjects needs pupils to gather knowledge that is fairly quick – but getting from an A to A* is significantly more difficult (and requires going ‘past’ and ‘beating’ many more peer so getting all pupils who may get lower grades a grade higher is probably easier than at the top end ) – and I think this is recognised in the new GCSEs. But attempting to recognise that the difference isn’t uniform this year when there is a combination of legacy and new GCSEs ((so the difference between and A* and an A is 1.5 points and a F and a G is 0.5 points) has caused outrage – many blogs claiming it means.

    Of course I agree that a 4 to a 5 (and a 5 to a 6) are life changing in a way that a 1 to a 2 are not – you are right on this though I think this stems from quite a technocratic view of education (but then we are discussing examination results so!) – but I think this is one thing we have to put up with. Schools which do best will be those that teach every child at every level as best they can.

    I’d add to your points that amending the figure for comparative difficulty of subjects (eg having loads of kids take MFL is likely to give a lower figure) is also desirable but I’m not sure the profession is ready for this and there will be knee-jerk reactions if so – eg ‘how dare they say MFL is more valuable’. However at the moment for there to be a disincentive accountability wise to enter pupils for MFL doesn’t make sense. And the introduction of decimals to the nth degree will make it far less usable. I think education data lab have done some great work on this (it does give positive A8 points for U grades in some subjects, fairly controversially).

    Finally on the accountability measure not being useful for anything apart from one aspect of holding schools to account. That’s good isn’t it? It needs to measure how well schools get their kids to learn in as valid a way as possible. What else do you want it to do?

    I think it’s as valid a figure as we’ve had.


    • Sorry! And another question..
      5. Could lack of context lead to coasting in some schools? For example, an all girls school in an affluent area may seem to be doing very well based on its data, but this could be misleading as it doesn’t take into account the fact that, on average, girls make more progress than boys do?


      • Yes, I think that is a concern. Girls doing better than boys, EAL better than non-EAL, more affluent better than disadvantaged. But these concerns that reflect that P8 doesn’t tell us everything and is one piece of data shouldn’t be enough to throw it out. And if you think about where our education system is letting down groups – those who are poorer, those who are white working class and those who live in coastal areas, for example, P8 does a good job of enabling focus on those areas.

        Liked by 2 people

  2. Thank you! This is exactly the kind of critique I was after when I posted this.
    A few questions though..

    1. I’m hearing about lots of schools that are, in fact, moving P8 down to department and individual student level (Ie. child X P8 figure for your subject is 0.5, why and what should be done?) Do you think the data is being misused when this happens?
    2. Do you think that I am too worried that KS2 data may actually be performative? By this I mean that it may be self-fulfilling – low averaged KS2 data is too readily seen as a measure of intelligence which affects setting, expectations, differentiation etc.
    3. Does the crude P8 score for disadvantaged pupils mask important subtleties within it? Some groups of children within this group do much better/worse than others and assuming this group is homogeneous makes the measure unfair on groups serving the most poorly performing groups.
    I do take your point about the lowering of expectations. I see this as a real danger but think the unintended consequences of not acknowledging it at all are recruitment and retention difficulties being felt most acutely by the schools that need good teachers most.
    4. You are quite right to point out that my argument about the difference between grades is reductive, technocratic and depressing! My only defence is that I find the whole way schools are being compared to each other reductive, technocratic and depressing too!

    Thank you for taking the time to respond so thoroughly. My thinking on this is only really beginning to emerge and your comment has already been really helpful.


    • 1. I don’t know if it is being misused so much as it is a piece of data at subject level that has less validity than some might believe it to have – such data is helpful (especially if one uses the relative difficulty of the subject as available in RAISE alongside it), but it is one very small piece if taken at individual subject level. If schools are using it for individual pupils I’m not sure why to be honest – maybe to give one or two a kick up the backside. It might be helpful for that but in my opinion is completely invalid from an individual pupil level.

      I should say here that I recognise the issue of outliers – of the schools with three or four high-ability pupils who sit no exams, perhaps due to mental health issues – in one cohort. I think this is a weakness. I’m not sure that all schools having outliers discounted helps – for this might encourage schools to ensure some weak performers become outliers and of course it will increase the A8 base as outliers are discounted across the country – but it is an issue.

      2. I don’t know really. Maybe. I just think that if we keep P8 as a measure of schools across a cohort then the individual pupils’ starting point (or end point) doesn’t matter – and it’s a performance measure on a school.

      3. Yes we do risk that. I accept that, but I think you’re in danger of critiquing something for not being perfect when it’s one piece of data that across cohorts is more valid than I believe anything else we have.

      4. Yes, but humans need short-cuts. The controversy around P8 is because looking at one figure is easier than getting under the skin of data and experience (and curriculum). I’m hopeful that Ofsted will get much better at judging schools in the near future. Maybe.


  3. Pingback: Educational Reader’s Digest | Friday 28th April – Friday 5th May – Douglas Wise

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s