When I left the UK in 2006, there were no Target Grades. When I returned five years later they were everywhere, screaming out from SIMS and multi-coloured Excel spreadsheets, and shouting mindlessly on the front of exercise books.
They were all over lessons too. Outcomes, written carefully on every board, were often tagged with the different target grades of students in that particular class. For example:
Describe the Night of the Long Knives (C)
Explain the causes of the Night of the Long Knives (B)
Evaluate the significance of these causes (A).
It seemed that the expectation was in the very best lessons, students should have individual target grades and given different tasks to do, which multiplied planning by three. This came as something as a shock. On a training day I heard a consultant ask “why should the higher ability start at the same point the lower ability do?” and saw people around me nod sagely. I was baffled. “How could a student explain what caused the Night of the Long Knives if they didn’t know what happened in the first place?” I thought, “Describing the fire would only take a student with a C target a few minutes. Should I just get them to repeat that in different ways until the end of the lesson? Should I stop them moving onto explaining?”
Anxious to catch up on what I’d missed while away, I read lots and asked lots of questions, but didn’t find the answers I got particularly illuminating.
I learned that Target Grades were generated from tests children sat at the end of their primary education in English and Maths. From an average of these three grades an acceptable level of progress had been decided on, and this was used to set a target for other subjects. The success of the child and, by association, their teachers and school, was assessed on whether they failed to meet, achieved or exceeded this target.
My mind boggled a bit. How could the average attainment of a child in two subjects be used to predict their attainment in completely different disciplines? Most children I taught in Year 7 had never really studied history meaningfully but, as I understood it, I was supposed to assume that they were already as good at it as they were the subjects they had learned. When I raised this I was told “just be glad you don’t teach music, art or PE. At least in history they do read and write.”
I asked more questions and got myself even more confused. I heard that the results of these KS2 tests were used as an assessment of the child’s intelligence and capacity to learn; if they had reached a certain standard in one subject it was assumed they could do so in others. To me this made no sense at all. The KS2 tests tested aptitude in discrete subjects and weren’t designed to test intelligence. The outcome could be the result of any one or combination of a huge number of variables including the ability of the teacher, the level or parental support and the degree to which a primary school taught directly to the test. The assessments weren’t intelligence tests and I didn’t think they should be used as a general indicator of a child’s capacity to learn.
When I brought this up I got a different answer. I was told that I’d misunderstood and these tests were actually used to assess the level of different generic skills a child had reached. Seeing English and Maths as the trunk of a tree with the other subjects as branches initially seemed quite neat. In this conception, children had developed different generic competencies. If a child could describe well in English, as demonstrated on their KS2 test, they would be able to describe well in history too. This helped explain the very Bloom’s influenced learning outcomes that had also mushroomed while I was away.
I allowed myself to be satisfied with this for a while but it didn’t fit quite right and nagged at me. The first problem was that the various skills that students were assumed to have were very different in history to how they were in other subjects; what I regarded as a good descriptive paragraph was different to the sort of paragraph their English teacher wanted them to write. Even more worrying was how the most important part of my subject of all, knowledge, barely got a look in. It seemed it was assumed that a child who wrote good historical descriptions of one thing should be able to do so on something they’d never learned about before.
This approach to differentiation, based on target grades, continues to cause big problems in history teaching. In a recently blogged about lesson, judged highly by Ofsted, students were given one of three different versions of a worksheet according to their target grade. Each child was told they could choose a worksheet that was aimed higher than their target but not lower. This doesn’t make sense. If a student is capable of doing work they should do it and allowing them to choose exposes an important flaw in lessons that use this approach. Even if this model was effective in supporting progress from different starting points, by its very nature it will never close gaps between students; higher ability students do harder work and learn more, lower ability do easier work and learn less. In classrooms and schools that work like this the weak can’t ever hope to catch up with the strong. Gaps between students are consolidated and never close.
This system casts a shadow half a decade long. In her book “Making Good Progress” Daisy Christodoulou points out that teachers often unwittingly underscore disadvantaged children and it is plausible that something similar happens as a result of averaged KS2 scores. Less is expected of a child who performs comparatively poorly in their KS2 tests, for whatever reason, than of one who performs well. They are generated a lower target grade, likely to be put in lower sets and almost certainly given easier work to do. Teachers working with subjective mark schemes suffering cognitive overload may unconsciously look for short-cuts when grading work. A target grade provides this short cut, and means they are more likely to give lower grades to students with lower KS2 data even if the work is of the same standard as that produced by a child with a higher KS2 test score. If the child wasn’t really low ability to begin with they can soon become so as they internalise the message that they aren’t one of the smart kids, drop further behind and become demoralised.
JL Austin’s work around the transformative effect of language gives a compelling explanation as to why this might happen. Austin finds that there are many things that people say that don’t just describe the world but have an impact on it. For example, in a Church of England marriage ceremony when a vicar says “I now pronounce you man and wife” they aren’t simply describing something that exists but changing reality itself. We can extend his insight beyond just what people say to things, like the KS2 tests. Those that advocate for these exams and scores might well say that they know that they are intending to change reality. They might say that this is the point. However, the issue is the mismatch between reality that is described in the test and its performativity. It is here that we need to be especially careful. While it might be argued that these grades are appropriate for the subjects in which the child actually took tests (Maths and English), clearly we are on much shakier ground when we start using this non-specific data to guide the teaching of students in other subjects. After all, a vicar announces the banns of marriage before the ceremony and so can be confident in its legitimacy.
Good schools and caring teachers always stress that targets can be beaten and some children do rise through the sets to dramatically exceed their target, but such instances are rare. Handing out high targets to some children and lower to others, in front of their peers based on the results of test data that might be four years old can be a traumatic experience for some children. By giving a target, we imply there is a ceiling on a child’s potential and may well create the low aspirations and low confidence we are trying to tackle in using them. Some schools try to mitigate the disheartening effect of low KS2 scores by artificially inflating the target grades of weaker students, but this can be just as unkind. In history, a child with an old L3 at KS2 had only a 6% chance of achieving a ‘B’ grade target in 2015, which leads to odds on year on year failure, demoralisation and de-motivation. For all students, from the most to least able, it’s better to have high expectations and to focus on meaningful step-by-step improvements from the subject specific point they’ve reached. Generic target grades are a distraction.
Target grades can wreak havoc at KS4 where they can easily result in teachers focusing on the wrong things, when they are inappropriately combined with mark schemes. A scheme for an old style 8 mark question may say that students can get 6 marks for a one sided argument. It may appear logical to translate this to a ‘B,’ and by doing so assume that a good target to set for a student with a ‘B’ grade target would be to “either agree or disagree with an interpretation and support with evidence.” But that 6/8 does not mean a ‘B’. It means 6 out of 8 marks on a paper carrying over 80 marks in total, which in itself holds only 50% of the total. A quick look through the past exam scripts of my students show what a dangerous misconception this is. Of all my students who have got ‘Cs’, hardly any have achieved the percentile equivalent of a ‘C’ consistently on every question. In almost every case, they’ve got very high marks on some questions and less than half on others. The problem was not that they hadn’t grasped a ‘B’ grade ‘skill’ as assumed by the target system, but that they had inconsistent knowledge over the breadth of the course. The target for these students should have been more focused revision, not teaching them to meet self-limiting criteria. A further implication of this is that if a child is capable of getting full marks on one question, then they are capable of doing the same on all of them, which makes the idea of a generic or skills based target grade even more absurd.
These serious concerns got me wondering where this all began and why it became so widespread. I did some digging.
Widespread target setting in English schools began with the Fischer Family Trust. The Trust compiled a wide range of data, including prior attainment (not ability) and socio-economic variables to make statistical links between them and the outcomes of individuals and groups of students. This information could be used by schools to assess how well their pupils were doing compared to children at other schools. This could be extremely useful and powerful information, allowing schools to see when their expectations were too low. Some schools began to use this information to set targets for their pupils and some began sticking these targets to the front of their pupils’ exercise books to give them something to aim for. Schools were in control as to what these should be and most, as a result of a ‘my expectations are higher than yours’ arms race, set targets at the very top end of what was statistically possible. Many schools set targets that, should all students achieve them, would place the school in the top 5% of those in the country.
The Fischer Family Trust never advocated this. Their original advice was that schools and students use the data as a starting point to begin discussions that would result in an agreed expected grade. To understand how it turned into what it did we need to look at wider political factors.
School league tables and ranking played a big role. Schools in disadvantaged areas realised that the raw attainment of their students would not compare well to those in more affluent schools so sought a measure that would demonstrate their pupils had made progress from lower starting points. FFT data offered this opportunity; a child who’d arrived in Year 7 on lower grades and from a more socially disadvantaged background was less likely to get as high a grade as one who arrived on higher ones from a more advantaged one and it seemed fairer to judge them on the progress they’d made since joining the school and against other pupils with similar contexts.
Target Grades and data tracking became inseparably linked, enshrined in the Teacher Standards, and encoded in the very DNA of English schools. After 2010 most schools stopped using FFT, which did try to recognise the effect of demographic on attainment, and began to form targets based purely on raw KS2 scores. This was partially because the DFE, under Michael Gove’s well-intentioned instruction, got rid of Contextual Value Added (CVA), believing taking into account demographics meant accepting differentiated standards by advantage and the inevitable failure of the poor. Most schools now disregard context completely and simply add three or four levels to a child’s averaged KS2 data to make this their target. As well-meaning as this is, outside of English and Maths, it is wrongheaded because the point from which each child is supposed to be progressing has very little, if anything at all, to do with the subject they are studying. Transition models in most subjects support this with few children at the lower end making the three levels of progress they are supposed to. In history, for example, in 2015 only 33% of children on a KS2 4c average reached or bettered the standard the DFE expected.
Ofsted inspections embedded this. While they have never officially required schools to share targets or put them on exercise books schools that did were praised and other schools, predictably, followed suit. Head Teachers working between 2008 and 2012 remember the practice spreading like a virus at local and national conferences as the idea that this was what Ofsted wanted took hold. As Alex Ford pointed out in his important post on how inspection regimes promoted extensive marking, praise or condemnation from Ofsted can very quickly become an important driver of school policy even when there is no evidence that the policy is effective.
A trawl through fifteen of the most recent Ofsted Reports for schools I’ve either worked at or know reasonably well suggests this is still fairly common. Of the fifteen I looked at eight mentioned Target Grades explicitly. In all these instances comments were approving, either praising their use or recommending that targets be made more challenging. No Ofsted Reports questioned the use of Target Grades or the data on which they were based which, of course, would make it seem logical for SLTs in struggling schools to insist on their use. Some reports included Target Grades in material on teacher appraisal and performance management. One team reported of one school that “teachers are aware they may not get a pay rise if students do not achieve their Target Grades.” The implications of this are worth thinking through. As I’ve already mentioned targets in school are no longer typically based on FFT data and are more commonly based on a simple numerical value (either 3 or 4 levels depending on the school) being added to each child’s mean KS2 score. This means that the targets of students in a class in a socially disadvantaged area may actually be those achieved by only a very small percentage of children from similar demographics nationally. To successfully meet their appraisal targets teachers at some schools have to achieve this with every one of the children in every one of their classes. Failure is inevitable and such performance management systems has meant that employment at the most disadvantaged of England’s schools is perceived as a real career risk, which may well be making the recruitment and retention crisis more acute in the disadvantaged schools where good teachers and leaders are most needed.
Despite all these issues, schools often continue to require teachers know the target of all the children in their classes and that children be able to parrot off these grades at the drop of a hat. Of course, students are also expected to know what they should do to reach their target but, because of the problems surrounding subject specificity outside Maths and English, such action points are often vague, non-specific and completely unhelpful. In history, I’ve seen ‘describe in more detail,’ ‘explain your points’ and ‘analyse the sources you use,’ which, while they may satisfy a school’s marking and reporting policy, are all pretty meaningless. Daisy Christodoulou has interesting things to say about other reasons for this in Making Good Progress.
So to summarise, the Fischer Family Trust gathered data on the grades students were statistically likely to get and schools turned these into targets for subjects students hadn’t get studied. Schools used FFT data to take into account context but in 2010 the DFE said they couldn’t, so instead schools started adding either 3 or 4L of progress to raw KS2 data. This has generated targets that some students are statistically highly unlikely to ever achieve. Ofsted didn’t tell anyone to put these target grades onto books or share them but somewhere, at sometime around 2008 a school did. Inspectors reported approvingly on this policy and soon most schools were doing it. Some schools have tied this to Performance Related Pay, which has made meeting appraisal targets all but impossible for some teachers. If this sounds a confused mess it’s because it is. Target Grades are an answer to a question nobody asked. The result was a decade long multi-vehicle wreck of which only happened because nobody was driving. It is difficult to see any positive impacts on the learning of children in English schools.
There will be those who seek to defend the policy both within their own schools and across England as a whole. I anticipate the most common will be that they lead to faster progress and, of course, if there is convincing evidence they do result in better outcomes then the policy, for all its flaws, might be worth continuing with.
But there isn’t.
The evidence base on the impact of GCSE target grades based on KS2 data of any type is pretty much non-existent. To be blunt, to my knowledge, nobody at all has done any work on it. Given how widespread it is and the impact it has on the day-to-day working lives of both students and their teachers, this is quite staggering. Of course, this also makes it impossible to say it has no positive impact in individual contexts but given such a confused birth and the many problems I hope I’ve demonstrated it causes, we must do better than that. Clearly, it would be very helpful if someone was to do a proper study of the impact of Target Grades on outcomes and if I’ve missed a study that has been done I’d be grateful to anyone who could point me to it. In preparation for speaking on this at ResearchED Rugby, I’m going to be looking at examples of target setting in other domains to see if there is evidence of either positive or negative impact, while being acutely aware of the irony of this given the problems I’ve identified with non-specific target grades.
If we accept that target grades have a powerful effect on children (and here some research would be useful), then we need to be really, really careful about how we get those grades. Nobody would expect your PB for a 100m sprint to form your target for a marathon. Most of us would laugh at the absurdity of this but this is all the more reason to be careful about using non-specific tests scores to generate targets in other subjects even if they seem similar. The differences between the disciplines may be more subtle than those between race distances but they are no less important.
Until there is evidence that Target Grades based on KS2 data do have a positive impact on student progress I’d like Ofsted to tell inspectors not to ask students or teachers for them outside Maths and English. I’d like them to insist that inspection teams refrain from making comments on students making progress, or not making progress because of either the presence or absence of Target Grades outside Maths and English. I think schools should stop using achievement of Target Grades, outside English and Maths, as a way of directly assessing the effectiveness of their teachers and that schools should not use them to make decisions around career and pay progression. Of course, schools actually don’t have much choice because of the nature of the accountability measures used to judge their effectiveness. The issues I’ve discussed have huge implications on the validity of the Progress 8 measure outside English and Maths because when children arrive in Y7 there is no evidence whatsoever that they’ve reached any standard at all in any other subject. It is simply wrong to assume they are progressing from the same standard they reached in the subjects in which they did take tests.
Given their confused origins and self-limiting nature I strongly suspect that any future research done into the effect of Target Grades will not find they have a positive impact. The data they are based on has been misunderstood, misapplied and used inappropriately. Using it in the way we do just isn’t safe and has negative consequences. Of course if positive research emerges, or it turns out I’ve got the wrong end of the stick, I’ll happily change my position but my belief is that that any serious study will cause the whole house of cards to come crashing down.
Nb. This is very much a work in progress. I’m aware I missed a lot while I was away and may have misunderstood some important things. I very much welcome critique and would be grateful to anyone pointing out factual mistakes, errors in my logic, relevant research or important things I’ve completely missed. Thanks to Lee Donaghy, Alex Ford, Jude Hunton and Tom Neumark for helping me with this.
Anything I’ve got wrong is, of course, my fault,