Month: August 2020

Is there an algorithm to explain U-turns?

If you’re a cynic you might think that the government’s decision to do something about its catastrophic cock-up over algorithms in public examinations was largely driven by Boris Johnson’s desire to start his holiday. However, the back-pedalling will still be welcomed by candidates and schools alike. The interesting question is how we have got into this mess and how both government and Ofqual got it so wrong. It will be fun over the next few weeks to see how each of them blames the other!

It will be news to some people that the government regularly intervenes in the business of examination standards. Since the first Schools Curriculum and Assessment Authority (SCAA) was set up just over thirty years ago it has always had the right to inspect examination standards before results are issued and to comment. These comments are not public but there’s no doubt that they impact and that Ofqual and its predecessors perceive them as instructions.

It is also the case that government has a terrible fear of grade creep which can be summarised as young people doing better in examinations than they ought to. The fear is not of school improvement but of castigation by members of Parliament, vociferous right wing commentators, the right-wing media and the elderly. Bizarrely, this overlooks the fact that those who comment are generally the ones who succeeded in school examinations and went on to positions of power. They are unlikely to see the process which got them there as fundamentally flawed.

In some ways, it is a baffling position to take up. Fear of grade creep got rid of some really good improvements in teaching and learning like General Vocational Qualifications, modular A-levels, practical skills tests and coursework. The final stage, overseen by Michael Gove, was the return to single terminal examination papers taken at the end of the course. He is the least likely to apologise but his decision left no evidence base for 2020 assessments.

Anyway, going back to interventions, the way that Ofqual manages grading gives some hint as to why its behaviour this year has appeared so weird to those outside the process. Each year, and for every subject, the examination boards submit tentative figures to Ofqual along with their proposed grade boundaries. Ofqual aggregates these into what is essentially a national order of merit so that it can see, in theory at least, if one examination board is being lenient with its clients or another one being severe and if the national standard being contemplated is sufficiently in line with previous years to be explainable. That is an odd expression because some people would argue that the huge investment in education in the past few years, the expectation that teachers are better qualified and trained, more on-the-job training and professional qualifications for leaders should be reflected in a rise in standards. It’s a balancing act but, in general, Conservative governments like to be tough on standards and then to be able to berate the Labour opposition for being soft when in power. The Murdoch press and the Daily Mail like this as well!

If Ofqual advises the government that everything is hunky-dory and the maintenance of standards is guaranteed, and the government agrees with this, then the examination boards get the nod to finalise the results. In some subjects they may be told to make minor changes – typically downwards. Although the examining boards have made decisions about grade boundaries on the basis of standards by inspecting genuine scripts there is sufficient leeway within the process for this to happen without the awarding meetings being recalled.

This year therefore, and in line with practice, the examination boards sent Ofqual a series of teacher assessments probably expressed in terms of marks and percentages as in any other year. The Government implied in public and, probably also, private statements that it did not expect a free for all in awards and that standards should be the same this year as in any other. So, when Ofqual discovered that nationally this was not the case the natural tendency was to reach for the algorithm!

As everyone knows, there is nothing fancy about the Ofqual algorithm and the old statement that if you put rubbish in you get rubbish out seems to have held true. What the algorithm tried to do was to compare standards in schools in subjects year by year and compare the 2020 pupil cohort taking examinations with those of other years based on key stage assessments. That sounds a bit fancy but it’s like taking your 11+ score and seeing if it predicts your geography A-level grade! To test the algorithm they ran it on the data from 2019 and discovered it would have been 60% accurate. It seems not to have occurred to them that it would have been 40% inaccurate which is about what has happened this year!

I think some of the above explains the sheer blindness at Ofqual (to some extent shared with the examination boards) in failing to spot the iceberg which was looming up in front of them. If they had only thought about the human level they might have done better because it is worth underlining here that predictions are bread-and-butter in schools. If you have a twelve year old in their second year in secondary school it is quite likely that their report tells you what grade is predicted for them at GCSE four years down the line. So, the A-level student who says they were predicted three A grades is talking about an achievement trail which goes back at least three years, not something conjured out of the air. This year’s teacher assessment known as the Centre Assessed Grade builds on these sort of forecasts. It’s a reliable measure.

We just need to divert to scotch all this nonsense about teachers being overly generous. A teacher assessment indicates what the student will do against exactly the same criteria as the examination but not under examination conditions, stress, nerves and whatever. As an indicator of potential rather than performance it is probably a better guide than an examination. You can have a political position on this but, yes, teacher assessments do deliver higher outcomes.

In the end, Ofqual failed to think things through and then it fudged. It didn’t tell government because it was doing what the government wanted and one suspects that the agency thought it would get away with it. It’s a mess now but not that bad. Young people have had a horrific 2020. Imagine being locked down with your parents when you are eighteen for a whole Spring and Summer if you can! It’s not a happy thought. It’ll be very interesting as well to see whether the 2020 GCSE cohort does better or worse at A Level in 2022. My suspicion is that by the time Ofqual have run a few new algorithms the standard will be almost exactly the same…



An Entirely Avoidable Mess and how to make it worse!

The roots of this week’s A-level results fiasco can be traced back to Michael Gove’s obsession with one-hit terminal exams. His nostalgic motion that if it was good enough for me then it’s good enough for today’s young people was always destined to hit the buffers somewhere. The latest announcement about using mock examination results as the basis for an appeal is simply going to make things worse.

Anybody who understands the purpose of teacher assessments appreciates that they are going to involve the allegedly awful consequence of grade inflation. If an education system is any good and if children learn more as a consequence then grade inflation should be the outcome but that’s a bit difficult for right-wing politicians to understand. Teachers aren’t corrupt either. The teacher assessment takes into account the work done by a student over two years and posits what their best performance would be given an obliging examination paper, hard work in the final run-up and no nerves. Inevitably, therefore, the assessments are bound to be higher than the final outcomes from an examination. It’s a good process where it is used, for example, to award a grade to a student who misses the final exam because of some personal or family disaster which is entirely out of their control or to provide a post examination analysis of where things went wrong.

There is one other element to the teacher assessment which is that a good proportion of the candidates are in schools where parents have paid considerable sums of money for them to succeed. That’s a significant pressure on the teachers who work in those schools who, for quite intelligible commercial reasons, will be required to inflate assessments by the school management and hierarchy. It means that these teacher assessments are not quite the same as the others.

Mock examinations are not the same as teacher assessments either although they are often confused. They are simply not designed to be externally valid. Some schools seeking a very high pass rate in examinations will use them as a gateway to entry. Some teachers will use them to challenge lazy pupils and some teachers will use them to encourage the nervous underachiever. These are commonsense approaches to student learning but it makes a mockery of using them as a basis for some kind of appeals process.

What Ofqual did wrong, perhaps for understandable reasons, was to try to maintain a similar standard in terms of past percentages between this year and last, taking into account the teacher estimates and the past performances of pupils. When examination authorities do this they like to respect the teachers and the school’s order of merit but move the grade borderlines up and down to get the outcome they prefer. This preferred outcome is influenced by the school’s previous results and can also include analysis of the assessments made at key stage 2 or eleven years of age and the impact of the result on national statistics.

You don’t have to be a statistician to realise that this process is going to disadvantage a very bright child in a neighbourhood comprehensive and produce bizarre outcomes in schools where an additional factor like gentrification, an influx of refugees, changes to the school’s catchment or simply considerable pupil mobility over five years undermines the mathematics.

Worse still in the current situation the approach doesn’t benefit rapidly improving schools of the kind which the government likes to trumpet where, for all sorts of reasons, standards might be expected to rise faster than the average. In overall terms and for everyone, it is the students on grade borderlines or thereabouts who are likely to lose out but they are also likely, given that this is an A-level, to have concerned, middle-class voting parents who know how to complain!

Could things have been different? The answer is that without Michael Gove’s intervention the assessments would have been considerably more reliable. It was only a few years ago that there were AS Levels which assessed the first year of a two-year A Level course, there were modular exams where credit for the candidate was accumulated over the whole two years and there was coursework, distinct from teacher assessment because it was evidence presented by the candidate to meet external criteria. Additionally, coursework was often approached under test conditions. Had any of these approaches survived they would have provided a statistically reliable component to provide the starting point for a valid assessment.

Is anything to be done? The Scottish approach of giving people teacher assessed grades and biting the bullet of single year significant grade inflation may be justifiable given the challenges which the pandemic has created for young people. It will also encourage some interesting research options! If it turned out that the young people who get places at Scottish universities with these 2020 grades go on to perform at degree level just as well as their peers from any other year that would be interesting. It might suggest that we have an examination system which is managed to depress the achievements of young people and to restrict access to what are perceived to be high status professional careers while acting as a brake on the aspirations of the disadvantaged seeking a better future. Perhaps a little grade inflation is not such a bad thing after all!