August 12th and all ptarmigans and teaching teams run for cover

Today is August 12th and marks the start of the shooting season for Grouse, Ptarmigan and the Common Snipe.

It also, entirely uncoincidently, marks the release day of the 2015 National Student Survey results in the UK. With much discussion around the introduction of new metrics and outcome criteria for the proposed Teaching Excellence Framework (TEF), and with HEFCE planning a review of NSS questions in 2016/17 to possibly include student engagement, it is certainly worth taking a step back to think about the mathematics of it all.

Are metrics losing the plot?

Myself and more notable others have been concerned about the gamification of these metrics and the emphasis on strategies used to encourage students to participate. In my blog post last year “NSS is the name of the game” I looked at some of the satisfaction data and pondered on the overall usefulness (“add the shoe sizes of VC’s into league tables! Would be just as accurate“) and questioned the ethics of some of the approaches to gathering the data (“we were pressed by tutors to answer certain questions in a particular way“).  I’ve heard myself someone comment to a student that if they don’t give a good NSS result, then employers won’t consider recruiting students from a particular university.

As I concluded in my blog last year after applying some statistical tests, is what we see a genuine annual increase in student claimed level of satisfaction or is it a result of a carefully honed precess for gathering data?

What about this year?

The 2015 NSS results were published by HEFCE this morning. There were no changes to the benchmarking this year, and only minor change was the cut off for inclusion of low datasets from 23 to 10 respondents.

NSS Q22 2015

FIGURE 1: Q22 NSS results 2010-2015. CC BY Viv Rolfe


The data above arranges English HEIs alphabetically. What we see is in 2015 (orange), the data very closely aligns with 2014 (turquoise). In 2014 the benchmarking was altered and brakes put onto the system, and this was the first year where no significant increase in overall satisfaction across the English university sector (ANCOVA tests previously reported).

By looking at Q22 in terms of annual means and standard deviations (below), a yearly increase in overall satisfaction is apparent across the 127 English HEIs. What is interesting is the reduction in variation across the institutions, and one has to question whether the NSS is becoming less discerning?

NSS Q22 Mean

FIGURE 2: Q22 Mean NSS results 2005-2015. CC BY Viv Rolfe


Results by mission group?

It is interesting to split the analysis by mission group, separating out the 19 Russell Group and 18 University Alliance institutions from the others. By extrapolating the data, in 2021 there will be a big party as universities outstrip the performance of the Russell Group should the survey remain unchanged. But that is unlikely.

NSS Mission Group

FIGURE 3: Trended Data for Q22 By Mission Group. CC BY Viv Rolfe



Hello – David here. You may remember me from such blogs as “Followers of the Apocalypse” where I write primarily about UK HE policy making. When Viv showed me this data set I was fascinated to see trends in the NSS, and immediately started to think about implications for the proposed “Teaching Excellence Framework” (TEF). 

Figure 2, above, shows the variability in NSS scores between institutions decreasing with each NSS iteration. There could be a number of drivers for this, I would suggest that it perhaps shows institutions getting better at running the process, and getting the message out to students that a good institutional NSS score is good for the perceived value of their degree from that institution. Manifest nonsense, obviously – but if metrics are good for one thing it would be for developing faith-based belief systems!

When the NSS was originally developed the scores were primarily used at a course (or at worst, subject area) level. This allowed prospective students to compare the attitudes of students doing a similar course at different institutions. Anyone that works in a HEI will tell you that variability between departments and subject areas is huge, and indeed most of the pain experienced by academic staff on the “glorious twelfth” will be concerning this intra-institutional variation.

Johnson Minor’s TEF would (we are led to believe) be at an institutional level, and Osborne announced that it would affect an institutional ability to increase student fees to match inflation (as if inflation was some kind of an optional extra rather than reflecting the reality of rising costs). If NSS results at an institutional level are included in the proposed “basket” of metrics within the TEF, the decreasing inter-institutional variance shown in Figure 2 implies that this will have the effect of making it harder to discriminate between institutions.

Of course, this may be what BIS want (so all institutions can increase fees with the figleaf of independent oversight justifying it – see also OFFA!), but in this case it seems like a very expensive way to pretend that you are not making HE more expensive for the taxpayer. But I suppose BIS are used to spending lots of money to do that kind of thing. Such is politics.

So that’s why I think this analysis is important.

[declaration of interest: I received 1 pint of beer for writing the above]


Thank you David for your wisdom and insight. The discipline variation and impact on teaching teams also concerns me as often we are  held to account for much of what goes on behind the scenes of successful teaching (timetables, IT, efficiency of academic administration systems) which is not reflected in the survey.

It just remains to say, and I know all the readers are dying to know, that a ptarmigan is a slightly plump bird with a beautiful plumage. I hope that like most teaching teams today, it manages to dodge any bullets and experiences nothing but a mild ruffling of feathers.

Rock Ptarmigan

By Jan Frode Haugseth (Own work) [CC BY-SA 3.0 ( or GFDL (], via Wikimedia Commons



  • Download summary data from HEFCE.
  • Data for English HEIs cleaned – aligning university names with 2014 recorded names (e.g. The University of Bath was University of Bath in 2014). Data was then sorted.
  • Registered data was used – that is the data represented the institution where the student was registered (as opposed to Taught – where sutdents do majoriy of year 1 study).
  • Data is all full time and part time students.
  • In 2010 the data benchmarking changed and was adjusted for ethnicity. Interetingly the data is not adjusted for socio-economic background (

Other articles that week:

Chris Hanretty (2015). When communicating uncertainty goes wrong. Available:

Keith Burnett (2015). Available:




NSS is the name of the game…

..and I wanna play the game with you.

What do these items have in common?

Amazon gift vouchers.
Drinks tokens.
A free iPad.
Tickets to the ball.
Free entry to “Scouting for girls”
£10 printing credit.

You may be mistaken for thinking these items may be about to feature in a 2015 version of the Generation Game (where a varied assortment of items were placed on a conveyor belt that the contestants would have to memorise them to win the prize). No. These items signify the arrival of that good old time of year again. Roll up. Roll up. It is the National Student Survey.



History of the survey 

The National Student Survey (NSS) in the UK started in 2005 and asked university students in their 3rd (mostly final) year of study series of questions relating to the experience on their chosen course (NSS, HEFCE 2005). The survey was intended to measure the ‘quality of higher education’. The questions have largely remained unchanged, and the most acclaimed and most likely to break vice chancellors out in an uneasy sweat is Q22 “overall how satisfied are you with your course”? The survey has been modified over the years and now includes additional questions for NHS based courses. The data presented is subject to a number of benchmarks and adjustments.

The survey was launched early on to a fair amount of resistance, questioning the poor style of analysis that bundles the data together and reduces the clarity of results, relying on broad and unspecific questions, and quite worrying on the back of this, that universities subsequently making important strategic decisions based on the outcomes of the survey (Jumping through hoops on a white elephant, 2008).


Is experience becoming more important than education?

The survey, despite these early robust discussions, has not gone away, and quite to the contrary. Whilst the survey itself remains largely unchanged, the resourcing and time investment by institutions to ensure an effective process for collecting responses is now a significant activity in every academic calendar. It is part of the ‘business’ of Higher Education and feeds the growing appetite of the sector for key performance indicators, data, league tables, frameworks, benchmarks, as we all set to the business of ‘measuring’ what education is. Couple this volume of work with poor data management systems, the increasing need to deliver and duplicate this data in a variety of forms to other places, the administration and teaching teams within universities are not surprisingly under huge pressures and experience unhealthily long working hours (UCU 2014 Survey). As with other areas of the public sector, I’m sure we’d much rather spend our time educating young people, and do the job we were initially intended to do?

But what will happen next – some positive action or collapse? I get a sense of a rise in scepticism across the sector as reported in a previous blog post (Guinea pigs in a maelstrom, 2014), where at the Society of Research into Higher Education annual conference, Bob Burgess and Jurgen Enders questioned with regards to league tables:

Aren’t there bigger problems to solve?


Like an academic arms race.


Add the shoe sizes of VC’s into league tables! Would be just as accurate.

I am optimistic we are on the verge of one brave institutional leader saying enough is enough.


So does it provide a useful view of the quality of education? What do people think?

I don’t doubt at all that prospective students and their families should be better informed about the performance of their institution to which they might be making a considerable and hefty commitment to. However are league tables the best way of doing it? I would question whether people read or indeed understand the increasing numbers of them. Having spent 12 years doing open days at three different universities I can honestly say I cannot recall it being a subject of conversation once. University choice is about gut feel of a place, it is about coming to an open day and meeting great existing students, academic and technical staff. If anyone makes a decision based on the position of a university on a table alone, they must be pretty mad.

But what do people really think, and going back to the cuddly toy, are incentives wrong? Incentives are an established means in market research for improving response rates and quality of responses to questionnaires. You might think they naturally bias responses toward being more favourable? Research shows this is not always the case, but the optimum incentive point must be found, otherwise the opposite can happen, and respondents start to get pretty naffed-off. One way to minimise any bias would be to get students to complete the survey independently of their univeristy – they complete the ‘Destination of Leavers from Higher Education’ – graduate employment – survey 6 months after leaving, so why not the NSS? Why not manage the survey centrally via the student union to relieve the burden on the teaching teams?

So what do students think of the survey? We don’t really know to be sure, but one media article about it, attracted a colourful range of comments:

Doing ourselves a favour by reviewing the university positively


We were pressed by tutors to answer certain questions in a particular way


Taxpayers deserve more open, fuller accountability by this sector because of the huge amounts now spent and the financial burden put on our young people.


Still, I remember my uni days fondly and would encourage anyone to seek out a uni experience and screw the untrustworthy rankings.


You can read the conversation for yourself. (BBC, Universities face survey warning, 2008).


What about the meaning of the data?

So what do we really know about the meaning of the data? Do we every really sit and question it? All the data is openly retrievable with data sets going back to 2005 on the HEFCE website ( ). This is the approach I took to looking at it in this first instance.

1) I downloaded all the Higher Education year datasets to an Excel spreadsheet.
2) I looked at the ‘registered’ data as opposed to the ‘taught’ data – data being responded to the institution at which the student was registered rather than where the majority of teaching may have been.
3) I manually corrected the variation in university names over the years, and included the latest name for those institutions that had been renamed.
4) I sorted the data by institution to allow for comparisons across each year.

Work by Paula Surridge ( ) informed the use of benchmarking, adjusting for subject, ethnicity, age, mode of study, gender and disability. It does not adjust for socio-economic group which on the surface is rather surprising. In 2008, the benchmarking changed, so comparing to data prior to that is not terribly useful.


NSS Q22 % Satisfaction

Figure 1. Overall % Satisfaction looked at across 30 HEIs from 2005 to 2013 to illustrate the change in benchmarking in 2008 and subsequent compaction of the data.



NSS % Satisfaction ALL HEI

Figure 2. Overall % satisfaction, HEIs in England, 2010-2014. (Arranged alphabetically)


When I first plotted this out by arranging the HEIs in England alphabetically, I thought it looked pretty and rather interesting. My partner thought it looked like the German world cup football strip. My statistician, who I gave the data blinded, observed “clearly some pattern and cyclical event going on”. I enlisted the help of a second statistician to analyse the data.

Performing an ANCOVA to compare year on year differences, there were significant differences between each year group with the exception of 2013 to 2014. Each institution was incrementally better year upon year until 2013 when there was another benchmark change.

Conclusion? The data suggests students are more satisfied year on year with the HEIs in England. Or, are the processes to gather the data are improving year on year?



Data was sorted according to Russell Group, Alliance University or other.


Figure 3. % Overall satisfaction by group and extrapolation of data. (RG Russell Group, UA Universities Alliance, Oth Other)


By sorting universities in England by their commonly referred to groupings, and extrapolating the dataset,  the Russell Group clearly achieve higher satisfaction rates compared to Alliance Universities and all others. The data does however show their rates of satisfaction slowing down. Should the benchmarking remain unchanged, by 2018, the UA and other will match the performance of the Russell Group. By 2023 the ‘others’ nose past the winning post being the first to reach 100%.

Improvement in satisfaction or processes?

It is not clear what the nature of these observations really are, and with all my discussions on the data, there are some interesting hypotheses. I hope this article prompts some serious data analyst to interrogate the data sets more fully. I have done the same analysis on other questions – and there are interesting differences with those also.

But do we see this elsewhere? In the 2014 REF the sector seemed to incrementally improve with suggestions that the evaluation is flawed (The Guardian, REF 2014), and whilst we cannot doubt the amazing and outstanding work that does go on in UK universities, we could say that this was also due to also improvements in the system.

One thought that has come up a few times is that we are dealing with a system that is corrupt – and by human nature if you set us targets and measure our performance, we will work to comply with those targets. We all know that is exactly what happens – how do we spend much of our academic time?

Donald Campbell was an economist who observed this in his writings in 1976:

The more any quantitative social indicator (or even some qualitative indicator) is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.

Charles Goodhart an economist made similar observations in 1975, now known as ‘Goodhart’s Law’:

When a measure becomes a target, it ceases to be a good measure.


Where do we go from here?

I do think the HE sector needs to take a good look at itself and understand fully the series of measures and targets to which our performances are increasingly evaluated (research, students satisfaction, teaching performance). If we do persist in having monitoring systems, they have to be run effectively. We are detracting academic staff from doing their jobs, and the pressure on teams do get good results, as we’ve see with the REF can develop a sinister side (The Guardian, REF 2014). Higher education is not a theme park delivering a jolly experience. It should be nurturing and at times challenging one to enable learners to develop and achieve meaningful goals, and I would therefore at times be quite happy if my students at times were left a little unsatisfied because I had chosen to stretch their thinking and their approaches.

But meanwhile.

“Good game. Good game”.