Q. What would a six-year old make of how Academics are prone to examine each other’s research metrics?
A. One big game of Top trumps.
I’ve been playing Top Trumps a lot with my son lately, and it got me thinking about the assessment of research, particularly the REF. Here I sketch out some ideas about how an Academic Top Trumps game would work.
I have chosen six categories for the game (below). What would you choose?
Peer Reviewed Publications
I’ve spent a lot of time in my life playing Top Trumps; first as a child myself (footballers; sportscars; Turbos (it was the 1980s); steam trains); then on honeymoon in Scotland without a TV (Lord of the Rings, the Original Star Wars Trilogy) and now with my son (Dinosaurs; Star Wars – The Clone Wars). For the uninitiated, the idea is that the cards contain data (this is sounding good for researchers already) and two players pit their data against each other. The data are arranged as a number of different variables. All cards share the same variables. A player forms a hypothesis; an a priori prediction of which variable on his card has the best value. He reports this datum to his opponent, who compares it to theirs, and the winner is the one with the highest value; they get to take the losing card, and move it to the back of the pack, in a counterbalanced fashion*. The winner ends up with the whole pack of cards.
What would academic Top Trumps look like? I’ve noticed that there is always a very glamorous, obvious category on the cards – for race cars it is Top Speed. One finds oneself playing this category a lot. For steam trains it is Power, for Star Wars characters it is The Force. For academics, it would have to be Publications. Probably more specifically, Number of Peer Reviewed Publications. I can already imagine the joy of playing it now, the magnanimous splendour of secretly holding Professor Arthur L Suchabody in one’s hand, “Publications, 568”. I’ve always found the game more enjoyable if, to announce the category and value, one uses a jeering tone half-way between football scores announcer and House of Commons Speaker.
One hardly needs to spend much time around researchers to know that the number of publications is a big deal, and this is where Top Trumps gets serious, and points at the malaise in the system. I was honestly told that I could apply for a Senior Lectureship after I had 10 publications. It’s an ugly rule of thumb for promotion, a grotesque abbreviation of a whole career; as if this sole criterion could judge the multitude of skills I had developed and enthusiasms I had squandered. Plus this rule of thumb seemed to change according to the size of the thumb, and now there has clearly been an inflation of this modest target.
In the last few years, though, the h index has replaced publications, as if it is now more important to receive than to give. I’ve heard academics say things like, “He may well have an h-index of 34, but his talk was rubbish.” Perhaps h-index and number of publications will be too closely related? This doesn’t worry me, correlated variables figure on most packs: height and weight (disonaurs); top speed and acceleration from 0 to 60mph for cars. For the third category we need a Top Trumps classic, something more esoteric, qualitative even. The Star Wars packs have a Force Rating and the dinosaurs have a Killing Quotient. Such numbers are not factual, they are likert scales in reverse, summarizing a character with a made up number. The latest Clone Wars pack has ‘Honour’ and ‘Wisdom’ which I like. At first, I thought these might make nice categories for my academic cards. On reflection, however, I have only rarely encountered them amongst my peers who score best on the publication metrics.
We could do better as psychologists. We do actually use numbers to quantify characteristics in a way that means it is possible to compare individuals. So, in a parallel universe, I think we could use real personality test data in the cards. Yes, this does mean I am suggesting that the academics may as well not stop at assessing colleagues as if they were just a sum of their outputs, but extend the metric to their very personality and intelligence. A fun variable for Top Trumps might be Egocentrism, as measured on the EPI.
But I am getting carried away. The idea is to measure output and success, not personality, despite what it may feel like climbing the greasy pole in a British University, so we must drop the idea of Egocentrism from our REF Top Trumps and look for a more reflective index of something important. So, I haven’t got ‘Ego’ or ‘Wisdom’ or ‘Honour’ on my cards. I thought for a while about ‘Reputation’, but of course, in ‘Impact’ we have the perfect Top-Trumps rating of something which sounds important and measurable, but likely isn’t. We know that Luke Skywalker’s ability with a lightsaber is key; we know that Darth Sidious can eject lightning from his finger tips, but who has the greater Force, and what does it mean for society? Luke may well have made his own Lightsaber, but has he patented it and started a spin-out company? Clearly it is important that our work has impact, but how can you put a number on that? And if my impact is in oranges and my colleagues in apples, what does that mean? Nonetheless, Top Trumps got to be as popular as it is now just by putting numbers to romantic characteristics, and thus Impact is the key ingredient for my cards. Just don’t ask how the rating is actually achieved.
The fourth category is similarly opaque: REF rating. The REF rating is a lot like ‘Jedi Power’ and ‘Magic’. That is, it offers only a limited range of values (0 to 5 Jedi Power; 0 to 4 REF Power), plus nobody really knows either how it is calculated nor what it actually means, except that a higher number is good. Do practice saying to yourself, ‘The REF is strong is this one’ in a Darth Vader voice. It’ll cheer you up on a low-REF day.
We can’t all be leaders of our field, with a long list of publications and a citation count the size of an undergraduate’s overdraft. And this is where the quirky category comes in, something like ‘Deception’ (in the Clone Wars pack), and ‘Resistence to Ring’ or ‘Resiliance’ (in the Lord of the Rings Packs). It acts as a category which is often a trade-off with another, like top speed and fuel economy. It means that in your pack, everything or everyone has something to give. It also tends to be a bit of a funny number too, because it is often a low number when all the others are high. For instance, Yoda has a deception of 5, where some unpronouncable baddie has a deception of 37. As an experimental psychologist, I happen to think that Deception is a bad thing, so we play it in our family with a low score being better, but the way the cards are, and the way in which some characters have nothing to offer but being highly deceptive, I think the card makers might actually think deception is a good thing.
In my pack, to offset all those research numbers, I thought I would have ‘Contact hours’ (teaching). Have that, emeritus professor, eat my evaluations, research fellow in non-teaching institution.Doubtless, there will be those who when they are shuffling about their department or adding to their pack, will play the teaching category like a low number is best (Ace is high) and those who think a high number is best (Ace is very busy doing repeat teaching to level 1 stats classes).
To this ambiguous category which makes a virtue of teaching, I will add ‘Book Chapters’, mostly just because I like reading and writing academic book chapters, and aside of the lack of peer-reviewed rigour, I often find them a far more useful medium than they are given credit for. More than one senior academic has chastised me for writing a book chapter, “Why on earth would you write a book chapter?” and this is reason enough for me to enjoy writing them. So, whereas the player who holds all the cards in her hand may well look at a high number of book chapters as proof that the character being played has not been spending his time writing research papers, she may well be pleased that they have published four book chapters, when the young lecturer from New University down the road has none. The same argument may be made for conference presentations, books, journal editing, or reviewing work. All of these feel like good scholarly work, but all to often they are viewed as negative by those on high, looking at all the numbers; those who can see the trade-offs but not the individual at the centre of them.
There’s other categories that didn’t make the cut. For pure fantasy and just to underscore a frequent worry I have about academia I could put ‘Carbon Footprint’ as one of my categories. Academia, in the age of the internet and decent audio-visual communication seems to involve a disproportionate amount of jet travel. I should also like to add, ‘Public Money Spent’, just to see how the experts and non-experts deal with this variable. Picture a VC and a tax-payer playing Top Trumps. The VC proudly declares ‘Professor James, publications, 346″. Because the game is set up like that, the taxpayer also has a top professor, with the same number of publications, 346; it is a tie. The rules are such that now, the VC can choose another category from the same card and try that one**. The VC smiles and chooses Public Money Spent (i.e. Research Grants from Government), £6.7million. That’s a lot of money and the VC is confident of having won. Hearing this, the taxpayer is puzzled, and thinks his card, someone who has only spent £150,000 of the public purse for the same gain, must be the winning card, like the supercar with a better fuel economy. A row ensues. I haven’t added ‘Salary’ either, but naturally, that would be a saving variable for many a senior academic. I suggest that this could be expressed in multiples of the Prime Minster’s salary, just to make the numbers more manageable on the modestly proportioned cards.
Finally, there’s the issue of who goes in the pack, in order to make it enjoyable but realistic. No good having a set of cards all distributed with similar scores on the one same variable, is it? A bit of variety is best. Also, it’s no good having only one domain, gender or age group represented. But, one could have different categories, different sets – Russell Group Top Trumps, New University Top Trumps, Early Career Researcher Top Trumps, Open Access Champion Top Trumps, etc. Endless fun.
*not actually true, but I just like the sound of ‘in a counterbalanced fashion’.
** other resolutions of a tie exist, please see local rules.