Claims of prejudice in student evaluation of teaching surveys (SETs) are greatly exaggerated, with biases producing minimal distortions that often favour marginalised academics, Australian research suggests.
A Monash University study of almost 400,000 student evaluations has contradicted a widely held conviction that SETs are skewed against academics from minority backgrounds and tarnished by students’ “bigotry” and “revenge reviews”.
Rather, assessments tend to favour female academics and those from non English-speaking backgrounds, with the harshest reviews produced by male students for their male teachers.
And while students’ citizenship affects the results slightly, its influence tends to be positive, with international students of both sexes generally offering more upbeat reviews than their domestic peers. They are also more likely to fill out the surveys.
The study, published in Assessment & Evaluation in Higher Education, also dismantles a theory that SET scores are distorted by vengeful students aggrieved at their marks. Only around one in five students who had failed their subjects ended up completing the surveys, with successful students exhibiting far higher response rates.
Almost half of the surveys awarded teachers the highest possible rating, with the 3,700-plus assessed academics earning a mean score of 13.2 out of 16. Study author Richard O’Donovan said it would be rare for student cohorts to achieve such “astonishingly high” grades.
“This is a good news story,” he said. “Students on average are actually very happy with the teaching that they’re experiencing.”
His study found that while student and course characteristics had “statistically significant” impacts on the ratings, most of these effects were “negligible”.
“If you have the luxury of a large sample, it’s like having an electron microscope. You can find all sorts of fascinating things that aren’t visible to the naked eye. But you’ve got to zoom back out [to appreciate] the real-world implications. When you do that, you start to see that the effect sizes are typically small or trivial.”
Dr O’Donovan, a senior lecturer at Monash’s School of Curriculum Teaching and Inclusive Education, found no evidence of SET scores being strongly influenced by the size of classes or the proximity of surveys to exams. “Average sentiment about teaching remains very consistent regardless of when the responses occurred, and whether a unit had examinations or not,” the paper says.
The analysis found that less than one in 1,000 survey responses had been flagged by survey software as containing “rude and hostile” comments. “As we enter into the world of generative AI, it’s getting easier…to detect and intercept those sorts of things that can clearly be stressful for some people,” Dr O’Donovan said.