GenAI has not broken assessment. It has exposed it

An assessment system that rewards polished work above judgement can’t function in an GenAI-enabled world. Here’s how to build one that can

University assessment and quality assurance

Artificial intelligence in higher education

Student engagement

Feature article

Europe

Lucy Gill-Simmen

Royal Holloway, University of London

30 Apr 2026

A pair of scales showing a balance between a pile of books and an AI bot

image credit: iStock/Deagreez.

Five things supervisors can do for struggling PhD students

How to apply virtual reality to enhance learning experiences

‘AI should support student services, not impersonate them’

Students are asking for AI guidance, not just policy

Are your students disengaging – or is it their personality?

Recently, conversations across the sector have been circling the same uncomfortable territory: generative artificial intelligence, assessment – and whether higher education is responding to the right problem. Most focus on the disruption: GenAI as an external force acting upon a stable system. But that framing misdiagnoses what is actually happening.

Assessment in higher education was already under strain. GenAI has not created a crisis – it has made an existing one impossible to ignore.

For decades, university education has operated on a quiet assumption: that what a student submits reflects what a student understands. This has never been entirely reliable. The gap between output and understanding has always existed.

What GenAI has done is widen that gap dramatically, and flood it with light. Students can now produce high-quality academic work with unprecedented ease. But the more significant shift is not technological – it is behavioural. Where assessment rewards polished outputs, it is entirely logical to prioritise producing those outputs as efficiently as possible.

This is not a failure of character. It is a predictable response to incentive structures we designed.

What these conversations keep returning to is a deeper misalignment, between what higher education claims to develop and what its assessment systems actually reward. Critical thinking, intellectual independence, judgement: these are the stated goals.

But judgement, in particular, is rarely what our assessments are explicitly designed to reveal. Judgement is what allows a student to decide what matters, to weigh evidence, to navigate ambiguity, to apply knowledge in context. It is also what determines how GenAI gets used. Whether a student uses GenAI to think more deeply or to bypass thinking altogether is not a technical question. It is a question of evaluative judgement.

Across the sector, GenAI use is now near-universal, but far from uniform. Students are not simply “using AI”. They are navigating it: moving between using it as a tutor, a thinking partner, a structural aid and, at times, a production tool. The same student may shift between these modes within a single assignment, depending on time pressure, clarity of expectations and perceived risk. Institutional responses that frame GenAI use simply as permitted or prohibited miss this complexity entirely.

What the evidence keeps pointing to is this: GenAI itself is not the primary driver of student behaviour. Assessment design is.

Where assessment includes a genuine moment of accountability – one where students must explain, apply or defend their thinking – GenAI tends to be used in ways that support learning. Students question, test and refine their understanding.

Where they don’t include these moments, the same students may use GenAI to complete tasks efficiently, with minimal intellectual engagement. The issue is not whether students use GenAI. It is whether the system requires them to understand what they produce. If understanding is not required, GenAI makes it rational not to learn.

This is not a story of disengaged students. In sector discussions, students consistently describe their most meaningful learning as occurring in moments of genuine accountability, where they had to think, apply and defend their ideas, often in dialogue with others. What they encounter too often instead are assessments reducible to aligning outputs with marking criteria.

The growing fairness problem deserves more attention than it is getting. Many institutions now have GenAI policies, but guidance varies significantly across modules, programmes and individual lecturers. The result is not simply confusion, it is unevenness. More cautious students may avoid GenAI altogether and disadvantage themselves relative to peers operating within fewer constraints. It’s as much a question of equity as it is of integrity.

So where does this point? Three things feel particularly clear from where the sector conversation currently sits.

Assessment needs to move beyond the evaluation of outputs towards the evidencing of understanding. Tasks must be designed so that students are required to account for what they know: explaining, applying and defending their thinking in ways that cannot be outsourced. These are not minor technical adjustments – they are the crucial moments where judgement becomes visible.

Assessment also needs to become more programmatic and developmental. Judgement does not emerge in a single high-stakes performance. It develops over time, through iteration, feedback and reflection. Structures that allow students to revisit ideas and demonstrate how their thinking has evolved are not a softening of standards; they are instead a more authentic expression of them.

In practice, this could mean replacing essays with short, structured explanation checkpoints where students must articulate and justify their thinking, requiring annotated portfolios that show how ideas evolved (and where GenAI was used), or designing scenario-based tasks that force decisions under ambiguity rather than polished reproduction. It might involve structured debates, where students must respond in real time, rather than submit prepared work. These approaches do not eliminate GenAI; they make it irrelevant unless understanding is present.

And institutions need to move beyond the binary of permitted versus prohibited. The more productive question is what role GenAI should play in a given task, and where judgement must remain with the student. Making this explicit is both fairer and more educationally honest.

None of this requires abandoning rigour. It requires a more genuine form of it, one that centres not just what students produce, but how they think.

AI has not broken assessment. It has made visible a system already under strain, and opened a window to redesign it around what higher education has always claimed to value: the development of judgement. That window will not stay open indefinitely.

Lucy Gill-Simmen is vice-dean for education and student experience in the School of Business and Management at Royal Holloway, University of London.

If you would like advice and insight from academics and university staff delivered direct to your inbox each week, sign up for the Campus newsletter.

GenAI has not broken assessment. It has exposed it

Lucy Gill-Simmen

Five things supervisors can do for struggling PhD students

How to apply virtual reality to enhance learning experiences

‘AI should support student services, not impersonate them’

Students are asking for AI guidance, not just policy

Are your students disengaging – or is it their personality?

You may also like

Five things supervisors can do for struggling PhD students

How to apply virtual reality to enhance learning experiences

‘AI should support student services, not impersonate them’

Students are asking for AI guidance, not just policy

Are your students disengaging – or is it their personality?

Why business schools need to rethink how they teach law

The future of English proficiency testing: why universities are rethinking language assessment

Stages for students to learn from, with, about and beyond AI

How should universities define AI proficiency?

Recognise the human side of doctoral study

Discover

More from THE

Sign up

Collaborate with THE

About

Legal stuff