While writing a recent paper, we found ourselves continuously worrying about using certain terms that might cause our text to be deemed AI-generated. And when we mentioned this to colleagues and students, it became apparent that they had had similar concerns when writing their papers and essays. A new wave of paranoia appears to be sweeping across higher education as everyone becomes an amateur expert in AI detection.
During discussions about students’ and colleagues’ writing, it is now common to hear phrases such as “I can tell it is written by AI”. A recent conversation that gained traction on social media revolved around specific vocabulary deemed to be overly common in content generated by ChatGPT, such as “foster”, “delve”, “in the realm”, “endeavour”, “thrilled” and even “delighted”. Other adjectives identified as AI favourites include “commendable”, “innovative”, “meticulous”, “intricate”, “notable” and “versatile”.
However, we should be very careful before installing ourselves as judge and jury about whether our colleagues and students have really written the content attributed to them. One problem is that human AI sleuths – just like their human-programmed mechanical cousins – have limitations and biases, particularly with regard to non-native English. This could have the effect of making non-native speakers paranoid about engaging fully in academic discourse for fear that their work will be deemed AI-generated. This could stifle their creativity and hamper their development of the genuine voice that is an essential component of effective writing and critical thinking.
Second, the presumption that certain words or phrases are indicative of AI use excludes a wide range of expressive possibilities and overlooks the diversity of native English usage worldwide. For example, “delve” is commonly used in former British colonies such as Nigeria. Inevitably, that means that Nigerian students and writers are more likely to use it in their academic discourse than other English speakers are, making them disproportionately likely to be accused of using AI.
Campus resource collection: AI transformers like ChatGPT are here, so what next?
A third problem stems from the fact that, in reality, detecting AI-generated content isn’t nearly as simple as counting the frequency of certain telltale words and phrases. This isn’t how AI works. The likes of ChatGPT are trained on vast datasets derived from human writing across genres and contexts. Hence, AI models do not develop their own dialects; they simply regurgitate the language they have been fed. So if they use words such as “delve”, it is because words such as “delve” appear relatively frequently in the existing literature.
This is another reason why we are foolish to believe ourselves to be experts in AI detection. Those supposedly telltale words are all out in the wild anyway, and the list of them is likely to differ from “expert” to “expert”. From the writer’s perspective, that means the goalposts will constantly move in terms of what otherwise useful phrases they should avoid if they don’t want to risk their writing being flagged as AI-generated.
Then there is the issue of who feels entitled to judge whom. There is a whole world of power dynamics involved in determining who assumes the authority to detect and whose work is subject to detection. But however entitled someone feels to judge, the shortcomings of human judgement in this area mean that, in reality, no one has the epistemic authority to judge whether anyone else has used AI.
Policing without the requisite expertise can lead to false accusations against students and writers. Those who lack the ability to challenge such accusations might suffer disciplinary and reputational repercussions as a result. The paranoia such policing induces can also compel students and writers to stop using familiar terms to avoid accusations that their work is AI-generated even when those terms are the most expressive of what they want to convey.
Given the legitimate scholarly potential of generative AI, we should also be wary of creating an atmosphere of universal censure and suspicion that drives its use underground. While some journals do not permit any use of it, the more forward-looking ones (some of which are very prestigious) already permit the use of AI assistants for grammar and spelling checking, and even data analysis.
Academic integrity needs to be redefined in this era of AI and the purpose of education re-evaluated. Perhaps we should be focused on teaching students and staff to responsibly use generative AI, rather than trying to catch them out for using it regardless of the ethics of their intent.
Ultimately, scouring papers and essays for words dubiously associated with generative AI will not stop people using it. And if they are doing so transparently and responsibly, should we even be trying to stop them?
Lilian N. Schofield is a senior lecturer in non-profit management and Xue Zhou is reader in entrepreneurship and innovation, both at Queen Mary University of London.