Policing AI use by counting ‘telltale’ words is flawed and damaging

Making people paranoid about employing familiar and useful words is not the way to encourage responsible AI use, say Lilian Schofield and Xue Zhou 

五月 3, 2024
A drawing of a Hollywood private investigator
Source: iStock/breakermaximus

While writing a recent paper, we found ourselves continuously worrying about using certain terms that might cause our text to be deemed AI-generated. And when we mentioned this to colleagues and students, it became apparent that they had had similar concerns when writing their papers and essays. A new wave of paranoia appears to be sweeping across higher education as everyone becomes an amateur expert in AI detection.

During discussions about students’ and colleagues’ writing, it is now common to hear phrases such as “I can tell it is written by AI”. A recent conversation that gained traction on social media revolved around specific vocabulary deemed to be overly common in content generated by ChatGPT, such as “foster”, “delve”, “in the realm”, “endeavour”, “thrilled” and even “delighted”. Other adjectives identified as AI favourites include “commendable”, “innovative”, “meticulous”, “intricate”, “notable” and “versatile”.

However, we should be very careful before installing ourselves as judge and jury about whether our colleagues and students have really written the content attributed to them. One problem is that human AI sleuths – just like their human-programmed mechanical cousins – have limitations and biases, particularly with regard to non-native English. This could have the effect of making non-native speakers paranoid about engaging fully in academic discourse for fear that their work will be deemed AI-generated. This could stifle their creativity and hamper their development of the genuine voice that is an essential component of effective writing and critical thinking.

Second, the presumption that certain words or phrases are indicative of AI use excludes a wide range of expressive possibilities and overlooks the diversity of native English usage worldwide. For example, “delve” is commonly used in former British colonies such as Nigeria. Inevitably, that means that Nigerian students and writers are more likely to use it in their academic discourse than other English speakers are, making them disproportionately likely to be accused of using AI.


Campus resource collection: AI transformers like ChatGPT are here, so what next?


A third problem stems from the fact that, in reality, detecting AI-generated content isn’t nearly as simple as counting the frequency of certain telltale words and phrases. This isn’t how AI works. The likes of ChatGPT are trained on vast datasets derived from human writing across genres and contexts. Hence, AI models do not develop their own dialects; they simply regurgitate the language they have been fed. So if they use words such as “delve”, it is because words such as “delve” appear relatively frequently in the existing literature.

This is another reason why we are foolish to believe ourselves to be experts in AI detection. Those supposedly telltale words are all out in the wild anyway, and the list of them is likely to differ from “expert” to “expert”. From the writer’s perspective, that means the goalposts will constantly move in terms of what otherwise useful phrases they should avoid if they don’t want to risk their writing being flagged as AI-generated.

Then there is the issue of who feels entitled to judge whom. There is a whole world of power dynamics involved in determining who assumes the authority to detect and whose work is subject to detection. But however entitled someone feels to judge, the shortcomings of human judgement in this area mean that, in reality, no one has the epistemic authority to judge whether anyone else has used AI.

Policing without the requisite expertise can lead to false accusations against students and writers. Those who lack the ability to challenge such accusations might suffer disciplinary and reputational repercussions as a result. The paranoia such policing induces can also compel students and writers to stop using familiar terms to avoid accusations that their work is AI-generated even when those terms are the most expressive of what they want to convey.

Given the legitimate scholarly potential of generative AI, we should also be wary of creating an atmosphere of universal censure and suspicion that drives its use underground. While some journals do not permit any use of it, the more forward-looking ones (some of which are very prestigious) already permit the use of AI assistants for grammar and spelling checking, and even data analysis.

Academic integrity needs to be redefined in this era of AI and the purpose of education re-evaluated. Perhaps we should be focused on teaching students and staff to responsibly use generative AI, rather than trying to catch them out for using it regardless of the ethics of their intent.

Ultimately, scouring papers and essays for words dubiously associated with generative AI will not stop people using it. And if they are doing so transparently and responsibly, should we even be trying to stop them?

Lilian N. Schofield is a senior lecturer in non-profit management and Xue Zhou is reader in entrepreneurship and innovation, both at Queen Mary University of London.

请先注册再继续

为何要注册?

  • 注册是免费的,而且十分便捷
  • 注册成功后,您每月可免费阅读3篇文章
  • 订阅我们的邮件
注册
Please 登录 or 注册 to read this article.

相关文章

人工智能很快就能像人类一样进行研究和写作。那么,真正的教育会被这种作弊的浪潮淹没吗?还是说,人工智能只会成为教学和评估的又一种技术辅助手段?来自约翰·罗斯(John Ross)的报道

7月 8日

Reader's comments (5)

This is exactly my paranoia in writing. I use words such as thrilled and now, it is linked to AI generated words.
Great article and thought-provoking. Really opens debate on why humans are now taking on the role of AI detector. Also, we need to think about the consequences of this on those whose work are under scrutiny.
Thank you for reinforcing my arguments. Your experiences and observations contribute significantly to the ongoing dialogue about the use of AI in academic writing. I think it is important that we continue to explore these issues collectively. This will ensure that AI tools are used to enhance, not hinder, our creative and intellectual efforts.
Great article. This shows we still have a long way to go with diversity and inclusion before making such accusations. We also have to start rethinking the role of AI and how this might totally change the way we teach and do research. Exciting days ahead and we have no choice as the world is changing.
"Counting 'telltale' words to police AI use is flawed and damaging because it perpetuates a reactive rather than a proactive approach to addressing potential harms of AI. Relying solely on specific keywords to gauge the ethical implications of AI systems oversimplifies complex ethical considerations and may lead to false positives or negatives. This approach also risks stifling innovation and creativity by instilling fear of censorship or punishment based on arbitrary word choices rather than encouraging thoughtful design and responsible implementation of AI technologies. Additionally, focusing on word counts can distract from more meaningful measures of accountability, such as transparency, oversight, and impact assessments, which are essential for fostering trust in AI systems and promoting their ethical use. Instead of fixating on words, efforts should be directed towards developing comprehensive frameworks that consider the broader societal, ethical, and legal implications of AI technologies." Just sharing answer generated by ChatGPT.
ADVERTISEMENT