How do students respond to AI-generated questions?

Sponsored by VitalSource

The use of artificial intelligence (AI) tools in higher education has skyrocketed in the last year, but what does research tell us about how students are responding to the technology?

VitalSource’s team of learning scientists recently conducted the largest empirical evaluation of AI-generated questions known to date, using data collected from nearly 1 million unique questions, 300,000+ students, and 7+ million total question attempts from January 2022 to May 2023.  VitalSource, a leading education technology solutions provider, sought to learn more about the performance of automatically generated (AG) questions at a large scale to gain a better understanding of emerging patterns in student behaviour, laying the groundwork for the future of AG formative questions.

This paper, which was peer-reviewed and presented in 2023 at the 24th International Conference on Artificial Intelligence in Education, evaluated the performance of five types of AG questions to learn more about the way students interact with types of questions on tests and homework. The five types of questions – fill-in-the-blank, matching, multiple choice, free response and self-graded submit and compare – were incorporated into digital textbooks using VitalSource’s free, artificial intelligence learning tool called Bookshelf CoachMe.

Bookshelf CoachMe incorporates all five question types to provide variation in how students practice and process new content knowledge. This process, formative practice, provides students with immediate feedback and unlimited answer attempts and is known to increase learning gains when incorporated into the primary learning material. [1][2]

Key findings

Key findings from VitalSource’s study:

1. The type of question is related to difficulty. Recognition-type questions are generally easier than recall-type questions.

2. Only about 12 per cent of students input “non-genuine” responses to fill-in-the blank questions, and nearly half of those students persist in answering until they input the correct response.

3. In a classroom environment, the difficulty index for all question types increases compared with the aggregated data, and all persistence rates are over 90 per cent, indicating that students behave differently when answering formative practice when it is incorporated into the course expectations.

“This research helps set benchmarks for performance metrics of automatically generated questions using student data and provides valuable insight into student learning behaviour. These analyses help us to continually improve our learning tools for students.”
Rachel Van Campenhout, senior research scientist at VitalSource and author of the paper

VitalSource’s award-winning, AI-powered study coach, Bookshelf CoachMe, launched in 2021. Based on the proven learning science principle known as the doer effect, its AI-generated questions help students study more effectively, build confidence and deepen subject matter expertise. Its content improvement service monitors and evaluates response data and automatically replaces underperforming questions – the first system of its kind to continually improve formative practice in real time.