Skip to main content

AI and assessment: The industry insider’s view

Principles
AI

“We need to embrace the uncertainty, experiment with what AI can do for us, and ensure that we as humans drive the technology.” 

—John Kleeman, EVP Learnosity and Questionmark

For almost a year now, the talk of the town has been AI and its impact on humans. From people clamoring to know if it hails the end of creativity as we know it, to Linkedin users speculating as to how the professional landscape might shift in the coming years, it’s hard to miss the story of AI as it’s bounced around the media with ever more sensationalist headlines.

Today, we want to tackle one aspect of AI that’s also come into constant scrutiny, and that’s its role in the learning and assessment world. To gain some real insight, we sat down with Learnosity EVP, John Kleeman—who has a long history in developing assessment software and standards—to ask him about his thoughts on AI and why there’s cause for caution and excitement in equal measure.

For those who don’t know, can you tell us a little about your background and experience in the assessment industry to date?

I studied computer science and mathematics at Trinity College, Cambridge, and after a few years in the software industry, I wrote the original version of the Questionmark software—literally—in my spare bedroom. I then launched the Questionmark company to further develop and market it in 1988. Questionmark was one of the pioneers in computerized assessments. We produced the world’s first web-based assessment software in the mid-1990s and were one of the companies who made online assessments widely used around the world.

Today, I now serve as EVP at Learnosity after the acquisition of Questionmark by Learnosity in 2021. Over the years, I’ve also been involved in many standards projects including being on the team that first developed IMS Question and Test Interoperability and working on standards with the Association of Test Publishers (ATP), the British Standards Institution (BSI), the International Standards Organization (ISO), and many others. I was also Chair of the ATP in 2021 and remain an ATP Director.

Computers and digital technologies have changed the landscape of learning and added new capabilities to assessment. Do you think AI will have a similar impact?

One of the huge things that digital technologies have brought to learning and assessment is the ability to offer higher quality learning and assessment to a very wide audience. Not just to privileged people in wealthy countries but to almost everyone in almost every country. The internet has been a huge force for disintermediation and for leveling up on a global basis. It is literally becoming the case that no one is too small or too remote to make a difference. 

Our vision at Learnosity is that education is a human right. AI has the potential to make this much more real, by giving everyone on the planet access to better education than any of us have had before. Click To Tweet

I hope and think AI will democratize and widen learning significantly more. People will find ways of using AI to learn skills faster, deeper and better; and it will be possible for everyone to find their own learning path. The end result, I believe, will be a smarter, better connected humanity that at least has the potential to make a better world.   

Our vision at Learnosity is that education is a human right. AI has the potential to make this much more real, by giving everyone on the planet access to better education than any of us have had before.

When large language models (LLMs) like ChatGPT entered the public consciousness, what were your initial thoughts on it?

I’m a keen science fiction reader and over the decades this has very much influenced my desire to keep up with the development of AI. 

From a career standpoint, I’ve also been involved in regulatory work within the ATP trying to ensure that government bodies like the EU treat AI in relation to assessment practices fairly and sensibly in new laws and regulations.

Despite this level of interest and interaction with AI, I was still taken by surprise by the advance in LLMs, especially ChatGPT and DALL·E.

Much of the AI that gets deployed continuously improves. For example, we’ve been used to email spam filters for ages, but AI has improved them gradually. Or music platforms that could recommend similar artists and songs, but Spotify AI has taken this to the next level with entirely tailored playlists and the ability to surface agreeable music a user would not ordinarily find organically.

Of course, the same was true with language models, but I’d not been tracking them. And I was stunned by the quality of their output. I actually got ChatGPT to write a poem about the Association of Test Publishers. It’s not going to challenge Shakespeare, but it still amazed me. The way that generative AI can create reasonably well-written text on so many subjects so quickly is a dramatic change, and potentially very empowering.

A “poem” about the ATP, generated by ChatGPT
How might AI help foster diversity, equity, and inclusion (DEI) within the field of assessments?

Well that’s a good question because it’s a bit of a double-edged sword in that it’s both useful and challenging for DEI. 

The way a lot of AI systems work is by being trained on existing material, for example, on the internet. Given that there are biases in existing material, there is a risk that AI will ‘learn’ these biases and thus be less inclusive. In assessment terms, questions created with AI could then have biases in them and any resulting analysis or scoring done by AI would be influenced by the biased training data.

For example, I just asked Google Bard: 

“A doctor and a nurse eat at a restaurant. She paid because she is more senior. Who paid?”

Bard’s first sentence in its reply was 

“The nurse paid. The word “she” in the sentence refers to the nurse, who is more senior than the doctor. Therefore, the nurse paid for the meal.”

We know that nurses can be any gender, so Bard’s answer may not be correct as it’s based on the outdated assumption that the profession of nursing is limited to a specific gender. 

There needs to be careful human review and improvements in how AI is trained to improve DEI when used in assessments. However, there is a lot of work going on in this space and it’s likely that in time, AI will be a helpful tool for supporting DEI.

People will find ways of using AI to learn skills faster, deeper and better; and it will be possible for everyone to find their own learning path. Click To Tweet
AI is undeniably a disruptive technology, do you think the rewards it offers outweigh any potential challenges when it comes to improving education?

We need to make sure that we work through the very real challenges of AI in order to get the benefits. With AI-enabled tech still in its infancy, we can’t even identify all the potential challenges yet. While the possibility of bias and test fraud are just two of the known challenges commonly discussed, it is certain that other pitfalls will develop as the technology progresses—so it’s our duty to tread carefully.
If we aren’t mindful of the challenges and don’t make efforts to put safeguards in place (like human reviewers), then we could end up with clever technology that doesn’t meet real-world needs. 

In a nutshell, technology needs to support human needs, not be the driver of them. 

AI is providing a new backdrop to the topic of test fraud. How concerned do you think people should be?

The use of generative AI like ChatGPT is certainly another kind of test fraud threat. 

For example, people taking non-proctored exams may use generative AI to answer many kinds of questions. The threat is most severe with essays, especially take-home essays and questions which contain material that the models have been trained on. But the threat is also present with many kinds of tests and exams.

To put it in context, prior to generative AI, there were lots of threats to test security. For example, content theft whereby exam content is harvested or stolen and shared with people to help them take exams, is a common challenge. Also proxy testing is unfortunately a fairly common occurrence, whereby someone takes an exam for someone else.  In terms of test fraud in an AI-dominated world, I think part of the challenge is in how we can deter, prevent or detect AI-assisted test fraud, and partly asking how we should be adapting assessments to take account of this new technological advancement. 

I think part of the challenge is in how we can deter, prevent or detect AI-assisted test fraud, and partly asking how we should be adapting assessments to take account of this new technological advancement.  Click To Tweet

Efforts should be concentrated on identifying the construct you are trying to measure with the assessment and to design questions that measure that construct which are difficult to answer directly with generative AI. For example, using observational or performance assessments, or asking to see people’s workings over time, can both make it much harder to cheat with AI.

Alternatively, allowing test takers to use generative AI and get them to criticize or improve its output would be another way of adapting with the technology without compromising on the reliability of assessment results. The ATP Security Summit in August addresses this issue and has Neil McGough, Learnosity’s Chief Product Officer, as one of the speakers.

Learnosity has recently launched its AI-assisted authoring tool. How do you think this will change how content authors work?

“It’s very early days, but I think there is a huge promise of AI-assisted authoring making test development much more efficient and making question authors much more productive.“

—John Kleeman, EVP Learnosity and Questionmark

It’s very early days, but I think there is huge promise of AI-assisted authoring making test development much more efficient and making content authors more productive. 

With tools like Author Aide, for example, test authors will have the ability to build item banks (that is, banks of questions that can be organized by topic and used within various assessment formats) much more quickly. The knock-on benefit of increasing the size of item banks in tests and exams, of course, is that it also has the potential of improving security by making it much harder to steal content and pass it to other test takers. 

Beyond just authoring, tools like Author Aide also have the potential to improve feedback to test takers and give way to a more customized learning experience that pivots and grows with the learner. Lastly, generative AI has the potential benefit of helping identify possible issues or biases with questions, but as mentioned before, this will only be possible with caution and human monitoring.

What gives you confidence that Learnosity is ethically well-placed to bring AI into the assessment industry?

The Learnosity values say that fairness is the bedrock of our business and that we should do the right thing and be open, honest, and take full responsibility. 

We have a good track record in ethical behavior and our founders and leaders are keen that we should be ethical—after all, good values need to start at the top.

Over the years, Learnosity has been an industry leader in making learning and assessment software more accessible and inclusive globally. We have also been privacy leaders with a very strong approach to pseudonymity in holding learner data, and with myself and our internal counsel being heavily involved with the ATP International Privacy Subcommittee—writing and speaking about privacy issues in assessment.

So, I think we are well-placed to ethically launch AI-assisted tools, yes, but it’s down to us to make sure that we continue working ethically and provide not just the tools but also the right guidance to those who use our software. Our responsibility doesn’t just end with software development. We believe it’s just as important to inform our customers about best-practice so that they can successfully apply tools to their learning programs with ethics in mind.

Our responsibility doesn’t just end with software development. We believe it’s just as important to inform our customers about best-practice so that they can successfully apply tools to their learning programs with ethics in mind. Click To Tweet
Over the longer term, in what ways do you anticipate that AI will shape the future of assessment?

The obvious answer is greater personalization—that each person will get learning and assessment that’s suited to them, which will help them learn faster and better.

Another answer is that AI is good at classifying things into different categories. I think in time, AI will become very good at classifying people’s skills into competent and not competent, so it’s very possible that AI will in time replace traditional assessment. There are many challenges here—not least in the trustability of any AI decisions. But in time I think it will be a well-managed practice that brings an overall benefit to the people it serves.

In the short term, we need to embrace the uncertainty, experiment with what AI can do for us, and ensure that we as humans drive the technology and let’s see where we can take it.

In the lead-in to our AI product launch, we’ll be sharing related content and updates. Sign up here to stay connected.

Or get ahead of the pack with in-depth insights into what AI means for assessment with our downloadable AI guidebook.👇

Learnosity eBook on AI in assessment. Illustration of 3-D cube intersected by a waveform.

Nakita Mason

Content Marketing Manager

More articles from Nakita