A key historical tenet for large-scale educational testing was that tests had to be standardized so that they were the same for everyone. This arose from scientific principles that suggested that by making the test identical for all, differences in performance between test takers resulted from genuine individual differences (e.g. different knowledge and skill) rather than testing variance and error.
Large-scale educational tests are critical to measuring the progress of learning and provide data on how schools and learning are progressing in different regions. Post-Covid concern around learning loss has added urgency to this. But advances in technology have made it possible to get the same (or better) kind of data from tests without the unpopular, enforced standardization that was previously required.
Here are some of the technology trends that I believe are effecting this change.
Adaptive testing is used to adjust question difficulty according to an individual learner’s ability or competence. As learners answer questions correctly, they get given more difficult questions, and as they answer ones incorrectly, they get given easier ones. The goal is to keep learners challenged and engaged.
Adaptive testing has always been possible, but it’s now more sophisticated and easier to implement. Adaptive tests require less learner time to measure capability, and usually can measure a wider set of abilities. For example, in Australia, the PAT Adaptive test can be used to test maths and reading abilities and can give a fair assessment for children at different stages and levels in both.
Children in the same year of school are often at very different points in their learning. Adaptive tests are essentially personalized for each child, and can help identify and respond to individual learning needs in ways that traditional group-based assessments cannot.
If everyone gets the same questions, they can (and often do) leak—especially if tests are available at different times or on demand. Learners will share questions and answers with peers, criminals will harvest questions and sell them, and some teachers may stretch boundaries to help their learners better prepare for tests.
Technology can circumvent such risks by randomizing tests. Question or choice order can be shuffled; questions can be chosen at random from test banks; or it’s possible to create dynamic questions that change for every test taker.
Because randomizing test content reduces test fraud and cheating, it also makes test results more valid and useful. Share on XAutomated item generation, increasingly powered by AI, can create large numbers of items, which makes it much easier to make a large test bank to select from. Meanwhile, automation of psychometrics and reporting makes it practical to determine fair meaning from the results of randomized tests.
Because randomizing test content reduces test fraud and cheating, it also makes test results more valid and useful. With the huge explosion in generative AI, which makes it easier to create a wider set of questions, we can expect to see a lot more tests populated with randomized content.
Performance testing is when learners are measured doing a practical task.
Many professional IT certification exams have test takers perform a practical task on a virtual machine. This is also becoming increasingly practical (and cost effective) in education.
For example, a student could create or modify a spreadsheet or show their competence at some other digital task entirely online. Algorithms can be used to reliably score such performance tasks.
Another kind of performance test is when a student is observed doing a practical task. For example, a learner could be observed giving a presentation to peers, or a group of learners could be observed and assessed doing some collaborative task or a learner could be observed doing a science experiment.
Assessment can either be by a teacher/instructor or sometimes by a peer—in both cases it’s common to use a tablet or other mobile device to fill in a checklist. The more that standard rubrics can be used, the more reliable such assessments become.
The rapid increase in capability of virtualization is making performance testing more practical and the widespread use of mobile devices and the desire to make assessments more inclusive and valid are increasing interest in this kind of assessment.
To test all learners fairly, tests need to be more inclusive. Technology offers a significant upgrade on paper-based testing here with far stronger accessibility support.
Learners with neurodiversity may also prefer online exams, as Laura McConnell, a campaigner in this space, suggests in the TES, stating that “Switching exams fully online would be a fantastic shift towards a more neurodiversity-friendly system”.
There are other inclusivity issues where technology can help. For example, a significant number of learners in many countries may not be native speakers of the language the test is given in. Technology is increasingly providing solutions to this: computerized translation can be used to translate question wording and choices for the learner, either in advance or while the test is being taken. There are also emerging ways in which learners can answer tests in more than one language (sometimes referred to as “translanguaging”).
Not a new trend in itself, but the proliferating usage of devices in educational settings has made it increasingly practical to give tests to students in schools in more countries worldwide. Additionally, increasingly edtech and assessment software works on less expensive devices like Chromebooks and tablets
With paper testing and some older digital systems, it could take weeks or months to score assessments and get results back to learners and teachers. That works for measuring learning across schools or systems but means that there is little direct feedback to learners or educators. Or when scores and feedback arrives, it’s no longer relevant.
Digital assessment allows for faster, often immediate scoring. This rapid feedback to learners and their teachers can inform and improve teacher or learner strategies—so not only do the tests help inform education leaders and bureaucrats, they also directly benefit those who count most: learners.
Slightly paradoxically, standardized tests are becoming less standardized while still providing good measurement. Psychometricians used to require every learner to take a test in exactly the same way to ensure a level playing field. But academics like Professor Steve Sireci of the University of Massachusetts Amherst encourage more flexibility in testing, based on understanding learner diversity and working with them to measure their learning rather than to enforce standardization.
The better we can measure learning, the more effectively we can improve it. Share on XOther ideas are likewise gaining currency, such as allowing learners to use their own devices to take tests, removing time limits from tests, allowing all learners to take a test in a separate room if they want—or even allowing learners to choose which questions they want to answer. These are all ways of providing appropriate ways to deal with diversity. The ideas derive from pedagogy but often need technology to implement.
In some countries, there has been a feeling of too much testing. In other countries, there is a huge appetite for better information to help improve the quality of schools and education. In all countries, the better we can measure learning, the more effectively we can improve it. And technology can support this goal in a growing number of ways.