Scoring a Computer Adaptive Test
by Chanan Braunstein, Owner
What is a Computer Adaptive Test (CAT)?
Computer adaptive testing (CAT) is a form of computer-based test that adapts to the examinee's ability level. For this reason, it has also been called tailored testing. CAT uses a statistical model, called item response theory (IRT), to estimate the examinee's ability based on their responses to the test items. CAT also uses an algorithm to select the optimal item for the examinee at each stage of the test, based on the current ability estimate and the difficulty and information of the items. CAT aims to provide a valid and reliable measurement of the examinee's ability with fewer items and shorter testing time than a traditional fixed-length test.
Some of the benefits of CAT include:
- Reduces the number of "easy" items that high-ability candidates receive; "easy" items tell little about a high performing candidate's ability,
- Reduces the number of "difficult" items low-ability candidates receive; candidates tend to guess on items that are too difficult which can skew results,
- Reduces item exposure and subsequent security risks.
Some of the challenges of CAT include:
- Requires a large and high-quality item pool that covers a wide range of difficulty levels and content domains,
- Requires sophisticated software and hardware systems to administer and score the test,
- Requires careful calibration and validation of the items and the IRT model parameters,
- Requires clear communication and explanation of the test design and scoring method to the stakeholders, such as test takers, educators, and employers.
Calculating the Probability of Answering an Item
We calculate the probability of an examinee answering an item using the Rasch probability function.
// Define a function to calculate the probability of answering an item correctly using the Rasch model
function raschModel(ability, difficulty) {
let exp = Math.exp(ability - difficulty) // Calculate the exponent term
return exp / (1 + exp) // Calculate the probability term
}
For example, suppose there are three items in the item pool, with difficulty levels of -1, 0, and 1, respectively. Assume that the examinee's ability estimate is 0.5, and that the IRT model is the two-parameter logistic model. Then, using the formula from, we can calculate the probability of answering each item correctly as follows:
- Item 1: P_1(0.5) = e^(0.5 - (-1)) / (1 + e^((0.5 - (-1)))) = approx 0.82
- Item 2: P_2(0.5) = e^(0.5 - (-1)) / (1 + e^((0.5 - 0))) = approx 0.62
- Item 3: P_3(0.5) = e^(0.5 - (-1)) / (1 + e^((0.5 - 1))) approx 0.38
What the above means is that a examinee with an ability of 0.5 has a 82% probability to answer an item with a difficulty 0f -1.
Maximum Information Method
The maximum information method is a way of selecting the best item to present to an examinee in a computerized adaptive test. The idea is to choose the item that provides the most information about the examinee's ability level, based on the item response theory (IRT) model. The information value of an item is calculated as the product of the probability of answering the item correctly and its complement, which depends on the ability estimate and the item difficulty. The higher the information value, the more precise the ability estimate will be after answering the item. The maximum information method selects the item that maximizes this information value among the available items in the item pool.
/ Define a function to calculate the information value using the Fisher information method
function getInfo(ability, difficulty) {
let prob = raschModel(ability, difficulty); // Calculate the probability term
let info = prob * (1 - prob); // Calculate the information value
return info;
}
Continuing the example from above, we can calculate the information value of each item as follows:
- Item 1: I_1(0.5) = P_1(0.5)(1 - P_1(0.5)) = approx 0.15
- Item 2: I_2(0.5) = P_2(0.5)(1 - P_2(0.5)) = approx 0.24
- Item 3: I_3(0.5) = P_3(0.5)(1 - P_3(0.5)) = approx 0.24
The maximum information method would select Item 2 or 3 as the next item to present to the examinee, since they have the highest information value among the three items.
Example JavaScript Program to Score CAT
The full program is below:
// Define some constants for the test parameters
const NUM_ITEMS = 100000 // The number of items in the test
const MIN_DIFFICULTY = -3 // The minimum difficulty level of the items
const MAX_DIFFICULTY = 3 // The maximum difficulty level of the items
const MIN_ABILITY = -4 // The minimum ability level of the examinees
const MAX_ABILITY = 4 // The maximum ability level of the examinees
const TERMINATION_CRITERION = 0.01 // The termination criterion for the standard error of measurement
const MAX_TEST_QUESTIONS = 200
// Define a function to generate random numbers from a normal distribution
function randn_bm() {
let u = 0,
v = 0
while (u === 0) u = Math.random() //Converting [0,1) to (0,1)
while (v === 0) v = Math.random()
return Math.sqrt(-2.0 * Math.log(u)) * Math.cos(2.0 * Math.PI * v)
}
// Define a function to generate random numbers from a uniform distribution
function randu(a, b) {
return a + Math.random() * (b - a)
}
// Define a function to generate an item pool with difficulty levels
function generateItemPool(n, min, max) {
let itemPool = []
for (let i = 0; i < n; i++) {
let difficulty = randu(min, max) // Generate a random difficulty level from a uniform distribution
itemPool.push({ id: i + 1, difficulty: difficulty })
}
return itemPool
}
// Define a function to calculate the probability of answering an item correctly using the Rasch model
function raschModel(ability, difficulty) {
let exp = Math.exp(ability - difficulty) // Calculate the exponent term
return exp / (1 + exp) // Calculate the probability term
}
// Define a function to simulate answering an item based on the probability of correctness
function answerItem(prob) {
let rand = Math.random() // Generate a random number from [0, 1)
if (rand < prob) {
return true // Answer correctly with probability prob
} else {
return false // Answer incorrectly with probability 1 - prob
}
}
// Define a function to update the ability estimate using the maximum likelihood method
function updateAbility(ability, responses, itemPool) {
let sumNum = 0 // Initialize the numerator of the summation term
let sumDen = 0 // Initialize the denominator of the summation term
for (let i = 0; i < responses.length; i++) {
let item = itemPool[responses[i].itemId - 1] // Get the item object from the item pool by id
let prob = raschModel(ability, item.difficulty) // Calculate the probability term
sumNum += responses[i].correct - prob // Update the numerator by adding the difference between response and probability
sumDen += prob * (1 - prob) // Update the denominator by adding the product of probability and its complement
}
let newAbility = ability + sumNum / sumDen // Calculate the new ability estimate by adding the quotient of the summation terms to the old ability estimate
return newAbility
}
// Define a function to calculate the standard error of measurement using the Fisher information method
function calculateSEM(ability, responses, itemPool) {
let sum = 0 // Initialize the summation term
for (let i = 0; i < responses.length; i++) {
let item = itemPool[responses[i].itemId - 1] // Get the item object from the item pool by id
let exp = Math.exp(ability - item.difficulty) // Calculate the exponent term
let prob = exp / (1 + exp) // Calculate the probability term
sum += prob * (1 - prob) // Update the summation by adding the product of probability and its complement
}
let sem = Math.sqrt(1 / sum) // Calculate the standard error of measurement by taking the square root of the inverse of the summation
return sem
}
// Define a function to calculate the information value using the Fisher information method
function getInfo(ability, difficulty) {
let exp = Math.exp(ability - difficulty) // Calculate the exponent term
let prob = exp / (1 + exp) // Calculate the probability term
let info = prob * (1 - prob) // Calculate the information value
return info
}
// Define a function to select the next item using the maximum information method
function selectItem(ability, responses, itemPool) {
let maxInfo = 0 // Initialize the maximum information value
let nextItem = null // Initialize the next item object
for (let i = 0; i < itemPool.length; i++) {
let item = itemPool[i] // Get the item object from the item pool
let info = getInfo(ability, item.difficulty) // Calculate the information value
if (info > maxInfo) {
// If the information value is greater than the current maximum information value
let answered = false // Initialize a flag to indicate if the item has been answered
for (let j = 0; j < responses.length; j++) {
if (responses[j].itemId === item.id) {
// If the item id matches any of the response item ids
answered = true // Set the flag to true
break // Break out of the loop
}
}
if (!answered) {
// If the flag is false, meaning the item has not been answered
maxInfo = info // Update the maximum information value
nextItem = item // Update the next item object
}
}
}
return nextItem
}
// Define a function to check if the termination criterion is met
function checkTermination(sem, criterion) {
if (sem < criterion) {
// If the standard error of measurement is less than the criterion value
return true // Return true to indicate that the test should be terminated
} else {
return false // Return false to indicate that the test should continue
}
}
// Define a function to score a computer adaptive test given an examinee and an item pool
function scoreCAT(examinee, itemPool) {
let ability = examinee.ability // Initialize the ability estimate with the examinee\'s ability level
let responses = [] // Initialize an empty array to store the responses
let terminated = false // Initialize a flag to indicate if the test is terminated
for (let i = 0; i < NUM_ITEMS; i++) {
if (i >= MAX_TEST_QUESTIONS) {
terminated = true
break
} // Break out of the loop if the maximum number of questions is reached
// Loop through each item index
let item = selectItem(ability, responses, itemPool) // Select the next item using the maximum information method
if (item === null) {
// If no item is selected, meaning all items have been answered or there is no suitable item in the pool
terminated = true // Set the flag to true
break // Break out of the loop
}
let prob = raschModel(ability, item.difficulty) // Calculate the probability of answering the item correctly using the Rasch model
let correct = answerItem(prob) // Simulate answering the item based on the probability of correctness
responses.push({ itemId: item.id, correct: correct }) // Add the response object to the array of responses
ability = updateAbility(ability, responses, itemPool) // Update the ability estimate using the maximum likelihood method
let sem = calculateSEM(ability, responses, itemPool) // Calculate the standard error of measurement using the Fisher information method
terminated = checkTermination(sem, TERMINATION_CRITERION) // Check if the termination criterion is met
if (terminated) {
// If the flag is true
break // Break out of the loop
}
}
return { ability: ability, responses: responses, terminated: terminated } // Return an object containing the final ability estimate, responses, and termination status
}
// Generate an item pool with difficulty levels from a uniform distribution between -3 and 3
let itemPool = generateItemPool(NUM_ITEMS, MIN_DIFFICULTY, MAX_DIFFICULTY)
let examinee = { ability: 0 } // Start with a mid point ability level
// Score a computer adaptive test given an examinee and an item pool
let score = scoreCAT(examinee, itemPool)
console.log(
"Examinee's estimated new ability level: " + score.ability.toFixed(2),
)
Conclusion
The above code can be used to score a Computer Adaptive Test. Contact us to find out how BK Software Development can help you implement such an algorithm at your TechEd company.