Scoring a Computer Adaptive Test

by Chanan Braunstein, Owner

What is a Computer Adaptive Test (CAT)?

Computer adaptive testing (CAT) is a form of computer-based test that adapts to the examinee's ability level. For this reason, it has also been called tailored testing. CAT uses a statistical model, called item response theory (IRT), to estimate the examinee's ability based on their responses to the test items. CAT also uses an algorithm to select the optimal item for the examinee at each stage of the test, based on the current ability estimate and the difficulty and information of the items. CAT aims to provide a valid and reliable measurement of the examinee's ability with fewer items and shorter testing time than a traditional fixed-length test.

Some of the benefits of CAT include:

  • Reduces the number of "easy" items that high-ability candidates receive; "easy" items tell little about a high performing candidate's ability,
  • Reduces the number of "difficult" items low-ability candidates receive; candidates tend to guess on items that are too difficult which can skew results,
  • Reduces item exposure and subsequent security risks.

Some of the challenges of CAT include:

  • Requires a large and high-quality item pool that covers a wide range of difficulty levels and content domains,
  • Requires sophisticated software and hardware systems to administer and score the test,
  • Requires careful calibration and validation of the items and the IRT model parameters,
  • Requires clear communication and explanation of the test design and scoring method to the stakeholders, such as test takers, educators, and employers.

Calculating the Probability of Answering an Item

We calculate the probability of an examinee answering an item using the Rasch probability function.

// Define a function to calculate the probability of answering an item correctly using the Rasch model
function raschModel(ability, difficulty) {
  let exp = Math.exp(ability - difficulty) // Calculate the exponent term
  return exp / (1 + exp) // Calculate the probability term
}

For example, suppose there are three items in the item pool, with difficulty levels of -1, 0, and 1, respectively. Assume that the examinee's ability estimate is 0.5, and that the IRT model is the two-parameter logistic model. Then, using the formula from, we can calculate the probability of answering each item correctly as follows:

  • Item 1: P_1(0.5) = e^(0.5 - (-1)) / (1 + e^((0.5 - (-1)))) = approx 0.82
  • Item 2: P_2(0.5) = e^(0.5 - (-1)) / (1 + e^((0.5 - 0))) = approx 0.62
  • Item 3: P_3(0.5) = e^(0.5 - (-1)) / (1 + e^((0.5 - 1))) approx 0.38

What the above means is that a examinee with an ability of 0.5 has a 82% probability to answer an item with a difficulty 0f -1.

Maximum Information Method

The maximum information method is a way of selecting the best item to present to an examinee in a computerized adaptive test. The idea is to choose the item that provides the most information about the examinee's ability level, based on the item response theory (IRT) model. The information value of an item is calculated as the product of the probability of answering the item correctly and its complement, which depends on the ability estimate and the item difficulty. The higher the information value, the more precise the ability estimate will be after answering the item. The maximum information method selects the item that maximizes this information value among the available items in the item pool.

/ Define a function to calculate the information value using the Fisher information method
function getInfo(ability, difficulty) {
  let prob = raschModel(ability, difficulty); // Calculate the probability term
  let info = prob * (1 - prob); // Calculate the information value
  return info;
}

Continuing the example from above, we can calculate the information value of each item as follows:

  • Item 1: I_1(0.5) = P_1(0.5)(1 - P_1(0.5)) = approx 0.15
  • Item 2: I_2(0.5) = P_2(0.5)(1 - P_2(0.5)) = approx 0.24
  • Item 3: I_3(0.5) = P_3(0.5)(1 - P_3(0.5)) = approx 0.24

The maximum information method would select Item 2 or 3 as the next item to present to the examinee, since they have the highest information value among the three items.

Example JavaScript Program to Score CAT

The full program is below:

// Define some constants for the test parameters
const NUM_ITEMS = 100000 // The number of items in the test
const MIN_DIFFICULTY = -3 // The minimum difficulty level of the items
const MAX_DIFFICULTY = 3 // The maximum difficulty level of the items
const MIN_ABILITY = -4 // The minimum ability level of the examinees
const MAX_ABILITY = 4 // The maximum ability level of the examinees
const TERMINATION_CRITERION = 0.01 // The termination criterion for the standard error of measurement
const MAX_TEST_QUESTIONS = 200

// Define a function to generate random numbers from a normal distribution
function randn_bm() {
  let u = 0,
    v = 0
  while (u === 0) u = Math.random() //Converting [0,1) to (0,1)
  while (v === 0) v = Math.random()
  return Math.sqrt(-2.0 * Math.log(u)) * Math.cos(2.0 * Math.PI * v)
}

// Define a function to generate random numbers from a uniform distribution
function randu(a, b) {
  return a + Math.random() * (b - a)
}

// Define a function to generate an item pool with difficulty levels
function generateItemPool(n, min, max) {
  let itemPool = []
  for (let i = 0; i < n; i++) {
    let difficulty = randu(min, max) // Generate a random difficulty level from a uniform distribution
    itemPool.push({ id: i + 1, difficulty: difficulty })
  }
  return itemPool
}

// Define a function to calculate the probability of answering an item correctly using the Rasch model
function raschModel(ability, difficulty) {
  let exp = Math.exp(ability - difficulty) // Calculate the exponent term
  return exp / (1 + exp) // Calculate the probability term
}

// Define a function to simulate answering an item based on the probability of correctness
function answerItem(prob) {
  let rand = Math.random() // Generate a random number from [0, 1)
  if (rand < prob) {
    return true // Answer correctly with probability prob
  } else {
    return false // Answer incorrectly with probability 1 - prob
  }
}

// Define a function to update the ability estimate using the maximum likelihood method
function updateAbility(ability, responses, itemPool) {
  let sumNum = 0 // Initialize the numerator of the summation term
  let sumDen = 0 // Initialize the denominator of the summation term
  for (let i = 0; i < responses.length; i++) {
    let item = itemPool[responses[i].itemId - 1] // Get the item object from the item pool by id
    let prob = raschModel(ability, item.difficulty) // Calculate the probability term
    sumNum += responses[i].correct - prob // Update the numerator by adding the difference between response and probability
    sumDen += prob * (1 - prob) // Update the denominator by adding the product of probability and its complement
  }
  let newAbility = ability + sumNum / sumDen // Calculate the new ability estimate by adding the quotient of the summation terms to the old ability estimate
  return newAbility
}

// Define a function to calculate the standard error of measurement using the Fisher information method
function calculateSEM(ability, responses, itemPool) {
  let sum = 0 // Initialize the summation term
  for (let i = 0; i < responses.length; i++) {
    let item = itemPool[responses[i].itemId - 1] // Get the item object from the item pool by id
    let exp = Math.exp(ability - item.difficulty) // Calculate the exponent term
    let prob = exp / (1 + exp) // Calculate the probability term
    sum += prob * (1 - prob) // Update the summation by adding the product of probability and its complement
  }
  let sem = Math.sqrt(1 / sum) // Calculate the standard error of measurement by taking the square root of the inverse of the summation
  return sem
}

// Define a function to calculate the information value using the Fisher information method
function getInfo(ability, difficulty) {
  let exp = Math.exp(ability - difficulty) // Calculate the exponent term
  let prob = exp / (1 + exp) // Calculate the probability term
  let info = prob * (1 - prob) // Calculate the information value
  return info
}

// Define a function to select the next item using the maximum information method
function selectItem(ability, responses, itemPool) {
  let maxInfo = 0 // Initialize the maximum information value
  let nextItem = null // Initialize the next item object
  for (let i = 0; i < itemPool.length; i++) {
    let item = itemPool[i] // Get the item object from the item pool
    let info = getInfo(ability, item.difficulty) // Calculate the information value
    if (info > maxInfo) {
      // If the information value is greater than the current maximum information value
      let answered = false // Initialize a flag to indicate if the item has been answered
      for (let j = 0; j < responses.length; j++) {
        if (responses[j].itemId === item.id) {
          // If the item id matches any of the response item ids
          answered = true // Set the flag to true
          break // Break out of the loop
        }
      }
      if (!answered) {
        // If the flag is false, meaning the item has not been answered
        maxInfo = info // Update the maximum information value
        nextItem = item // Update the next item object
      }
    }
  }
  return nextItem
}

// Define a function to check if the termination criterion is met
function checkTermination(sem, criterion) {
  if (sem < criterion) {
    // If the standard error of measurement is less than the criterion value
    return true // Return true to indicate that the test should be terminated
  } else {
    return false // Return false to indicate that the test should continue
  }
}

// Define a function to score a computer adaptive test given an examinee and an item pool
function scoreCAT(examinee, itemPool) {
  let ability = examinee.ability // Initialize the ability estimate with the examinee\'s ability level
  let responses = [] // Initialize an empty array to store the responses
  let terminated = false // Initialize a flag to indicate if the test is terminated

  for (let i = 0; i < NUM_ITEMS; i++) {
    if (i >= MAX_TEST_QUESTIONS) {
      terminated = true
      break
    } // Break out of the loop if the maximum number of questions is reached
    // Loop through each item index
    let item = selectItem(ability, responses, itemPool) // Select the next item using the maximum information method
    if (item === null) {
      // If no item is selected, meaning all items have been answered or there is no suitable item in the pool
      terminated = true // Set the flag to true
      break // Break out of the loop
    }
    let prob = raschModel(ability, item.difficulty) // Calculate the probability of answering the item correctly using the Rasch model
    let correct = answerItem(prob) // Simulate answering the item based on the probability of correctness
    responses.push({ itemId: item.id, correct: correct }) // Add the response object to the array of responses
    ability = updateAbility(ability, responses, itemPool) // Update the ability estimate using the maximum likelihood method
    let sem = calculateSEM(ability, responses, itemPool) // Calculate the standard error of measurement using the Fisher information method
    terminated = checkTermination(sem, TERMINATION_CRITERION) // Check if the termination criterion is met
    if (terminated) {
      // If the flag is true
      break // Break out of the loop
    }
  }

  return { ability: ability, responses: responses, terminated: terminated } // Return an object containing the final ability estimate, responses, and termination status
}

// Generate an item pool with difficulty levels from a uniform distribution between -3 and 3
let itemPool = generateItemPool(NUM_ITEMS, MIN_DIFFICULTY, MAX_DIFFICULTY)

let examinee = { ability: 0 } // Start with a mid point ability level
// Score a computer adaptive test given an examinee and an item pool
let score = scoreCAT(examinee, itemPool)

console.log(
  "Examinee's estimated new ability level: " + score.ability.toFixed(2),
)

Conclusion

The above code can be used to score a Computer Adaptive Test. Contact us to find out how BK Software Development can help you implement such an algorithm at your TechEd company.

More articles

Using Riverpod and Freezed together in a Flutter application

Using Riverpod and Freezed makes calling an API from Flutter easy. Lets see how using an example.

Read more

Using Flutter to develop a mobile app for an MDM application

Using Flutter along with an MDM makes it easy to build beautiful, interactive, real-time mobiles applications.

Read more

Tell us about your project