Item usage in a multidimensional computerized adaptive test (MCAT) measuring health-related quality of life

Quality of Life Research, Jun 2017

Purpose Examining item usage is an important step in evaluating the performance of a computerized adaptive test (CAT). We study item usage for a newly developed multidimensional CAT which draws items from three PROMIS domains, as well as a disease-specific one. Methods The multidimensional item bank used in the current study contained 194 items from four domains: the PROMIS domains fatigue, physical function, and ability to participate in social roles and activities, and a disease-specific domain (the COPD-SIB). The item bank was calibrated using the multidimensional graded response model and data of 795 patients with chronic obstructive pulmonary disease. To evaluate the item usage rates of all individual items in our item bank, CAT simulations were performed on responses generated based on a multivariate uniform distribution. The outcome variables included active bank size and item overuse (usage rate larger than the expected item usage rate). Results For average θ-values, the overall active bank size was 9–10%; this number quickly increased as θ-values became more extreme. For values of −2 and +2, the overall active bank size equaled 39–40%. There was 78% overlap between overused items and active bank size for average θ-values. For more extreme θ-values, the overused items made up a much smaller part of the active bank size: here the overlap was only 35%. Conclusions Our results strengthen the claim that relatively short item banks may suffice when using polytomous items (and no content constraints/exposure control mechanisms), especially when using MCAT.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs11136-017-1624-3.pdf

Item usage in a multidimensional computerized adaptive test (MCAT) measuring health-related quality of life

Item usage in a multidimensional computerized adaptive test (MCAT) measuring health-related quality of life Muirne C. S. Paap 0 1 2 3 4 Karel A. Kroeze 0 1 2 3 4 Caroline B. Terwee 0 1 2 3 4 Job van der Palen 0 1 2 3 4 Bernard P. Veldkamp 0 1 2 3 4 Muirne C. S. Paap 0 1 2 3 4 0 Department of Research Methodology, Measurement, and Data-Analysis, Faculty of Behavioural, Management and Social Sciences, University of Twente , Enschede , The Netherlands 1 Centre for Educational Measurement at the University of Oslo (CEMO), Faculty of Educational Sciences, University of Oslo , Oslo , Norway 2 Department of Special Needs, Education, and Youth Care, Faculty of Behavioural and Social Sciences, University of Groningen , Grote Rozenstraat 38, 9712 TJ Groningen , The Netherlands 3 Medical School Twente, Medisch Spectrum Twente , Enschede , The Netherlands 4 Department of Epidemiology and Biostatistics and the EMGO Institute for Health and Care Research, VU University Medical Center , Amsterdam , The Netherlands Purpose Examining item usage is an important step in evaluating the performance of a computerized adaptive test (CAT). We study item usage for a newly developed multidimensional CAT which draws items from three PROMIS domains, as well as a disease-specific one. Methods The multidimensional item bank used in the current study contained 194 items from four domains: the PROMIS domains fatigue, physical function, and ability to participate in social roles and activities, and a diseasespecific domain (the COPD-SIB). The item bank was calibrated using the multidimensional graded response model and data of 795 patients with chronic obstructive pulmonary disease. To evaluate the item usage rates of all individual items in our item bank, CAT simulations were performed on responses generated based on a multivariate uniform distribution. The outcome variables included active bank size and item overuse (usage rate larger than the expected item usage rate). Results For average h-values, the overall active bank size was 9-10%; this number quickly increased as h-values became more extreme. For values of -2 and ?2, the overall active bank size equaled 39-40%. There was 78% overlap between overused items and active bank size for average h-values. For more extreme h-values, the overused items made up a much smaller part of the active bank size: here the overlap was only 35%. Conclusions Our results strengthen the claim that relatively short item banks may suffice when using polytomous items (and no content constraints/exposure control mechanisms), especially when using MCAT. Item exposure; HRQL; IRT; Item response theory; MCAT; CAT; MAT; Computerized adaptive test Introduction In the last decade, computerized adaptive tests (CATs) [ 1 ] based on item response theory (IRT) [ 2 ] have become increasingly popular in health measurement. A CAT can be seen as a questionnaire that is tailored to the test-taker on the fly: it continuously updates the estimate(s) of the position on the construct of interest (latent trait) based on answers given by the test-taker to the questions (items) posed. The underlying algorithm then selects the item that is most informative at that particular moment, given the current estimate of the latent trait value. It is clear why CATs appeal to healthcare professionals (HCPs): by selecting only those items that contribute most to the reliable measurement of a patient’s latent trait value, measurement efficiency is increased, which results in a substantial decrease in response burden [ 3 ]. Furthermore, CAT estimates can be used to generate automatic reports instantly, providing the HCP with all necessary information (latent trait estimate, standard error, norms, and graphic display) to facilitate communication with the patient. These properties make CATs excellent candidates for monitoring patients’ physical and mental health routinely, be it on a monthly or daily basis. CATs draw their items from item banks: large collections of items that have been calibrated with an IRT model using a large sample representative of the target population. The quality of the CAT and the latent trait estimate it generates depend to a large degree on the quality of the item bank. A psychometrically sound item bank contains items with location parameters that cover the whole range of relevant latent trait values, while having adequate to high discrimination parameters. A CAT drawing items from such an item bank will result in efficient measurement for all patients (irrespective of their latent trait score). Most CATs currently used for health measurement are based on item banks that were calibrated using unidimensional IRT models (e.g., [ 4–7 ]). Although less frequently used, multidimensional IRT models are available as well, and can be used to support multidimensional CAT (MCAT) (e.g., [ 8–10 ]). It has been shown that test length can be further reduced by taking the correlation among constructs into account d (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs11136-017-1624-3.pdf

Muirne C. S. Paap, Karel A. Kroeze, Caroline B. Terwee, Job van der Palen, Bernard P. Veldkamp. Item usage in a multidimensional computerized adaptive test (MCAT) measuring health-related quality of life, Quality of Life Research, 2017, pp. 1-10, DOI: 10.1007/s11136-017-1624-3