Digital ITEMS Modules

ITEMS Portal Menu

Digital Modules     Home     Print Modules

Check Out the Most Recent ITEMS modules!


Reliability in Classical Test Theory

​In this digital ITEMS module, Dr. Charlie Lewis and Dr. Michael Chajewski provide a two-part introduction to the topic of reliability from the perspective of classical test theory (CTT).

Scale Reliability in Structural Equation Modeling

​In this digital ITEMS module, Dr. Greg Hancock and Dr. Ji An provide an overview of scale reliability from the perspective of structural equation modeling (SEM) and address some of the limitations of Cronbach’s α.

Nonparametric Item Response Theory

In this digital ITEMS module Dr. Stefanie Wind introduces the framework of nonparametric item response theory (IRT), in particular Mokken scaling, which can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models.

Diagnostic Measurement Checklists

​In this digital ITEMS module, Dr. Natacha Carragher, Dr. Jonathan Templin, and colleagues provide a didactic overview of the specification, estimation, evaluation, and interpretation steps for diagnostic measurement / classification models (DCMs) centered around checklists for practitioners. A library of macros and supporting files for Excel, SAS, and Mplus is provided along with video tutorials for key practices.

The G-DINA Framework

In this digital ITEMS module, Dr. Wenchao Ma and Dr. Jimmy de la Torre introduce the G-DINA model, which is a general framework for specifying, estimating, and evaluating a wide variety of cognitive diagnosis models for the purpose of diagnostic measurement.

Posterior Predictive Model Checking

​In this digital ITEMS module, Dr. Allison Ames and Aaron Myers ​discuss the most common Bayesian approach to model-data fit evaluation, which is called Posterior Predictive Model Checking (PPMC), for simple linear regression and item response theory models.

Subscore Evaluation & Reporting

In this digital ITEMS module, Dr. Sandip Sinharay reviews the status quo on the reporting of subscores, which includes how they are used in operational reporting, what kinds of professional standards they need to meet, and how their psychometric properties can be evaluated.

Foundations of Operational Item Analysis

In this digital ITEMS module, Dr. Hanwook Yoo and Dr. Ronald K. Hambleton provide an accessible overview of operational item analysis approaches for dichotomously scored items within the frameworks of classical test theory and item response theory.

Sociocognitive Assessment for Diverse Populations

In this digital ITEMS module, Dr. Robert Mislevy and Dr. Maria Elena Oliveri introduce and illustrate a sociocognitive perspective on educational measurement, which focuses on a variety of design and implementation considerations for creating fair and valid assessments for learners from diverse populations with diverse sociocultural experiences.

Rasch Measurement Theory

In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales and demonstrate the estimation of core models with the Shiny_ERMA and Winsteps programs.

Bayesian Psychometrics

In this digital ITEMS module, Dr. Roy Levy discusses how Bayesian inference is a mechanism for reasoning in probability-modeling framework, describes how this plays out in a normal distribution model and unidimensional item response theory (IRT) models, and illustrates these steps using the JAGS software and R.

Think-aloud Interviews and Cognitive Labs

​In this digital ITEMS module, Dr. Jacqueline Leighton and Dr. Blair Lehman review differences between think-aloud interviews to measure problem-solving processes and cognitive labs to measure comprehension processes and illustrate both traditional and modern data-collection methods.

Simulation Studies in IRT

In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of Monte Carlo simulation studies (MCSS) in item response theory (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because they allow researchers to specify and manipulate an array of parameter values and experimental conditions (e.g., sample size, test length, and test characteristics).

Planning and Conducting Standard Setting

In this digital ITEMS module, Dr. Michael B. Bunch provides an in-depth, step-by-step look at how standard setting is done. It does not focus on any specific procedure or methodology (e.g., modified Angoff, bookmark, body of work) but on the practical tasks that must be completed for any standard setting activity.

Accessibility of Educational Assessments

In this digital ITEMS module, Dr. Ketterlin Geller and her colleagues provide an introduction to accessibility of educational assessments. They discuss the legal basis for accessibility in K-12 and higher education organizations and describe how test and item design features as well as examinee characteristics affect the role that accessibility plays in evaluating test validity during test development operational deployment.

Longitudinal Data Analysis

In this digital ITEMS module, Dr. Jeffrey Harring and Ms. Tessa Johnson introduce the linear mixed effects (LME) model as a flexible general framework for simultaneously modeling continuous repeated measures data with a scientifically-defensible function that adequately summarizes both individual change as well as the average response.

Data Visualizations

In this digital module, Nikole Gregg and Dr. Brian Leventhal discuss strategies to ensure data visualizations achieve graphical excellence. The instructors review key literature, discuss strategies for enhancing graphical presentation, and provide an introduction to the Graph Template Language (GTL) in SAS to illustrate how elementary components can be used to make efficient, effective and accurate graphics for a variety of audiences.

Automated Scoring

In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. They discuss automated scoring from a number of perspectives and provide two data examples, one focused on training and evaluating an automated scoring engine and one focused on the impact of rater error on predicted scores.

Foundations of IRT Estimation

In this digital ITEMS module, Dr. Zhuoran Wang and Dr. Nathan Thompson introduce the basic item response theory (IRT) item calibration and examinee scoring procedures as well as strategies to improve estimation accuracy.

Classroom Assessment Standards

In this digital ITEMS module, Dr. Caroline Wylie reviews the Classroom Assessment Standards with their three sets of standards: (1) Foundations (these six standards provide the basis for developing and implementing sound and fair classroom assessment); (2) Use (these five standards follow a logical progression from the selection and development of classroom assessments to the communication of the assessment results); and (3) Quality (these five standards guide teachers in providing accurate, reliable, and fair classroom assessment results for all students).

Results Reporting for Large-scale Assessments

In this digital ITEMS module, Dr. Francis O’Donnell and Dr. April Zenisky provide a firm grounding in the conceptual and operational considerations around results reporting for summative large-scale assessment and, throughout the module, highlight research-grounded good practices, concluding with some principles and ideas around conducting reporting research.

Supporting Decisions with Assessment

In this digital ITEMS module, Dr. Chad Gotch walks through different forms of assessment, from everyday actions that are almost invisible, to high-profile, annual, large-scale tests with an eye towards educational decision-making.

Multidimensional Item Response Theory Graphics

In this digital ITEMS module, Dr. Terry Ackerman and Dr. Qing Xie cover the underlying theory and application of multidimensional item response theory models from a visual perspective.

Assessment Literacy

​In this digital ITEMS module, Dr. Jade Caines Lee provides an opportunity for learners to gain introductory-level knowledge of educational assessment. The module’s framework will allow K-12 teachers, school building leaders, and district-level administrators to build “literacy” in three key assessment areas: measurement, testing, and data.

Testlet Models

​In this digital ITEMS module, Dr. Hong Jiao and Dr. Manqian Liao describe testlet response theory and associated measurement models for the construction and evaluation of new measures and scales when local item dependence is present.

Content Alignment in Standards-based Educational Assessment

In this digital ITEMS module, Dr. Katherine Reynolds and Dr. Sebastian Moncaleano discuss content alignment, its role in standards-based educational assessment, and popular methods for conducting alignment studies.

Hierarchical Rater Models

In this digital ITEMS module, Dr. Jodi M. Casabianca provides a primer on the hierarchical rater model (HRM) and the recent expansions to the model for analyzing raters and ratings of constructed responses.

 Unusual Things that Usually Occur in a Credentialing Testing Program

In this digital ITEMS module, Drs. Richard Feinberg, Carol Morrison, and Mark R. Raymond discuss an overview of how credentialing testing programs operate and special considerations that need to be made when unusual things occur.

Multidimensional Item Response Theory Equating

In this digital ITEMS module, Dr. Stella Kim provides an overview of multidimensional item response theory (MIRT) equating.

Validity and Educational Testing: Purposes and Uses of Educational Tests

In this digital ITEMS module, Jennifer Lewis and Dr. Stephen G. Sireci discuss the benefits and limitations of educational tests, the concept of validity and why it is important, and the types of validity evidence that should be used to support the use of a test for a particular purpose.


Testing Accommodations for Students with Disabilities

In this digital ITEMS module, Dr. Benjamin J. Lovett discusses a psychometric framework for thinking about accommodations, and then explicates an accommodations decision-making framework that includes a variety of considerations.


Understanding and Mitigating the Impact of Low Effort on Common Uses of  Test and Survey Scores 

In this digital ITEMS module, Dr. James Soland discusses how frequently behaviors associated with low effort occurs,  some of the ways they can distort inferences based on test scores, and some of the most common approaches for identifying and correcting for low effort when examining test scores. 


Fairness in Classroom Assessment: Dimensions and Tensions

In this digital ITEMS module, Dr. Amirhossein Rasooli shares the findings drawn from theoretical and empirical research from various countries to provide a space for further critical reflection on best practices in enhancing fairness in classroom assessment contexts.     


Introduction to Multilevel Measurement Modeling

In this digital ITEMS module, Mairead Shaw and Dr. Jessica Flake review two different frameworks for multilevel measurement modelling: (1) multilevel modelling and (2) structural equation modelling; and demonstrate the entire process in R with working code and available data, from preparing the dataset, through writing and running code, to interpreting and comparing output for the two approaches.


Through-Year Assessment

In this digital ITEMS module, Nathan Dadey, Brian Gong, Yun-Kyung Kim, and Edynn Sato present information about through-year assessment, including discussion of major test design elements and considerations, key challenges that pose a threat to assessment validity and utility, and recommended methods to address these challenges, and considerations for implementation. 


Applying Intersectionality Theory to Educational Measurement

In this digital ITEMS module, Dr. Michael Russell examines key concepts that form the foundation of Intersectionality Theory and considers challenges and opportunities these concepts present for quantitative methods.