2007 Annual Meeting
National Council on Measurement in Education
2007 Annual Meeting & Training Sessions
April 10 – 12, 2007
Chicago, Illinois
Pre-Conference Training Sessions April 8-9 2007
April 8
Title: Basic Concepts in Exploratory Factor Analysis
Presenter(s): Thompson, Bruce
Cost: $35
Date: April 8
Time: 8:00 – 12:00
Abstract
The purpose of this introductory training session is to present the rationale for three uses of factor analysis (and especially evaluating the validity of scores), and to present the basic concepts of exploratory (EFA) applications. Topics include basic concepts of exploratory factor analysis; rotation, factor score and higher order methods; and EFA printout interpretation.
Audience
This workshop is for faculty and graduate students who have some familiarity with factor analysis, but who would like a refresher, and for others who recognize the important applications of factor analytic methods, but have not yet studied them and wish an introduction to the basic concepts.
Title: Generalizability Theory
Presenter(s): Brennan, Robert; Gao, Xiaohong
Cost: $110
Date: April 8
Time: 8:00 – 5:00
Abstract
Generalizability theory liberalizes and extends classical test theory. In particular, generalizability theory enables an investigator to disentangle multiple sources of error through the application of analysis of variance procedures to assess the dependability of measurements. The primary goals of this training session are to enable participants to understand the basic principles of generalizability theory, to conduct relatively straightforward generalizability analyses, and to interpret and use the results of such analyses. Mathematical and statistical foundations will be treated only minimally. Major emphasis will be placed upon quickly enabling participants to conduct and interpret relatively straightforward generalizability analyses, then more complicated ones. Examples will include various types of performance assessments. Computer programs for performing generalizability analyses will be discussed and illustrated. The book entitled Generalizability Theory (Brennan) will be distributed to participants and used as a principle reference in the training session.
Audience
The targeted audience is principally upper-level graduate students and new Ph.D.’s with interest in learning about and applying generalizability theory in practical contexts. Such persons are often new faculty members or members of (or persons who plan to pursue careers in) testing organizations, organizations concerned with small- and large-scale evaluations, or state and federal agencies concentrating on assessment procedures. As minimal prerequisites for attendance, participants should have one course in measurement and some familiarity with analysis of variance, at least at the level treated in introductory graduate statistics courses in education and psychology.
Title: Bayesian Networks in Educational Assessment
Presenter(s): Almond, Russell; Mislevy, Robert; Williamson, David; Yan, Duanli
Cost: $65
Date: April 8
Time: 8:00 – 5:00
Abstract
This session will provide the background information on Bayesian networks, Graphical Models and related inference and representation methods and provide examples of their use in educational assessment. It will review and provide intuition about the major methods for manipulating graphical models. It will concentrate on reviewing the existing body of literature on graphical models from other disciplines (in particular, the Uncertainty in Artificial Intelligence literature). Although the course will review the Evidence Centered Design framework for representing measurement models in educational assessments using graphs, the primary goal is to review the work done in other communities for psychometricians and psychologists.
Audience
This session is intended for people who have a good knowledge of probability and statistics (at the level of a college course in statistics with mathematics), but little experience with graphical models (Bayes nets) and related technologies. The audience should have some exposure to Bayesian ideas of inference, but extensive experience is not necessary. For the most part, models will be discussed in term of mathematics, not equations. Although key theorems in the area will be discussed, the goal will be to provide intuition rather than rigorous proof.
Title: Skills Diagnosis with Latent Variable Models
Presenter(s): Douglas, Jeff; Chang, Hua-Hua; de la Torre, Jimmy; Templin, Jonathan
Cost: $80
Date: April 8
Time: 8:00 – 5:00
Abstract
The primary aim of skills diagnosis is to develop and analyze tests in ways that reveal information with more diagnostic value, when compared with traditional approaches. In the methods for skills diagnosis that we consider mastery of particular skills or states of knowledge can be represented by a list of binary latent variables, indicating mastery of each of a finite set of skills under diagnosis. The main objective of skills diagnosis is to classify examinees according to this list of skills. In this training session, several popular modeling and classification approaches will be discussed. Three conjunctive latent class models known as the DINA, NIDA, and Fusion models will be introduced, and software for fitting these models with Mplus will be demonstrated. Because of the multidimensional nature of these models, estimation benefits greatly if it can adapt to previous responses. To address this, computerized adaptive testing (CAT) is considered. Because Fisher information does not apply to discrete latent variables, alternative and computationally simple item selection rules are introduced. For CAT settings in which both traditional and diagnostic models are being used, CAT algorithms are introduced for ensuring reliable information for these dual objectives. In addition to sequential methods of test construction, indices for use in fixed-length test construction are also given. The training session is meant to provide practical guidelines for implementing skills diagnosis, and considers the essential topics of identifying the attributes measured by items as well as test equating. Participants will be given access to a website to download software that can be used with MPlus for fitting latent variable models for skills diagnosis.
Audience
The intended audience for this training session includes anyone interested in cognitive or skills diagnosis that has some familiarity with item response theory or classical test theory. No previous knowledge of latent class models or cognitive diagnosis is required. The material will be useful for faculty and students specializing in educational testing, as well as testing professionals working in government or private testing organizations.
Title: Considerations in Setting Performance Standards
Presenter(s): Pitoniak, Mary; Zieky, Michael; Perie, Marianne
Cost: $65
Date: April 8
Time: 8:00 – 5:00
Abstract
This training session intends to answer questions regarding how to choose a standard-setting method, which methods are currently being used, and how to know if the cut scores set for an assessment yield valid interpretations within the context of a particular testing program. Information on vertically moderated standards and adjusting committee-recommended cut scores will also be discussed. Beginning with a historical overview, the session will provide a context regarding how decisions about standard setting are made today. Methodologies currently being used by the states in setting performance standards will be reviewed. Hands-on practice time will be given to allow participants to thoroughly understand the cognitive tasks involved in making the judgments for two of the most commonly used methods, Bookmark (Lewis, Mitzel, & Green, 1996) and modified Angoff (Angoff, 1971). This exercise will also prepare participants to plan and run modified Angoff and Bookmark standard setting workshops. Finally, significant time will be devoted to studying the validity of standard setting procedures and the resulting cut scores. Using Kane’s (1994, 2001) framework, the session will explore three sources of evidence: procedural, internal, and external. This session is intended for anyone who needs to understand how to run a standard setting session and the complexities involved.
Audience
The intended audience includes anyone currently involved in setting standards, from state assessment directors who need to make decisions regarding how to set standards in their state to the vendors who actually conduct the standard setting. Anyone who wants to learn the steps for conducting a sound standard setting workshop is welcome.
April 9
Title: Teaching Educational Measurement
Presenter(s): Bandalos, Deborah; Ferster, Amanda
Cost: $35
Date: April 9
Time: 8:00 – 12:00
Abstract
This training session has been designed to provide ideas, materials, and other resources for those who have teaching responsibilities for an introductory course in educational measurement. The training session will include discussions of topics to include in an educational measurement course, sequencing of topics, responding to the needs of students from different content areas, teaching strategies and materials, assessment methods, and recommendations for books, articles, websites, and other materials. Handouts will include examples of assignments, exercises, and assessments from a variety of experienced teachers of such courses.
Audience
This training session is intended for those who have teaching responsibilities for an introductory course in educational measurement or assessment. Courses such as this are typically offered in Colleges of Education and may be targeted toward many different audiences including (but not limited to): undergraduate teaching or psychology majors, graduate students in cognitive science, psychology, education, and educational measurement. Although this course is primarily intended for those who are new to teaching educational measurement, it will also offer new ideas and resources to those who are currently teaching such a course.
Title: Student Involvement and Formative Feedback in Classroom Assessment:
Measurement Concepts and Issues
Presenters: Beaudry, Jeff; Lukin, Leslie; Nebelsick-Gullet, Lori
Cost: $65
Date: April 9
Time: 8:00 – 12:00
Abstract
The purpose of this training session is to examine current theory and best practice regarding classroom assessment and grading, how to use this knowledge to promote student learning, and understanding how students benefit from direct involvement in assessment and grading. A key element of this discussion will focus on the development and use of formative assessment and feedback as an important part of the learning process. Learning activities will center on issues of assessment quality and utility. Through the discussion of in-depth case studies of practitioners, participants will explore the following topics:
- Development of a shared language for classroom assessment literacy, development of an understanding of the similarities and differences between assessments that are used for system accountability versus assessments used in classrooms to support the learning process,
- Development and implementation of interpretable and useable formative feedback,
- Development of a fair and equitable learning environment,
- How to create an environment at the systems level that supports the implementation of best practice in the areas of assessment and grading in classrooms, and
- Use of data for student learning, teacher planning, and system improvement.
Audience
The intended audience for this session is test and measurement specialists and/or practitioners, including teacher educators, who may or may not have direct experience in P- 16 classrooms. These specialists need to understand the standards of quality for classroom assessment and the benefits of student involvement in assessment and grading.
Title: The Kernel Method of Observed Score Test Equating
Presenter(s): Von Davier, Alina; Holland, Paul; Chen, Henry
Cost: $110
Date: April 9
Time: 8:00 – 5:00
Abstract
Test equating methods are used to produce scores that are comparable across different test forms. The Kernel Method of Test Equating (KE) is a unified approach to test equating based on a flexible family of equipercentile-like equating functions that contains the linear equating function as a special case. Observed-score test equating is viewed as having five steps or parts, each of which involves distinct ideas. They are: 1) pre-smoothing; 2) estimation of the score probabilities on the target population; 3) continuization; 4) computing the equating function; 5) computing the standard error of equating and related accuracy measures. KE brings together these steps into an organized whole rather than treating them as disparate problems. KE exploits pre-smoothing by fitting log-linear models to score data, and incorporates it into step 5) above. KE provides new tools for comparing two or more equating functions and to rationally choose between them. In this session, theoretical issues will be considered along with numerical examples and software demonstration using real data. The book with the same title is the basis of this training session. The theory behind KE will be covered as well as its application to the Equivalent Groups (EG) Design, Single Group (SG) Design, Counterbalanced (CB) Design, and Non-Equivalent groups Anchor Test (NEAT) Design. KE allows us to give a unified discussion of Chain Equating and Post-Stratification Equating (frequency estimation and Tucker equating). A demo of the KE-software v.2.0 (ETS, 2006) will be provided. Participants will receive a copy of the KE-software v. 2.0 (ETS 2006) contingent upon signing a license agreement. Participants will receive a ticket that will provide a discount price for purchasing the book “The Kernel Method of Test Equating.”
Audience
Statisticians and other research workers interested in the theory behind such work and the use of model based statistical methods of data smoothing in applied work. This session is also intended for practitioners who need to equate tests—including those with these responsibilities in testing companies, state testing agencies and school districts. Real examples for illustration will be included and give enough detail that the session will be useful for advanced graduate students in psychometric and measurement programs.
Title: Applying Hierarchical Models to Causal Inference
Presenter(s): Raudenbush, Stephen; Hong, Guanglei,
Cost: $65
Date: April 9
Time: 8:00 – 5:00
Abstract
The purpose of this training session is to introduce recent development of causal inference concepts and methods for evaluating educational policy and program effects in multi-level settings when randomized experiments are infeasible. Hierarchical linear and nonlinear models are taught in combination with propensity score-based methods for causal effect estimation. Education examples will be used throughout in lecture, discussion, and hands-on practice. Participants need to bring a laptop computer with SPSS 14.0 installed. Participants are expected to download and install ahead of time the free 15-day trial edition of the HLM 6 software available at http://www.ssicentral.com/hlm/downloads.html
Audience
The session is intended for university faculty, graduate students, and educational researchers interested in investigating the effectiveness of educational policies, intervention programs, and various educational practices.
Title: Multidimensional Item Response Theory
Presenter(s): Habing, Brian; Froelich, Amy
Cost: $90
Date: April 9
Time: 8:00 – 5:00
Abstract
One common problem in educational measurement is determining if an exam or scale satisfies the twin assumptions of unidimensionality and local independence. When these assumptions fail it is then necessary to examine the underlying multidimensional / locally dependent structure and either model that structure or refine the original scale. This training session is designed for those who have been exposed to the standard 1PL and 3PL IRT models and deal with (potentially) multidimensional educational assessments or surveys. It focuses on developing an intuitive understanding of the concepts and methods as opposed to rigorously developing the mathematics. The session begins with a brief review of the assumptions of local independence and unidimensionality, an overview of multidimensional IRT models (including NOHARM and testlet models), and a survey of the common procedures for testing unidimensionality. Mokken scaling and the conditional covariance methods (DIMTEST, DETECT, HCA-CCPROX) are then examined in detail, with hands on opportunities to try the procedures on real data sets. Participants will be provided copies of the software used and are encouraged to bring a laptop running Windows 95 or better. Sijtsma and Molenaar’s (2002) Introduction to Nonparametric Item Response Theory will also be provided to the participants.
Audience
Measurement professionals and students with exposure to the basics of unidimensional IRT.
Title: Tips for Graduate Students: Advice for Finishing School, Obtaining a Job, and Starting a Career
Presenters: Harris, Deborah; Sanclemente, Julio; Ho, Andrew
Cost: $10
Date: April 9
Time: 1:00 – 5:00
Abstract
The training session has three main components:
- Finishing up the Ph.D. including finding a dissertation topic and how to maximize experiences while still a student (classes, internships, work experiences, networking, professional associations),
- Obtaining a job including how to locate where jobs are available (universities, testing companies, school districts, state departments, professional/licensing organizations, etc.), how to apply for jobs (including targeting cover letters, references, and resumes) and the interview process,
- Beginning a career including job politics, adjusting to the environment, career path, publishing, professional service, being a mentor/finding a mentor, balancing work and life, and what if I hate my job.
Audience
The training session is targeted towards graduate students in measurement who have questions in such areas as: where jobs are available (e.g., school districts, state departments); what types of things employers look for in application materials; what types of questions might be asked of an interviewee; what types of questions should an interviewee ask; what are possible dissertation topics; etc. Graduate students in other areas might gain some benefit, but the materials will be specifically geared to graduate students in measurement.
Title: Vertical Scaling
Presenter(s): Kolen, Michael; Tong, Ye
Cost: $45
Date: April 9
Time: 1:00 – 5:00
Abstract
The potential need for constructing a vertical scale arises whenever a testing program has multiple grade levels and wishes to have a common scale to compare test scores across these grade levels. Vertical scaling uses statistical process to place test scores that measure similar content domain but at different educational levels onto a common scale. The goals of the session are for attendees to be able to understand the principles of vertical scaling, to conduct vertical scaling and to interpret the results of vertical scaling in reasonable ways. Vertical scaling will be contrasted with related equating and linking processes. Traditional and IRT vertical linking methodologies will be described and practical issues will be discussed. The focus is on developing a conceptual understanding of vertical scaling through numerical examples and discussion of practical issues. Importance and challenges related to vertical scaling will be included. The text for the session is a chapter in Kolen and Brennan’s (2004) Test Equating, Scaling, and Linking. Methods and Practices (Second Edition).
Audience
The targeted audience is upper-level graduate students, new Ph.D’s, testing professionals with operational or oversight responsibility for vertical scaling, and others with interest in learning about vertical scaling methods and practice. Participants should have at least two graduate courses in measurement and two graduate courses in statistics.

