Natural Language Processing for Improving Student Learning Outcomes

Organized by the NCME Artificial Intelligence in Measurement & Education (AIME) SIGIMIE

Presenter: Rose E. Wang, Stanford University

Language is central to educational interactions. However, there’s been little work on measuring, generating, and intervening on language at scale in education. My work takes an empirical language-based approach that leverages natural language processing to model and conduct interventions on educational language at scale. My approach allows us to not only identify opportunities and risks before large-scale deployment, but also answer questions about language use in education settings in a data-driven way [1-3].

I will first discuss Bridge, a work that investigates how to scale high-quality teaching in tutoring [2]. Scaling high-quality teaching is challenging. Due to growing demand, many tutoring platforms employ novice tutors who, unlike experienced educators, struggle to address student mistakes and thus fail to seize prime learning opportunities. To address this, I developed Bridge, a method that uses cognitive task analysis to translate an expert's latent thought process into a decision-making model for remediation. A blind rating test reveals that responses generated from Bridge-guided large language models (LLMs) are considered to be better quality than responses written by novice tutors alone and LLMs alone.

Next, I will discuss Tutor CoPilot, a real-time decision aid that leverages Bridge to provide tutors suggestions on how to remediate. I will show preliminary findings of an on-going randomized controlled trial of Tutor CoPilot used for virtual tutoring sessions [3]: I find that students working with tutors with access to Tutor CoPilot have significantly more positive perceptions of their tutor and tutoring experience, over students who have not worked with Tutor CoPilot tutors. This preliminary analysis provides promising evidence for real-time decision aids like Tutor CoPilot for enhancing the tutor’s instruction, and thus potentially increasing the efficacy of virtual tutoring programs.

References:

[1] Rose E. Wang, Dorottya Demszky. “Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction”. BEA 2024.

[2] Rose E. Wang, Qingyang Zhang, Carly Robinson, Susanna Loeb, Dorottya Demszky. “Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes”. NAACL 2024.

[3] Rose E. Wang, Ana Ribeiro, Carly Robinson, Dorottya Demszky, Susanna Loeb. “The Effect of Tutor CoPilot for Virtual Tutoring Sessions: Testing an Intervention to Improve Tutor Instruction with Expert-Guided LLM-generated Remediation Language”. 2024.

When:  May 8, 2024 from 04:00 PM to 05:00 PM (ET)

Location

Online Instructions:
Url: http://us02web.zoom.us/j/81728443502?pwd=aGtPb0x0NGhTYVFPYzl2L2xaZEdGUT09
Login: Meeting ID: 817 2844 3502 Passcode: 649132