Complementary Minds: Quantifying Human–AI Strengths and Robustness for Question Answering

This event is being organized by the NCME Artificial Intelligence in Measurement and Education (AIME) SIGIMIE.

As LLMs sprint ahead in fact retrieval and pattern-matching, they’re racing into questions humans still ace—abductive puzzles, conceptual leaps, and cross-contextual riddles. But benchmarks age fast: today’s “hard” question can become tomorrow’s trivia. In this talk, you’ll discover two breakthrough tools—CAIMIRA, which uses item-response theory to chart human vs. AI proficiencies at scale, and AdvScore, a human-anchored metric that flags when adversarial datasets stop being challenging. We’ll reveal surprising gaps—for example, GPT-4’s dominance on lookup tasks vs. human intuition on abductive reasoning—and show how to design next-generation QA challenges that push both mind and machine. Finally, learn a roadmap for pairing humans and AI agents so that their complementary “superpowers” deliver truly robust, real-world question-answering systems.

Presenters:

  • Maharshi Gor, University of Maryland

When:  Jul 10, 2025 from 04:00 PM to 05:00 PM (ET)

Location

Online Instructions:
Url: https://acui-org.zoom.us/j/85091619144?pwd=Ty3UcBqjnZuL8mKjiuS8Pnzo3i9wKN.1
Login: https://acui-org.zoom.us/j/85091619144?pwd=Ty3UcBqjnZuL8mKjiuS8Pnzo3i9wKN.1