Using large language models to assess students and code qualitative data

Monday February 26, 2024
12:00 PM - 1:30 PM

This event will be held on Zoom, with an in-person viewing party in the iSchool Terrace Lab. Please register below to receive the Zoom link to the event.

Contemporary large language models (LLMs), though the culmination of years-long effort, have exploded into much greater use in education in the last year. LLMs provide a wide range of opportunities, as seen in tools like Khanmigo, and unique challenges compared to previous-generation technologies, such as hallucinations, irreproducibility, and perspective-switching.

In this talk, Dr. Ryan Baker will discuss ongoing efforts at the Penn Center for Learning Analytics to leverage large language models for educational research and development, focusing on their efforts to use LLMs to identify meaningful categories in texts. He will discuss three projects within this talk. In the first project, ChatGPT is used to attempt to produce better models multidimensionally assessing student scientific inquiry skill from their explanations. In the second project, mixed-initiative qualitative coding was conducted, partnering humans with ChatGPT in different ways, and studying which combinations produce the greatest complementarities. In the third project, LLMs were used to provide feedback for student errors within introductory computer programming. Dr. Baker will discuss the successes and failures of these approaches and what lessons we can draw from these projects for the use of ChatGPT for assessing text responses.

Ryan Baker is Professor at the University of Pennsylvania, and Director of the Penn Center for Learning Analytics. Baker has developed models that can automatically detect student engagement in over a dozen online learning environments, and led the development of an observational protocol and app for field observation of student engagement that has been used by over 150 researchers in 7 countries. Predictive analytics models he helped develop have been used to benefit over two million students, over a hundred thousand people have taken MOOCs he ran, and he has coordinated longitudinal studies that spanned over a decade. Baker was the founding president of the International Educational Data Mining Society, is currently serving as Editor of the journal Computer-Based Learning in Context, is Associate Editor of the Journal of Educational Data Mining, was the first technical director of the Pittsburgh Science of Learning Center DataShop, and currently serves as Co-Director of the MOOC Replication Framework (MORF). Baker has co-authored published papers with over 400 colleagues and has been cited over 25,000 times.