From Course to Skill: Evaluating Large Language Model Performance in Curricular Analytics

Curricular analytics (CA)—the systematic analysis of curricula data to inform program and course refinement—is an increasingly valuable tool to help institutions align academic offerings with evolving societal and economic demands. Large language models (LLMs) are promising for handling large-scale, unstructured curriculum data, but it remains uncertain how reliably LLMs can perform CA tasks.

In this AIED conference paper, the authors evaluate four text alignment strategies based on LLMs for skill extraction, a core task in CA. Using a stratified sample of 400 curriculum documents of different types and a human-LLM collaborative evaluation framework, they find that retrieval-augmented generation is the top-performing strategy across all types of curriculum documents. Their findings highlight the promise of LLMs in analyzing brief and abstract curriculum documents, but also reveal that their performance can vary significantly depending on model selection and prompting strategies. This underscores the importance of carefully evaluating the performance of LLM-based strategies before large-scale deployment.

From Course to Skill: Evaluating Large Language Model Performance in Curricular Analytics

Additional Resources