Community college student transcripts tend to be diverse, and faculty, administrators, and researchers have difficulty understanding course-taking patterns of their students in order to determine what programs of study they are pursuing.
This working paper demonstrates how using a clustering algorithm―which allows researchers to group similar items into clusters, relying only on a measure of the similarity of those items―can be used to make sense of relevant transcript data.
Using transcript data for first-time college students who entered Washington State community and technical colleges during fall 2005–06, the authors used the clustering algorithm to separately cluster liberal arts and career-technical students. The resulting clusters roughly corresponded to programs of study, and the authors were able to estimate how many students were undertaking each program and what subject students were studying.
The authors were also able to examine demographics and completion and transfer rates of students within each cluster to understand what types of students pursued each program of study and their success rates.