🤖 AI Summary
Facing surging application volumes and heightened demands for fairness, this paper proposes an interpretable, fairness-aware university admissions prediction framework. Methodologically, it integrates structured academic records with unstructured personal statement texts; innovatively employs GPT-4 to simulate human annotation of textual features; and combines logistic regression, random forests, deep neural networks, and stacked ensemble models. A fairness auditing mechanism and feature importance visualization are incorporated to jointly optimize predictive performance and decision transparency. Evaluated on over 2,000 real-world applications, the framework achieves a peak accuracy of 91.0% and quantitatively identifies systematic biases attributable to applicant gender and parental education level. It thus provides auditable, actionable fairness monitoring to support equitable admissions decisions.
📝 Abstract
Universities face surging applications and heightened expectations for fairness, making accurate admission prediction increasingly vital. This work presents a comprehensive framework that fuses machine learning, deep learning, and large language model techniques to combine structured academic and demographic variables with unstructured text signals. Drawing on more than 2,000 student records, the study benchmarks logistic regression, Naive Bayes, random forests, deep neural networks, and a stacked ensemble. Logistic regression offers a strong, interpretable baseline at 89.5% accuracy, while the stacked ensemble achieves the best performance at 91.0%, with Naive Bayes and random forests close behind. To probe text integration, GPT-4-simulated evaluations of personal statements are added as features, yielding modest gains but demonstrating feasibility for authentic essays and recommendation letters. Transparency is ensured through feature-importance visualizations and fairness audits. The audits reveal a 9% gender gap (67% male vs. 76% female) and an 11% gap by parental education, underscoring the need for continued monitoring. The framework is interpretable, fairness-aware, and deployable.