Manually reading, evaluating, and scoring motivation letters as part of the admissions process is a time-consuming and tedious task for Dalarna University's program managers. An automated scoring system would provide them with relief as well as the ability to make much faster decisions when selecting applicants for admission. The aim of this thesis was to analyse current human judgment and attempt to emulate it using machine learning techniques. We used various topic modelling methods, such as Latent Dirichlet Allocation and Non-Negative Matrix Factorization, to find the most interpretable topics, build a bridge between topics and human-defined factors, and finally evaluate model performance by predicting scoring values and finding accuracy using logistic regression, discriminant analysis, and other classification algorithms. Despite the fact that we were able to discover the meaning of almost all human factors on our own, the topic models' accuracy in predicting overall score was unexpectedly low. Setting a threshold on overall score to select applicants for admission yielded a good overall accuracy result, but did not yield a good consistent precision or recall score. During our investigation, we attempted to determine the possible causes of these unexpected results and discovered that not only is topic modelling limitation to blame, but human bias also plays a role.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:du-37469 |
Date | January 2021 |
Creators | Mercado Salazar, Jorge Anibal, Rana, S M Masud |
Publisher | Högskolan Dalarna, Institutionen för information och teknik, Högskolan Dalarna, Institutionen för information och teknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0024 seconds