Return to search

Exploring State-of-the-Art Natural Language Processing Models with Regards to Matching Job Adverts and Resumes

The ability to automate the process of comparing and matching resumes with job adverts is a growing research field. This can be done through the use of the machine learning area Natural Language Processing (NLP), which enables a model to learn human language. This thesis explores and evaluates the application of the state-of-the-art NLP model, SBERT, on the task of comparing and calculating a measure of similarity between extracted text from resumes and adverts. This thesis also investigates what type of data that generates the best performing model on said task. The results show that SBERT quickly can be trained on unlabeled data from the HR domain with the usage of a Triplet network, and achieves high performance and good results when tested on various tasks. The models are shown to be bilingual, can tackle unseen vocabulary and understand the concept and descriptive context of entire sentences instead of solely single words. Thus, the conclusion is that the models have a neat understanding of semantic similarity and relatedness. However, in some cases the models are also shown to become binary in their calculations of similarity between inputs. Moreover, it is hard to tune a model that is exhaustively comprehensive of such diverse domain such as HR. A model fine-tuned on clean and generic data extracted from adverts shows the overall best performance in terms of loss and consistency.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-477468
Date January 2022
CreatorsRückert, Lise, Sjögren, Henry
PublisherUppsala universitet, Avdelningen för systemteknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC STS, 1650-8319 ; 22008

Page generated in 0.0203 seconds