Return to search

The Viability of Machine Learning Models Based on Levenstein Distance and Cosine Similarity for Plagiarism Detection in Digital Exams

This paper investigates the viability of a machine learning model based on similarities in text structure compared to one based on statistical properties in the text to detect cheating in digital examinations. The machine learning model comparing similarity in text structure used Levenstein distance and the one comparing statistical text properties compared cosine distance between word vectors. The paper also investigates whether security has been a driving force impacting the industrial dynamics of the digitalization of examinations in Sweden. This is done using the multi-level perspective framework and interviewing users of a digital examination platform. The results show that the machine learning model based on statistical text properties has a higher accuracy, recall, precision and F-score. Nothing is concluded from this, however, due to discussion of validity of the results from the machine learning model based on the similarities in text structure. The analysis of the industrial dynamics shows that security has been a driving force towards digitalization.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-240398
Date January 2018
CreatorsAnzén, Elizabeth
PublisherKTH, Skolan för elektroteknik och datavetenskap (EECS)
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationTRITA-EECS-EX ; 2018:441

Page generated in 0.0023 seconds