Return to search

Using Machine Learning to Categorize Documents in a Construction Project

Automation of document handling in the construction industries could save large amounts of time, effort and money and classifying a document is an important step in that automation. In the field of machine learning, lots of research have been done on perfecting the algorithms and techniques, but there are many areas where those techniques could be used that has not yet been studied. In this study I looked at how effectively the machine learning algorithm multinomial Naïve-Bayes would be able to classify 1427 documents split up into 19 different categories from a construction project. The experiment achieved an accuracy of 92.7% and the paper discusses some of the ways that accuracy can be improved. However, data extraction proved to be a bottleneck and only 66% of the original documents could be used for testing the classifier.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-90091
Date January 2019
CreatorsBjörkendal, Nicklas
PublisherLinnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM)
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0024 seconds