Return to search

AI Approaches for Classification and Attribute Extraction in Text

As the amount of data online grows, the urge to use this data for different applications grows as well. Machine learning can be used with the intent to reconstruct and validate the data you are interested in. Although the problem is very domain specific, this report will attempt to shed some light on what we call strategies for classification, which in broad terms mean, a set of steps in a process where the end goal is to have classified some part of the original data. As a result, we hope to introduce clarity into the classification process in detail as well as from a broader perspective. The report will investigate two classification objectives, one of which is dependent on many variables found in the input data and one that is more literal and only dependent on one or two variables. Specifically, the data we will classify are sales-objects. Each sales-object has a text describing the object and a related image. We will attempt to place these sales-objects into the correct product category. We will also try to derive the year of creation and it’s dimensions such as height and width. Different approaches are presented in the aforementioned strategies in order to classify such attributes. The results showed that for broader attributes such as a product category, supervised learning is indeed an appropriate approach, while the same can not be said for narrower attributes, which instead had to rely on entity recognition. Experiments on image analytics in conjunction with supervised learning proved image analytics to be a good addition when requiring a higher precision score.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-67882
Date January 2017
CreatorsMagnusson, Ludvig, Rovala, Johan
PublisherLinnéuniversitetet, Institutionen för datavetenskap (DV), Linnéuniversitetet, Institutionen för datavetenskap (DV)
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0029 seconds