Return to search

Förbehandling och Hantering av Användarmärkningar på E-handelsartiklar / Preprocessing and Treatment of User Tags on E-commerce Articles

Plick is an online platform with the intention of being a marketplace where users may buy and sell second-hand fashion. The platform caters to younger users, and as such borrows many ideas from well-known social network platforms - such as putting more focus on user profiles and expression, rather than just the products themselves. One of these ideas is to allow users free reign over tagging their items, rather than having them select from some constrained, pre-approved, group of categories, styles, sizes - et cetera. A problem of letting users tag products however they see fit is that a subset of users will inevitably try to 'game' the system by knowingly tagging their products using incorrect labels - resulting in inaccurate search results for many of these incorrect tags.The aim of this project is to firstly develop a pre-processing algorithm to normalize the user generated tagging data - to handle situations such as a tag having multiple different (albeit possibly all correct) spellings, capitalizations, typos, languages etc. The processed data will then be used to develop two different approaches to solve the problem of incorrect tagging. The first approach involves using the normalized data to create a graph representation of the tags and their relations to each other. Each node in the graph will represent an individual tag, and each edge between nodes will explain how closely related those two tags are. An algorithm will then be developed to, utilizing the tag relation graph, describe the relatedness of an arbitrary group of tags. The algorithm should also be able to identify any tags that are outliers among the group. The second approach entails the development of a gaussian naive bayes classifier, with the goal of identifying whether an article is anomalistic or not - given the group of tags it's been assigned.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-503435
Date January 2023
CreatorsJohansson, Viktor
PublisherUppsala universitet, Avdelningen för systemteknik
Source SetsDiVA Archive at Upsalla University
LanguageSwedish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC F, 1401-5757 ; 23024

Page generated in 0.0022 seconds