Plick is an online platform with the intention of being a marketplace where users may buy and sell second-hand fashion. The platform caters to younger users, and as such borrows many ideas from well-known social network platforms - such as putting more focus on user profiles and expression, rather than just the products themselves. One of these ideas is to allow users free reign over tagging their items, rather than having them select from some constrained, pre-approved, group of categories, styles, sizes - et cetera. A problem of letting users tag products however they see fit is that a subset of users will inevitably try to 'game' the system by knowingly tagging their products using incorrect labels - resulting in inaccurate search results for many of these incorrect tags.The aim of this project is to firstly develop a pre-processing algorithm to normalize the user generated tagging data - to handle situations such as a tag having multiple different (albeit possibly all correct) spellings, capitalizations, typos, languages etc. The processed data will then be used to develop two different approaches to solve the problem of incorrect tagging. The first approach involves using the normalized data to create a graph representation of the tags and their relations to each other. Each node in the graph will represent an individual tag, and each edge between nodes will explain how closely related those two tags are. An algorithm will then be developed to, utilizing the tag relation graph, describe the relatedness of an arbitrary group of tags. The algorithm should also be able to identify any tags that are outliers among the group. The second approach entails the development of a gaussian naive bayes classifier, with the goal of identifying whether an article is anomalistic or not - given the group of tags it's been assigned.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-503435 |
Date | January 2023 |
Creators | Johansson, Viktor |
Publisher | Uppsala universitet, Avdelningen för systemteknik |
Source Sets | DiVA Archive at Upsalla University |
Language | Swedish |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | UPTEC F, 1401-5757 ; 23024 |
Page generated in 0.0013 seconds