Recommender systems are used in online sales and e-commerce for recommending potential items/products for customers to buy based on their previous buying preferences and related behaviours. Collaborative filtering is a popular computational technique that has been used worldwide for such personalized recommendations. Among two forms of collaborative filtering, neighbourhood and model-based, the neighbourhood-based collaborative filtering is more popular yet relatively simple. It relies on the concept that a certain item might be of interest to a given customer (active user) if, either he appreciated similar items in the buying space, or if the item is appreciated by similar users (neighbours). To implement this concept different kinds of similarity measures are used. This thesis is set to compare different user-based similarity measures along with defining meaningful measures based on Hellinger distance that is a metric in the space of probability distributions. Data from a popular database MovieLens will be used to show the effectiveness of dierent Hellinger distance-based measures compared to other popular measures such as Pearson correlation (PC), cosine similarity, constrained PC and JMSD. The performance of dierent similarity measures will then be evaluated with the help of mean absolute error, root mean squared error and F-score. From the results, no evidence were found to claim that Hellinger distance-based measures performed better than more popular similarity measures for the given dataset.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-172385 |
Date | January 2020 |
Creators | Goussakov, Roma |
Publisher | Umeå universitet, Statistik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0023 seconds