Global ETD Search

Return to search

Scalable Collaborative Filtering Recommendation Algorithms on Apache Spark

Collaborative filtering based recommender systems use information about a user's preferences to make personalized predictions about content, such as topics, people, or products, that they might find relevant. As the volume of accessible information and active users on the Internet continues to grow, it becomes increasingly difficult to compute recommendations quickly and accurately over a large dataset. In this study, we will introduce an algorithmic framework built on top of Apache Spark for parallel computation of the neighborhood-based collaborative filtering problem, which allows the algorithm to scale linearly with a growing number of users. We also investigate several different variants of this technique including user and item-based recommendation approaches, correlation and vector-based similarity calculations, and selective down-sampling of user interactions. Finally, we provide an experimental comparison of these techniques on the MovieLens dataset consisting of 10 million movie ratings.

Collaborative filtering

recommendation engine

Artificial Intelligence and Robotics

Software Engineering

Statistical Models

Identifer	oai:union.ndltd.org:CLAREMONT/oai:scholarship.claremont.edu:cmc_theses-1914
Date	01 January 2014
Creators	Casey, Walker Evan
Publisher	Scholarship @ Claremont
Source Sets	Claremont Colleges
Detected Language	English
Type	text
Format	application/pdf
Source	CMC Senior Theses
Rights	© 2014 Walker Evan Casey

Page generated in 0.003 seconds

Scalable Collaborative Filtering Recommendation Algorithms on Apache Spark

Description

Links & Downloads

Tags

Additional Fields