Global ETD Search

1	Scalable Collaborative Filtering Recommendation Algorithms on Apache Spark Casey, Walker Evan 01 January 2014 (has links) Collaborative filtering based recommender systems use information about a user's preferences to make personalized predictions about content, such as topics, people, or products, that they might find relevant. As the volume of accessible information and active users on the Internet continues to grow, it becomes increasingly difficult to compute recommendations quickly and accurately over a large dataset. In this study, we will introduce an algorithmic framework built on top of Apache Spark for parallel computation of the neighborhood-based collaborative filtering problem, which allows the algorithm to scale linearly with a growing number of users. We also investigate several different variants of this technique including user and item-based recommendation approaches, correlation and vector-based similarity calculations, and selective down-sampling of user interactions. Finally, we provide an experimental comparison of these techniques on the MovieLens dataset consisting of 10 million movie ratings. Collaborative filtering recommendation engine spark hadoop scalability distributed systems Artificial Intelligence and Robotics Software Engineering Statistical Models
2	Purchase behaviour analysis in the retail industry using Generalized Linear Models / Analys av köpbeteende inom detaljhandeln med hjälp av generaliserade linjära modeller Karlsson, Sofia January 2018 (has links) This master thesis uses applied mathematicalstatistics to analyse purchase behaviour based on customer data of the Swedishbrand Indiska. The aim of the study is to build a model that can helppredicting the sales quantities of different product classes and identify whichfactors are the most significant in the different models and furthermore, tocreate an algorithm that can provide suggested product combinations in thepurchasing process. Generalized linear models with a Negative binomial distributionare applied to retrieve the predicted sales quantity. Moreover, conditionalprobability is used in the algorithm which results in a product recommendationengine based on the calculated conditional probability that the suggestedcombinations are purchased.From the findings, it can be concluded that all variables considered in themodels; original price, purchase month, colour, cluster, purchase country andchannel are significant for the predicted outcome of the sales quantity foreach product class. Furthermore, by using conditional probability andhistorical sales data, an algorithm can be constructed which createsrecommendations of product combinations of either one or two products that canbe bought together with an initial product that a customer shows interest in. / Matematisk statistik tillämpas i denna masteruppsats för att analysera köpbeteende baserat på kunddata från det svenska varumärket Indiska. Syftet med studien är att bygga modeller som kan hjälpa till att förutsäga försäljningskvantiteter för olika produktklasser och identifiera vilka faktorer som är mest signifikanta i de olika modellerna och därtill att skapa en algoritm som ger förslag på rekommenderade produktkombinationer i köpprocessen. Generaliserade linjära modeller med en negativ binomialfördelning utvecklades för att beräkna den förutspådda försäljningskvantiteten för de olika produktklasserna. Dessutom används betingad sannolikhet i algoritmen som resulterar i en produktrekommendationsmotor som baseras på den betingade sannolikheten att de föreslagna produktkombinationerna är inköpta.Från resultaten kan slutsatsen dras att alla variabler som beaktas i modellerna; originalpris, inköpsmånad, produktfärg, kluster, inköpsland och kanal är signifikanta för det predikterade resultatet av försäljningskvantiteten för varje produktklass. Vidare är det möjligt att, med hjälp av betingad sannolikhet och historisk försäljningsdata, konstruera en algoritm som skapar rekommendationer av produktkombinationer av en eller två produkter som kan köpas tillsammans med en produkt som en kund visar intresse för. Generalized linear models Algorithm Historical transaction Retail Fashion Recommendation engine Computational Mathematics Beräkningsmatematik
3	Rekommendationsmotor: med fokus inom E-lärande / Recommendation engine: focus within E-learning Jakobsson, Lennart, Nilsson, Thires January 2018 (has links) Studier kring rekommendationsmotorer är ett område med större signifikans i en växande digital verklighet. Mängden med information ökar och med mer information blir det svårare att hitta det som för individen är av intresse. Vissa specifika områden med tillämpning av rekommendationsmotorer är mer välstuderade än andra, domäner som sysslar med försäljning hamnar i den mer studerade kategorin. Andra domäner som är i behov av rekommendationsmotorer, som inte är lika välstuderade är verksamheter som tillhandahåller möjlighet för lärande via internet. En av dessa verksamheter heter Nomp och erbjuder ett läroverktyg för barn och ungdomar inom matematik. Målet med denna studie är därför att implementera en rekommendationsmotor inom denna mindre utforskade domän. Målet är även att undersöka nyttan med rekommendationsmotorn för applikationens användare. Studien har baserats på ett ramverk inom designforskning, vilket inkluderar olika typer av experiment samt en undersökning. Resultaten från dessa aktiviteter utgjorde empirin för den analys som sedan genomfördes. Resultatet ger visst stöd för att det är möjligt att implementera en rekommendationsmotor för denna domän. De visade däremot inget entydigt svar i vilken omfattning dess nytta har för slutanvändaren. Studiens målsättning uppfylldes till viss del, däremot kunde nyttan för slutanvändaren utforskats i större omfattning. Förhoppningen är att denna studie ska ha effekter i form av praktiska konsekvenser, där användare kan spendera mindre tid på att leta efter information som kan vara till nytta. Det som skiljer sig i denna studie från tidigare liknande studier är att rekommendationsmotorn är implementerad för att passa en verklig verksamhet. I jämförelse med andra studier är denna studie även baserad på data direkt från verksamhetens användare. Vissa liknande artefakter har blivit implementerade, men då är de ofta mer generella eller har använt sig av data som inte är relevant för domänen. Det är också vanligare att liknande rekommendationsmotorer använder sig av direkt användarfeedback för att göra rekommendationer, vilket inte används i denna studie. / Studies regarding recommendation engines have gained greater importance in our reality of the digital community. With regards to the continuously growing amount of digital information it has become harder to find information that’s of importance to the individual. Some specific domains with enforcement of recommendation engines are more studied than others, domains that distribute services or items usually end up in this category. Other domains that are in need of recommendation engines, that’s not as well explored is business which enables learning through the internet. One of these business is called Nomp and provides a learning tool for kids and young teenagers in mathematics. The goal with this study is therefore to implement a recommendation engine for a business that is within this lesser explored domain. The goal is also to explore the advantages a recommendation engine would provide for its users. The study is based on a framework within design science research, which included various kinds of experiments and a survey. The results from these activities represented the empirics for the analysis that was conducted. The results show some signs that it’s possible to implement an artifact for this domain. However, it does not clearly show to what extent it’s valuable for the end user. For some part, the objectives for this study was met. Although, the advantages for the users could have been explored in greater depth. The overall prospects by conducting this study is that it will have some practical consequences, that the user can or will spend lesser time to search for important information. Differences between this study and other similar studies is that the recommendation engine is implemented to fit the needs of a real business. Also, compared to others, this study is based on data collected directly from the end users. Some similar systems have been implemented but the artefact is often more general or might have used data that’s not relevant the domain. It’s also more common that similar recommendation engines are using direct user feedback to make recommendations, which is not used in this study. Recommendation engine implicit data collaborative filtering E-learning recommendations Rekommendationsmotor implicita data samarbetsfiltrering E-lärande rekommendationer Information Systems
4	Sentiment-Driven Topic Analysis Of Song Lyrics Sharma, Govind 08 1900 (has links) (PDF) Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a user. The very field is further sub-divided into Opinion Mining and Emotion Analysis, the latter of which is the basis for the present work. Work on songs is aimed at building affective interactive applications such as music recommendation engines. Using song lyrics, we are interested in both supervised and unsupervised analyses, each of which has its own pros and cons. For an unsupervised analysis (clustering), we use a standard probabilistic topic model called Latent Dirichlet Allocation (LDA). It mines topics from songs, which are nothing but probability distributions over the vocabulary of words. Some of the topics seem sentiment-based, motivating us to continue with this approach. We evaluate our clusters using a gold dataset collected from an apt website and get positive results. This approach would be useful in the absence of a supervisor dataset. In another part of our work, we argue the inescapable existence of supervision in terms of having to manually analyse the topics returned. Further, we have also used explicit supervision in terms of a training dataset for a classifier to learn sentiment specific classes. This analysis helps reduce dimensionality and improve classification accuracy. We get excellent dimensionality reduction using Support Vector Machines (SVM) for feature selection. For re-classification, we use the Naive Bayes Classifier (NBC) and SVM, both of which perform well. We also use Non-negative Matrix Factorization (NMF) for classification, but observe that the results coincide with those of NBC, with no exceptions. This drives us towards establishing a theoretical equivalence between the two. Song Lyrics Non-negative Matrix Factorization (NMF) Music Information Retrival Music Recommendation Engine Support Vector Machine (SVM) Naive Bayes Classifier (NBC) Sentiment Analysis Emotion Analysis Latent Dirichlet Allocation (LDA) Sentiment Clustering Sentiment Classification k-Nearest Neighbour Classi er (k-NNC) Computer Science

1

Page generated in 0.1244 seconds