Global ETD Search

51	Matrix factorization in recommender systems : How sensitive are matrix factorization models to sparsity? Strömqvist, Zakris January 2018 (has links) One of the most popular methods in recommender systems are matrix factorization (MF) models. In this paper, the sensitivity of sparsity of these models are investigated using a simulation study. Using the MovieLens dataset as a base several dense matrices are created. These dense matrices are then made sparse in two different ways to simulate different kinds of data. The accuracy of MF is then measured on each of the simulated sparse matrices. This shows that the matrix factorization models are sensitive to the degree of information available. For high levels of sparsity the MF performs badly but as the information level increases the accuracy of the models improve, for both samples. Recommender systems Collaborative filtering Matrix factorization Probability Theory and Statistics Sannolikhetsteori och statistik
52	EFFICIENT LEARNING-BASED RECOMMENDATION ALGORITHMS FOR TOP-N TASKS AND TOP-N WORKERS IN LARGE-SCALE CROWDSOURCING SYSTEMS Safran, Mejdl Sultan 01 May 2018 (has links) A pressing need for efficient personalized recommendations has emerged in crowdsourcing systems. On the one hand, workers confront a flood of tasks, and they often spend too much time to find tasks matching their skills and interests. Thus, workers want effective recommendation of the most suitable tasks with regard to their skills and preferences. On the other hand, requesters sometimes receive results in low-quality completion since a less qualified worker may start working on a task before a better-skilled worker may get hands on. Thus, requesters want reliable recommendation of the best workers for their tasks in terms of workers' qualifications and accountability. The task and worker recommendation problems in crowdsourcing systems have brought up unique characteristics that are not present in traditional recommendation scenarios, i.e., the huge flow of tasks with short lifespans, the importance of workers' capabilities, and the quality of the completed tasks. These unique features make traditional recommendation approaches (mostly developed for e-commerce markets) no longer satisfactory for task and worker recommendation in crowdsourcing systems. In this research, we reveal our insight into the essential difference between the tasks in crowdsourcing systems and the products/items in e-commerce markets, and the difference between buyers' interests in products/items and workers' interests in tasks. Our insight inspires us to bring up categories as a key mediation mechanism between workers and tasks. We propose a two-tier data representation scheme (defining a worker-category suitability score and a worker-task attractiveness score) to support personalized task and worker recommendation. We also extend two optimization methods, namely least mean square error (LMS) and Bayesian personalized rank (BPR) in order to better fit the characteristics of task/worker recommendation in crowdsourcing systems. We then integrate the proposed representation scheme and the extended optimization methods along with the two adapted popular learning models, i.e., matrix factorization and kNN, and result in two lines of top-N recommendation algorithms for crowdsourcing systems: (1) Top-N-Tasks (TNT) recommendation algorithms for discovering the top-N most suitable tasks for a given worker, and (2) Top-N-Workers (TNW) recommendation algorithms for identifying the top-N best workers for a task requester. An extensive experimental study is conducted that validates the effectiveness and efficiency of a broad spectrum of algorithms, accompanied by our analysis and the insights gained. Collaborative filtering Crowdsourcing systems Item ranking Machine learning Personalization Recommendation algorithms
53	Collaborative filtering approaches for single-domain and cross-domain recommender systems Parimi, Rohit January 1900 (has links) Doctor of Philosophy / Computing and Information Sciences / Doina Caragea / Increasing amounts of content on the Web means that users can select from a wide variety of items (i.e., items that concur with their tastes and requirements). The generation of personalized item suggestions to users has become a crucial functionality for many web applications as users benefit from being shown only items of potential interest to them. One popular solution to creating personalized item suggestions to users is recommender systems. Recommender systems can address the item recommendation task by utilizing past user preferences for items captured as either explicit or implicit user feedback. Numerous collaborative filtering (CF) approaches have been proposed in the literature to address the recommendation problem in the single-domain setting (user preferences from only one domain are used to recommend items). However, increasingly large datasets often prevent experimentation of every approach in order to choose the one that best fits an application domain. The work in this dissertation on the single-domain setting studies two CF algorithms, Adsorption and Matrix Factorization (MF), considered to be state-of-the-art approaches for implicit feedback and suggests that characteristics of a domain (e.g., close connections versus loose connections among users) or characteristics of data available (e.g., density of the feedback matrix) can be useful in selecting the most suitable CF approach to use for a particular recommendation problem. Furthermore, for Adsorption, a neighborhood-based approach, this work studies several ways to construct user neighborhoods based on similarity functions and on community detection approaches, and suggests that domain and data characteristics can also be useful in selecting the neighborhood approach to use for Adsorption. Finally, motivated by the need to decrease computational costs of recommendation algorithms, this work studies the effectiveness of using short-user histories and suggests that short-user histories can successfully replace long-user histories for recommendation tasks. Although most approaches for recommender systems use user preferences from only one domain, in many applications, user interests span items of various types (e.g., artists and tags). Each recommendation problem (e.g., recommending artists to users or recommending tags to users) can be considered unique domains, and user preferences from several domains can be used to improve accuracy in one domain, an area of research known as cross-domain recommender systems. The work in this dissertation on cross-domain recommender systems investigates several limitations of existing approaches and proposes three novel approaches (two Adsorption-based and one MF-based) to improve recommendation accuracy in one domain by leveraging knowledge from multiple domains with implicit feedback. The first approach performs aggregation of neighborhoods (WAN) from the source and target domains, and the neighborhoods are used with Adsorption to recommend target items. The second approach performs aggregation of target recommendations (WAR) from Adsorption computed using neighborhoods from the source and target domains. The third approach integrates latent user factors from source domains into the target through a regularized latent factor model (CIMF). Experimental results on six target recommendation tasks from two real-world applications suggest that the proposed approaches effectively improve target recommendation accuracy as compared to single-domain CF approaches and successfully utilize varying amounts of user overlap between source and target domains. Furthermore, under the assumption that tuning may not be possible for large recommendation problems, this work proposes an approach to calculate knowledge aggregation weights based on network alignment for WAN and WAR approaches, and results show the usefulness of the proposed solution. The results also suggest that the WAN and WAR approaches effectively address the cold-start user problem in the target domain. Recommender systems Collaborative filtering Implicit feedback Cross-domain Adsorption Matrix factorization Computer Science (0984)
54	Leerec : A scalable product recommendation engine suitable for transaction data. Flodin, Anton January 2018 (has links) We are currently living in the Internet of Things (IoT) era, which involves devices that are connected to Internet and are communicating with each other. Each year, the number of devices increases rapidly, which result in rapid growth of data that is generated. This large amount of data is sometimes titled as Big Data, which is generated from different sources, such as log data of user behavior. These log files can be collected and analyzed in different ways, such as creating product recommendations. Product recommendations have been around since the late 90s, when the amount of data collected were not at the same level as it is today. The aim of this thesis has been to investigating methods to process and create product recommendations to see how well they are adapted for Big Data. This has been accomplished by three theory studies on how to process user events, how to make the product recommendation algorithm called collaborative filtering scalable and finally how to convert implicit feedback to explicit feedback (ratings). This resulted in a recommendation engine consisting of Apache Spark as the data processing system, which had three functions: read multiple log files and concatenate log files for each month, parsing the log files of the user events to create explicit ratings from the transactions and create four types of recommendations. The NoSQL database MongoDB was chosen as the database to store the different types of product recommendations that was created. To be able to get the recommendations from the recommendation engine and the database, a REST API was implemented which can be used by any third-party. What can be concluded from the results of this thesis work is that the system that was implemented is partial scalable. This means that Apache Spark was scalable for both concatenating files, parse and create ratings and also create the recommendations using the ALS method. However, MongoDB was shown to be not scalable when managing more than 100 concurrent requests. Future work involves making the recommendation engine distributed in a multi-node cluster to utilize the parallelization of Apache Spark. Other recommendations include considering other NoSQL databases that might be more scalable than MongoDB. Collaborative filtering log processing event Alternating Least Square Computer Systems Datorsystem
55	Metodologia de segmentação de mídia social / Methodology of social media segmentation Luiz Wanderley Tavares 06 October 2017 (has links) As primeiras mídias sociais da internet surgiram há pouco mais de duas décadas, segunda metade dos anos 90. Em comparação com a evolução humana, isso seria algo como um milésimo de segundo de sua existência. Neste período, vários estudos procuram entender o comportamento e o agrupamento dos seres humanos nesta nova forma de comunicação. Teorias sobre formas de analisar as pessoas neste meio e como elas se agrupam e criam novos modos de comunicação e propagação de suas ideias florescem e iluminam este desconhecido caminho a ser criado e percorrido. Os métodos de identificação do comportamento humano criados antes das mídias sociais ganham uma nova forma de serem utilizados. Estudos sobre o \"eu\" (Belk, 1988), tribalismo (Cova, B., 1997), etnografia (Danzig, 1985), netnografia (Kozinets, 1998) e filtragem colaborativa (Golberg, Nichols, Oki e Terry, 1992) entram em cena para colocar uma luz no estudo das relações humanas no mundo digital. A internet revolucionou o modo de as pessoas interagirem e a evolução constante da tecnologia vem incessantemente gerando profundas implicações para o marketing. A rede mundial passou a ser um canal global pelo qual as empresas podem divulgar e vender seus produtos. No entanto, mesmo oferecendo um enorme potencial para as empresas, a internet aumentou a complexidade de identificar os clientes. Os usuários presentes nas mídias sociais estão menos interessados nos produtos e valorizam mais as identidades e os laços sociais gerados em torno de seus assuntos de interesse. Estas tribos eletrônicas ultrapassam as fronteiras geográficas e independem de raça, sexo e aspectos culturais de seus integrantes. Este trabalho apresenta um método para identificar tribos nas mídias sociais. O método foi aplicado na identificação da tribo de MMA (MixedMartialArts, em tradução livre, Artes Marciais Mistas) no Twitter. A validação foi realizada usando a plataforma de anúncios do Twitter, enviando durante 72 horas uma publicidade para mais de 600 mil usuários, divididos em grupo de controle e segmentações do Twitter e do método proposto DNA. O estudo comparou os resultados obtidos pelo método proposto DNA com os resultados do grupo de controle e da segmentação realizada pelo Twitter. Os resultados obtidos apontaram o aumento de interações dos usuários identificados como pertencentes a tribo de MMA, validando o método. / The first Internet social media emerged just over two decades ago, at the second half of 90\'s. Compared to human evolution, this would be something like a millisecond of its existence. In this period, several studies try to understand the behavior and grouping of human beings in this new form of communication. Theories about ways of analyzing people in this environment and how they group themselves and create new ways of communication and propagation their ideas flourish and illuminate this unknown pathway to be created and traveled. Methods of identifying human behavior created before social media receive a new way of being used. Studies on the \"self\" (Belk, 1988), tribalism (Cova, B., 1997), ethnography (Danzig, 1985), netnography (Kozinets, 1998) and collaborative filtering (Golberg, Nichols, Oki and Terry, 1992) come on the scene to shed light on the study of human relations in the digital world. The Internet has revolutionized people\'s way of interacting and the constant evolution of technology generates profound implications for the marketing. The worldwide network has become a global channel through which companies can disclose and sell their products. However, while offering tremendous potential to businesses, the Internet has increased the complexity of identifying customers. Users present in social media are less interested in products and value more the identities and social ties generated around their subjects of interest. These electronic tribes transcend the geographical borders and are independent of race, sex and cultural aspects of its members. This paper presents a method to identify tribes in social media. The method was applied in the identification of the MMA (Mixed Martial Arts) tribe on Twitter. The validation was done using the Twitter ads platform, sending 72 hours of advertisement for more than 600 thousand users, divided in control group and segmentations of Twitter and the proposed method. The study compared the results obtained by the proposed method with that of the control group and the segmentation created by Twitter. The obtained results pointed out the increase of interactions of the users identified as belonging to the MMA tribe validating the method. Filtragem colaborativa Mídia social Netnografia Tribos Twitter Collaborative filtering Netnography Social media Tribes Twitter
56	The Use of Items Personality Profiles in Recommender Systems Alharthi, Haifa January 2015 (has links) Due to the growth of online shopping and services, various types of products can be recommended to an individual. After reviewing the current methods for cross-domain recommendations, we believe that there is a need to make different types of recommendations by relying on a common base, and that it is better to depend on a target customer’s information when building the base, because the customer is the one common element in all the purchases. Therefore, we suggest a recommender system (RS) that develops a personality profile for each product, and represents items by an aggregated vector of personality features of the people who have liked the items. We investigate two ways to build personality profiles for items (IPPs). The first way is called average-based IPPs, which represents each item with five attributes that reflect the average Big Five Personality values of the users who like it. The second way is named proportion-based IPPs, which consists of 15 attributes that aggregate the number of fans who have high, average and low Big Five values. The system functions like an item-based collaborative filtering recommender; that is, it recommends items similar to those the user liked. Our system demonstrates the highest recommendation quality in providing cross-domain recommendations, compared to traditional item-based collaborative filtering systems and content-based recommenders. Cross-domain recommendations Big Five Personality Traits Recommender Systems Collaborative filtering recommender
57	Algorithmes d'apprentissage pour les grandes masses de données : Application à la classification multi-classes et à l'optimisation distribuée asynchrone / Scalable algorithms for large-scale machine learning problems : Application to multiclass classification and asynchronous distributed optimization Joshi, Bikash 26 September 2017 (has links) L'objectif de cette thèse est de développer des algorithmes d'apprentissage adaptés aux grandes masses de données. Dans un premier temps, nous considérons le problème de la classification avec un grand nombre de classes. Afin d'obtenir un algorithme adapté à la grande dimension, nous proposons un algorithme qui transforme le problème multi-classes en un problème de classification binaire que nous sous-échantillonnons de manière drastique. Afin de valider cette méthode, nous fournissons une analyse théorique et expérimentale détaillée.Dans la seconde partie, nous approchons le problème de l'apprentissage sur données distribuées en introduisant un cadre asynchrone pour le traitement des données. Nous appliquons ce cadre à deux applications phares : la factorisation de matrice pour les systèmes de recommandation en grande dimension et la classification binaire. / This thesis focuses on developing scalable algorithms for large scale machine learning. In this work, we present two perspectives to handle large data. First, we consider the problem of large-scale multiclass classification. We introduce the task of multiclass classification and the challenge of classifying with a large number of classes. To alleviate these challenges, we propose an algorithm which reduces the original multiclass problem to an equivalent binary one. Based on this reduction technique, we introduce a scalable method to tackle the multiclass classification problem for very large number of classes and perform detailed theoretical and empirical analyses.In the second part, we discuss the problem of distributed machine learning. In this domain, we introduce an asynchronous framework for performing distributed optimization. We present application of the proposed asynchronous framework on two popular domains: matrix factorization for large-scale recommender systems and large-scale binary classification. In the case of matrix factorization, we perform Stochastic Gradient Descent (SGD) in an asynchronous distributed manner. Whereas, in the case of large-scale binary classification we use a variant of SGD which uses variance reduction technique, SVRG as our optimization algorithm. Apprentissage machine Filtrage collaboratif Cadre distribué Machine learning Collaborative filtering Distributed Framework 004
58	Hybrid Recommender Systems via Spectral Learning and a Random Forest Williams, Alyssa 01 December 2019 (has links) We demonstrate spectral learning can be combined with a random forest classifier to produce a hybrid recommender system capable of incorporating meta information. Spectral learning is supervised learning in which data is in the form of one or more networks. Responses are predicted from features obtained from the eigenvector decomposition of matrix representations of the networks. Spectral learning is based on the highest weight eigenvectors of natural Markov chain representations. A random forest is an ensemble technique for supervised learning whose internal predictive model can be interpreted as a nearest neighbor network. A hybrid recommender can be constructed by first deriving a network model from a recommender's similarity matrix then applying spectral learning techniques to produce a new network model. The response learned by the new version of the recommender can be meta information. This leads to a system capable of incorporating meta data into recommendations. similarity learning collaborative filtering nearest neighbors Databases and Information Systems Other Mathematics Theory and Algorithms
59	Implementation and Evaluation of a Recommender System Based on the Slope One and the Weighted Slope One Algorithm Ye, Brian, Tieu, Benny January 2015 (has links) Recommender systems are used on many different websites today and are mechanisms that are supposed to accurately give personalized recommendations of items to a set of different users. An item can for example be movies on Netflix. The purpose of this paper is to implement an algorithm that fulfills five stated goals of the implementation. The goals are as followed: the algorithm should be easy to implement, be effective on query time, accurate on recommendations, put little expectations on users and alternations of algorithm should not have to be changed comprehensively. Slope One is a simplified version of linear regression and can be used to recommend items. By using the Netflix Prize data set from 2009 and the Root-Mean-Square-Error (RMSE) as an evaluator, Slope One generates an accuracy of 1.007 units. The Weighted Slope One, which takes the relevancy of items into the calculation, generates an accuracy of 0.990 units. Adding Weighted Slope One to the Slope One implementation can be done without changing the fundamentals of the Slope One algorithm. It is nearly instantaneous to generate a recommendation of a movie with regular Slope One and Weighted Slope One. However, a precomputing stage is needed for the mechanism. In order to receive a recommendation of the implementation in this paper, the user must at least have rated two items. / Rekommendationssystem används idag på många olika hemsidor, och är en mekanism som har syftet att, med noggrannhet, ge en personlig rekommendation av objekt till en mängd olika användare. Ett objekt kan exempelvis vara en film från Netflix. Syftet med denna rapport är att implementera en algoritm som uppfyller fem olika implementationsmål. Målen är enligt följande: algoritmen ska vara enkel att implementera, ha en effektiv tid på dataförfrågan, ge noggranna rekommendationer, sätta låga förväntningar hos användaren samt ska algoritmen inte behöva omfattande förändring vid alternering. Slope One är en förenklad version av linjär regression, och kan även användas till att rekommendera objekt. Genom att använda datamängden från Netflix Prize från 2009 och måttet Root-Mean-Square-Error (RMSE) som en utvärderare, kan Slope One generera en precision på 1.007 enheter. Den viktade Slope One, som tar hänsyn till varje föremåls relevans, genererar en precision på 0.990 enheter. När dessa två algoritmer kombineras, behövs inte större fundamentala ändringar i implementationen av Slope One. En rekommendation av något objekt kan genereras omedelbart med någon av de två algoritmerna, dock krävs det en förberäkningsfas i mekanismen. För att få en rekommendation av implementationen i denna rapport, måste användaren åtminstone ha värderat två objekt. Slope One Recommender System Collaborative-filtering Netflix Prize Computer Sciences Datavetenskap (datalogi)
60	Recommending new items to customers : A comparison between Collaborative Filtering and Association Rule Mining / Rekommendera nya produkter till kunder : En jämförelsestudie mellan Collaborative Filtering och Association Rule Mining Sohlberg, Henrik January 2015 (has links) E-commerce is an ever growing industry as the internet infrastructure continues to evolve. The benefits from a recommendation system to any online retail store are several. It can help customers to find what they need as well as increase sales by enabling accurate targeted promotions. Among many techniques that can form recommendation systems, this thesis compares Collaborative Filtering against Association Rule Mining, both implemented in combination with clustering. The suggested implementations are designed with the cold start problem in mind and are evaluated with a data set from an online retail store which sells clothing. The results indicate that Collaborative Filtering is the preferable technique while associated rules may still offer business value to stakeholders. However, the strength of the results is undermined by the fact that only a single data set was used. / E-handel är en växande marknad i takt med att Internet utvecklas samtidigt som antalet användare ständigt ökar. Antalet fördelar från rekommendationssytem som e-butiker kan dra nytta av är flera. Samtidigt som det kan hjälpa kunder att hitta vad de letar efter kan det utgöra underlag för riktade kampanjer, något som kan öka försäljning. Det finns många olika tekniker som rekommendationssystem kan vara byggda utifrån. Detta examensarbete ställer fokus på de två teknikerna Collborative Filtering samt Association Rule Mining och jämför dessa sinsemellan. Båda metoderna kombinerades med klustring och utformades för att råda bot på kallstartsproblemet. De två föreslagna implementationerna testades sedan mot en riktig datamängd från en e-butik med kläder i sitt sortiment. Resultaten tyder på att Collborative Filtering är den överlägsna tekniken samtidigt som det fortfarande finns ett värde i associeringsregler. Att dra generella slutsatser försvåras dock av att enbart en datamängd användes. Recommendation system Association rule mining Collaborative filtering Cold start Computer Sciences Datavetenskap (datalogi)

Search results