Global ETD Search

161	Papyres : un système de gestion et de recommandation d’articles de recherche Naak, Amine 07 1900 (has links) Les étudiants gradués et les professeurs (les chercheurs, en général), accèdent, passent en revue et utilisent régulièrement un grand nombre d’articles, cependant aucun des outils et solutions existants ne fournit la vaste gamme de fonctionnalités exigées pour gérer correctement ces ressources. En effet, les systèmes de gestion de bibliographie gèrent les références et les citations, mais ne parviennent pas à aider les chercheurs à manipuler et à localiser des ressources. D'autre part, les systèmes de recommandation d’articles de recherche et les moteurs de recherche spécialisés aident les chercheurs à localiser de nouvelles ressources, mais là encore échouent dans l’aide à les gérer. Finalement, les systèmes de gestion de contenu d'entreprise offrent les fonctionnalités de gestion de documents et des connaissances, mais ne sont pas conçus pour les articles de recherche. Dans ce mémoire, nous présentons une nouvelle classe de systèmes de gestion : système de gestion et de recommandation d’articles de recherche. Papyres (Naak, Hage, & Aïmeur, 2008, 2009) est un prototype qui l’illustre. Il combine des fonctionnalités de bibliographie avec des techniques de recommandation d’articles et des outils de gestion de contenu, afin de fournir un ensemble de fonctionnalités pour localiser les articles de recherche, manipuler et maintenir les bibliographies. De plus, il permet de gérer et partager les connaissances relatives à la littérature. La technique de recommandation utilisée dans Papyres est originale. Sa particularité réside dans l'aspect multicritère introduit dans le processus de filtrage collaboratif, permettant ainsi aux chercheurs d'indiquer leur intérêt pour des parties spécifiques des articles. De plus, nous proposons de tester et de comparer plusieurs approches afin de déterminer le voisinage dans le processus de Filtrage Collaboratif Multicritère, de telle sorte à accroître la précision de la recommandation. Enfin, nous ferons un rapport global sur la mise en œuvre et la validation de Papyres. / Graduate students and professors (researchers, in general) regularly access, review, and use large amounts of research papers, yet none of the existing tools and solutions provides the wide range of functionalities required to properly manage these resources. Indeed, bibliography management systems manage the references and citations but fail to help researchers in handling and locating resources. On the other hand, research paper recommendation systems and specialized search engines help researchers to locate new resources, but again fail to help researchers in managing the resources. Finally, Enterprise Content Management systems offer the required functionalities to manage resources and knowledge, but are not designed for research literature. Consequently, we suggest a new class of management systems: Research Paper Management and Recommendation System. Through our system Papyres (Naak, Hage, & Aïmeur, 2008, 2009) we illustrate our approach, which combines bibliography functionalities along with recommendation techniques and content management tools, in order to provide a set of functionalities to locate research papers, handle and maintain the bibliographies, and to manage and share knowledge related to the research literature. Additionally, we propose a novel research paper recommendation technique, used within Papyres. Its uniqueness lies in the multicriteria aspect introduced in the process of collaborative filtering, allowing researchers to indicate their interest in specific parts of articles. Moreover, we suggest test and compare several approaches to determine the neighbourhood in the Multicriteria Collaborative Filtering process, such as to increase the accuracy of the recommendation. Finally, we report on the implementation and validation of Papyres. Gestion d’articles de recherche Gestion de Références Gestion de Contenu d’Entreprise Filtrage Collaboratif Multicritère Research Paper Management Reference Management Enterprise Content Management Research Paper Recommendation Multicriteria Collaborative Filtering Recommendation Systems’ classification
162	Probabilistic and Bayesian nonparametric approaches for recommender systems and networks / Approches probabilistes et bayésiennes non paramétriques pour les systemes de recommandation et les réseaux Todeschini, Adrien 10 November 2016 (has links) Nous proposons deux nouvelles approches pour les systèmes de recommandation et les réseaux. Dans la première partie, nous donnons d’abord un aperçu sur les systèmes de recommandation avant de nous concentrer sur les approches de rang faible pour la complétion de matrice. En nous appuyant sur une approche probabiliste, nous proposons de nouvelles fonctions de pénalité sur les valeurs singulières de la matrice de rang faible. En exploitant une représentation de modèle de mélange de cette pénalité, nous montrons qu’un ensemble de variables latentes convenablement choisi permet de développer un algorithme espérance-maximisation afin d’obtenir un maximum a posteriori de la matrice de rang faible complétée. L’algorithme résultant est un algorithme à seuillage doux itératif qui adapte de manière itérative les coefficients de réduction associés aux valeurs singulières. L’algorithme est simple à mettre en œuvre et peut s’adapter à de grandes matrices. Nous fournissons des comparaisons numériques entre notre approche et de récentes alternatives montrant l’intérêt de l’approche proposée pour la complétion de matrice à rang faible. Dans la deuxième partie, nous présentons d’abord quelques prérequis sur l’approche bayésienne non paramétrique et en particulier sur les mesures complètement aléatoires et leur extension multivariée, les mesures complètement aléatoires composées. Nous proposons ensuite un nouveau modèle statistique pour les réseaux creux qui se structurent en communautés avec chevauchement. Le modèle est basé sur la représentation du graphe comme un processus ponctuel échangeable, et généralise naturellement des modèles probabilistes existants à structure en blocs avec chevauchement au régime creux. Notre construction s’appuie sur des vecteurs de mesures complètement aléatoires, et possède des paramètres interprétables, chaque nœud étant associé un vecteur représentant son niveau d’affiliation à certaines communautés latentes. Nous développons des méthodes pour simuler cette classe de graphes aléatoires, ainsi que pour effectuer l’inférence a posteriori. Nous montrons que l’approche proposée peut récupérer une structure interprétable à partir de deux réseaux du monde réel et peut gérer des graphes avec des milliers de nœuds et des dizaines de milliers de connections. / We propose two novel approaches for recommender systems and networks. In the first part, we first give an overview of recommender systems and concentrate on the low-rank approaches for matrix completion. Building on a probabilistic approach, we propose novel penalty functions on the singular values of the low-rank matrix. By exploiting a mixture model representation of this penalty, we show that a suitably chosen set of latent variables enables to derive an expectation-maximization algorithm to obtain a maximum a posteriori estimate of the completed low-rank matrix. The resulting algorithm is an iterative soft-thresholded algorithm which iteratively adapts the shrinkage coefficients associated to the singular values. The algorithm is simple to implement and can scale to large matrices. We provide numerical comparisons between our approach and recent alternatives showing the interest of the proposed approach for low-rank matrix completion. In the second part, we first introduce some background on Bayesian nonparametrics and in particular on completely random measures (CRMs) and their multivariate extension, the compound CRMs. We then propose a novel statistical model for sparse networks with overlapping community structure. The model is based on representing the graph as an exchangeable point process, and naturally generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our construction builds on vectors of CRMs, and has interpretable parameters, each node being assigned a vector representing its level of affiliation to some latent communities. We develop methods for simulating this class of random graphs, as well as to perform posterior inference. We show that the proposed approach can recover interpretable structure from two real-world networks and can handle graphs with thousands of nodes and tens of thousands of edges. Systèmes de recommandation Filtrage collaboratif Complétion de matrice de rang faible Modèles probabilistes Espérance-maximisation Réseaux Parcimonie Comportement en loi de puissance Structure en communautés Mesures complètement aléatoires Monte Carlo par chaîne de Markov Graphes Recommender systems Collaborative filtering Low-rank matrix completion Probabilistic models Expectation maximization Networks Graphs Sparsity Power-law behavior Community structure Bayesian nonparametrics Completely random measures Markov chain Monte Carlo
163	A Hybrid Approach to Music Recommendation: Exploiting Collaborative Music Tags and Acoustic Features Kaufman, Jaime C. 01 January 2014 (has links) Recommendation systems make it easier for an individual to navigate through large datasets by recommending information relevant to the user. Companies such as Facebook, LinkedIn, Twitter, Netflix, Amazon, Pandora, and others utilize these types of systems in order to increase revenue by providing personalized recommendations. Recommendation systems generally use one of the two techniques: collaborative filtering (i.e., collective intelligence) and content-based filtering. Systems using collaborative filtering recommend items based on a community of users, their preferences, and their browsing or shopping behavior. Examples include Netflix, Amazon shopping, and Last.fm. This approach has been proven effective due to increased popularity, and its accuracy improves as its pool of users expands. However, the weakness with this approach is the Cold Start problem. It is difficult to recommend items that are either brand new or have no user activity. Systems that use content-based filtering recommend items based on extracted information from the actual content. A popular example of this approach is Pandora Internet Radio. This approach overcomes the Cold Start problem. However, the main issue with this approach is its heavy demand on computational power. Also, the semantic meaning of an item may not be taken into account when producing recommendations. In this thesis, a hybrid approach is proposed by utilizing the strengths of both collaborative and content-based filtering techniques. As proof-of-concept, a hybrid music recommendation system was developed and evaluated by users. The results show that this system effectively tackles the Cold Start problem and provides more variation on what is recommended. Thesis University of North Florida UNF music collaborative filtering collective intelligence content-based filtering MFCC recommend information retrieval database Pandora Last.fm music recommendation system UNF Databases and Information Systems Software Engineering Theory and Algorithms
164	Data-based Therapy Recommender Systems Gräßer, Felix Magnus 10 November 2021 (has links) Für viele Krankheitsbilder und Indikationen ist ein breites Spektrum an Arzneimitteln und Arzneimittelkombinationen verfügbar. Darüber hinaus stellen Therapieziele oft Kompromisse zwischen medizinischen Zielstellungen und Präferenzen und Erwartungen von Patienten dar, um Zufriedenheit und Adhärenz zu gewährleisten. Die Auswahl der optimalen Therapieoption kann daher eine große Herausforderung für den behandelnden Arzt darstellen. Klinische Entscheidungsunterstützungssysteme, die Wirksamkeit oder Risiken unerwünschter Arzneimittelwirkung für Behandlungsoptionen vorhersagen, können diesen Entscheidungsprozess unterstützen und \linebreak Leitlinien-basierte Empfehlungen ergänzen, wenn Leitlinien oder wissenschaftliche Literatur fehlen oder ungeeignet sind. Bis heute sind keine derartigen Systeme verfügbar. Im Rahmen dieser Arbeit wird die Anwendung von Methoden aus der Domäne der Recommender Systems (RS) und des Maschinellen Lernens (ML) in solchen Unterstützungssystemen untersucht. Aufgrund ihres erfolgreichen Einsatzes in anderen Empfehlungssystemen und der einfachen Interpretierbarkeit werden zum einen Nachbarschafts-basierte Collaborative Filter (CF) an die besonderen Anforderungen und Herausforderungen der Therapieempfehlung angepasst. Zum anderen werden ein Modell-basierter CF-Ansatz (SLIM) und ein ML Algorithmus (GBM) erprobt. Alle genannten Ansätze werden anhand eines exemplarischen Therapieempfehlungssystems evaluiert, das auf die Behandlung der Autoimmunkrankheit Psoriasis abzielt. Um das Risiko der Empfehlung kontraindizierter oder gar gesundheitsgefährdender Medikamente zu reduzieren, werden Regeln aus evidenzbasierten Leitlinien und Expertenempfehlungen implementiert, um solche Therapieoptionen aus den Empfehlungslisten herauszufiltern. Insbesondere die Nachbarschafts-basierten CF-Algorithmen zeigen insgesamt kleine durchschnittliche Abweichungen zwischen geschätztem und tatsächlichem Therapie-Outcome. Auch die aus den Outcome-Schätzungen abgeleiteten Empfehlungen zeigen eine hohe Übereinstimmung mit der tatsächlich angewandten Behandlung. Die Modell-basierten Ansätze sind den Nachbarschafts-basierten Ansätzen insgesamt unterlegen, was auf den begrenzten Umfang der verfügbaren Trainingsdaten zurückzuführen ist und die Generalisierungsfähigkeit der Modelle erschwert. Im Vergleich mit menschlichen Experten sind alle untersuchten Algorithmen jedoch hinsichtlich Übereinstimmung mit der tatsächlich angewandten Therapie unterlegen. Eine objektive und effiziente Bewertung des Behandlungserfolgs kann als Voraussetzung für ein erfolgreiches ``Krankheitsmanagement'' angesehen werden. Daher wird in weiteren Untersuchungen für ausgwählten klinische Anwendungen der Einsatz von ML Methoden zur automatischen Quantifizierung von Gesunheitszustand und Therapie-Outcome erprobt. Zusätzlich, als weitere Quelle für Informationen über Therapiewirksamkeiten, wird der Einsatz von Sentiment Analysis Methoden zur Extraktion solcher Informationen aus Medikamenten-Bewertungen untersucht. / Under most medical conditions and indications, a great variety of pharmaceutical drugs and drug combinations are available. Beyond that, trade-offs need to be found between the medical requirements and the patients' preferences and expectations in order to support patients’ satisfaction and adherence to treatments. As a consequence, the selection of an optimal therapy option for an individual patient poses a challenging task to prescribers. Clinical Decision Support Systems (CDSSs), which predict outcome as effectiveness and risk of adverse effects for available treatment options, can support this decision-making process and complement guideline-based decision-making where evidence from scientific literature is missing or inappropriate. To date, no such systems are available. Within this work, the application of methods from the Recommender Systems (RS) domain and Machine Learning (ML) in such decision support systems is studied. Due to their successful application in other recommender systems and good interpretability, neighborhood-based CF algorithms are transferred to the medical domain and are adapted to meet the requirements and challenges of the therapy recommendation task. Moreover, a model-based CF method (SLIM) and a state of the art ML algorithm (GBM) are employed. All algorithms are evaluated in an exemplary therapy recommender system, targeting the treatment of the autoimmune skin disease Psoriasis. In order to reduce the risk of recommending contraindicated or even health-endangering drugs, rules derived from evidence-based guidelines and expert recommendations are implemented to filter such options from the recommendation lists. Especially the neighborhood-based CF algorithms show small average errors between estimated and observed outcome. Also, the recommendations derived from outcome estimates show high agreement with the ground truth. The performance of both model-based approaches is inferior to the neighborhood-based recommender. This is primarily assumed to be due to the limited training data sizes, which renders generalizability of the learned models difficult. Compared with recommendations provided by various experts, all proposed approaches are, however, inferior in terms of agreement with the ground truth. An objective and efficient assessment of treatment response can be regarded a prerequisite for successful ``disease management''. Therefore, the use of ML methods for the automatic quantification of health status and therapy outcome for selected clinical applications is investigated in further experiments. Moreover, as additional source of information about drug effectiveness, the use of Sentiment Analysis, in order to extract such information from drug reviews, is investigated. info:eu-repo/classification/ddc/610 ddc:610 info:eu-repo/classification/ddc/615 ddc:615 info:eu-repo/classification/ddc/006 ddc:006
165	Recommender System for Gym Customers Sundaramurthy, Roshni January 2020 (has links) Recommender systems provide new opportunities for retrieving personalized information on the Internet. Due to the availability of big data, the fitness industries are now focusing on building an efficient recommender system for their end-users. This thesis investigates the possibilities of building an efficient recommender system for gym users. BRP Systems AB has provided the gym data for evaluation and it consists of approximately 896,000 customer interactions with 8 features. Four different matrix factorization methods, Latent semantic analysis using Singular value decomposition, Alternating least square, Bayesian personalized ranking, and Logistic matrix factorization that are based on implicit feedback are applied for the given data. These methods decompose the implicit data matrix of user-gym group activity interactions into the product of two lower-dimensional matrices. They are used to calculate the similarities between the user and activity interactions and based on the score, the top-k recommendations are provided. These methods are evaluated by the ranking metrics such as Precision@k, Mean average precision (MAP) @k, Area under the curve (AUC) score, and Normalized discounted cumulative gain (NDCG) @k. The qualitative analysis is also performed to evaluate the results of the recommendations. For this specific dataset, it is found that the optimal method is the Alternating least square method which achieved around 90\% AUC for the overall system and managed to give personalized recommendations to the users. Recommender system collaborative filtering matrix factorization sparse matrix latent semantic analysis singular value decomposition alternating least square Bayesian personalized ranking logistic matrix factorization stochastic gradient descent AUC metric mean average precision normalized discounted cumulative gain Rekommendationssystem Computer Sciences Datavetenskap (datalogi) Probability Theory and Statistics Sannolikhetsteori och statistik
166	Von Mises-Fisher based (co-)clustering for high-dimensional sparse data : application to text and collaborative filtering data / Modèles de mélange de von Mises-Fisher pour la classification simple et croisée de données éparses de grande dimension Salah, Aghiles 21 November 2016 (has links) La classification automatique, qui consiste à regrouper des objets similaires au sein de groupes, également appelés classes ou clusters, est sans aucun doute l’une des méthodes d’apprentissage non-supervisé les plus utiles dans le contexte du Big Data. En effet, avec l’expansion des volumes de données disponibles, notamment sur le web, la classification ne cesse de gagner en importance dans le domaine de la science des données pour la réalisation de différentes tâches, telles que le résumé automatique, la réduction de dimension, la visualisation, la détection d’anomalies, l’accélération des moteurs de recherche, l’organisation d’énormes ensembles de données, etc. De nombreuses méthodes de classification ont été développées à ce jour, ces dernières sont cependant fortement mises en difficulté par les caractéristiques complexes des ensembles de données que l’on rencontre dans certains domaines d’actualité tel que le Filtrage Collaboratif (FC) et de la fouille de textes. Ces données, souvent représentées sous forme de matrices, sont de très grande dimension (des milliers de variables) et extrêmement creuses (ou sparses, avec plus de 95% de zéros). En plus d’être de grande dimension et sparse, les données rencontrées dans les domaines mentionnés ci-dessus sont également de nature directionnelles. En effet, plusieurs études antérieures ont démontré empiriquement que les mesures directionnelles, telle que la similarité cosinus, sont supérieurs à d’autres mesures, telle que la distance Euclidiennes, pour la classification des documents textuels ou pour mesurer les similitudes entre les utilisateurs/items dans le FC. Cela suggère que, dans un tel contexte, c’est la direction d’un vecteur de données (e.g., représentant un document texte) qui est pertinente, et non pas sa longueur. Il est intéressant de noter que la similarité cosinus est exactement le produit scalaire entre des vecteurs unitaires (de norme 1). Ainsi, d’un point de vue probabiliste l’utilisation de la similarité cosinus revient à supposer que les données sont directionnelles et réparties sur la surface d’une hypersphère unité. En dépit des nombreuses preuves empiriques suggérant que certains ensembles de données sparses et de grande dimension sont mieux modélisés sur une hypersphère unité, la plupart des modèles existants dans le contexte de la fouille de textes et du FC s’appuient sur des hypothèses populaires : distributions Gaussiennes ou Multinomiales, qui sont malheureusement inadéquates pour des données directionnelles. Dans cette thèse, nous nous focalisons sur deux challenges d’actualité, à savoir la classification des documents textuels et la recommandation d’items, qui ne cesse d’attirer l’attention dans les domaines de la fouille de textes et celui du filtrage collaborative, respectivement. Afin de répondre aux limitations ci-dessus, nous proposons une série de nouveaux modèles et algorithmes qui s’appuient sur la distribution de von Mises-Fisher (vMF) qui est plus appropriée aux données directionnelles distribuées sur une hypersphère unité. / Cluster analysis or clustering, which aims to group together similar objects, is undoubtedly a very powerful unsupervised learning technique. With the growing amount of available data, clustering is increasingly gaining in importance in various areas of data science for several reasons such as automatic summarization, dimensionality reduction, visualization, outlier detection, speed up research engines, organization of huge data sets, etc. Existing clustering approaches are, however, severely challenged by the high dimensionality and extreme sparsity of the data sets arising in some current areas of interest, such as Collaborative Filtering (CF) and text mining. Such data often consists of thousands of features and more than 95% of zero entries. In addition to being high dimensional and sparse, the data sets encountered in the aforementioned domains are also directional in nature. In fact, several previous studies have empirically demonstrated that directional measures—that measure the distance between objects relative to the angle between them—, such as the cosine similarity, are substantially superior to other measures such as Euclidean distortions, for clustering text documents or assessing the similarities between users/items in CF. This suggests that in such context only the direction of a data vector (e.g., text document) is relevant, not its magnitude. It is worth noting that the cosine similarity is exactly the scalar product between unit length data vectors, i.e., L 2 normalized vectors. Thus, from a probabilistic perspective using the cosine similarity is equivalent to assuming that the data are directional data distributed on the surface of a unit-hypersphere. Despite the substantial empirical evidence that certain high dimensional sparse data sets, such as those encountered in the above domains, are better modeled as directional data, most existing models in text mining and CF are based on popular assumptions such as Gaussian, Multinomial or Bernoulli which are inadequate for L 2 normalized data. In this thesis, we focus on the two challenging tasks of text document clustering and item recommendation, which are still attracting a lot of attention in the domains of text mining and CF, respectively. In order to address the above limitations, we propose a suite of new models and algorithms which rely on the von Mises-Fisher (vMF) assumption that arises naturally for directional data lying on a unit-hypersphere. Apprentissage statistique Classification Classification croisée Modèles de mélanges Statistiques directionnelles Distribution de von Mises-Fisher Fouille de textes Systèmes de recommandation Filtrage collaboratif Matrices creuses Grande dimension Machine learning Clustering Co-clustering Mixture models Directional statistics Von Mises-Fisher distribution Text mining Recommender systems Collaborative filtering Sparse data High dimensional data 003.3

Page generated in 0.1447 seconds