• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 40
  • 7
  • 4
  • Tagged with
  • 58
  • 58
  • 36
  • 32
  • 28
  • 17
  • 17
  • 13
  • 13
  • 11
  • 11
  • 11
  • 10
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Contribution to learning and decision making under uncertainty for Cognitive Radio. / Contribution à l’apprentissage et à la prise de décision, dans des contextes d’incertitude, pour la radio intelligente

Jouini, Wassim 15 June 2012 (has links)
L’allocation des ressources spectrales à des services de communications sans fil, sans cesse plus nombreux et plus gourmands, a récemment mené la communauté radio à vouloir remettre en question la stratégie de répartition des bandes de fréquences imposée depuis plus d’un siècle. En effet une étude rendue publique en 2002 par la commission fédérale des communications aux Etats-Unis (Federal Communications Commission - FCC) mit en évidence une pénurie des ressources spectrales dans une large bande de fréquences comprise entre quelques mégahertz à plusieurs gigahertz. Cependant, cette même étude expliqua cette pénurie par une allocation statique des ressources aux différents services demandeurs plutôt que par une saturation des bandes de fréquences. Cette explication fut par la suite corroborée par de nombreuses mesures d’occupation spectrale, réalisées dans plusieurs pays, qui montrèrent une forte sous-utilisation des bandes de fréquences en fonction du temps et de l’espace, représentant par conséquent autant d’opportunité spectrale inexploitée. Ces constations donnèrent naissance à un domaine en plein effervescence connu sous le nom d’Accès Opportuniste au Spectre (Opportunistic Spectrum Access). Nos travaux suggèrent l’étude de mécanismes d’apprentissage pour la radio intelligente (Cognitive Radio) dans le cadre de l’Accès Opportuniste au Spectre (AOS) afin de permettre à des équipements radio d’exploiter ces opportunités de manière autonome. Pour cela, nous montrons que les problématiques d’AOS peuvent être fidèlement représentées par des modèles d’apprentissage par renforcement. Ainsi, l’équipement radio est modélisé par un agent intelligent capable d’interagir avec son environnement afin d’en collecter des informations. Ces dernières servent à reconnaître, au fur et à mesure des expériences, les meilleurs choix (bandes de fréquences, configurations, etc.) qui s’offrent au système de communication. Nous nous intéressons au modèle particulier des bandits manchots (Multi-Armed Bandit appliqué à l’AOS). Nous discutons, lors d’une phase préliminaire, différentes solutions empruntées au domaine de l’apprentissage machine (Machine Learning). Ensuite, nous élargissons ces résultats à des cadres adaptés à la radio intelligente. Notamment, nous évaluons les performances de ces algorithmes dans le cas de réseaux d’équipements qui collaborent en prenant en compte, dans le modèle suggéré, les erreurs d’observations. On montre de plus que ces algorithmes n’ont pas besoin de connaître la fréquence des erreurs d’observation afin de converger. La vitesse de convergence dépend néanmoins de ces fréquences. Dans un second temps nous concevons un nouvel algorithme d’apprentissage destiné à répondre à des problèmes d’exploitation des ressources spectrales dans des conditions dites de fading. Tous ces travaux présupposent néanmoins la capacité de l’équipement intelligent à détecter efficacement l’activité d’autres utilisateurs sur la bande (utilisateurs prioritaires dits utilisateurs primaires). La principale difficulté réside dans le fait que l’équipement intelligent ne suppose aucune connaissance a priori sur son environnement (niveau du bruit notamment) ou sur les utilisateurs primaires. Afin de lever le doute sur l’efficacité de l’approche suggérée, nous analysons l’impact de ces incertitudes sur le détecteur d’énergie. Ce dernier prend donc le rôle d’observateur et envoie ses observations aux algorithmes d’apprentissage. Nous montrons ainsi qu’il est possible de quantifier les performances de ce détecteur dans des conditions d’incertitude sur le niveau du bruit ce qui le rend utilisable dans le contexte de la radio intelligente. Par conséquent, les algorithmes d’apprentissage utilisés pourront exploiter les résultats du détecteur malgré l’incertitude inhérente liée à l’environnement considéré et aux hypothèses (sévères) d’incertitude liées au problème analysé. / During the last century, most of the meaningful frequency bands were licensed to emerging wireless applications. Because of the static model of frequency allocation, the growing number of spectrum demanding services led to a spectrum scarcity. However, recently, series of measurements on the spectrum utilization showed that the different frequency bands were underutilized (sometimes even unoccupied) and thus that the scarcity of the spectrum resource is virtual and only due to the static allocation of the different bands to specific wireless services. Moreover, the underutilization of the spectrum resource varies on different scales in time and space offering many opportunities to an unlicensed user or network to access the spectrum. Cognitive Radio (CR) and Opportunistic Spectrum Access (OSA) were introduced as possible solutions to alleviate the spectrum scarcity issue.In this dissertation, we aim at enabling CR equipments to exploit autonomously communication opportunities found in their vicinity. For that purpose, we suggest decision making mechanisms designed and/or adapted to answer CR related problems in general, and more specifically, OSA related scenarios. Thus, we argue that OSA scenarios can be modeled as Multi-Armed Bandit (MAB) problems. As a matter of fact, within OSA contexts, CR equipments are assumed to have no prior knowledge on their environment. Acquiring the necessary information relies on a sequential interaction between the CR equipment and its environment. Finally, the CR equipment is modeled as a cognitive agent whose purpose is to learn while providing an improving service to its user. Thus, firstly we analyze the performance of UCB1 algorithm when dealing with OSA problems with imperfect sensing. More specifically, we show that UCB1 can efficiently cope with sensing errors. We prove its convergence to the optimal channel and quantify its loss of performance compared to the case with perfect sensing. Secondly, we combine UCB1 algorithm with collaborative and coordination mechanism to model a secondary network (i.e. several SUs). We show that within this complex scenario, a coordinated learning mechanism can lead to efficient secondary networks. These scenarios assume that a SU can efficiently detect incumbent users’ activity while having no prior knowledge on their characteristics. Usually, energy detection is suggested as a possible approach to handle such task. Unfortunately, energy detection in known to perform poorly when dealing with uncertainty. Consequently, we ventured in this Ph.D. to revisit the problem of energy detection limits under uncertainty. We present new results on its performances as well as its limits when the noise level is uncertain and the uncertainty is modeled by a log-normal distribution (as suggested by Alexander Sonnenschein and Philip M. Fishman in 1992). Within OSA contexts, we address a final problem where a sensor aims at quantifying the quality of a channel in fading environments. In such contexts, UCB1 algorithms seem to fail. Consequently, we designed a new algorithm called Multiplicative UCB (UCB) and prove its convergence. Moreover, we prove that MUCB algorithms are order optimal (i.e., the order of their learning rate is optimal). This last work provides a contribution that goes beyond CR and OSA. As a matter of fact, MUCB algorithms are introduced and solved within a general MAB framework.
42

Summary Statistic Selection with Reinforcement Learning

Barkino, Iliam January 2019 (has links)
Multi-armed bandit (MAB) algorithms could be used to select a subset of the k most informative summary statistics, from a pool of m possible summary statistics, by reformulating the subset selection problem as a MAB problem. This is suggested by experiments that tested five MAB algorithms (Direct, Halving, SAR, OCBA-m, and Racing) on the reformulated problem and comparing the results to two established subset selection algorithms (Minimizing Entropy and Approximate Sufficiency). The MAB algorithms yielded errors at par with the established methods, but in only a fraction of the time. Establishing MAB algorithms as a new standard for summary statistics subset selection could therefore save numerous scientists substantial amounts of time when selecting summary statistics for approximate bayesian computation.
43

Monte Carlo Tree Search for Continuous and Stochastic Sequential Decision Making Problems / Monte Carlo Tree Search pour les problèmes de décision séquentielle en milieu continus et stochastiques

Couetoux, Adrien 30 September 2013 (has links)
Dans cette thèse, nous avons étudié les problèmes de décisions séquentielles, avec comme application la gestion de stocks d'énergie. Traditionnellement, ces problèmes sont résolus par programmation dynamique stochastique. Mais la grande dimension, et la non convexité du problème, amènent à faire des simplifications sur le modèle pour pouvoir faire fonctionner ces méthodes.Nous avons donc étudié une méthode alternative, qui ne requiert pas de simplifications du modèle: Monte Carlo Tree Search (MCTS). Nous avons commencé par étendre le MCTS classique (qui s’applique aux domaines finis et déterministes) aux domaines continus et stochastiques. Pour cela, nous avons utilisé la méthode de Double Progressive Widening (DPW), qui permet de gérer le ratio entre largeur et profondeur de l’arbre, à l’aide de deux méta paramètres. Nous avons aussi proposé une heuristique nommée Blind Value (BV) pour améliorer la recherche de nouvelles actions, en utilisant l’information donnée par les simulations passées. D’autre part, nous avons étendu l’heuristique RAVE aux domaines continus. Enfin, nous avons proposé deux nouvelles méthodes pour faire remonter l’information dans l’arbre, qui ont beaucoup amélioré la vitesse de convergence sur deux cas tests.Une part importante de notre travail a été de proposer une façon de mêler MCTS avec des heuristiques rapides pré-existantes. C’est une idée particulièrement intéressante dans le cas de la gestion d’énergie, car ces problèmes sont pour le moment résolus de manière approchée. Nous avons montré comment utiliser Direct Policy Search (DPS) pour rechercher une politique par défaut efficace, qui est ensuite utilisée à l’intérieur de MCTS. Les résultats expérimentaux sont très encourageants.Nous avons aussi appliqué MCTS à des processus markoviens partiellement observables (POMDP), avec comme exemple le jeu de démineur. Dans ce cas, les algorithmes actuels ne sont pas optimaux, et notre approche l’est, en transformant le POMDP en MDP, par un changement de vecteur d’état.Enfin, nous avons utilisé MCTS dans un cadre de méta-bandit, pour résoudre des problèmes d’investissement. Le choix d’investissement est fait par des algorithmes de bandits à bras multiples, tandis que l’évaluation de chaque bras est faite par MCTS.Une des conclusions importantes de ces travaux est que MCTS en continu a besoin de très peu d’hypothèses (uniquement un modèle génératif du problème), converge vers l’optimum, et peut facilement améliorer des méthodes suboptimales existantes. / In this thesis, we study sequential decision making problems, with a focus on the unit commitment problem. Traditionally solved by dynamic programming methods, this problem is still a challenge, due to its high dimension and to the sacrifices made on the accuracy of the model to apply state of the art methods. We investigate on the applicability of Monte Carlo Tree Search methods for this problem, and other problems that are single player, stochastic and continuous sequential decision making problems. We started by extending the traditional finite state MCTS to continuous domains, with a method called Double Progressive Widening (DPW). This method relies on two hyper parameters, and determines the ratio between width and depth in the nodes of the tree. We developed a heuristic called Blind Value (BV) to improve the exploration of new actions, using the information from past simulations. We also extended the RAVE heuristic to continuous domain. Finally, we proposed two new ways of backing up information through the tree, that improved the convergence speed considerably on two test cases.An important part of our work was to propose a way to mix MCTS with existing powerful heuristics, with the application to energy management in mind. We did so by proposing a framework that allows to learn a good default policy by Direct Policy Search (DPS), and to include it in MCTS. The experimental results are very positive.To extend the reach of MCTS, we showed how it could be used to solve Partially Observable Markovian Decision Processes, with an application to game of Mine Sweeper, for which no consistent method had been proposed before.Finally, we used MCTS in a meta-bandit framework to solve energy investment problems: the investment decision was handled by classical bandit algorithms, while the evaluation of each investment was done by MCTS.The most important take away is that continuous MCTS has almost no assumption (besides the need for a generative model), is consistent, and can easily improve existing suboptimal solvers by using a method similar to what we proposed with DPS.
44

On recommendation systems in a sequential context / Des Systèmes de Recommandation dans un Contexte Séquentiel

Guillou, Frédéric 02 December 2016 (has links)
Cette thèse porte sur l'étude des Systèmes de Recommandation dans un cadre séquentiel, où les retours des utilisateurs sur des articles arrivent dans le système l'un après l'autre. Après chaque retour utilisateur, le système doit le prendre en compte afin d'améliorer les recommandations futures. De nombreuses techniques de recommandation ou méthodologies d'évaluation ont été proposées par le passé pour les problèmes de recommandation. Malgré cela, l'évaluation séquentielle, qui est pourtant plus réaliste et se rapproche davantage du cadre d'évaluation d'un vrai système de recommandation, a été laissée de côté. Le contexte séquentiel nécessite de prendre en considération différents aspects non visibles dans un contexte fixe. Le premier de ces aspects est le dilemme dit d'exploration vs. exploitation: le modèle effectuant les recommandations doit trouver le bon compromis entre recueillir de l'information sur les goûts des utilisateurs à travers des étapes d'exploration, et exploiter la connaissance qu'il a à l'heure actuelle pour maximiser le feedback reçu. L'importance de ce premier point est mise en avant à travers une première évaluation, et nous proposons une approche à la fois simple et efficace, basée sur la Factorisation de Matrice et un algorithme de Bandit Manchot, pour produire des recommandations appropriées. Le second aspect pouvant apparaître dans le cadre séquentiel surgit dans le cas où une liste ordonnée d'articles est recommandée au lieu d'un seul article. Dans cette situation, le feedback donné par l'utilisateur est multiple: la partie explicite concerne la note donnée par l'utilisateur concernant l'article choisi, tandis que la partie implicite concerne les articles cliqués (ou non cliqués) parmi les articles de la liste. En intégrant les deux parties du feedback dans un modèle d'apprentissage, nous proposons une approche basée sur la Factorisation de Matrice, qui peut recommander de meilleures listes ordonnées d'articles, et nous évaluons cette approche dans un contexte séquentiel particulier pour montrer son efficacité. / This thesis is dedicated to the study of Recommendation Systems under a sequential setting, where the feedback given by users on items arrive one after another in the system. After each feedback, the system has to integrate it and try to improve future recommendations. Many techniques or evaluation methods have already been proposed to study the recommendation problem. Despite that, such sequential setting, which is more realistic and represent a closer framework to a real Recommendation System evaluation, has surprisingly been left aside. Under a sequential context, recommendation techniques need to take into consideration several aspects which are not visible for a fixed setting. The first one is the exploration-exploitation dilemma: the model making recommendations needs to find a good balance between gathering information about users' tastes or items through exploratory recommendation steps, and exploiting its current knowledge of the users and items to try to maximize the feedback received. We highlight the importance of this point through the first evaluation study and propose a simple yet efficient approach to make effective recommendation, based on Matrix Factorization and Multi-Armed Bandit algorithms. The second aspect emphasized by the sequential context appears when a list of items is recommended to the user instead of a single item. In such a case, the feedback given by the user includes two parts: the explicit feedback as the rating, but also the implicit feedback given by clicking (or not clicking) on other items of the list. By integrating both feedback into a Matrix Factorization model, we propose an approach which can suggest better ranked list of items, and we evaluate it in a particular setting.
45

Adaptive Personalization of Pedagogical Sequences using Machine Learning / Personalisation Adaptative de Séquences Pédagogique à l'aide d'Apprentissage Automatique

Clement, Benjamin 12 December 2018 (has links)
Les ordinateurs peuvent-ils enseigner ? Pour répondre à cette question, la recherche dans les Systèmes Tuteurs Intelligents est en pleine expansion parmi la communauté travaillant sur les Technologies de l'Information et de la Communication pour l'Enseignement (TICE). C'est un domaine qui rassemble différentes problématiques et réunit des chercheurs venant de domaines variés, tels que la psychologie, la didactique, les neurosciences et, plus particulièrement, le machine learning. Les technologies numériques deviennent de plus en plus présentes dans la vie quotidienne avec le développement des tablettes et des smartphones. Il semble naturel d'utiliser ces technologies dans un but éducatif. Cela amène de nombreuses problématiques, telles que comment faire des interfaces accessibles à tous, comment rendre des contenus pédagogiques motivants ou encore comment personnaliser les activités afin d'adapter le contenu à chacun. Au cours de cette thèse, nous avons développé des méthodes, regroupées dans un framework nommé HMABITS, afin d'adapter des séquences d'activités pédagogiques en fonction des performances et des préférences des apprenants, dans le but de maximiser leur vitesse d'apprentissage et leur motivation. Ces méthodes utilisent des modèles computationnels de motivation intrinsèque pour identifier les activités offrant les plus grands progrès d'apprentissage, et utilisent des algorithmes de Bandits Multi-Bras pour gérer le compromis exploration/exploitation à l'intérieur de l'espace d'activité. Les activités présentant un intérêt optimal sont ainsi privilégiées afin de maintenir l'apprenant dans un état de Flow ou dans sa Zone de Développement Proximal. De plus, certaines de nos méthodes permettent à l'apprenant de faire des choix sur des caractéristiques contextuelles ou le contenu pédagogique de l'application, ce qui est un vecteur d'autodétermination et de motivation. Afin d'évaluer l'efficacité et la pertinence de nos algorithmes, nous avons mené plusieurs types d'expérimentation. Nos méthodes ont d'abord été testées en simulation afin d'évaluer leur fonctionnement avant de les utiliser dans d'actuelles applications d'apprentissage. Pour ce faire, nous avons développé différents modèles d'apprenants, afin de pouvoir éprouver nos méthodes selon différentes approches, un modèle d'apprenant virtuel ne reflétant jamais le comportement d'un apprenant réel. Les résultats des simulations montrent que le framework HMABITS permet d'obtenir des résultats d'apprentissage comparables et, dans certains cas, meilleurs qu'une solution optimale ou qu'une séquence experte. Nous avons ensuite développé notre propre scénario pédagogique et notre propre serious game afin de tester nos algorithmes en situation réelle avec de vrais élèves. Nous avons donc développé un jeu sur la thématique de la décomposition des nombres, au travers de la manipulation de la monnaie, pour les enfants de 6 à 8 ans. Nous avons ensuite travaillé avec le rectorat et différentes écoles de l'académie de bordeaux. Sur l'ensemble des expérimentations, environ 1000 élèves ont travaillé sur l'application sur tablette. Les résultats des études en situation réelle montrent que le framework HMABITS permet aux élèves d'accéder à des activités plus diverses et plus difficiles, d'avoir un meilleure apprentissage et d'être plus motivés qu'avec une séquence experte. Les résultats montrent même que ces effets sont encore plus marqués lorsque les élèves ont la possibilité de faire des choix. / Can computers teach people? To answer this question, Intelligent Tutoring Systems are a rapidly expanding field of research among the Information and Communication Technologies for the Education community. This subject brings together different issues and researchers from various fields, such as psychology, didactics, neurosciences and, particularly, machine learning. Digital technologies are becoming more and more a part of everyday life with the development of tablets and smartphones. It seems natural to consider using these technologies for educational purposes. This raises several questions, such as how to make user interfaces accessible to everyone, how to make educational content motivating and how to customize it to individual learners. In this PhD, we developed methods, grouped in the aptly-named HMABITS framework, to adapt pedagogical activity sequences based on learners' performances and preferences to maximize their learning speed and motivation. These methods use computational models of intrinsic motivation and curiosity-driven learning to identify the activities providing the highest learning progress and use Multi-Armed Bandit algorithms to manage the exploration/exploitation trade-off inside the activity space. Activities of optimal interest are thus privileged with the target to keep the learner in a state of Flow or in his or her Zone of Proximal Development. Moreover, some of our methods allow the student to make choices about contextual features or pedagogical content, which is a vector of self-determination and motivation. To evaluate the effectiveness and relevance of our algorithms, we carried out several types of experiments. We first evaluated these methods with numerical simulations before applying them to real teaching conditions. To do this, we developed multiple models of learners, since a single model never exactly replicates the behavior of a real learner. The simulation results show the HMABITS framework achieves comparable, and in some cases better, learning results than an optimal solution or an expert sequence. We then developed our own pedagogical scenario and serious game to test our algorithms in classrooms with real students. We developed a game on the theme of number decomposition, through the manipulation of money, for children aged 6 to 8. We then worked with the educational institutions and several schools in the Bordeaux school district. Overall, about 1000 students participated in trial lessons using the tablet application. The results of the real-world studies show that the HMABITS framework allows the students to do more diverse and difficult activities, to achieve better learning and to be more motivated than with an Expert Sequence. The results show that this effect is even greater when the students have the possibility to make choices.
46

Assessing and improving recommender systems to deal with user cold-start problem

Paixão, Crícia Zilda Felício 06 March 2017 (has links)
Sistemas de recomendação fazem parte do nosso dia-a-dia. Os métodos usados nesses sistemas tem como objetivo principal predizer as preferências por novos itens baseado no perĄl do usuário. As pesquisas relacionadas a esse tópico procuram entre outras coisas tratar o problema do cold-start do usuário, que é o desaĄo de recomendar itens para usuários que possuem poucos ou nenhum registro de preferências no sistema. Uma forma de tratar o cold-start do usuário é buscar inferir as preferências dos usuários a partir de informações adicionais. Dessa forma, informações adicionais de diferentes tipos podem ser exploradas nas pesquisas. Alguns estudos usam informação social combinada com preferências dos usuários, outros se baseiam nos clicks ao navegar por sites Web, informação de localização geográĄca, percepção visual, informação de contexto, etc. A abordagem típica desses sistemas é usar informação adicional para construir um modelo de predição para cada usuário. Além desse processo ser mais complexo, para usuários full cold-start (sem preferências identiĄcadas pelo sistema) em particular, a maioria dos sistemas de recomendação apresentam um baixo desempenho. O trabalho aqui apresentado, por outro lado, propõe que novos usuários receberão recomendações mais acuradas de modelos de predição que já existem no sistema. Nesta tese foram propostas 4 abordagens para lidar com o problema de cold-start do usuário usando modelos existentes nos sistemas de recomendação. As abordagens apresentadas trataram os seguintes aspectos: o Inclusão de informação social em sistemas de recomendação tradicional: foram investigados os papéis de várias métricas sociais em um sistema de recomendação de preferências pairwise fornecendo subsidíos para a deĄnição de um framework geral para incluir informação social em abordagens tradicionais. o Uso de similaridade por percepção visual: usando a similaridade por percepção visual foram inferidas redes, conectando usuários similares, para serem usadas na seleção de modelos de predição para novos usuários. o Análise dos benefícios de um framework geral para incluir informação de redes de usuários em sistemas de recomendação: representando diferentes tipos de informação adicional como uma rede de usuários, foi investigado como as redes de usuários podem ser incluídas nos sistemas de recomendação de maneira a beneĄciar a recomendação para usuários cold-start. o Análise do impacto da seleção de modelos de predição para usuários cold-start: a última abordagem proposta considerou que sem a informação adicional o sistema poderia recomendar para novos usuários fazendo a troca entre os modelos já existentes no sistema e procurando aprender qual seria o mais adequado para a recomendação. As abordagens propostas foram avaliadas em termos da qualidade da predição e da qualidade do ranking em banco de dados reais e de diferentes domínios. Os resultados obtidos demonstraram que as abordagens propostas atingiram melhores resultados que os métodos do estado da arte. / Recommender systems are in our everyday life. The recommendation methods have as main purpose to predict preferences for new items based on userŠs past preferences. The research related to this topic seeks among other things to discuss user cold-start problem, which is the challenge of recommending to users with few or no preferences records. One way to address cold-start issues is to infer the missing data relying on side information. Side information of different types has been explored in researches. Some studies use social information combined with usersŠ preferences, others user click behavior, location-based information, userŠs visual perception, contextual information, etc. The typical approach is to use side information to build one prediction model for each cold user. Due to the inherent complexity of this prediction process, for full cold-start user in particular, the performance of most recommender systems falls a great deal. We, rather, propose that cold users are best served by models already built in system. In this thesis we propose 4 approaches to deal with user cold-start problem using existing models available for analysis in the recommender systems. We cover the follow aspects: o Embedding social information into traditional recommender systems: We investigate the role of several social metrics on pairwise preference recommendations and provide the Ąrst steps towards a general framework to incorporate social information in traditional approaches. o Improving recommendation with visual perception similarities: We extract networks connecting users with similar visual perception and use them to come up with prediction models that maximize the information gained from cold users. o Analyzing the beneĄts of general framework to incorporate networked information into recommender systems: Representing different types of side information as a user network, we investigated how to incorporate networked information into recommender systems to understand the beneĄts of it in the context of cold user recommendation. o Analyzing the impact of prediction model selection for cold users: The last proposal consider that without side information the system will recommend to cold users based on the switch of models already built in system. We evaluated the proposed approaches in terms of prediction quality and ranking quality in real-world datasets under different recommendation domains. The experiments showed that our approaches achieve better results than the comparison methods. / Tese (Doutorado)
47

Novel Mechanisms For Allocation Of Heterogeneous Items In Strategic Settings

Prakash, Gujar Sujit 10 1900 (has links) (PDF)
Allocation of objects or resources to competing agents is a ubiquitous problem in the real world. For example, a federal government may wish to allocate different types of spectrum licenses to telecom service providers; a search engine has to assign different sponsored slots to the ads of advertisers; etc. The agents involved in such situations have private preferences over the allocations. The agents, being strategic, may manipulate the allocation procedure to get a favourable allocation. If the objects to be allocated are heterogeneous (rather than homogeneous), the problem becomes quite complex. The allocation problem becomes even more formidable in the presence of a dynamic supply and/or demand. This doctoral work is motivated by such problems involving strategic agents, heterogeneous objects, and dynamic supply and/or demand. In this thesis, we model such problems in a standard game theoretic setting and use mechanism design to propose novel solutions to the problems. We extend the current state-of-the-art in a non-trivial way by solving the following problems: Optimal combinatorial auctions with single minded bidders, generalizing the existing methods to take into account multiple units of heterogeneous objects Multi-armed bandit mechanisms for sponsored search auctions with multiple slots, generalizing the current methods that only consider a single slot. Strategyproof redistribution mechanisms for heterogeneous objects, expanding the scope of the current state of practice beyond homogeneous objects Online allocation mechanisms without money for one-sided and two-sided matching markets, extending the existing methods for static settings.
48

Learning-based Attack and Defense on Recommender Systems

Agnideven Palanisamy Sundar (11190282) 06 August 2021 (has links)
The internet is the home for massive volumes of valuable data constantly being created, making it difficult for users to find information relevant to them. In recent times, online users have been relying on the recommendations made by websites to narrow down the options. Online reviews have also become an increasingly important factor in the final choice of a customer. Unfortunately, attackers have found ways to manipulate both reviews and recommendations to mislead users. A Recommendation System is a special type of information filtering system adapted by online vendors to provide suggestions to their customers based on their requirements. Collaborative filtering is one of the most widely used recommendation systems; unfortunately, it is prone to shilling/profile injection attacks. Such attacks alter the recommendation process to promote or demote a particular product. On the other hand, many spammers write deceptive reviews to change the credibility of a product/service. This work aims to address these issues by treating the review manipulation and shilling attack scenarios independently. For the shilling attacks, we build an efficient Reinforcement Learning-based shilling attack method. This method reduces the uncertainty associated with the item selection process and finds the most optimal items to enhance attack reach while treating the recommender system as a black box. Such practical online attacks open new avenues for research in building more robust recommender systems. When it comes to review manipulations, we introduce a method to use a deep structure embedding approach that preserves highly nonlinear structural information and the dynamic aspects of user reviews to identify and cluster the spam users. It is worth mentioning that, in the experiment with real datasets, our method captures about 92\% of all spam reviewers using an unsupervised learning approach.<br>
49

Machine Learning Algorithms for Influence Maximization on Social Networks

Abhishek Kumar Umrawal (16787802) 08 August 2023 (has links)
<p>With an increasing number of users spending time on social media platforms and engaging with family, friends, and influencers within communities of interest (such as in fashion, cooking, gaming, etc.), there are significant opportunities for marketing firms to leverage word-of-mouth advertising on these platforms. In particular, marketing firms can select sets of influencers within relevant communities to sponsor, namely by providing free product samples to those influencers so that so they will discuss and promote the product on their social media accounts.</p><p>The question of which set of influencers to sponsor is known as <b>influence maximization</b> (IM) formally defined as follows: "if we can try to convince a subset of individuals in a social network to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target?'' Under standard diffusion models, this optimization problem is known to be NP-hard. This problem has been widely studied in the literature and several approaches for solving it have been proposed. Some approaches provide near-optimal solutions but are costly in terms of runtime. On the other hand, some approaches are faster but heuristics, i.e., do not have approximation guarantees.</p><p>In this dissertation, we study the influence maximization problem extensively. We provide efficient algorithms for solving the original problem and its important generalizations. Furthermore, we provide theoretical guarantees and experimental evaluations to support the claims made in this dissertation.</p><p>We first study the original IM problem referred to as the discrete influence maximization (DIM) problem where the marketer can either provide a free sample to an influencer or not, i.e., they cannot give fractional discounts like 10% off, etc. As already mentioned the existing solution methods (for instance, the simulation-based greedy algorithm) provide near-optimal solutions that are costly in terms of runtime and the approaches that are faster do not have approximation guarantees. Motivated by the idea of addressing this trade-off between accuracy and runtime, we propose a community-aware divide-and-conquer framework to provide a time-efficient solution to the DIM problem. The proposed framework outperforms the standard methods in terms of runtime and the heuristic methods in terms of influence.</p><p>We next study a natural extension of the DIM problem referred to as the fractional influence maximization (FIM) problem where the marketer may offer fractional discounts (as opposed to either providing a free sample to an influencer or not in the DIM problem) to the influencers. Clearly, the FIM problem provides more flexibility to the marketer in allocating the available budget among different influencers. The existing solution methods propose to use a continuous extension of the simulation-based greedy approximation algorithm for solving the DIM problem. This continuous extension suggests greedily building the solution for the given fractional budget by taking small steps through the interior of the feasible region. On the contrary, we first characterize the solution to the FIM problem in terms of the solution to the DIM problem. We then use this characterization to propose an efficient greedy approximation algorithm that only iterates through the corners of the feasible region. This leads to huge savings in terms of runtime compared to the existing methods that suggest iterating through the interior of the feasible region. Furthermore, we provide an approximation guarantee for the proposed greedy algorithm to solve the FIM problem.</p><p>Finally, we study another extension of the DIM problem referred to as the online discrete influence maximization (ODIM) problem, where the marketer provides free samples not just once but repeatedly over a given time horizon and the goal is to maximize the cumulative influence over time while receiving instantaneous feedback. The existing solution methods are based on semi-bandit instantaneous feedback where the knowledge of some intermediate aspects of how the influence propagates in the social network is assumed or observed. For instance, which specific individuals became influenced at the intermediate steps during the propagation? However, for social networks with user privacy, this information is not available. Hence, we consider the ODIM problem with full-bandit feedback where no knowledge of the underlying social network or diffusion process is assumed. We note that the ODIM problem is an instance of the stochastic combinatorial multi-armed bandit (CMAB) problem with submodular rewards. To solve the ODIM problem, we provide an efficient algorithm that outperforms the existing methods in terms of influence, and time and space complexities.</p><p>Furthermore, we point out the connections of influence maximization with a related problem of disease outbreak prevention and a more general problem of submodular maximization. The methods proposed in this dissertation can also be used to solve those problems.</p>
50

Opportunistic Scheduling Using Channel Memory in Markov-modeled Wireless Networks

Murugesan, Sugumar 26 October 2010 (has links)
No description available.

Page generated in 0.0603 seconds