• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 701
  • 81
  • 68
  • 22
  • 11
  • 8
  • 8
  • 7
  • 7
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 1128
  • 1128
  • 278
  • 236
  • 219
  • 192
  • 170
  • 167
  • 163
  • 160
  • 152
  • 136
  • 129
  • 128
  • 120
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
451

Regularized Greedy Gradient Q-Learning with Mobile Health Applications

Lu, Xiaoqi January 2021 (has links)
Recent advance in health and technology has made mobile apps a viable approach to delivering behavioral interventions in areas including physical activity encouragement, smoking cessation, substance abuse prevention, and mental health management. Due to the chronic nature of most of the disorders and heterogeneity among mobile users, delivery of the interventions needs to be sequential and tailored to individual needs. We operationalize the sequential decision making via a policy that takes a mobile user's past usage pattern and health status as input and outputs an app/intervention recommendation with the goal of optimizing the cumulative rewards of interest in an indefinite horizon setting. There is a plethora of reinforcement learning methods on the development of optimal policies in this case. However, the vast majority of the literature focuses on studying the convergence of the algorithms with infinite amount of data in computer science domain. Their performances in health applications with limited amount of data and high noise are yet to be explored. Technically the nature of sequential decision making results in an objective function that is non-smooth (not even a Lipschitz) and non-convex in the model parameters. This poses theoretical challenges to the characterization of the asymptotic properties of the optimizer of the objective function, as well as computational challenges for optimization. This problem is especially exacerbated with the presence of high dimensional data in mobile health applications. In this dissertation we propose a regularized greedy gradient Q-learning (RGGQ) method to tackle this estimation problem. The optimal policy is estimated via an algorithm which synthesizes the PGM and the GGQ algorithms in the presence of an L₁ regularization, and its asymptotic properties are established. The theoretical framework initiated in this work can be applied to tackle other non-smooth high dimensional problems in reinforcement learning.
452

[pt] COORDENAÇÃO INTELIGENTE PARA MULTIAGENTES BASEADOS EM MODELOS NEURO-FUZZY HIERÁRQUICOS COM APRENDIZADO POR REFORÇO / [en] INTELLIGENT COORDINATION FOR MULTIAGENT BASED MODELS HIERARCHICAL NEURO-FUZZY WITH REINFORCEMENT LEARNING

08 November 2018 (has links)
[pt] Esta tese consiste na investigação e no desenvolvimento de estratégias de coordenação inteligente que possam ser integradas a modelos neuro-fuzzy hierárquicos para sistemas de múltiplos agentes em ambientes complexos. Em ambientes dinâmicos ou complexos a organização dos agentes deve se adaptar a mudanças nos objetivos do sistema, na disponibilidade de recursos, nos relacionamentos entre os agentes, e assim por diante. Esta flexibilidade é um problema chave nos sistemas multiagente. O objetivo principal dos modelos propostos é fazer com que múltiplos agentes interajam de forma inteligente entre si em sistemas complexos. Neste trabalho foram desenvolvidos dois novos modelos inteligentes neuro-fuzzy hierárquicos com mecanismo de coordenação para sistemas multiagentes, a saber: modelo Neuro-Fuzzy Hierárquico com Aprendizado por Reforço com mecanismo de coordenação Market-Driven (RL-NFHP-MA-MD); e o Modelo Neuro-Fuzzy Hierárquico com Aprendizado por Reforço com modelo de coordenação por grafos (RL-NFHP-MA-CG). A inclusão de modelos de coordenação ao modelo Neuro-Fuzzy Hierárquicos com Aprendizado por Reforço (RL-NHFP-MA) foi motivada principalmente pela importância de otimizar o desempenho do trabalho em conjunto dos agentes, melhorando os resultados do modelo e visando aplicações mais complexas. Os modelos foram concebidos a partir do estudo das limitações existentes nos modelos atuais e das características desejáveis para sistemas de aprendizado baseados em RL, em particular quando aplicados a ambientes contínuos e/ou ambientes considerados de grande dimensão. Os modelos desenvolvidos foram testados através de basicamente dois estudos de caso: a aplicação benchmark do jogo da presa-predador (Pursuit- Game) e Futebol de robôs (simulado e com agentes robóticos). Os resultados obtidos tanto no jogo da presa-predador quanto no futebol de robô através dos novos modelos RL-NFHP-MA-MD e RL-NFHP-MA-CG para múltiplos agentes se mostraram bastante promissores. Os testes demonstraram que o novo sistema mostrou capacidade de coordenar as ações entre agentes com uma velocidade de convergência quase 30 por cento maior que a versão original. Os resultados de futebol de robô foram obtidos com o modelo RL-NFHP-MA-MD e o modelo RL-NFHP-MA-CG, os resultados são bons em jogos completos como em jogadas específicas, ganhando de times desenvolvidos com outros modelos similares. / [en] This thesis is the research and development of intelligent coordination strategies that can be integrated into models for hierarchical neuro-fuzzy systems of multiple agents in complex environments. In dynamic environments or complex organization of agents must adapt to changes in the objectives of the system, availability of resources, relationships between agents, and so on. This flexibility is a key problem in multiagent systems. The main objective of the proposed models is to make multiple agents interact intelligently with each other in complex systems. In this work we developed two new intelligent neuro-fuzzy models with hierarchical coordination mechanism for multi-agent systems, namely Neuro-Fuzzy Model with Hierarchical Reinforcement Learning with coordination mechanism Market-Driven (RL-NFHP-MA-MD), and Neuro-Fuzzy model with Hierarchical Reinforcement Learning with coordination model for graphs (RL-NFHP-MA-CG). The inclusion of coordination models to model with Neuro-Fuzzy Hierarchical Reinforcement Learning (RL-NHFP-MA) was primarily motivated by the importance of optimizing the performance of the work in all players, improving the model results and targeting more complex applications. The models were designed based on the study of existing limitations in current models and desirable features for learning systems based RL, in particular when applied to continuous environments and/or environments considered large. The developed models were tested primarily through two case studies: application benchmark game of predator-prey ( Pursuit-Game) and Soccer robots (simulated and robotic agents). The results obtained both in the game of predator-prey as in soccer robot through new models RL-NFHP-MA-MD and RL-NFHP-MA-CG for multiple agents proved promising. The tests showed that the new system showed ability to coordinate actions among agents with a convergence rate nearly 30 percent higher than the original version. Results soccer robot were obtained with model RL-NFHP-MA-MD–NFHP-RL and model-CG-MA, the results are good in games played in full as specific winning teams developed with other similar models.
453

A Reward-based Algorithm for Hyperparameter Optimization of Neural Networks / En Belöningsbaserad Algoritm för Hyperparameteroptimering av Neurala Nätverk

Larsson, Olov January 2020 (has links)
Machine learning and its wide range of applications is becoming increasingly prevalent in both academia and industry. This thesis will focus on the two machine learning methods convolutional neural networks and reinforcement learning. Convolutional neural networks has seen great success in various applications for both classification and regression problems in a diverse range of fields, e.g. vision for self-driving cars or facial recognition. These networks are built on a set of trainable weights optimized on data, and a set of hyperparameters set by the designer of the network which will remain constant. For the network to perform well, the hyperparameters have to be optimized separately. The goal of this thesis is to investigate the use of reinforcement learning as a method for optimizing hyperparameters in convolutional neural networks built for classification problems. The reinforcement learning methods used are a tabular Q-learning and a new Q-learning inspired algorithm denominated max-table. These algorithms have been tested with different exploration policies based on each hyperparameter value’s covariance, precision or relevance to the performance metric. The reinforcement learning algorithms were mostly tested on the datasets CIFAR10 and MNIST fashion against a baseline set by random search. While the Q-learning algorithm was not able to perform better than random search, max-table was able to perform better than random search in 50% of the time on both datasets. Hyperparameterbased exploration policy using covariance and relevance were shown to decrease the optimizers’ performance. No significant difference was found between a hyperparameter based exploration policy using performance and an equally distributed exploration policy. / Maskininlärning och dess många tillämpningsområden blir vanligare i både akademin och industrin. Den här uppsatsen fokuserar på två maskininlärningsmetoder, faltande neurala nätverk och förstärkningsinlärning. Faltande neurala nätverk har sett stora framgångar inom olika applikationsområden både för klassifieringsproblem och regressionsproblem inom diverse fält, t.ex. syn för självkörande bilar eller ansiktsigenkänning. Dessa nätverk är uppbyggda på en uppsättning av tränbara parameterar men optimeras på data, samt en uppsättning hyperparameterar bestämda av designern och som hålls konstanta vilka behöver optimeras separat för att nätverket ska prestera bra. Målet med denna uppsats är att utforska användandet av förstärkningsinlärning som en metod för att optimera hyperparameterar i faltande neurala nätverk gjorda för klassifieringsproblem. De förstärkningsinlärningsmetoder som använts är en tabellarisk "Q-learning" samt en ny "Q-learning" inspirerad metod benämnd "max-table". Dessa algoritmer har testats med olika handlingsmetoder för utforskning baserade på hyperparameterarnas värdens kovarians, precision eller relevans gentemot utvärderingsmetriken. Förstärkningsinlärningsalgoritmerna var i största del testade på dataseten CIFAR10 och MNIST fashion och jämförda mot en baslinje satt av en slumpmässig sökning. Medan "Q-learning"-algoritmen inte kunde visas prestera bättre än den slumpmässiga sökningen, kunde "max-table" prestera bättre på 50\% av tiden på både dataseten. De handlingsmetoder för utforskning som var baserade på kovarians eller relevans visades minska algoritmens prestanda. Ingen signifikant skillnad kunde påvisas mellan en handlingsmetod baserad på hyperparametrarnas precision och en jämnt fördelad handlingsmetod för utforsking.
454

Analysis of Design Artifacts in Platform-Based Markets

Vandith Pamuru Subramanya Rama (9180506) 31 July 2020 (has links)
<div>Digitization has led to emergence of many platforms-based markets. In this dissertation I focus on three different design problems in these markets. The first essay relates to augmented-reality platforms. Pok\'emon Go, an augmented-reality technology-based game, garnered tremendous public interest upon release with an average of 20 million active daily users. The game combines geo-spatial elements with gamification practices to incentivize user movement in the physical world. This work examines the potential externalities that such incentives may have on associated businesses. Particularly, we study the impact of Pok\'emon Go on local restaurants using online reviews as a proxy for consumer engagement and perception. We treat the release of Pok\'emon Go as a natural experiment and study the post-release impact on the associated restaurants. We find that restaurants located near an in-game artifact do indeed observe a higher level of consumer engagement and a more positive consumer perception as compared with those that have no in-game artifacts nearby. In addition, we find that the heterogeneous characteristics of the restaurants moderate the effect significantly. To the best of our knowledge, this study is the first to examine the economic implications of augmented-reality applications. Thereby, our research lays the foundations for how augmented-reality games affect consumer economic behavior. This work also builds insights into the potential value of such associations for business owners and policymakers. </div><div><br></div><div>The second essay focuses on the platform design problem in sponsored seaerch ad-market.Recent advances in technology have reduced frictions in various markets. In this research, we specifically investigate the role of frictions in determining the efficiency and bidding behavior in a generalized second price auction (GSP) -- the most preferred mechanism for sponsored search advertisements. First, we simulate computational agents in the GSP setting and obtain predictions for the metrics of interest. Second, we test these predictions by conducting a human-subject experiment. We find that, contrary to the theoretical prediction, the lower-valued advertisers (who do not win the auction) substantially overbid. Moreover, we find that the presence of market frictions moderates this phenomenon and results in higher allocative efficiency. These results have implications for policymakers and auction platform managers in designing incentives for more efficient auctions.</div><div><br></div><div>The third essay is about user-generated content platforms. These platform utilize various gamification strategies to incentivize user contributions. One of the most popular strategy is to provide platform sponsorships like a special status. Previous literature has extensively studied the impact of having these sponsorships user contributions. We specifically focus on the impact of losing such elite status. Once their contributions to the platform reduce in volume, elite users lose status. Using a unique empirical strategy we show that users continue to contribute high quality reviews, even though they lose their status. We utilize NLP to extract various review characteristics including sentiment and topics. Using an empirical strategy, we find that losing status does not modify the topic of the reviews written by the users, on average. </div><div><br></div>
455

Multi-modal Simulation and Calibration for OSU CampusMobility

Kalra, Vikhyat January 2021 (has links)
No description available.
456

Radio Resource Allocation and Beam Management under Location Uncertainty in 5G mmWave Networks

Yao, Yujie 16 June 2022 (has links)
Millimeter wave (mmWave) plays a critical role in the Fifth-generation (5G) new radio due to the rich bandwidth it provides. However, one shortcoming of mmWave is the substantial path loss caused by poor diffraction at high frequencies, and consequently highly directional beams are applied to mitigate this problem. A typical way of beam management is to cluster users based on their locations. However, localization uncertainty is unavoidable due to measurement accuracy, system performance fluctuation, and so on. Meanwhile, the traffic demand may change dynamically in wireless environments, which increases the complexity of network management. Therefore, a scheme that can handle both the uncertainty of localization and dynamic radio resource allocation is required. Moreover, since the localization uncertainty will influence the network performance, more state-of-the-art localization methods, such as vision-aided localization, are expected to reduce the localization error. In this thesis, we proposed two algorithms for joint radio resource allocation and beam management in 5G mmWave networks, namely UK-means-based Clustering and Deep Reinforcement Learning-based resource allocation (UK-DRL) and UK-medoids-based Clustering and Deep Reinforcement Learning-based resource allocation (UKM-DRL). Specifically, we deploy UK-means and UK-medoids clustering method in UK-DRL and UKM-DRL, respectively, which is designed to handle the clustering under location uncertainties. Meanwhile, we apply Deep Reinforcement Learning (DRL) for intra-beam radio resource allocations in UK-DRL and UKM-DRL. Moreover, to improve the localization accuracy, we develop a vision-aided localization scheme, where pixel characteristics-based features are extracted from satellite images as additional input features for location prediction. The simulations show that UK-DRL and UKM-DRL successfully improve the network performance in data rate and delay than baseline algorithms. When the traffic load is 4 Mbps, UK-DRL has a 172.4\% improvement in sum rate and 64.1\% improvement in latency than K-means-based Clustering and Deep Reinforcement Learning-based resource allocation (K-DRL). UKM-DRL has 17.2\% higher throughput and 7.7\% lower latency than UK-DRL, and 231\% higher throughput and 55.8\% lower latency than K-DRL. On the other hand, the vision-aided localization scheme can significantly reduce the localization error from 17.11 meters to 3.6 meters.
457

Navigation of Mobile Robots in Human Environments with Deep Reinforcement Learning / Navigering av mobila robotar i mänskliga miljöer med deep reinforcement learning

Coors, Benjamin January 2016 (has links)
For mobile robots which operate in human environments it is not sufficient to simply travel to their target destination as quickly as possible. Instead, mobile robots in human environments need to travel to their destination safely,  keeping a comfortable distance to humans and not colliding with any obstacles along the way. As the number of possible human-robot interactions is very large, defining a rule-based navigation approach is difficult in such highly dynamic environments. Current approaches solve this task by predicting the trajectories of humans in the scene and then planning a collision-free path. However, this requires separate components for detecting and predicting human motion and does not scale well to densely populated environments. Therefore, this work investigates the use of deep reinforcement learning for the navigation of mobile robots in human environments. This approach is based on recent research on utilizing deep neural networks in reinforcement learning to successfully play Atari 2600 video games on human level. A deep convolutional neural network is trained end-to-end from one-dimensional laser scan data to command velocities. Discrete and continuous action space implementations are evaluated in a simulation and are shown to outperform a Social Force Model baseline approach on the navigation problem for mobile robots in human environments.
458

Offline Reinforcement Learning for Scheduling Live Video Events in Large Enterprises

Franzén, Jonathan January 2022 (has links)
In modern times, live video streaming events in companies has become an increasingly relevantmethod for communications. As a platform provider for these events, being able to deliverrelevant recommendations for event scheduling times to users is an important feature. A systemproviding relevant recommendations to users can be described as a recommender system.Recommender systems usually face issues such as having to be trained purely offline, astraining the system online can be costly or time-consuming, requiring manual user feedback.While many solutions and advancements have been made in recommender systems over theyears, such as contributions in the Netflix Prize, it still continues to be an active research topic.This work aims at designing a recommender system which observes users' past sequentialscheduling behavior to provide relevant recommendations for scheduling upcoming live videoevents. The developed recommender system uses reinforcement learning as a model, withcomponents such as a generative model to help it learn from offline data.
459

Information and Self-Organization in Complex Networks

Culbreth, Garland 12 1900 (has links)
Networks that self-organize in response to information are one of the most central studies in complex systems theory. A new time series analysis tool for studying self-organizing systems is developed and demonstrated. This method is applied to interacting complex swarms to explore the connection between information transport and group size, providing evidence for Dunbar's numbers having a foundation in network dynamics. A complex network model of information spread is developed. This network infodemic model uses reinforcement learning to simulate connection and opinion adaptation resulting from interaction between units. The model is applied to study polarized populations and echo chamber formation, exploring strategies for network resilience and weakening. The model is straightforward to extend to multilayer networks and networks generated from real world data. By unifying explanation and prediction, the network infodemic model offers a timely step toward understanding global collective behavior.
460

Hyperparameter Tuning for Reinforcement Learning with Bandits and Off-Policy Sampling

Hauser, Kristen 21 June 2021 (has links)
No description available.

Page generated in 0.1047 seconds