Global ETD Search

61	Solu??es de coexist?ncia LTE/Wi-Fi em banda n?o licenciada Santana, Pedro Maia de 07 December 2017 (has links) Submitted by Automa??o e Estat?stica (sst@bczm.ufrn.br) on 2018-04-02T12:36:23Z No. of bitstreams: 1 PedroMaiaDeSantana_DISSERT.pdf: 2069422 bytes, checksum: 1a983f5131609d3cf7fe3bd96b0ef27d (MD5) / Approved for entry into archive by Arlan Eloi Leite Silva (eloihistoriador@yahoo.com.br) on 2018-04-04T13:34:35Z (GMT) No. of bitstreams: 1 PedroMaiaDeSantana_DISSERT.pdf: 2069422 bytes, checksum: 1a983f5131609d3cf7fe3bd96b0ef27d (MD5) / Made available in DSpace on 2018-04-04T13:34:36Z (GMT). No. of bitstreams: 1 PedroMaiaDeSantana_DISSERT.pdf: 2069422 bytes, checksum: 1a983f5131609d3cf7fe3bd96b0ef27d (MD5) Previous issue date: 2017-12-07 / Este trabalho tem como objetivo realizar um estudo sobre a aplica??o de redes LTE no espectro ISM (Industrial Scientific and Medical) e seu consequente impacto sobre tecnologias comumente coexistentes na mesma faixa de frequ?ncia. Inicialmente, ? realizada uma elucida??o te?rica sobre as regulamenta??es que envolvem o uso de espectro n?o-licenciado. Na sequ?ncia, s?o apresentadas as principais solu??es de coexist?ncia do LTE nesse meio, destacando-se o mecanismo recentemente padronizado pelo 3GPP, o LTE-LBT, e tecnologias espec?ficas de empresas pioneiras na ?rea, tais como a solu??o LTE-DC. Como elemento pr?tico complementar ? investiga??o te?rica inicial, s?o desenvolvidas an?lises de desempenho das respectivas solu??es utilizando o simulador ns-3. A novidade do trabalho ? materializada pela apresenta??o de uma proposta de solu??o para o mecanismo Carrier-Sensing Adaptive Transmission (CSAT). Essa solu??o, baseada em aprendizado de m?quina, visa melhorar o desempenho conjunto dos sistemas que coexistem na faixa ISM. Este trabalho tamb?m prop?e uma solu??o de coexist?ncia do LTE-DC consigo pr?prio a partir de uma abordagem utilizando teoria dos jogos. Essas solu??es s?o comparada com as solu??es cl?ssicas e o seus ganhos s?o evidenciado em cen?rios definidos por ?rg?os de padroniza??o mundial. / This work aims to perform a study about the application of LTE networks in ISM (Industrial Scientific and Medical) spectrum and its impact over technologies that communly coexist in the same frequency range. Initially, it?s made a theoretical elucidation about regulamentations involving the non licensed spectrum usage. In sequence, it?s presented the main LTE coexistence solutions in this field, highlighting the recent mechanism standardized by 3GPP, the LTE-LBT, and specific technologies of pioneering companies in this domain, like LTE-DC solution. As a practical element complementary to the initial theoretical investigation, it?s developed performance analyzes of the respective solutions using ns-3 simulator. The novelty of the work is materialized by the presentation of a solution proposal for the Carrier-Sensing Adaptive Transmission (CSAT). This solution, based on machine learning, aims to improve the joint performance of systems that coexist in the ISM band. This work also propose a solution for LTE-DC self-coexistence by a game theory approach. These solutions are compared to the classical ones and their gains are evidenced in scenarios defined by global standardization institutions. CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA LTE Wi-Fi 3GPP LTE-DC LTE-LBT LTE-LAA Q-Learning Game theory ns-3
62	Conception et commande d'une structure de locomotion compliante pour le franchissement d'obstacle / Design and control of a compliant locomotion structure for obstacle crossing Bouton, Arthur 16 November 2017 (has links) La recherche d’une locomotion performante sur des terrains accidentés constitue encore à l’heure actuelle un défi pour les systèmes robotisés de toutes sortes s’y attelant. Les robots hybrides de type “roues-pattes”, qui tentent d’allier l’efficacité énergétique des roues à l’agilité des pattes, en sont un exemple aux capacités potentiellement très prometteuses. Malheureusement, le contrôle de telles structures s’avère rapidement problématique du fait des redondances cinématiques, mais aussi et surtout de la difficulté que pose la connaissance exacte de la géométrie du sol à mesure que le robot avance. Cette thèse propose alors une réponse à la complexité des systèmes roulants reconfigurables par une approche synergique entre compliance et actionnement. Pour cela, nous proposons d’exploiter une décomposition idéalement orthogonale entre les différentes formes de compliances qui réalisent la suspension du robot. Ainsi, l’actionnement au sein de la structure est ici dédié à un contrôle des efforts verticaux s’exerçant sur les roues, tandis que les déplacements horizontaux de ces dernières sont le fait d’une raideur passive combinée à une modulation locale des vitesses d’entraînement. La posture du robot est maîtrisée via l’asservissement des forces verticales fournies par un actionnement de type série-élastique. Ceci permet de garantir une adaptation spontanée de la hauteur des roues tout en conservant l’ascendant sur la distribution de la charge. La faisabilité d’un tel système de locomotion est validée à travers un prototype reposant sur quatre “roues-pattes” compliantes. Celui-ci, entièrement conçu dans le cadre de cette étude, approche la décomposition fonctionnelle proposée tout en répondant aux contraintes de réalisation et de robustesse. Tirant parti de la décomposition fonctionnelle proposée pour la structure, deux procédés de commande sont présentés afin de réaliser le franchissement des obstacles : le premier vise à exploiter l’inertie du châssis pour réaliser une modification locale des forces verticales appliquées aux roues, tandis que le second est basé sur la sélection d’un mode de répartition des efforts adaptés à la poursuite d’une évolution quasi-statique en toutes circonstances. Pour cette dernière commande, deux méthodes de synthèse sont abordées : l’une via un algorithme d’apprentissage de type “Q-learning” et l’autre par détermination de règles expertes paramétrées. Ces commandes, validées par des simulations dynamiques dans des situations variées, se basent exclusivement sur des données proprioceptives accessibles immédiatement par la mesure des variables articulaires de la structure. De cette manière, le robot réagit directement au contact des obstacles, sans avoir besoin de connaître à l’avance la géométrie du sol. / Performing an efficient locomotion on rough terrains is still a challenge for robotic systems of all kinds. “Wheel-on-leg” robots that try to combine energy efficiency of wheels with leg agility are an example with potentially very promising capabilities. Unfortunately, control of such structures turns out to be problematic because of the kinematic redundanciesand, above all, the difficulty of precisely evaluating the ground geometry as the robot advances. This thesis proposes a solution to the complexity of reconfigurable rolling systems by a synergic approach between compliance and actuation.To this purpose, we propose to exploit an ideally orthogonal decomposition between the different movements enabled by the robot suspension due to compliant elements. Then, the structure actuation is here dedicated to controlling the vertical forces applied on wheels, while the horizontal wheel displacements are due to a passive stiffness combined with a local modulation of wheel speed. The robot posture is controlled through the vertical forces servoing provided by a series elastic actuation. This ensures a spontaneous adaptation of wheel heights while keeping the control on load distribution. The feasibility of such a locomotion system is validated through a prototype based on four compliant “wheel-legs”. Entirely conceived as part of this study, this one approximates the proposed functional decomposition while meeting the realization and robustness constraints. We also present two control methods that take advantage of the functional decompositionproposed for the structure in order to cross obstacles. The first one aims to exploit the chassis inertia in order to perform a local modification of the vertical forces applied on wheels, while the second one is based on the selection of proper ways of distributing forces in order to be able to pursue a quasi-static advance in all circumstances. Two approaches are given for the production of the last control : either with a “Q-learning” algorithm or by determining parameterized expert rules. Validated by dynamic simulations in various situations, these controls rely only on proprioceptive data immediately provided by the measurement of articular variables. This way, the robot directly reacts when it touches obstacles, without having to know the ground geometry in advance. Robot à roues-Pattes Franchissement d'obstacles Compliance Conception mécanique Commande en efforts Q-Learning Wheel-on-leg robot Obstacle crossing Mechanical design 629.89
63	Solution Of Delayed Reinforcement Learning Problems Having Continuous Action Spaces Ravindran, B 03 1900 (has links) (PDF) No description available. Machine Learning Self Organizing Systems Reinforcement Learning Q-learning Continuous Action Spaces Q-functions Delayed Reinforcement Learning Computer Science
64	Plánování cesty robotu pomocí posilovaného učení / Robot path planning by means of reinforcement learning Veselovský, Michal January 2013 (has links) This thesis is dealing with path planning for autonomous robot in enviromenment with static obstacles. Thesis includes analysis of different approaches for path planning, description of methods utilizing reinforcement learning and experiments with them. Main outputs of thesis are working algorithms for path planning based on Q-learning, verifying their functionality and mutual comparison.
65	A Modified Q-Learning Approach for Predicting Mortality in Patients Diagnosed with Sepsis Dunn, Noah M. 15 April 2021 (has links) No description available. Computer Science
66	Where not what: the role of spatial-motor processing in decision-making Banks, Parker January 2021 (has links) Decision-making is comprised of an incredibly varied set of behaviours. However, all vertebrates tend to repeat previously rewarding actions and avoid those that have led to loss, behaviours known collectively as the win-stay, lose-shift strategy. This response strategy is supported by the sensorimotor striatum and nucleus accumbens, structures also implicated in spatial processing and the integration of sensory information in order to guide motor action. Therefore, choices may be represented as spatial-motor actions whose value is determined by the rewards and punishments associated with that action. In this dissertation I demonstrate that the location of choices relative to previous rewards and punishments, rather than their identities, determines their value. Chapters 2 and 4 demonstrate that the location of rewards and punishments drives future decisions to win-stay or lose-shift towards that location. Even when choices differ in colour or shape, choice value is determined by location, not visual identity. Chapter 3 compares decision-making when two, six, twelve, or eighteen choices are present, finding that the value of a win or loss is not tied to a single location, but is distributed throughout the choice environment. Finally, Chapter 5 provides anatomical support for the spatial-motor basis of choice. Specifically, win-stay responses are associated with greater oscillatory activity than win-shift responses in the motor cortex corresponding to the hand used to make a choice, whereas lose-shift responses are accompanied by greater activation of frontal systems compared to lose-stay responses. The win-stay and lose-shift behaviours activate structures known to project to different regions of the striatum. Overall, this dissertation provides behavioural evidence that choice location, not visual identity, determines choice value. / Thesis / Doctor of Philosophy (PhD) eeg psychology neuroscience electroencephalography decision reinforcement learning choice q-learning motor vision basal ganglia striatum putamen nucleus accumbens cannabis sex
67	Deep Q Learning with a Multi-Level Vehicle Perception for Cooperative Automated Highway Driving Hamilton, Richard January 2021 (has links) Autonomous vehicles, commonly known as “self-driving cars”, are increasingly becoming of interest for researchers due to their potential to mitigate traffic accidents and congestion. Using reinforcement learning, previous research has demonstrated that a DQN agent can be trained to effectively navigate a simulated two-lane environment via cooperative driving, in which a model of V2X technology allows an AV to receive information from surrounding vehicles (termed Primary Perceived Vehicles) to make driving decisions. Results have demonstrated that the DQN agent can learn to navigate longitudinally and laterally, but with a prohibitively high collision rate of 1.5% - 4.8% and an average speed of 13.4 m/s. In this research, the impact of including information from traffic vehicles that are outside of those that immediately surround the AV (termed Secondary Perceived Vehicles) as inputs to a DQN agent is investigated. Results indicate that while including velocity and distance information from SPVs does not improve the collision rate and average speed of the driving algorithm, it does yield a lower standard deviation of speed during episodes, indicating lower acceleration. This effect, however, is lost when the agent is tested under constant traffic flow scenarios (as opposed to fluctuating driving conditions). Taken together, it is concluded that while the SPV inclusion does not have an impact on collision rate and average speed, its ability to achieve the same performance with lower acceleration can significantly improve fuel economy and drive quality. These findings give a better understanding of how additional vehicle information during cooperative driving affects automated driving. / Thesis / Master of Applied Science (MASc) Deep Q Learning, Autonomous Vehicles Machine Learning Reinforcement Learning Primary Perceived Vehicles Secondary Perceived Vehicles Cooperative Driving
68	Deep Reinforcement Learning for Card Games Tegnér Mohringe, Oscar, Cali, Rayan January 2022 (has links) This project aims to investigate how reinforcement learning (RL) techniques can be applied to the card game LimitTexas Hold’em. RL is a type of machine learning that can learn to optimally solve problems that can be formulated according toa Markov Decision Process.We considered two different RL algorithms, Deep Q-Learning(DQN) for its popularity within the RL community and DeepMonte-Carlo (DMC) for its success in other card games. With the goal of investigating how different parameters affect their performance and if possible achieve human performance.To achieve this, a subset of the parameters used by these methods were varied and their impact on the overall learning performance was investigated. With both DQN and DMC we were able to isolate parameters that had a significant impact on the performance.While both methods failed to reach human performance, both showed obvious signs of learning. The DQN algorithm’s biggest flaw was that it tended to fall into simplified strategies where it would stick to using only one action. The pitfall for DMC was the fact that the algorithm has a high variance and therefore needs a lot of samples to train. However, despite this fallacy,the algorithm has seemingly developed a primitive strategy. We believe that with some modifications to the methods, better results could be achieved. / Detta projekt strävar efter att undersöka hur olika Förstärkningsinlärning (RL) tekniker kan implementeras för kortspelet Limit Texas Hold’Em. RL är en typ av maskininlärning som kan lära sig att optimalt lösa problem som kan formuleras enligt en markovbeslutsprocess. Vi betraktade två olika algoritmer, Deep Q-Learning (DQN) som valdes för sin popularitet och Deep Monte-Carlo (DMC) valdes för dess tidigare framgång i andra kortspel. Med målet att undersöka hur olika parametrar påverkar inlärningsprocessen och om möjligt uppnå mänsklig prestanda. För att uppnå detta så valdes en delmängd av de parametrar som används av dessa metoder. Dessa ändrades successivt för att sedan mäta dess påverkan på den övergripande inlärningsprestandan. Med både DQN och DMC så lyckades vi isolera parametrar som hade en signifikant påverkan på prestandan. Trots att båda metoderna misslyckades med att uppnå mänsklig prestanda så visade båda tecken på upplärning. Det största problemet med DQN var att metoden tenderade att fastna i enkla strategier där den enbart valde ett drag. För DMC så låg problemet i att metoden har en hög varians vilket innebär att metoden behöver mycket tid för att tränas upp. Dock så lyckades ändå metoden utveckla en primitiv strategi. Vi tror att metoder med ett par modifikationer skulle kunna nå ett bättre resultat. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm Reinforcement Learning Deep Q-Learning Deep Monte-Carlo Poker Elektroteknik och elektronik
69	Complementary Layered Learning Mondesire, Sean 01 January 2014 (has links) Layered learning is a machine learning paradigm used to develop autonomous robotic-based agents by decomposing a complex task into simpler subtasks and learns each sequentially. Although the paradigm continues to have success in multiple domains, performance can be unexpectedly unsatisfactory. Using Boolean-logic problems and autonomous agent navigation, we show poor performance is due to the learner forgetting how to perform earlier learned subtasks too quickly (favoring plasticity) or having difficulty learning new things (favoring stability). We demonstrate that this imbalance can hinder learning so that task performance is no better than that of a suboptimal learning technique, monolithic learning, which does not use decomposition. Through the resulting analyses, we have identified factors that can lead to imbalance and their negative effects, providing a deeper understanding of stability and plasticity in decomposition-based approaches, such as layered learning. To combat the negative effects of the imbalance, a complementary learning system is applied to layered learning. The new technique augments the original learning approach with dual storage region policies to preserve useful information from being removed from an agent’s policy prematurely. Through multi-agent experiments, a 28% task performance increase is obtained with the proposed augmentations over the original technique. Machine learning reinforcement learning layered learning evolutionary computation q learning forgetting stability plasticity dilemma Computer Sciences Engineering
70	Control of an Inverted Pendulum Using Reinforcement Learning Methods Kärn, Joel January 2021 (has links) In this paper the two reinforcement learning algorithmsQ-learning and deep Q-learning (DQN) are used tobalance an inverted pendulum. In order to compare the two, bothalgorithms are optimized to some extent, by evaluating differentvalues for some parameters of the algorithms. Since the differencebetween Q-learning and DQN is a deep neural network (DNN),some benefits of a DNN are then discussed.The conclusion is that this particular problem is simple enoughfor the Q-learning algorithm to work well and is preferable,even though the DQN algorithm solves the problem in fewerepisodes. This is due to the stability of the Q-learning algorithmand because more time is required to find a suitable DNN andevaluate appropriate parameters for the DQN algorithm, than tofind the proper parameters for the Q-learning algorithm. / I denna rapport används två algoritmer inom förstärkningsinlärning och djup Q-inlärning (DQN), för att balancera en omvänd pendel. För att jämföra dem så optimeras algoritmerna i viss utsträckning genom att testa olika värden för vissa av deras parametrar. Eftersom att skillnaden mellan Q-inlärning och DQN är ett djupt neuralt nätverk (DNN) så diskuterades fördelen med ett DNN. Slutstatsen är att för ett så pass enkelt problem så fungerar Q-inlärningsalgoritmen bra och är att föredra, trots att DQNalgoritmen löser problemet på färre episoder. Detta är pågrund av Q-inlärningsalgoritmens stabilitet och att mer tid krävs för att hitta ett passande DNN och hitta lämpliga parametrar för DQN-algoritmen än vad det krävs för att hitta bra parametrar för Q-inlärningsalgoritmen. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm Reinforcement Learning Q-learning DQN CartPole Inverted Pendulum OpenAI Elektroteknik och elektronik

Search results