• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Deep Reinforcement Learning Approach for Dynamic Traffic Light Control with Transit Signal Priority

Nousch, Tobias, Zhou, Runhao, Adam, Django, Hirrle, Angelika, Wang, Meng 23 June 2023 (has links)
Traffic light control (TLC) with transit signal priority (TSP) is an effective way to deal with urban congestion and travel delay. The growing amount of available connected vehicle data offers opportunities for signal control with transit priority, but the conventional control algorithms fall short in fully exploiting those datasets. This paper proposes a novel approach for dynamic TLC with TSP at an urban intersection. We propose a deep reinforcement learning based framework JenaRL to deal with the complex real-world intersections. The optimisation focuses on TSP while balancing the delay of all vehicles. A two-layer state space is defined to capture the real-time traffic information, i.e. vehicle position, type and incoming lane. The discrete action space includes the optimal phase and phase duration based on the real-time traffic situation. An intersection in the inner city of Jena is constructed in an open-source microscopic traffic simulator SUMO. A time-varying traffic demand of motorised individual traffic (MIT), the current TLC controller of the city, as well as the original timetables of the public transport (PT) are implemented in simulation to construct a realistic traffic environment. The results of the simulation with the proposed framework indicate a significant enhancement in the performance of traffic light controller by reducing the delay of all vehicles, and especially minimising the loss time of PT.
2

Learning medical triage by using a reinforcement learning approach

Sundqvist, Niklas January 2022 (has links)
Many emergency departments are today suffering from a overcrowding of people seeking care. The first stage in seeking care is being prioritised in different orders depending on symptoms by a doctor or nurse called medical triage. This is a cumbersome process that could be subject of automatisation. This master thesis investigates the possibility of using reinforcement learning for performing medical triage of patients. A deep Q-learning approach is taken for designing the agent for the environment together with the two extensions of using double Q-learning and a duelling network architecture. The agent is deployed to train in two different environments. The goal for the agent in the first environment is to ask questions to a patient and then decide, when enough information has been collected, how the patient should be prioritised. The second environment makes the agent decide which questions should be asked to the patient and then a separate classifier is used with the information gained to perform the actual triage decision of the patient. The training and testing process of the agent in the two environments reveal difficulties in exploring the environment efficiently and thoroughly. It was also shown that defining a reward function for the environments that guides the agent into asking valuable questions and learninga stopping condition for asking questions is a complicated task. Suitable future work is discussed that would, in combination with the work performed in this paper, create a better reinforcement learning model that could potentially show more promising results in the task of performing medical triage of patients.
3

Deep Reinforcement Learning for the Popular Game tag

Söderlund, August, von Knorring, Gustav January 2021 (has links)
Reinforcement learning can be compared to howhumans learn – by interaction, which is the fundamental conceptof this project. This paper aims to compare three differentlearning methods by creating two adversarial reinforcementlearning models and simulate them in the game tag. The threefundamental learning methods are ordinary Q-learning, Deep Qlearning(DQN), and Double Deep Q-learning (DDQN).The models for ordinary Q-learning are built using a table andthe models for both DQN and DDQN are constructed by using aPython module called TensorFlow. The environment is composedof a bounded square with two obstacles and two agents withadversarial objectives. The rewards are given primarily basedon the distance between the agents.By comparing the trained models it was established that onlyDDQN could solve the task well and generalize, whilst both theQ-model and DQN had more serious flaws. A comparison ofthe DDQN model against its average reward trends establishedthat the model still improved regardless of the constant averagereward.Conclusively, DDQN is the appropriate choice for this adversarialproblem whilst Q-learning and DQN should be avoided.Finally, a constant average reward can be caused by bothagents improving at a similar rate rather than a stagnation inperformance. / Förstärkande inlärning kan jämföras medsättet vi människor lär oss, genom interaktion, vilket är denfundamentala idéen med detta projekt. Syftet med denna rapportär att jämföra tre olika inlärningsmetoder genom att skapatvå förstärkande motståndarinlärningsagenter och simulera demi spelet kull. De tre fundamentala inlärningsmetoderna är Qlearning,Deep Q-learning (DQN) och Double Deep Q-learning(DDQN).Modellerna för vanlig Q-learning är konstruerade med hjälpav en tabell och modellerna för både DQN och DDQN är byggdamed en Python modul, TensorFlow. Miljön är uppbyggd av enbegränsad kvadrat med två hinder och två agenter med motsattamål. Belöningarna ges baserat på avståndet mellan agenterna.En jämförelse mellan de tränade modelerna visade på attenbart DDQN kunde spela bra och generalisera sig, medan bådeQ-modellen och DQN-modellen hade mer allvarliga problem.Genom en jämförelse för DDQN-modellerna och deras genomsnittligabelöning visade det sig att DDQN-modellen fortfarandeförbättrade sig, oavsett det konstanta genomsnittet.Sammanfattningsvis, DDQN är det bäst lämpade valet fördenna motpart simulering medan vanlig Q-learning och DQNborde undvikas. Slutligen, ett konstant belöningsgenomsnitt orsakasav att agenterna förbättras i samma takt snarare än attde stagnerar i prestanda. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm

Page generated in 0.0951 seconds