Global ETD Search

11	Regret minimisation and system-efficiency in route choice / Minimização de Regret e eficiência do sistema em escala de rotas Ramos, Gabriel de Oliveira January 2018 (has links) Aprendizagem por reforço multiagente (do inglês, MARL) é uma tarefa desafiadora em que agentes buscam, concorrentemente, uma política capaz de maximizar sua utilidade. Aprender neste tipo de cenário é difícil porque os agentes devem se adaptar uns aos outros, tornando o objetivo um alvo em movimento. Consequentemente, não existem garantias de convergência para problemas de MARL em geral. Esta tese explora um problema em particular, denominado escolha de rotas (onde motoristas egoístas deve escolher rotas que minimizem seus custos de viagem), em busca de garantias de convergência. Em particular, esta tese busca garantir a convergência de algoritmos de MARL para o equilíbrio dos usuários (onde nenhum motorista consegue melhorar seu desempenho mudando de rota) e para o ótimo do sistema (onde o tempo médio de viagem é mínimo). O principal objetivo desta tese é mostrar que, no contexto de escolha de rotas, é possível garantir a convergência de algoritmos de MARL sob certas condições. Primeiramente, introduzimos uma algoritmo de aprendizagem por reforço baseado em minimização de arrependimento, o qual provamos ser capaz de convergir para o equilíbrio dos usuários Nosso algoritmo estima o arrependimento associado com as ações dos agentes e usa tal informação como sinal de reforço dos agentes. Além do mais, estabelecemos um limite superior no arrependimento dos agentes. Em seguida, estendemos o referido algoritmo para lidar com informações não-locais, fornecidas por um serviço de navegação. Ao usar tais informações, os agentes são capazes de estimar melhor o arrependimento de suas ações, o que melhora seu desempenho. Finalmente, de modo a mitigar os efeitos do egoísmo dos agentes, propomos ainda um método genérico de pedágios baseados em custos marginais, onde os agentes são cobrados proporcionalmente ao custo imposto por eles aos demais. Neste sentido, apresentamos ainda um algoritmo de aprendizagem por reforço baseado em pedágios que, provamos, converge para o ótimo do sistema e é mais justo que outros existentes na literatura. / Multiagent reinforcement learning (MARL) is a challenging task, where self-interested agents concurrently learn a policy that maximise their utilities. Learning here is difficult because agents must adapt to each other, which makes their objective a moving target. As a side effect, no convergence guarantees exist for the general MARL setting. This thesis exploits a particular MARL problem, namely route choice (where selfish drivers aim at choosing routes that minimise their travel costs), to deliver convergence guarantees. We are particularly interested in guaranteeing convergence to two fundamental solution concepts: the user equilibrium (UE, when no agent benefits from unilaterally changing its route) and the system optimum (SO, when average travel time is minimum). The main goal of this thesis is to show that, in the context of route choice, MARL can be guaranteed to converge to the UE as well as to the SO upon certain conditions. Firstly, we introduce a regret-minimising Q-learning algorithm, which we prove that converges to the UE. Our algorithm works by estimating the regret associated with agents’ actions and using such information as reinforcement signal for updating the corresponding Q-values. We also establish a bound on the agents’ regret. We then extend this algorithm to deal with non-local information provided by a navigation service. Using such information, agents can improve their regrets estimates, thus performing empirically better. Finally, in order to mitigate the effects of selfishness, we also present a generalised marginal-cost tolling scheme in which drivers are charged proportional to the cost imposed on others. We then devise a toll-based Q-learning algorithm, which we prove that converges to the SO and that is fairer than existing tolling schemes. Sistemas multiagentes Informatica : Transportes Multiagent reinforcement learning Route choice User equilibrium System optimal Regret minimisation Action regret Travel information Marginal-cost tolling
12	Regret minimisation and system-efficiency in route choice / Minimização de Regret e eficiência do sistema em escala de rotas Ramos, Gabriel de Oliveira January 2018 (has links) Aprendizagem por reforço multiagente (do inglês, MARL) é uma tarefa desafiadora em que agentes buscam, concorrentemente, uma política capaz de maximizar sua utilidade. Aprender neste tipo de cenário é difícil porque os agentes devem se adaptar uns aos outros, tornando o objetivo um alvo em movimento. Consequentemente, não existem garantias de convergência para problemas de MARL em geral. Esta tese explora um problema em particular, denominado escolha de rotas (onde motoristas egoístas deve escolher rotas que minimizem seus custos de viagem), em busca de garantias de convergência. Em particular, esta tese busca garantir a convergência de algoritmos de MARL para o equilíbrio dos usuários (onde nenhum motorista consegue melhorar seu desempenho mudando de rota) e para o ótimo do sistema (onde o tempo médio de viagem é mínimo). O principal objetivo desta tese é mostrar que, no contexto de escolha de rotas, é possível garantir a convergência de algoritmos de MARL sob certas condições. Primeiramente, introduzimos uma algoritmo de aprendizagem por reforço baseado em minimização de arrependimento, o qual provamos ser capaz de convergir para o equilíbrio dos usuários Nosso algoritmo estima o arrependimento associado com as ações dos agentes e usa tal informação como sinal de reforço dos agentes. Além do mais, estabelecemos um limite superior no arrependimento dos agentes. Em seguida, estendemos o referido algoritmo para lidar com informações não-locais, fornecidas por um serviço de navegação. Ao usar tais informações, os agentes são capazes de estimar melhor o arrependimento de suas ações, o que melhora seu desempenho. Finalmente, de modo a mitigar os efeitos do egoísmo dos agentes, propomos ainda um método genérico de pedágios baseados em custos marginais, onde os agentes são cobrados proporcionalmente ao custo imposto por eles aos demais. Neste sentido, apresentamos ainda um algoritmo de aprendizagem por reforço baseado em pedágios que, provamos, converge para o ótimo do sistema e é mais justo que outros existentes na literatura. / Multiagent reinforcement learning (MARL) is a challenging task, where self-interested agents concurrently learn a policy that maximise their utilities. Learning here is difficult because agents must adapt to each other, which makes their objective a moving target. As a side effect, no convergence guarantees exist for the general MARL setting. This thesis exploits a particular MARL problem, namely route choice (where selfish drivers aim at choosing routes that minimise their travel costs), to deliver convergence guarantees. We are particularly interested in guaranteeing convergence to two fundamental solution concepts: the user equilibrium (UE, when no agent benefits from unilaterally changing its route) and the system optimum (SO, when average travel time is minimum). The main goal of this thesis is to show that, in the context of route choice, MARL can be guaranteed to converge to the UE as well as to the SO upon certain conditions. Firstly, we introduce a regret-minimising Q-learning algorithm, which we prove that converges to the UE. Our algorithm works by estimating the regret associated with agents’ actions and using such information as reinforcement signal for updating the corresponding Q-values. We also establish a bound on the agents’ regret. We then extend this algorithm to deal with non-local information provided by a navigation service. Using such information, agents can improve their regrets estimates, thus performing empirically better. Finally, in order to mitigate the effects of selfishness, we also present a generalised marginal-cost tolling scheme in which drivers are charged proportional to the cost imposed on others. We then devise a toll-based Q-learning algorithm, which we prove that converges to the SO and that is fairer than existing tolling schemes. Sistemas multiagentes Informatica : Transportes Multiagent reinforcement learning Route choice User equilibrium System optimal Regret minimisation Action regret Travel information Marginal-cost tolling
13	Modelling and controlling risk in energy systems Gonzalez, Jhonny January 2015 (has links) The Autonomic Power System (APS) grand challenge was a multi-disciplinary EPSRC-funded research project that examined novel techniques that would enable the transition between today's and 2050's highly uncertain and complex energy network. Being part of the APS, this thesis reports on the sub-project 'RR2: Avoiding High-Impact Low Probability events'. The goal of RR2 is to develop new algorithms for controlling risk exposure to high-impact low probability (Hi-Lo) events through the provision of appropriate risk-sensitive control strategies. Additionally, RR2 is concerned with new techniques for identifying and modelling risk in future energy networks, in particular, the risk of Hi-Lo events. In this context, this thesis investigates two distinct problems arising from energy risk management. On the one hand, we examine the problem of finding managerial strategies for exercising the operational flexibility of energy assets. We look at this problem from a risk perspective taking into account non-linear risk preferences of energy asset managers. Our main contribution is the development of a risk-sensitive approach to the class of optimal switching problems. By recasting the problem as an iterative optimal stopping problem, we are able to characterise the optimal risk-sensitive switching strategies. As byproduct, we obtain a multiplicative dynamic programming equation for the value function, upon which we propose a numerical algorithm based on least squares Monte Carlo regression. On the other hand, we develop tools to identify and model the risk factors faced by energy asset managers. For this, we consider a class of models consisting of superposition of Gaussian and non-Gaussian Ornstein-Uhlenbeck processes. Our main contribution is the development of a Bayesian methodology based on Markov chain Monte Carlo (MCMC) algorithms to make inference into this class of models. On extensive simulations, we demonstrate the robustness and efficiency of the algorithms to different data features. Furthermore, we construct a diagnostic tool based on Bayesian p-values to check goodness-of-fit of the models on a Bayesian framework. We apply this tool to MCMC results from fitting historical electricity and gas spot price data- sets corresponding to the UK and German energy markets. Our analysis demonstrates that the MCMC-estimated models are able to capture not only long- and short-lived positive price spikes, but also short-lived negative price spikes which are typical of UK gas prices and German electricity prices. Combining together the solutions to the two problems above, we strive to capture the interplay between risk, uncertainty, flexibility and performance in various applications to energy systems. In these applications, which include power stations, energy storage and district energy systems, we consistently show that our risk management methodology offers a tradeoff between maximising average performance and minimising risk, while accounting for the jump dynamics of energy prices. Moreover, the tradeoff is achieved in such way that the benefits in terms of risk reduction outweigh the loss in average performance. 510
14	Traffic Safety Assessment of Different Toll Collection Systems on Expressways Using Multiple Analytical Techniques Abuzwidah, Muamer 01 January 2014 (has links) Traffic safety has been considered one of the most important issues in the transportation field. Crashes have caused extensive human and economic losses. With the objective of reducing crash occurrence and alleviating crash injury severity, major efforts have been dedicated to reveal the hazardous factors that affect crash occurrence. With these consistent efforts, both fatalities and fatality rates from road traffic crashes in many countries have been steadily declining over the last ten years. Nevertheless, according to the World Health Organization, the world still lost 1.24 million lives from road traffic crashes in the year of 2013. And without action, traffic crashes on the roads network are predicted to result in deaths of around 1.9 million people, and up to 50 million more people suffer non-fatal injuries annually, with many incurring a disability as a result of their injury by the year 2020. To meet the transportation needs, the use of expressways (toll roads) has risen dramatically in many countries in the past decade. In fact, freeways and expressways are considered an important part of any successful transportation system. These facilities carry the majority of daily trips on the transportation network. Although expressways offer high level of service, and are considered the safest among other types of roads, traditional toll collection systems may have both safety and operational challenges. The traditional toll plazas still experience many crashes, many of which are severe. Therefore, it becomes more important to evaluate the traffic safety impacts of using different tolling systems. The main focus of the research in this dissertation is to provide an up-to-date safety impact of using different toll collection systems, as well as providing safety guidelines for these facilities to promote safety and enhance mobility on expressways. In this study, an extensive data collection was conducted that included one hundred mainline toll plazas located on approximately 750 miles of expressways in Florida. Multiple sources of data available online maintained by Florida Department of Transportation were utilized to identify traffic, geometric and geographic characteristics of the locations as well as investigating and determination of the most complete and accurate data. Different methods of observational before-after and Cross-Sectional techniques were used to evaluate the safety effectiveness of applying different treatments on expressways. The Before-After method includes Naive Before-After, Before-After with Comparison Group, and Before-After with Empirical Bayesian. A set of Safety Performance Functions (SPFs) which predict crash frequency as a function of explanatory variables were developed at the aggregate level using crash data and the corresponding exposure and risk factors. Results of the aggregate traffic safety analysis can be used to identify the hazardous locations (hot spots) such as traditional toll plazas, and also to predict crash frequency for untreated sites in the after period in the Before-After with EB method or derive Crash Modification Factors (CMF) for the treatment using the Cross-Sectional method. This type of analysis is usually used to improve geometric characteristics and mainly focus on discovering the risk factors that are related to the total crash frequency, specific crash type, and/or different crash severity levels. Both simple SPFs (with traffic volume only as an explanatory variable) and full SPFs (with traffic volume and additional explanatory variable(s)) were used to estimate the CMFs and only CMFs with lower standard error were recommended. The results of this study proved that safety effectiveness was significantly improved across all locations that were upgraded from Traditional Mainline Toll Plazas (TMTP) to the Hybrid Mainline Toll Plazas (HMTP) system. This treatment significantly reduced total, Fatal-and-Injury (F+I), and Rear-End crashes by 47, 46 and 65 percent, respectively. Moreover, this study examined the traffic safety impact of using different designs, and diverge-and-merge areas of the HMTP. This design combines either express Open Road Tolling (ORT) lanes on the mainline and separate traditional toll collection to the side (design-1), or traditional toll collection on the mainline and separate ORT lanes to the side (design-2). It was also proven that there is a significant difference between these designs, and there is an indication that design-1 is safer and the majority of crashes occurred at diverge-and-merge areas before and after these facilities. However, design-2 could be a good temporary design at locations that have low prepaid transponder (Electronic Toll Collection (ETC)) users. In other words, it is dependent upon the percentage of the ETC users. As this percentage increases, more traffic will need to diverge and merge; thus, this design becomes riskier. In addition, the results indicated significant relationships between the crash frequency and toll plaza types, annual average daily traffic, and drivers* age. The analysis showed that the conversion from TMTP to the All-Electronic Toll Collection (AETC) system resulted in an average reduction of 77, 76, and 67 percent for total, F+I, and Property Damage Only (PDO) crashes, respectively; for rear end and Lane Change Related (LCR) crashes the average reductions were 81 and 75 percent, respectively. The conversion from HMTP to AETC system enhanced traffic safety by reducing crashes by an average of 23, 29 and 19 percent for total, F+I, and PDO crashes; also, for rear end and LCR crashes, the average reductions were 15 and 21 percent, respectively. Based on these results, the use of AETC system changed toll plazas from the highest risk sections on Expressways to be similar to regular segments. Therefore, it can be concluded that the use of AETC system was proven to be an excellent solution to several traffic operations as well as environmental and economic problems. For those agencies that cannot adopt the HMTP and the AETC systems, improving traffic safety at traditional toll plazas should take a priority. This study also evaluates the safety effectiveness of the implementation of High-Occupancy Toll lanes (HOT Lanes) as well as adding roadway lighting to expressways. The results showed that there were no significant impact of the implementation of HOT lanes on the roadway segment as a whole (HOT and Regular Lanes combined). But there was a significant difference between the regular lanes and the HOT lanes at the same roadway segment; the crash count increased at the regular lanes and decreased at the HOT lanes. It was found that the total and F+I crashes were reduced at the HOT lanes by an average of 25 and 45 percent, respectively. This may be attributable to the fact that the HOT lanes became a highway within a highway. Moreover adding roadway lighting has significantly improved traffic safety on the expressways by reducing the night crashes by approximately 35 percent. Overall, the proposed analyses of the safety effectiveness of using different toll collection systems are useful in providing expressway authorities with detailed information on where countermeasures must be implemented. This study provided for the first time an up-to-date safety impact of using different toll collection systems, also developed safety guidelines for these systems which would be useful for practitioners and roadway users. Civil Engineering Engineering

Page generated in 0.0447 seconds