Motivated by the imbalance between demand (i.e., passenger requests) and supply (i.e., available vehicles) in the ride-hailing market and severe traffic congestion faced by modern cities, this dissertation aims to improve the efficiency of the sharing economy by building an agent-based methodological framework for optimal decision-making of distributed agents (e.g., autonomous shared vehicles), including passenger-seeking and route choice. Furthermore, noticing that city planners can impact the behavior of agents via some operational measures such as congestion pricing and signal control, this dissertation investigates the overall bilevel problem that involves the decision-making process of both distributed agents (i.e., the lower level) and central city planners (i.e., the upper level).
First of all, for the task of passenger-seeking, this dissertation proposes a model-based Markov decision process (MDP) approach to incorporate distinct features of e-hailing drivers. The modified MDP approach is found to outperform the baseline (i.e., the local hotspot strategy) in terms of both the rate of return and the utilization rate. Although the modified MDP approach is set up in the single-agent setting, we extend its applicability to multi-agent scenarios by a dynamic adjustment strategy of the order matching probability which is able to partially capture the competition among agents. Furthermore, noticing that the reward function is commonly assumed as some prior knowledge, this dissertation unveils the underlying reward function of the overall e-hailing driver population (i.e., 44,000 Didi drivers in Beijing) through an inverse reinforcement learning method, which paves the way for future research on discovering the underlying reward mechanism in a complex and dynamic ride-hailing market.
To better incorporate the competition among agents, this dissertation develops a model-free mean-field multi-agent actor-critic algorithm for multi-driver passenger-seeking. A bilevel optimization model is then formulated with the upper level as a reward design mechanism and the lower level as a multi-agent system. We use the developed mean field multi-agent actor-critic algorithm to solve for the optimal passenger-seeking policies of distributed agents in the lower level and Bayesian optimization to solve for the optimal control of upper-level city planners. The bilevel optimization model is applied to a real-world large-scale multi-class taxi driver repositioning task with congestion pricing as the upper-level control. It is disclosed that the derived optimal toll charge can efficiently improve the objective of city planners.
With agents knowingwhere to go (i.e., passenger-seeking), this dissertation then applies the bilevel optimization model to the research question of how to get there (i.e., route choice). Different from the task of passenger-seeking where the action space is always fixed-dimensional, the problem of variable action set emerges in the task of route choice. Therefore, a flow-dependent deep Q-learning algorithm is proposed to efficiently derive the optimal policies for multi-commodity multi-class agents. We demonstrate the effect of two countermeasures, namely tolling and signal control, on the behavior of travelers and show that the systematic objective of city planners can be optimized by a proper control.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/d8-qn07-w207 |
Date | January 2021 |
Creators | Shou, Zhenyu |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.002 seconds