Spelling suggestions: "subject:"same theorymathematical models"" "subject:"same theirmathematical models""
1 |
Counter piracy a repeated game with asymmetric information /Marsh, Christopher D. January 2009 (has links) (PDF)
Thesis (M.S. in Operations Research)--Naval Postgraduate School, September 2009. / Thesis Advisor(s): Lin, Kyle Y. "September 2009." Description based on title screen as viewed on 5 November 2009. Author(s) subject terms: Piracy, game theory, Bayesian update. Includes bibliographical references (p. 37-38). Also available in print.
|
2 |
Game-theoretic coordination and configuration of multi-level supply chainsHuang, Yun, 黄赟 January 2010 (has links)
published_or_final_version / Industrial and Manufacturing Systems Engineering / Doctoral / Doctor of Philosophy
|
3 |
Essays on Dynamic Optimization for Markets and NetworksGan, Yuanling January 2023 (has links)
We study dynamic decision-making problems in networks and markets under uncertainty about future payoffs. This problem is difficult in general since 1) Although the current decision (potentially) affects future decisions, the decision-maker does not have exact information on the future payoffs when he/she commits to the current decision; 2) The decision made at one part of the network usually interacts with the decisions made at the other parts of the network, which makes the computation scales very fast with the network size and brings computational challenges in practice. In this thesis, we propose computationally efficient methods to solve dynamic optimization problems on markets and networks, specify a general set of conditions under which the proposed methods give theoretical guarantees on global near-optimality, and further provide numerical studies to verify the performance empirically. The proposed methods/algorithms have a general theme as “local algorithms”, meaning that the decision at each node/agent on the network uses only partial information on the network.
In the first part of this thesis, we consider a network model with stochastic uncertainty about future payoffs. The network has a bounded degree, and each node takes a discrete decision at each period, leading to a per-period payoff which is a sum of three parts: node rewards for individual node decisions, temporal interactions between individual node decisions from the current and previous periods, and spatial interactions between decisions from pairs of neighboring nodes. The objective is to maximize the expected total payoffs over a finite horizon. We study a natural decentralized algorithm (whose computational requirement is linear in the network size and planning horizon) and prove that our decentralized algorithm achieves global near-optimality when temporal and spatial interactions are not dominant compared to the randomness in node rewards. Decentralized algorithms are parameterized by the locality parameter L: An L-local algorithm makes its decision at each node v based on current and (simulated) future payoffs only up to L periods ahead, and only in an L-radius neighborhood around v. Given any permitted error ε > 0, we show that our proposed L-local algorithm with L = O(log(1/ε)) has an average per-node-per- period optimality gap bounded above by ε, in networks where temporal and spatial interactions are not dominant. This constitutes the first theoretical result establishing the global near-optimality of a local algorithm for network dynamic optimization.
In the second part of this thesis, we consider the previous three types of payoff functions under adversarial uncertainty about the future. In general, there are no performance guarantees for arbitrary payoff functions. We consider an additional convexity structure in the individual node payoffs and interaction functions, which helps us leverage the tools in the broad Online Convex Optimization literature. In this work, we study the setting where there is a trade-off between developing future predictions for a longer lookahead horizon, denoted as k versus increasing spatial radius for decentralized computation, denoted as r. When deciding individual node decisions at each time, each node has access to predictions of local cost functions for the next k time steps in an r-hop neighborhood. Our work proposes a novel online algorithm, Localized Predictive Control (LPC), which generalizes predictive control to multi-agent systems. We show that LPC achieves a competitive ratio approaching to 1 exponentially fast in ρT and ρS in an adversarial setting, where ρT and ρS are constants in (0, 1) that increase with the relative strength of temporal and spatial interaction costs, respectively. This is the first competitive ratio bound on decentralized predictive control for networked online convex optimization. Further, we show that the dependence on k and r in our results is near-optimal by lower bounding the competitive ratio of any decentralized online algorithm.
In the third part of this work, we consider a general dynamic matching model for online competitive gaming platforms. Players arrive stochastically with a skill attribute, the Elo rating. The distribution of Elo is known and i.i.d across players. However, the individual’s rating is only observed upon arrival. Matching two players with different skills incurs a match cost. The goal is tominimize a weighted combination of waiting costs and matching costs in the system. We investigate a popular heuristic used in industry to trade-off between these two costs, the Bubble algorithm. The algorithm places arriving players on the Elo line with a growing bubble around them. When two bubbles touch, the two players get matched. We show that, with the optimal bubble expansion rate, the Bubble algorithm achieves a constant factor ratio against the offline optimal cost when the match cost (resp. waiting cost) is a power of Elo difference (resp. waiting time). We use players’ activity logs data from a gaming start-up to validate our approach and further provide guidance on how to tune the Bubble expansion rate in practice.
|
4 |
Learning in Partially Observable Markov Decision ProcessesSachan, Mohit 21 August 2013 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Learning in Partially Observable Markov Decision process (POMDP) is motivated by the essential need to address a number of realistic problems. A number of methods exist for learning in POMDPs, but learning with limited amount of information about the model of POMDP remains a highly anticipated feature. Learning with minimal information is desirable in complex systems as methods requiring complete information among decision makers are impractical in complex systems due to increase of problem dimensionality.
In this thesis we address the problem of decentralized control of POMDPs with unknown transition probabilities and reward. We suggest learning in POMDP using a tree based approach. States of the POMDP are guessed using this tree. Each node in the tree has an automaton in it and acts as a decentralized decision maker for the POMDP. The start state of POMDP is known as the landmark state. Each automaton in the tree uses a simple learning scheme to update its action choice and requires minimal information. The principal result derived is that, without proper knowledge of transition probabilities and rewards, the automata tree of decision makers will converge to a set of actions that maximizes the long term expected reward per unit time obtained by the system. The analysis is based on learning in sequential stochastic games and properties of ergodic Markov chains. Simulation results are presented to compare the long term rewards of the system under different decision control algorithms.
|
5 |
Optimization and Decision-Making in Decentralized Finance, Scheduling, and Graphical Game TheoryPatange, Utkarsh January 2024 (has links)
We consider the problem of optimization and decision-making in various settings involving complex systems. In particular, we consider specific problems in decentralized finance which we address employing insights from mathematical finance, in course-mode selection that we solve by applying mixed-integer programming, and in social networks that we approach using tools from graphical game theory.In the first part of the thesis, we model and analyze fixed spread liquidation lending in DeFi as implemented by popular pooled lending protocols such as AAVE, JustLend, and Compound.
Empirically, we observe that over 70% of liquidations occur in the absence of any downward price jumps. Then, assuming the borrowers monitor their loans with exponentially distributed horizons, we compute the expected liquidation cost incurred by the borrowers in closed form as a function of the monitoring frequency. We compare this cost against liquidation data obtained from AAVE protocol V2, and observe a match with our model assuming the borrowers monitor their loans five to six times more often than they interact with the pool. Such borrowers must balance the financing cost against the likelihood of liquidation. We compute the optimal health factor in this situation assuming a financing rate for the collateral. Empirically, we observe that borrowers are often more conservative compared to model predictions, though on average, model predictions match with empirical observations.
In the second part of the thesis, we consider the problem of hybrid scheduling that was faced by Columbia Business School during the Covid-19 pandemic and describe the system that we implemented to address it. The system allows some students to attend in-person classes with social distancing, while their peers attend online, and schedules vary by day. We consider two variations of this problem: one where students have unique, individualized class enrollments, and one where they are grouped in teams that are enrolled in identical classes. We formulate both problems as mixed-integer programs.
In the first setting, students who are scheduled to attend all classes in person on a given day may, at times, be required to attend a particular class on that day online due to social distancing constraints. We count these instances as “excess.” We minimize excess and related objectives, and analyze and solve the relaxed linear program. In the second setting, we schedule the teams so that each team’s in-person attendance is balanced over days of week and spread out over the entire term. Our objective is to maximize interaction between different teams. Our program was used to schedule over 2,500 students in student-level scheduling and about 790 students in team-level scheduling from the Fall 2020 through Summer 2021 terms at Columbia Business School.
In the third part of the thesis, we consider a social network, where individuals choose actions which optimize utility which is a function of their neighbors’ actions. We assume that a central authority aiming to maximize social welfare at equilibrium can intervene by paying some cost to shift individual incentives, and that the cost is upper bounded by a budget. The intervention that maximizes the social welfare can be computed using the spectral decomposition of the adjacency matrix of the graph, yet this is infeasible in practice if the adjacency matrix is unknown.
We study the question of designing intervention strategies for graphs where the adjacency matrix is unknown and is drawn from some distribution. For several commonly studied random graph models, we show that the competitive ratio of in intervention proportional to the first eigenvector of the expected adjacency matrix, approaches 1 in probability as the graph size increases. We also provide several efficient sampling-based approaches for approximately recovering the first eigenvector when we do not know the distribution.
On the whole, our analysis compares three categories of interventions: those which use no data about the network, those which use some data (such as distributional knowledge or queries to the graph), and those which are fully optimal. We evaluate these intervention strategies on synthetic and real-world network data, and our results suggest that analysis of random graph models can be useful for determining when certain heuristics may perform well in practice.
|
Page generated in 0.1062 seconds