• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 4
  • 4
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Counter piracy a repeated game with asymmetric information /

Marsh, Christopher D. January 2009 (has links) (PDF)
Thesis (M.S. in Operations Research)--Naval Postgraduate School, September 2009. / Thesis Advisor(s): Lin, Kyle Y. "September 2009." Description based on title screen as viewed on 5 November 2009. Author(s) subject terms: Piracy, game theory, Bayesian update. Includes bibliographical references (p. 37-38). Also available in print.
2

Game-theoretic coordination and configuration of multi-level supply chains

Huang, Yun, 黄赟 January 2010 (has links)
published_or_final_version / Industrial and Manufacturing Systems Engineering / Doctoral / Doctor of Philosophy
3

Essays on Dynamic Optimization for Markets and Networks

Gan, Yuanling January 2023 (has links)
We study dynamic decision-making problems in networks and markets under uncertainty about future payoffs. This problem is difficult in general since 1) Although the current decision (potentially) affects future decisions, the decision-maker does not have exact information on the future payoffs when he/she commits to the current decision; 2) The decision made at one part of the network usually interacts with the decisions made at the other parts of the network, which makes the computation scales very fast with the network size and brings computational challenges in practice. In this thesis, we propose computationally efficient methods to solve dynamic optimization problems on markets and networks, specify a general set of conditions under which the proposed methods give theoretical guarantees on global near-optimality, and further provide numerical studies to verify the performance empirically. The proposed methods/algorithms have a general theme as “local algorithms”, meaning that the decision at each node/agent on the network uses only partial information on the network. In the first part of this thesis, we consider a network model with stochastic uncertainty about future payoffs. The network has a bounded degree, and each node takes a discrete decision at each period, leading to a per-period payoff which is a sum of three parts: node rewards for individual node decisions, temporal interactions between individual node decisions from the current and previous periods, and spatial interactions between decisions from pairs of neighboring nodes. The objective is to maximize the expected total payoffs over a finite horizon. We study a natural decentralized algorithm (whose computational requirement is linear in the network size and planning horizon) and prove that our decentralized algorithm achieves global near-optimality when temporal and spatial interactions are not dominant compared to the randomness in node rewards. Decentralized algorithms are parameterized by the locality parameter L: An L-local algorithm makes its decision at each node v based on current and (simulated) future payoffs only up to L periods ahead, and only in an L-radius neighborhood around v. Given any permitted error ε > 0, we show that our proposed L-local algorithm with L = O(log(1/ε)) has an average per-node-per- period optimality gap bounded above by ε, in networks where temporal and spatial interactions are not dominant. This constitutes the first theoretical result establishing the global near-optimality of a local algorithm for network dynamic optimization. In the second part of this thesis, we consider the previous three types of payoff functions under adversarial uncertainty about the future. In general, there are no performance guarantees for arbitrary payoff functions. We consider an additional convexity structure in the individual node payoffs and interaction functions, which helps us leverage the tools in the broad Online Convex Optimization literature. In this work, we study the setting where there is a trade-off between developing future predictions for a longer lookahead horizon, denoted as k versus increasing spatial radius for decentralized computation, denoted as r. When deciding individual node decisions at each time, each node has access to predictions of local cost functions for the next k time steps in an r-hop neighborhood. Our work proposes a novel online algorithm, Localized Predictive Control (LPC), which generalizes predictive control to multi-agent systems. We show that LPC achieves a competitive ratio approaching to 1 exponentially fast in ρT and ρS in an adversarial setting, where ρT and ρS are constants in (0, 1) that increase with the relative strength of temporal and spatial interaction costs, respectively. This is the first competitive ratio bound on decentralized predictive control for networked online convex optimization. Further, we show that the dependence on k and r in our results is near-optimal by lower bounding the competitive ratio of any decentralized online algorithm. In the third part of this work, we consider a general dynamic matching model for online competitive gaming platforms. Players arrive stochastically with a skill attribute, the Elo rating. The distribution of Elo is known and i.i.d across players. However, the individual’s rating is only observed upon arrival. Matching two players with different skills incurs a match cost. The goal is tominimize a weighted combination of waiting costs and matching costs in the system. We investigate a popular heuristic used in industry to trade-off between these two costs, the Bubble algorithm. The algorithm places arriving players on the Elo line with a growing bubble around them. When two bubbles touch, the two players get matched. We show that, with the optimal bubble expansion rate, the Bubble algorithm achieves a constant factor ratio against the offline optimal cost when the match cost (resp. waiting cost) is a power of Elo difference (resp. waiting time). We use players’ activity logs data from a gaming start-up to validate our approach and further provide guidance on how to tune the Bubble expansion rate in practice.
4

Learning in Partially Observable Markov Decision Processes

Sachan, Mohit 21 August 2013 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Learning in Partially Observable Markov Decision process (POMDP) is motivated by the essential need to address a number of realistic problems. A number of methods exist for learning in POMDPs, but learning with limited amount of information about the model of POMDP remains a highly anticipated feature. Learning with minimal information is desirable in complex systems as methods requiring complete information among decision makers are impractical in complex systems due to increase of problem dimensionality. In this thesis we address the problem of decentralized control of POMDPs with unknown transition probabilities and reward. We suggest learning in POMDP using a tree based approach. States of the POMDP are guessed using this tree. Each node in the tree has an automaton in it and acts as a decentralized decision maker for the POMDP. The start state of POMDP is known as the landmark state. Each automaton in the tree uses a simple learning scheme to update its action choice and requires minimal information. The principal result derived is that, without proper knowledge of transition probabilities and rewards, the automata tree of decision makers will converge to a set of actions that maximizes the long term expected reward per unit time obtained by the system. The analysis is based on learning in sequential stochastic games and properties of ergodic Markov chains. Simulation results are presented to compare the long term rewards of the system under different decision control algorithms.

Page generated in 0.1028 seconds