• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • Tagged with
  • 5
  • 5
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Agent Contribution in Multi-Agent Reinforcement Learning : A Case Study in Remote Electrical Tilt

Emanuelsson, William January 2024 (has links)
As multi-agent reinforcement learning (MARL) continues to evolve and find applications in complex real-world systems, the imperative for explainability in these systems becomes increasingly critical. Central to enhancing this explainability is tackling the credit assignment problem, a key challenge in MARL that involves quantifying the individual contributions of agents toward a common goal. In addressing this challenge, this thesis introduces and explores the application of Local and Global Shapley Values (LSV and GSV) within MARL contexts. These novel adaptations of the traditional Shapley value from cooperative game theory are investigated particularly in the context of optimizing remote electrical tilt in telecommunications antennas. Using both predator-prey and remote electrical tilt environments, the study delves into local and global explanations, examining how the Shapley value can illuminate changes in agent contributions over time and across different states, as well as aggregate these insights over multiple episodes. The research findings demonstrate that the use of Shapley values enhances the understanding of individual agent behaviors, offers insights into policy suboptimalities and environmental nuances, and aids in identifying agent redundancies—a feature with potential applications in energy savings in real-world systems. Altogether, this thesis highlights the considerable potential of employing the Shapley value as a tool in explainable MARL. / I takt med utvecklingen och tillämpningen av multi-agent förstärkningsinlärning (MARL) i komplexa verkliga system, blir behovet av förklarbarhet i dessa system allt mer väsentligt. För att förbättra denna förklarbarhet är det viktigt att lösa problemet med belöningstilldelning, en nyckelutmaning i MARL som innefattar att kvantifiera de enskilda bidragen från agenter mot ett gemensamt mål. I denna uppsats introduceras och utforskas tillämpningen av lokala och globala Shapley-värden (LSV och GSV) inom MARL-sammanhang. Dessa nya anpassningar av det traditionella Shapley-värdet från samarbetsbaserad spelteori undersöks särskilt i sammanhanget av att optimera fjärrstyrda elektriska lutningar i telekommunikationsantenner. Genom att använda både rovdjur-byte och fjärrstyrda elektriska lutningsmiljöer fördjupar studien sig i lokala och globala förklaringar, och undersöker hur Shapley-värdet kan belysa förändringar i agenters bidrag över tid och över olika tillstånd, samt sammanfatta dessa insikter över flera episoder. Resultaten visar att användningen av Shapley-värden förbättrar förståelsen för individuella agentbeteenden, erbjuder insikter i policybrister och miljönyanser, och hjälper till att identifiera agentredundanser – en egenskap med potentiella tillämpningar för energibesparingar i verkliga system. Sammanfattningsvis belyser denna uppsats den betydande potentialen av att använda Shapley-värdet som ett verktyg i förklaringsbar MARL.
2

Game Theoretic and Analytical Approaches to International Cooperation and Investment Problems

Li, Qing 12 May 2001 (has links)
International cooperation and foreign investment issues are two important components of an international economy. The various aspects of research related to such international cooperation and foreign investment decisions are fraught with various complex factors. In this thesis, we consider two specific issues in the arena of international technological cooperation and foreign investments, by using established Operations Research techniques of game theory and multiple criteria decision making. We first analyze regional technological cooperation mechanisms using classical game theory. A concept of regional technological cooperation is developed based on a cooperative game theoretic model, in which a plan of payoff distributions induces an agreement that is acceptable to each participant. Under certain conditions, the underlying game is shown to be convex, and hence to have a nonempty core with the Shapley value allocations belonging to the core. A compensation scheme is devised based on the Shapley value allocations, whereby participants who enjoy a greater payoff with respect to the technological cooperation compensate the participants who receive a relatively lesser payoff via cooperation. In this manner, regional technological cooperation can bring overall benefits to all the involved players in the game. Some insightful examples are provided to illustrate the methodological concept. Next, we discuss a model for analyzing foreign direct investment opportunities and for evaluating related projects based on the International Investment Attracting Force Theory and the technology of fuzzy evaluation. This model is applied to assess the industrial investment projects that were proposed in the â â 95 China's Tumen River Area International Investment and Business Forumâ funded by the United Nations Industrial Development Organization. Accordingly, the projects are classified into groups based on their potential to attract foreign investors. Furthermore, we simulate the actual forming process whereby projects are sequenced and selected for funding by foreign investors based on a sequential update of their effect on the local economy. The results provide a scientific basis for formulating related decisions and policy recommendations regarding the various proposed projects. / Master of Science
3

Survivability Prediction and Analysis using Interpretable Machine Learning : A Study on Protecting Ships in Naval Electronic Warfare

Rydström, Sidney January 2022 (has links)
Computer simulation is a commonly applied technique for studying electronic warfare duels. This thesis aims to apply machine learning techniques to convert simulation output data into knowledge and insights regarding defensive actions for a ship facing multiple hostile missiles. The analysis may support tactical decision-making, hence the interpretability aspect of predictions is necessary to allow for human evaluation and understanding of impacts from the explanatory variables. The final distance for the threats to the target and the probability of the threats hitting the target was modeled using a multi-layer perceptron model with a multi-task approach, including custom loss functions. The results generated in this study show that the selected methodology is more successful than a baseline using regression models. Modeling the outcome with artificial neural networks results in a black box for decision making. Therefore the concept of interpretable machine learning was applied using a post-hoc approach. Given the learned model, the features considered, and the multiple threats, the feature contributions to the model were interpreted using Kernel SHapley Additive exPlanations (SHAP). The method consists of local linear surrogate models for approximating Shapley values. The analysis primarily showed that an increased seeker activation distance was important, and the increased time for defensive actions improved the outcomes. Further, predicting the final distance to the ship at the beginning of a simulation is important and, in general, a guidance of the actual outcome. The action of firing chaff grenades in the tracking gate also had importance. More chaff grenades influenced the missiles' tracking and provided a preferable outcome from the defended ship's point of view.
4

Resource Allocation on Networks: Nested Event Tree Optimization, Network Interdiction, and Game Theoretic Methods

Lunday, Brian Joseph 08 April 2010 (has links)
This dissertation addresses five fundamental resource allocation problems on networks, all of which have applications to support Homeland Security or industry challenges. In the first application, we model and solve the strategic problem of minimizing the expected loss inflicted by a hostile terrorist organization. An appropriate allocation of certain capability-related, intent-related, vulnerability-related, and consequence-related resources is used to reduce the probabilities of success in the respective attack-related actions, and to ameliorate losses in case of a successful attack. Given the disparate nature of prioritizing capital and material investments by federal, state, local, and private agencies to combat terrorism, our model and accompanying solution procedure represent an innovative, comprehensive, and quantitative approach to coordinate resource allocations from various agencies across the breadth of domains that deal with preventing attacks and mitigating their consequences. Adopting a nested event tree optimization framework, we present a novel formulation for the problem as a specially structured nonconvex factorable program, and develop two branch-and-bound schemes based respectively on utilizing a convex nonlinear relaxation and a linear outer-approximation, both of which are proven to converge to a global optimal solution. We also investigate a fundamental special-case variant for each of these schemes, and design an alternative direct mixed-integer programming model representation for this scenario. Several range reduction, partitioning, and branching strategies are proposed, and extensive computational results are presented to study the efficacy of different compositions of these algorithmic ingredients, including comparisons with the commercial software BARON. The developed set of algorithmic implementation strategies and enhancements are shown to outperform BARON over a set of simulated test instances, where the best proposed methodology produces an average optimality gap of 0.35% (compared to 4.29% for BARON) and reduces the required computational effort by a factor of 33. A sensitivity analysis is also conducted to explore the effect of certain key model parameters, whereupon we demonstrate that the prescribed algorithm can attain significantly tighter optimality gaps with only a near-linear corresponding increase in computational effort. In addition to enabling effective comprehensive resource allocations, this research permits coordinating agencies to conduct quantitative what-if studies on the impact of alternative resourcing priorities. The second application is motivated by the author's experience with the U.S. Army during a tour in Iraq, during which combined operations involving U.S. Army, Iraqi Army, and Iraqi Police forces sought to interdict the transport of selected materials used for the manufacture of specialized types of Improvised Explosive Devices, as well as to interdict the distribution of assembled devices to operatives in the field. In this application, we model and solve the problem of minimizing the maximum flow through a network from a given source node to a terminus node, integrating different forms of superadditive synergy with respect to the effect of resources applied to the arcs in the network. Herein, the superadditive synergy reflects the additional effectiveness of forces conducting combined operations, vis-à-vis unilateral efforts. We examine linear, concave, and general nonconcave superadditive synergistic relationships between resources, and accordingly develop and test effective solution procedures for the underlying nonlinear programs. For the linear case, we formulate an alternative model representation via Fourier-Motzkin elimination that reduces average computational effort by over 40% on a set of randomly generated test instances. This test is followed by extensive analyses of instance parameters to determine their effect on the levels of synergy attained using different specified metrics. For the case of concave synergy relationships, which yields a convex program, we design an inner-linearization procedure that attains solutions on average within 3% of optimality with a reduction in computational effort by a factor of 18 in comparison with the commercial codes SBB and BARON for small- and medium-sized problems; and outperforms these softwares on large-sized problems, where both solvers failed to attain an optimal solution (and often failed to detect a feasible solution) within 1800 CPU seconds. Examining a general nonlinear synergy relationship, we develop solution methods based on outer-linearizations, inner-linearizations, and mixed-integer approximations, and compare these against the commercial software BARON. Considering increased granularities for the outer-linearization and mixed-integer approximations, as well as different implementation variants for both these approaches, we conduct extensive computational experiments to reveal that, whereas both these techniques perform comparably with respect to BARON on small-sized problems, they significantly improve upon the performance for medium- and large-sized problems. Our superlative procedure reduces the computational effort by a factor of 461 for the subset of test problems for which the commercial global optimization software BARON could identify a feasible solution, while also achieving solutions of objective value 0.20% better than BARON. The third application is likewise motivated by the author's military experience in Iraq, both from several instances involving coalition forces attempting to interdict the transport of a kidnapping victim by a sectarian militia as well as, from the opposite perspective, instances involving coalition forces transporting detainees between interment facilities. For this application, we examine the network interdiction problem of minimizing the maximum probability of evasion by an entity traversing a network from a given source to a designated terminus, while incorporating novel forms of superadditive synergy between resources applied to arcs in the network. Our formulations examine either linear or concave (nonlinear) synergy relationships. Conformant with military strategies that frequently involve a combination of overt and covert operations to achieve an operational objective, we also propose an alternative model for sequential overt and covert deployment of subsets of interdiction resources, and conduct theoretical as well as empirical comparative analyses between models for purely overt (with or without synergy) and composite overt-covert strategies to provide insights into absolute and relative threshold criteria for recommended resource utilization. In contrast to existing static models, in a fourth application, we present a novel dynamic network interdiction model that improves realism by accounting for interactions between an interdictor deploying resources on arcs in a digraph and an evader traversing the network from a designated source to a known terminus, wherein the agents may modify strategies in selected subsequent periods according to respective decision and implementation cycles. We further enhance the realism of our model by considering a multi-component objective function, wherein the interdictor seeks to minimize the maximum value of a regret function that consists of the evader's net flow from the source to the terminus; the interdictor's procurement, deployment, and redeployment costs; and penalties incurred by the evader for misperceptions as to the interdicted state of the network. For the resulting minimax model, we use duality to develop a reformulation that facilitates a direct solution procedure using the commercial software BARON, and examine certain related stability and convergence issues. We demonstrate cases for convergence to a stable equilibrium of strategies for problem structures having a unique solution to minimize the maximum evader flow, as well as convergence to a region of bounded oscillation for structures yielding alternative interdictor strategies that minimize the maximum evader flow. We also provide insights into the computational performance of BARON for these two problem structures, yielding useful guidelines for other research involving similar non-convex optimization problems. For the fifth application, we examine the problem of apportioning railcars to car manufacturers and railroads participating in a pooling agreement for shipping automobiles, given a dynamically determined total fleet size. This study is motivated by the existence of such a consortium of automobile manufacturers and railroads, for which the collaborative fleet sizing and efforts to equitably allocate railcars amongst the participants are currently orchestrated by the \textit{TTX Company} in Chicago, Illinois. In our study, we first demonstrate potential inequities in the industry standard resulting either from failing to address disconnected transportation network components separately, or from utilizing the current manufacturer allocation technique that is based on average nodal empty transit time estimates. We next propose and illustrate four alternative schemes to apportion railcars to manufacturers, respectively based on total transit time that accounts for queuing; two marginal cost-induced methods; and a Shapley value approach. We also provide a game-theoretic insight into the existing procedure for apportioning railcars to railroads, and develop an alternative railroad allocation scheme based on capital plus operating costs. Extensive computational results are presented for the ten combinations of current and proposed allocation techniques for automobile manufacturers and railroads, using realistic instances derived from representative data of the current business environment. We conclude with recommendations for adopting an appropriate apportionment methodology for implementation by the industry. / Ph. D.
5

Duplicate detection of multimodal and domain-specific trouble reports when having few samples : An evaluation of models using natural language processing, machine learning, and Siamese networks pre-trained on automatically labeled data / Dublettdetektering av multimodala och domänspecifika buggrapporter med få träningsexempel : En utvärdering av modeller med naturlig språkbehandling, maskininlärning, och siamesiska nätverk förtränade på automatiskt märkt data

Karlstrand, Viktor January 2022 (has links)
Trouble and bug reports are essential in software maintenance and for identifying faults—a challenging and time-consuming task. In cases when the fault and reports are similar or identical to previous and already resolved ones, the effort can be reduced significantly making the prospect of automatically detecting duplicates very compelling. In this work, common methods and techniques in the literature are evaluated and compared on domain-specific and multimodal trouble reports from Ericsson software. The number of samples is few, which is a case not so well-studied in the area. On this basis, both traditional and more recent techniques based on deep learning are considered with the goal of accurately detecting duplicates. Firstly, the more traditional approach based on natural language processing and machine learning is evaluated using different vectorization techniques and similarity measures adapted and customized to the domain-specific trouble reports. The multimodality and many fields of the trouble reports call for a wide range of techniques, including term frequency-inverse document frequency, BM25, and latent semantic analysis. A pipeline processing each data field of the trouble reports independently and automatically weighing the importance of each data field is proposed. The best performing model achieves a recall rate of 89% for a duplicate candidate list size of 10. Further, obtaining knowledge on which types of data are most important for duplicate detection is explored through what is known as Shapley values. Results indicate that utilizing all types of data indeed improve performance, and that date and code parameters are strong indicators. Secondly, a Siamese network based on Transformer-encoders is evaluated on data fields believed to have some underlying representation of the semantic meaning or sequentially important information, which a deep model can capture. To alleviate the issues when having few samples, pre-training through automatic data labeling is studied. Results show an increase in performance compared to not pre-training the Siamese network. However, compared to the more traditional model it performs on par, indicating that traditional models may perform equally well when having few samples besides also being simpler, more robust, and faster. / Buggrapporter är kritiska för underhåll av mjukvara och för att identifiera fel — en utmanande och tidskrävande uppgift. I de fall då felet och rapporterna liknar eller är identiska med tidigare och redan lösta ärenden, kan tiden som krävs minskas avsevärt, vilket gör automatiskt detektering av dubbletter mycket önskvärd. I detta arbete utvärderas och jämförs vanliga metoder och tekniker i litteraturen på domänspecifika och multimodala buggrapporter från Ericssons mjukvara. Antalet tillgängliga träningsexempel är få, vilket inte är ett så välstuderat fall. Utifrån detta utvärderas både traditionella samt nyare tekniker baserade på djupinlärning med målet att detektera dubbletter så bra som möjligt. Först utvärderas det mer traditionella tillvägagångssättet baserat på naturlig språkbearbetning och maskininlärning med hjälp av olika vektoriseringstekniker och likhetsmått specialanpassade till buggrapporterna. Multimodaliteten och de många datafälten i buggrapporterna kräver en rad av tekniker, så som termfrekvens-invers dokumentfrekvens, BM25 och latent semantisk analys. I detta arbete föreslås en modell som behandlar varje datafält i buggrapporterna separat och automatiskt sammanväger varje datafälts betydelse. Den bäst presterande modellen uppnår en återkallningsfrekvens på 89% för en lista med 10 dubblettkandidater. Vidare undersöks vilka datafält som är mest viktiga för dubblettdetektering genom Shapley-värden. Resultaten tyder på att utnyttja alla tillgängliga datafält förbättrar prestandan, och att datum och kodparametrar är starka indikatorer. Sedan utvärderas ett siamesiskt nätverk baserat på Transformator-kodare på datafält som tros ha en underliggande representation av semantisk betydelse eller sekventiellt viktig information, vilket en djup modell kan utnyttja. För att lindra de problem som uppstår med få träningssexempel, studeras det hur den djupa modellen kan förtränas genom automatisk datamärkning. Resultaten visar på en ökning i prestanda jämfört med att inte förträna det siamesiska nätverket. Men jämfört med den mer traditionella modellen presterar den likvärdigt, vilket indikerar att mer traditionella modeller kan prestera lika bra när antalet träningsexempel är få, förutom att också vara enklare, mer robusta, och snabbare.

Page generated in 0.0312 seconds