Global ETD Search

21	Investigation on stability of Knowledge Based Subset Construction in Multi-Agent Games / Undersökning av stabiliteten för en Kunskapsbaserad Delmängdskonstruktion i Fleragentsspel Johansson, Gustaf, Bergmark, Gustaf January 2022 (has links) Many real life problems can be modelled using multi-agent games played on finite graphs. When an agent cannot differentiate between game states, for example when a robot operates with a broken sensor, the game is classified as a game of imperfect information. This report focuses on non-deterministic multi-agent games of imperfect information or Multi-Agent Games of Imperfect Information Against Nature (MAGIIAN). Finding optimal strategies for these games is very hard due to the element of imperfect information as well as taking into account the multiple cooperating agents. Using a generalised version for multi-agent games of the known Knowledge Based Subset Construction (KBSC) algorithm may solve the problem of strategy synthesis for MAGIIAN. While the KBSC transforms the game into a game with perfect information, the multi-agent variant (MKBSC) instead yields another MAGIIAN. When applying the algorithm iteratively some games stop expanding while others expand seemingly boundlessly. This is referred to as stability and divergence respectively. Our research focuses on different patterns, called structural conditions, in the MAGIIAN and how they affect stability. By using an existing implementation of the MKBSC along with some newly developed algorithms we were able to iterate over different games and analyse different structural conditions. We have identified several structural conditions which affect stability. By reducing divergent games to only their core components with respect to divergence, a more careful examination of what causes divergence could be done. It reaffirmed earlier research that cycles are necessary in order for games to diverge. Observation overlap was found to not be a necessary condition for divergent games as counter examples to this was found. Games containing well formed observations were found to stabilise within one iteration. Our research has also lead us to believe that it is impossible for structural conditions to properly classify divergence. / Olika typer av autonoma problem kan modelleras med hjälp av fleragentsspel spelade på ändliga grafer. Spel där en agent ej kan urskilja mellan två tillstånd, till exempel när en robot arbetar med trasiga sensorer, klassas som spel med ofullständig information. Vår rapport fokuserar på ickedeterministiska fleragentsspel med ofullständig information, även kallat Multi-Agent Games of Imperfect Information Against Nature (MAGIIAN). Att hitta optimala strategier för dessa spel är mycket svårt både på grund av den ofullständiga informationen och på grund av flertalet agenter som ska samarbeta. Användandet av en generaliserad variant för fleragentspel av den kända Knowledge Based Subset Construction (KBSC) algoritmen kan hjälpa att hitta vinnande strategier för MAGIIAN. Medan KBSC algoritmen transformerar spelet till ett spel med fullständig information, så ger fleragentsvarianten istället ännu ett MAGIIAN. Om man applicerar algoritmen många gånger kommer vissa spel sluta att växa medan andra fortsätter växa gränslöst. Detta kallas att spelen är stabila eller divergeranta. Vår rapport fokuserar på olika strukturer i dessa spel och hur dessa påverkar stabiliteten. Genom att använda en implementation av MKBSC tillsammans med nya algoritmer har vi itererat över många olika spel och analyserat olika strukturer. Vi har hittat flertalet strukturer som påverkar stabiliteten. Genom att reducera divergenta spel så att alla kvarvarande komponenter krävs för divergens, kunde divergenses orsaker noggrannt undersökas. Detta bekräftade tidigare påståenden om att cykler krävs för divergens. Därefter motbevisades att överlappande observationer krävdes för divergens med hjälp av motexempel. Spel innehållandes välformade observationer visades stabilisera efter en iteration av MKBSC:n. Multi-Agent games Imperfect information Strategy synthesis Structural conditions Fleragentsspel Ofullständig information Strategisyntes Strukturella villkor Computer Sciences Datavetenskap (datalogi)
22	Power to the people : electricity demand and household behavior Vesterberg, Mattias January 2017 (has links) Paper [I] Using a unique and highly detailed data set on energy consumption at the appliance-level for 200 Swedish households, seemingly unrelated regression (SUR)-based end-use speciﬁc load curves are estimated. The estimated load curves are then used to explore possible restrictions on load shifting (e.g. the ofﬁce hours schedule) as well as the cost implications of different load shift patterns. The cost implications of shifting load from "expensive" to "cheap" hours, using the Nord Pool spot prices as a proxy for a dynamic price, are computed to be very small; roughly 2-4% reduction in total daily costs from shifting load up to ﬁve hours ahead, indicating small incentives for households (and retailers) to adopt dynamic pricing of electricity. Paper [II] Using a detailed data set on appliance-level electricity consumption at the hourly level, we provide the ﬁrst estimates of hourly and end-use-speciﬁc income elasticities for electricity. Such estimates are informative about how consumption patterns in general, and peak demand in particular, will develop as households’ income changes. We ﬁnd that the income elasticities are highest during peak hours for kitchen and lighting, with point estimates of roughly 0.4, but insigniﬁcant for space heating. Paper [III] In this paper, I estimate the price elasticity of electricity as a function of the choice between ﬁxed-price and variable-price contracts. Further, assuming that households have imperfect information about electricity prices and usage, I explore how media coverage of electricity prices affects electricity demand, both by augmenting price responsiveness and as a direct effect of media coverage on electricity demand, independent of prices. I also address the endogeneity of the choice of electricity contract. The parameters in the model are estimated using unique and detailed Swedish panel data on monthly household-level electricity consumption. I ﬁnd that price elasticities range between −0.025 and −0.07 at the mean level of media coverage, depending on contract choice, and that households with monthly variation in electricity prices respond more to prices when there is extensive media coverage of electricity prices. When media coverage is high, for example 840 news articles per month (which corresponds to the mean plus two standard deviations), the price elasticity is −0.12, or 1.7 times the elasticity at the mean media coverage. Similarly, media coverage is also found to have a direct effect on electricity demand. Paper [IV] I explore how households switch between ﬁxed-price and variable-price electricity contracts in response to variations in price and temperature, conditional on previous contract choice. Using panel data with roughly 54000 Swedish households, a dynamic probit model is estimated. The results suggest that the choice of contract exhibits substantial state dependence, with an estimated marginal effect of previous contractchoiceof0.96, andthattheeffectofvariationinpricesandtemperatureonthechoice of electricity contract is small. Further, the state dependence and price responsiveness are similar across housing types, income levels and other dimensions. A plausible explanation of these results is that transaction costs are larger than the relatively small cost savings from switching between contracts. electricity demand real-time pricing demand flexibility elasticity appliance-level data end-use media contract choice de-regulated market household behavior intermittent electricity production efficiency imperfect information
23	Choice Under Uncertainty: Violations of Optimality in Decision Making Rodenburg, Kathleen 11 June 2013 (has links) This thesis is an investigation of how subjects behave in an individual binary choice decision task with the option to purchase or observe for free additional information before reaching a decision. In part 1 of this thesis, an investigative study is conducted with the intent to sharpen the view to literature concerning corresponding psychology and economics experiments designed to test decision tasks that involve purchasing and observing information from an imperfect message prior to taking a terminal action choice. This investigative study identifies areas of research that warrant further investigation as well as provides enhancements for execution in the subsequent experiment conducted in Part 2 & 3 of this thesis. In Part 2 & 3, I conduct an experiment to test how subjects behave in an individual binary choice decision task with the option to purchase or observe for free additional information before reaching a final decision. I find that subjects’ behaviour over time converges toward optimal decisions prior to observing an imperfect information signal. However, when subjects observe an imperfect information signal prior to their terminal choice there is greater deviation from optimal behaviour. I find in addition to behaviour that is reflective of a risk-neutral BEU maximizer, status quo bias, over-weighing the informational value of the message received and past statistically independent outcomes influencing future choices. The subjects’ willingness to pay (WTP) to use the additional information gathered from an imperfect message service when making a final decision was on average less than the risk neutral BEU willingness to pay benchmark. Moreover, as the informative value of the message increased, causing the BEU valuation to increase, subjects under-estimated the value of the message signal to a greater degree. Although risk attitudes may have influenced the subjects’ WTP decisions, it does not account for the increased conservative WTP behaviour when information became more valuable. Additionally, the findings from this study suggest that individuals adopt different decision rules depending on both personal attributes (i.e. skillset, gender, experience) and on the context and environment in which the decision task is conducted. / SSHRC grant: Social Sciences and Humanities Research Council via Dr. Bram Cadsby Professor Department of Economics, University of Guelph
24	Information and politics Frisell, Lars January 2001 (has links) This thesis consists of four independent essays, which consider different topics in information economics and political economy. The first two papers are variants of the same idea. An uninformed principal, e.g., a government, will make a decision. In order to gain more information it may consult two experts; however, these experts have a private interest in certain policies being implemented. The question is, to gain as much information as possible, should the principal consult experts who are biased in the same direction, or experts who prefer different decisions? The main result is that, as long as collusion between experts can be prevented, homogeneous panels are superior to heterogeneous ones, and this advantage increases with the experts’ informational precision. In the third paper, two firms consider entry in a new product market and must decide when to enter the market and how to design their product. Firms do not know for certain what the best design is, so both firms want to outwait the other’s decision in order to gain more information. The focus of the paper is on which firm will make the first decision. The main result is that if products are strong (strategic) substitutes, the worst informed firm makes the first decision in equilibrium. The analysis should apply to a range of other contexts, such as investors’ trading decisions or the policy choices of political candidates. The final paper asks the following question: Could it be that parties in a two-party system may benefit from using several candidates in the same election? To promote the use of multiple candidates, I assume that a party never runs the risk of having its votes split up among its candidates. Despite this, it turns out that parties have a strong incentive to restrict their number of nominees. Paradoxically, it seems that the more uncertain parties are about voter opinion, the fewer candidates they want to use. In particular, with a uniform voter distribution the optimal number of candidates is one. / Diss. Stockholm : Handelshögsk., 2001 S. v-vii: sammanfattning, s. 1-72: 4 uppsatser Homogeneous vs. heterogeneous committees Imperfect information Informational efficiency External forces Endogenous decision order Informational spillovers Payoff externalities Two-party system Multiple candidates Economics Nationalekonomi
25	Řešení koncovek ve velkých hrách s neúplnou informací jako je např. Poker / Solving Endgames in Large Imperfect-Information Games such as Poker Ha, Karel January 2016 (has links) Title: Solving Endgames in Large Imperfect-Information Games such as Poker Author: Bc. Karel Ha Department: Department of Applied Mathematics Supervisor: doc. Mgr. Milan Hladík, Ph.D., Department of Applied Mathematics Abstract: Endgames have a distinctive role for players. At the late stage of games, many aspects are finally clearly defined, deeming exhaustive analysis tractable. Specialised endgame handling is rewarding for games with perfect information (e.g., Chess databases pre-computed for entire classes of endings, or dividing Go board into separate independent subgames). An appealing idea would be to extend this approach to imperfect-information games such as the famous Poker: play the early parts of the game, and once the subgame becomes feasible, calculate an ending solution. However, the problem is much more complex for imperfect information. Subgames need to be generalized to account for information sets. Unfortunately, such a generalization cannot be solved straightaway, as it does not generally preserve optimality. As a consequence, we may end up with a far more exploitable strategy. There are currently three techniques to deal with this challenge: (a) disregard the problem entirely; (b) use a decomposition technique, which sadly retains only the same quality; (c) or formalize improvements of...
26	AI for an Imperfect-Information Wargame with Self-Play Reinforcement Learning / AI med självspelande förstärkningsinlärning för ett krigsspel med imperfekt information Ryblad, Filip January 2021 (has links) The task of training AIs for imperfect-information games has long been difficult. However, recently the algorithm ReBeL, a general framework for self-play reinforcement learning, has been shown to excel at heads-up no-limit Texas hold 'em, among other imperfect-information games. In this report the ability to adapt ReBeL to a downscaled version of the strategy wargame \say{Game of the Generals} is explored. It is shown that an implementation of ReBeL that uses no domain-specific knowledge is able to beat all benchmark bots, which indicates that ReBeL can be a useful framework when training AIs for imperfect-information wargames. / Det har länge varit en utmaning att träna AI:n för spel med imperfekt information. Nyligen har dock algoritmen ReBeL, ett generellt ramverk för självspelande förstärkningsinlärning, visat lovande prestanda i heads-up no-limit Texas hold 'em och andra spel med imperfekt information. I denna rapport undersöks ReBeLs förmåga att anpassas till en nedskalad version av spelet \say{Game of the Generals}, vilket är ett strategiskt krigsspel. Det visas att en implementation av ReBeL som inte använder någon domänspecifik kunskap klarar av att besegra alla bottar som användes vid jämförelse, vilket indikerar att ReBeL kan vara ett användbart ramverk för att träna AI:n för krigsspel med imperfekt information. ReBeL deep reinforcement learning self-play game theory imperfect-information ReBeL djup förstärkningsinlärning självspelande spelteori imperfekt information Other Mathematics Annan matematik
27	Grid-based Pursuit Evasion Games of Imperfect Information: Theory and Higher Order Knowledge-based Strategies Granqvist, Jacob, Haker, Jonas January 2022 (has links) One group of games studied within game theory are grid-based pursuit evasion games of imperfect information. A pursuit evasion game is in essence a game where there exists a set of pursuers which have as their objective to capture a set of evaders. This thesis aims to develop a formalisation of this type of games as well as describing and integrating vital game theoretical concepts such as order of knowledge into this game. With the developed formalism at hand, the concept of knowledge-based strategies is then introduced, which is essential when searching for the way to play the game most efficiently. The formalisation of the game is then followed by a simulation, measuring the performance of some older and some newly developed knowledge-based strategies. The thesis concludes that the formalisation is applicable on a more general class of pursuit evasion games and enables a wider study of the game. The simulation results indicate that knowledge-based strategies of higher order do not always perform better compared to simpler strategies of lower order of knowledge. Furthermore, strategies which allow for communication between agents are found to be superior to communication-less strategies. / En typ av spel som studeras inom spelteori är rutnätsbaserade jakt-flykt-spel med ofullständig information. Ett jakt-flykt-spel går ut på att det existerar en samling jagande aktörer som försöker fånga en samling flyende aktörer. Denna uppsats söker utveckla en formalism för denna typ av spel såväl som att beskriva och integrera ett antal nyckelkoncept inom spelteori såsom kunskapsordning. Med hjälp av den utvecklade formalismen, framställs så kallade kunskapsbaserade strategier, vilka är av fundamental vikt i sökandet efter sätt att spela spelet på det effektivaste sättet. Kapitlet om formalismen följs sedan av simuleringar där några äldre och några nyare kunskapsbaserade strategier prövas. Slutsatsen dras att den nya formalismen kan vara applicerbar på en bredare samling jakt-flykt-spel än den initialt påtänkta. Vidare underlättar formalismen en generalisering till andra sätt att beskriva spel. Simulationsresultaten indikerar att kunskapsbaserade strategier av högre ordning inte alltid presterar bättre än enklare strategier av lägre ordning. Till yttermera visso visar sig kommunikationslösa strategier vara underlägsna strategier som tillåter kommunikation. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm Pursuit Evasion Games Knowledge representation Imperfect Information Higher Order Knowledge Knowledge-based Strategies Communication-based Strategies Game Theory Elektroteknik och elektronik
28	Reinforcement Learning for Multi-Agent Strategy Synthesis Using Higher-Order Knowledge Forsell, Gustav, Gergi, Shamoun January 2023 (has links) Imagine for a moment we are living in the distant future where autonomous robots are patrollingthe streets as police officers. Two such robots are chasing a robber through the city streets. Fearingthe thief might listen in to any potential transmission, both robots remain radio silent and are thuslimited to a strictly visual pursuit. Since the robots cannot see the robber the entire time, they haveto deduce the potential location of the robber. What would the best strategy be for these robots toachieve their objective? This bachelor's thesis investigated the above example by creating strategies through reinforcementlearning. The thesis also investigated the performance of the players when they have differentabilities of deduction. This was tested by creating a suitable game and corresponding reinforcementlearning algorithm and running the simulations for different degrees of knowledge. The study provedthat reinforcement learning is a viable method for strategy construction, reaching nearly guaranteedvictory for cases when the agent knows everything about the environment and a slightly lower winratio when there is uncertainty introduced. The implementation yielded only a small gain in win ratiowhen the agents could deduce even more about each other. / Föreställ dig för ett ögonblick att vi lever i en avlägsen framtid där autonoma robotar patrullerar pågatorna som poliser. Två sådana robotar jagar en rånare genom stadens gator. Eftersom de är räddaför att tjuven kan lyssna på alla möjliga sändningar, förblir båda robotarna radiotysta och är därförbegränsade till en strikt visuell strävan. Eftersom robotarna inte kan se rånaren hela tiden, måste dehärleda den potentiella platsen för rånaren. Vilken skulle den bästa strategin vara för dessa robotarför att uppnå sitt mål? Denna kandidatuppsats undersökte ovanstående exempel genomskapa strategier genomförstärkningsinlärning. Avhandlingen undersökte också spelarnas prestationer när de har olikaavdragsförmåga. Detta testades genom att skapa ett lämpligt spel och motsvarandeförstärkningsinlärningsalgoritm och köra simuleringarna för olika kunskapsgrader. Studien visade attförstärkningsinlärning är en användbar metod för strategikonstruktion, och når nästan garanteradseger i fall då agenten vet allt om miljön och en något lägre vinstkvot när det finns osäkerhet.Implementeringen gav bara en liten vinst i vinstförhållandet när agenterna kunde härleda ännu merom varandra. / Kandidatexjobb i elektroteknik 2023, KTH, Stockholm Higher Order Knowledge Imperfect Information Reinforcement Learning Deep Q- networks Knowledge Representation Pursuit Evasion Games Elektroteknik och elektronik
29	Multi-Agent Games of Imperfect Information: Algorithms for Strategy Synthesis Åkerblom Jonsson, Viktor, Berisha, David January 2021 (has links) The aim of this project was to improve upon a toolfor strategy synthesis for multi-agent games of imperfect informationagainst nature. Another objective was to compare the toolwith the original tool we improved upon and the Strategic ModelChecker (SMC). For the strategy synthesis, an existing extensionfor expanding the games called the Multi-Agent Knowledge-Based Subset Construction was used. The construction creates anew knowledge-based game where strategies can be tested. Thestrategies were synthesized for the individual agents and thenjoint profiles of the individual strategies were tested to see ifthey were winning.Four different algorithms for going through the game graphswere tested against the other existing tools. The new andimproved tool was faster at synthesizing a strategy than both theold tool and the SMC for almost all games tested. Although forthe games where the new tool is out-performed, results indicateit to be due to a combination of chance and how the games areperceived by the tools. No algorithm or tool proved to be thebest performing for all games. / Syftet med detta projekt var att förbättra ettexisterande verktyg för att syntetisera strategier för fleragentspelav imperfect information mot naturen. Därefter också jämföraverktyget med original verktyget och med ett verktyg somheter the strategic model checker (SMC). För syntetiseringenav strategier användes ett existerande verktyg för att expanderaspel, som kallas Multi-Agent Knowledge-Based Subset Construction.Konstruktionen skapar ett kunskapsbaserat spel därstrategierna kan bli testade. Strategierna syntetiserades för deenskilda agenterna och därefter skapades en sammansatt profilav strategier, som då testades för att se om det var en vinnandestrategi.Fyra olika algoritmer för att gå igenom spelgrafen testadesoch jämfördes med de andra verktygen. Det nya och förbättradeverktyget var snabbare att syntetisera en strategi än både detgamla verktyget och SMC verktyget för nästan alla spel somtestades. Fast, för spelen då nya verktyget inte var snabbast så indikerar resultaten på att detta är p.g.a. en kombination avslump och hur spelen ses på av verktygen. Ingen algoritm ellerverktyg visade sig vara det snabbaste för samtliga spel. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm Strategy Synthesis Strategic Model Checker Multi- Agent Games Imperfect Information Elektroteknik och elektronik
30	Dynamic opponent modelling in two-player games Mealing, Richard Andrew January 2015 (has links) This thesis investigates decision-making in two-player imperfect information games against opponents whose actions can affect our rewards, and whose strategies may be based on memories of interaction, or may be changing, or both. The focus is on modelling these dynamic opponents, and using the models to learn high-reward strategies. The main contributions of this work are: 1. An approach to learn high-reward strategies in small simultaneous-move games against these opponents. This is done by using a model of the opponent learnt from sequence prediction, with (possibly discounted) rewards learnt from reinforcement learning, to lookahead using explicit tree search. Empirical results show that this gains higher average rewards per game than state-of-the-art reinforcement learning agents in three simultaneous-move games. They also show that several sequence prediction methods model these opponents effectively, supporting the idea of using them from areas such as data compression and string matching; 2. An online expectation-maximisation algorithm that infers an agent's hidden information based on its behaviour in imperfect information games; 3. An approach to learn high-reward strategies in medium-size sequential-move poker games against these opponents. This is done by using a model of the opponent learnt from sequence prediction, which needs its hidden information (inferred by the online expectation-maximisation algorithm), to train a state-of-the-art no-regret learning algorithm by simulating games between the algorithm and the model. Empirical results show that this improves the no-regret learning algorithm's rewards when playing against popular and state-of-the-art algorithms in two simplified poker games; 4. Demonstrating that several change detection methods can effectively model changing categorical distributions with experimental results comparing their accuracies to empirical distributions. These results also show that their models can be used to outperform state-of-the-art reinforcement learning agents in two simultaneous-move games. This supports the idea of modelling changing opponent strategies with change detection methods; 5. Experimental results for the self-play convergence to mixed strategy Nash equilibria of the empirical distributions of plays of sequence prediction and change detection methods. The results show that they converge faster, and in more cases for change detection, than fictitious play. 006.3

Search results