Global ETD Search

251	MAS-based affective state analysis for user guiding in on-line social environments Aguado Sarrió, Guillem 07 April 2021 (has links) [ES] Recientemente, hay una fuerte y creciente influencia de aplicaciones en línea en nuestro día a día. Más concretamente las redes sociales se cuentan entre las plataformas en línea más usadas, que permiten a usuarios comunicarse e interactuar desde diferentes partes del mundo todos los días. Dado que estas interacciones conllevan diferentes riesgos, y además los adolescentes tienen características que los hacen más vulnerables a ciertos riesgos, es deseable que el sistema pueda guiar a los usuarios cuando se encuentren interactuando en línea, para intentar mitigar la probabilidad de que caigan en uno de estos riesgos. Esto conduce a una experiencia en línea más segura y satisfactoria para usuarios de este tipo de plataformas. El interés en aplicaciones de inteligencia artificial capaces de realizar análisis de sentimientos ha crecido recientemente. Los usos de la detección automática de sentimiento de usuarios en plataformas en línea son variados y útiles. Se pueden usar polaridades de sentimiento para realizar minería de opiniones en personas o productos, y así descubrir las inclinaciones y opiniones de usuarios acerca de ciertos productos (o ciertas características de ellos), para ayudar en campañas de marketing, y también opiniones acerca de personas como políticos, para descubrir la intención de voto en un periodo electoral, por ejemplo. En esta tesis, se presenta un Sistema Multi-Agente (SMA), el cual integra agentes que realizan diferentes análisis de sentimientos y de estrés usando texto y dinámicas de escritura (usando análisis unimodal y multimodal), y utiliza la respuesta de los analizadores para generar retroalimentación para los usuarios y potencialmente evitar que caigan en riesgos y difundan comentarios en plataformas sociales en línea que pudieran difundir polaridades de sentimiento negativas o niveles altos de estrés. El SMA implementa un análisis en paralelo de diferentes tipos de datos y generación de retroalimentación a través del uso de dos mecanismos diferentes. El primer mecanismo se trata de un agente que realiza generación de retroalimentación y guiado de usuarios basándose en un conjunto de reglas y la salida de los analizadores. El segundo mecanismo es un módulo de Razonamiento Basado en Casos (CBR) que usa no solo la salida de los analizadores en los mensajes del usuario interactuando para predecir si su interacción puede generar una futura repercusión negativa, sino también información de contexto de interacciones de usuarios como son los tópicos sobre los que hablan o información sobre predicciones previas en mensajes escritos por la gente que conforma la audiencia del usuario. Se han llevado a cabo experimentos con datos de una red social privada generada en laboratorio con gente real usando el sistema en tiempo real, y también con datos de Twitter.com para descubrir cuál es la eficacia de los diferentes analizadores implementados y del módulo CBR al detectar estados del usuario que se propagan más en la red social. Esto conlleva descubrir cuál de las técnicas puede prevenir mejor riesgos potenciales que los usuarios pueden sufrir cuando interactúan, y en qué casos. Se han encontrado diferencias estadísticamente significativas y la versión final del SMA incorpora los analizadores que mejores resultados obtuvieron, un agente asesor o guía basado en reglas y un módulo CBR. El trabajo de esta tesis pretende ayudar a futuros desarrolladores de sistemas inteligentes a crear sistemas que puedan detectar el estado de los usuarios interactuando en sitios en línea y prevenir riesgos que los usuarios pudiesen enfrentar. Esto propiciaría una experiencia de usuario más segura y satisfactoria. / [CA] Recentment, hi ha una forta i creixent influència d'aplicacions en línia en el nostre dia a dia, i concretament les xarxes socials es compten entre les plataformes en línia més utilitzades, que permeten a usuaris comunicar-se i interactuar des de diferents parts del món cada dia. Donat que aquestes interaccions comporten diferents riscos, i a més els adolescents tenen característiques que els fan més vulnerables a certs riscos, seria desitjable que el sistema poguera guiar als usuaris mentre es troben interactuant en línia, per així poder mitigar la probabilitat de caure en un d'aquests riscos. Açò comporta una experiència en línia més segura i satisfactòria per a usuaris d'aquest tipus de plataformes. L'interés en aplicacions d'intel·ligència artificial capaces de realitzar anàlisi de sentiments ha crescut recentment. Els usos de la detecció automàtica de sentiments en usuaris en plataformes en línia són variats i útils. Es poden utilitzar polaritats de sentiment per a realitzar mineria d'opinions en persones o productes, i així descobrir les inclinacions i opinions d'usuaris sobre certs productes (o certes característiques d'ells), per a ajudar en campanyes de màrqueting, i també opinions sobre persones com polítics, per a descobrir la intenció de vot en un període electoral, per exemple. En aquesta tesi, es presenta un Sistema Multi-Agent (SMA), que integra agents que implementen diferents anàlisis de sentiments i d'estrés utilitzant text i dinàmica d'escriptura (utilitzant anàlisi unimodal i multimodal), i utilitza la resposta dels analitzadors per a generar retroalimentació per als usuaris i potencialment evitar que caiguen en riscos i difonguen comentaris en plataformes socials en línia que pogueren difondre polaritats de sentiment negatives o nivells alts d'estrés. El SMA implementa una anàlisi en paral·lel de diferents tipus de dades i generació de retroalimentació a través de l'ús de dos mecanismes diferents. El primer mecanisme es tracta d'un agent que realitza generació de retroalimentació i guia d'usuaris basant-se en un conjunt de regles i l'eixida dels analitzadors. El segon mecanisme és un mòdul de Raonament Basat en Casos (CBR) que utilitza no solament l'eixida dels analitzadors en els missatges de l'usuari per a predir si la seua interacció pot generar una futura repercussió negativa, sinó també informació de context d'interaccions d'usuaris, com són els tòpics sobre els quals es parla o informació sobre prediccions prèvies en missatges escrits per la gent que forma part de l'audiència de l'usuari. S'han realitzat experiments amb dades d'una xarxa social privada generada al laboratori amb gent real utilitzant el sistema implementat en temps real, i també amb dades de Twitter.com per a descobrir quina és l'eficàcia dels diferents analitzadors implementats i del mòdul CBR en detectar estats de l'usuari que es propaguen més a la xarxa social. Açò comporta descobrir quina de les tècniques millor pot prevenir riscos potencials que els usuaris poden sofrir quan interactuen, i en quins casos. S'han trobat diferències estadísticament significatives i la versió final del SMA incorpora els analitzadors que millors resultats obtingueren, un agent assessor o guia basat en regles i un mòdul CBR. El treball d'aquesta tesi pretén ajudar a futurs dissenyadors de sistemes intel·ligents a crear sistemes que puguen detectar l'estat dels usuaris interactuant en llocs en línia i prevenir riscos que els usuaris poguessen enfrontar. Açò propiciaria una experiència d'usuari més segura i satisfactòria. / [EN] In the present days, there is a strong and growing influence of on-line applications in our daily lives, and concretely Social Network Sites (SNSs) are one of the most used on-line social platforms that allow users to communicate and interact from different parts of the world every day. Since this interaction poses several risks, and also teenagers have characteristics that make them more vulnerable to certain risks, it is desirable that the system could be able to guide users when interacting on-line, to try and mitigate the probability of incurring one of those risks. This would in the end lead to a more satisfactory and safe experience for the users of such on-line platforms. Recently, interest in artificial intelligence applications being able to perform sentiment analysis has risen. The uses of detecting the sentiment of users in on-line platforms or sites are variated and rewarding. Sentiment polarities can be used to perform opinion mining on people or products, and discover the inclinations and opinions of users on certain products (or certain features of them) to help marketing campaigns, and also on people such as politics, to discover the voting intention for example in electoral periods. In this thesis, a Multi-Agent System (MAS) is presented, which integrates agents that perform different sentiment and stress analyses using text and keystroke dynamics data (using both unimodal and multi-modal analysis). The MAS uses the output of the analyzers for generating feedback for users and potentially avoids them from incurring risks and spreading comments in on-line social platforms that could lead to the spread of negative sentiment or high-stress levels. Moreover, the MAS incorporates parallelized analyses of different data types and feedback generation via the use of two different mechanisms. On the one hand, a rule-based advisor agent has been implemented, that generates feedback or guiding for users based on the output of the analyzers and a set of rules. On the other hand, a Case-Based Reasoning (CBR) module that uses not only the output of the different analyzers on the messages of the user interacting, but also context information from user interactions such as the topics being talked about or information about the previous states detected on messages written by people in the audience of the user. Experiments with data from a private SNS generated in a laboratory with real people using the system in real-time, and also with data from Twitter.com have been performed to ascertain the efficacy of the different analyzers implemented and the CBR module on detecting states of the user that propagate more in the network, which leads to discovering which of the techniques is able to better prevent potential risks that users could face when interacting, and in which cases. Significant differences were found and the final version of the MAS incorporates the best-performing analyzer agents, a rule-based advisor agent, and a CBR module. In the end, this thesis aims to help intelligent systems developers to build systems that are able to detect the state of users interacting in on-line sites and prevent risks that they could face, leading to a more satisfactory and safe user experience. / This thesis was funded by the following research projects: Privacy in Social Educational Environments during Child-hood and Adolescence (PESEDIA), Ministerio de Economia y Empresa (TIN2014-55206-R) and Intelligent Agents for Privacy Advice in Social Networks (AI4PRI), Ministerio de Economia y Empresa (TIN2017-89156-R) / Aguado Sarrió, G. (2021). MAS-based affective state analysis for user guiding in on-line social environments [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/164902 / TESIS Guía del usuario Razonamiento basado en casos DInámica de escritura Dinámica de pulsaciones de teclas Análisis combinado Análisis de estrés Análisis de sentimiento Redes sociales Sistema multiagente Multi-Agent System Social Networks Sentiment Analysis Stress Analysis Combined Analysis Keystroke dynamics Case-based reasoning User advice LENGUAJES Y SISTEMAS INFORMATICOS
252	Contribution à l'étude de la stabilité et à la stabilisation des réseaux DC à récupération d'énergie / Contribution to the stability analysis and stabilization of DC microgrid with energy storage capability Magne, Pierre 30 April 2012 (has links) Ce mémoire est consacré à l'étude du phénomène d'instabilité pouvant apparaître sur les bus continus des réseaux DC. En effet, l'interaction entre les différents sous-systèmes électriques (source, charge, filtre) composant le réseau DC peut conduire, sous certaines conditions, à l'instabilité du système. A partir de la modélisation des charges sous forme de "Charge à Puissance Constante" (notée CPL), des méthodes d'études permettant l'analyse de la stabilité "petit-signal" et "grand-signal" des systèmes électriques sont présentées. Celles-ci permettent de mettre en évidence le fait qu'un réseau DC ne peut pas fournir n'importe quelle puissance à ses charges sans devenir instable. Ces puissances limites dépendent à la fois de la structure du réseau et des valeurs de ses éléments passifs et de sa tension de bus. Afin de pouvoir augmenter l'amortissement/les marges de stabilité du système, des méthodes de stabilisation sont présentées dans ce mémoire. Elles proposent d'adapter les commandes des charges de manière à assurer sa stabilité. Ceci se fait grâce à l'addition d'un signal stabilisant sur la référence de chaque charge. Ce signal n'est visible que durant les régimes transitoires de la charge afin de ne pas modifier le point de fonctionnement demandé. Néanmoins, plus on voudra stabiliser une charge et plus son signal stabilisant sera important. Un bon compromis doit donc être trouvé afin d'assurer la stabilité du système sans altérer les performances dynamiques des charges. Deux approches différentes sont proposées afin de générer ces commandes stabilisantes. La première se base sur la mise en place d'un stabilisateur centralisé. Deux méthodes centralisées sont alors proposées : la première s'appuie sur la théorie des multimodèles de Takagi-Sugeno alors que la seconde s'appuie sur la théorie de Lyapunov. Cette dernière permettra d'orienter les efforts de stabilisation sur les charges souhaitées pour par exemple, les diriger principalement vers les organes de récupération d'énergie. La seconde approche se base sur la mise en place d'un système de stabilisation multi-agent. Celui-ci présente une structure décentralisée où chaque agent correspond à un bloc de stabilisation. Ceux-ci vont compenser localement les impacts déstabilisants de leur charge respective et leurs actions combinées permettront d'assurer la stabilité du système. De plus, on propose d'utiliser un algorithme d'optimisation sous contraintes qui permettra de donner un dimensionnement du système minimisant les efforts de stabilisation tout en considérant des cas de défaut tels que la perte de l'un des agents ou la reconfiguration du réseau / This thesis is devoted to the analysis of the instability phenomenon that may appear on the DC bus of DC microgrids. Indeed, interaction between the different electrical subsystems of the grid (source, load, filters) can lead, under certain conditions, to the system instability. From the "Constant Power Load" (CPL) hypothesis for the loads, this thesis presents studying methods for "small-signal" and "large-signal" stability analysis of electrical systems. This highlights that a DC microgrid cannot power the loads more than a maximum limit without becoming unstable. This power limitation depends on the structure of the grid, the value of its passive components, and its bus voltage. In order to improve the microgrid stability, stabilization methods are presented in this thesis. They propose to adapt the loads control to ensure the system stability. This is achieved by the addition of a stabilizing signal to the reference of each load. This signal is only visible during the load power transient mode to not change the requested operating point. However, a good trade-off must be found to ensure system stability without affecting the dynamic performance of its loads. Two approaches are investigated to generate the stabilizing commands. The first one is based on the establishment of a centralized stabilization block. Two centralized methods have been developed: the first one is based on the Takagi-Sugeno theory while the second is based on the Lyapunov theory. This latest permits to guide the stabilizing effort on the desired loads. For example, stabilizing effort can be oriented on the energy storage device. The second approach is based on the establishment of a multi-agent stabilizing system. It consists of a decentralized structure in which each agent corresponds to a stabilization block. These will locally compensate the destabilizing impact of their respective load on the microgrid, and their combined actions ensure the system stability. To design the system, the use of a constrained optimization algorithm is proposed. This permits to minimize stabilization efforts while considering faulty events such as the failure of one of the agents or a reconfiguration of the microgrid Réseau DC Source de stockage électrique Charge à puissance constante Stabilité "petit-signal" Stabilité "grand-signal" Multimodèles de Takagi-Sugeno Stabilisation Fonction de Lyapunov Système multi-agent DC network Energy storage device Constant power load "small-signal" stability "large-signal" stability Takagi-Sugeno model Stabilization Active damping Lyapunov function Multi-agent system 621.31
253	Approche multi-agents pour la gestion des fermes éoliennes offshore / A multi-agent approach for offshore wind farms management Paniah, Crédo 21 May 2015 (has links) La raréfaction des sources de production conventionnelles et leurs émissions nocives ont favorisé l’essor notable de la production renouvelable, plus durable et mieux répartie géographiquement. Toutefois, son intégration au système électrique est problématique. En effet, la production renouvelable est peu prédictible et issue de sources majoritairement incontrôlables, ce qui compromet la stabilité du réseau, la viabilité économique des producteurs et rend nécessaire la définition de solutions adaptées pour leur participation au marché de l’électricité. Dans ce contexte, le projet scientifique Winpower propose de relier par un réseau à courant continu les ressources de plusieurs acteurs possédant respectivement des fermes éoliennes offshore (acteurs EnR) et des centrales de stockage de masse (acteurs CSM). Cette configuration impose aux acteurs d’assurer conjointement la gestion du réseau électrique.Nous supposons que les acteurs participent au marché comme une entité unique : cette hypothèse permet aux acteurs EnR de tirer profit de la flexibilité des ressources contrôlables pour minimiser le risque de pénalités sur le marché de l’électricité, aux acteurs CSM de valoriser leurs ressources auprès des acteurs EnR et/ou auprès du marché et à la coalition de faciliter la gestion des déséquilibres sur le réseau électrique, en agrégeant les ressources disponibles. Dans ce cadre, notre travail s’attaque à la problématique de la participation au marché EPEX SPOT Day-Ahead de la coalition comme une centrale électrique virtuelle ou CVPP (Cooperative Virtual Power Plant). Nous proposons une architecture de pilotage multi-acteurs basée sur les systèmes multi-agents (SMA) : elle permet d’allier les objectifs et contraintes locaux des acteurs et les objectifs globaux de la coalition.Nous formalisons alors l’agrégation et la planification de l’utilisation des ressources comme un processus décisionnel de Markov (MDP), un modèle formel adapté à la décision séquentielle en environnement incertain, pour déterminer la séquence d’actions sur les ressources contrôlables qui maximise l’espérance des revenus effectifs de la coalition. Toutefois, au moment de la planification des ressources de la coalition, l’état de la production renouvelable n’est pas connue et le MDP n’est pas résoluble en l’état : on parle de MDP partiellement observable (POMDP). Nous décomposons le POMDP en un MDP classique et un état d’information (la distribution de probabilités des erreurs de prévision de la production renouvelable) ; en extrayant cet état d’information de l’expression du POMDP, nous obtenons un MDP à état d’information (IS-MDP), pour la résolution duquel nous proposons une adaptation d’un algorithme de résolution classique des MDP, le Backwards Induction.Nous décrivons alors un cadre de simulation commun pour comparer dans les mêmes conditions nos propositions et quelques autres stratégies de participation au marché dont l’état de l’art dans la gestion des ressources renouvelables et contrôlables. Les résultats obtenus confortent l’hypothèse de la minimisation du risque associé à la production renouvelable, grâce à l’agrégation des ressources et confirment l’intérêt de la coopération des acteurs EnR et CSM dans leur participation au marché de l’électricité. Enfin, l’architecture proposée offre la possibilité de distribuer le processus de décision optimale entre les différents acteurs de la coalition : nous proposons quelques pistes de solution dans cette direction. / Renewable Energy Sources (RES) has grown remarkably in last few decades. Compared to conventional energy sources, renewable generation is more available, sustainable and environment-friendly - for example, there is no greenhouse gases emission during the energy generation. However, while electrical network stability requires production and consumption equality and the electricity market constrains producers to contract future production a priori and respect their furniture commitments or pay substantial penalties, RES are mainly uncontrollable and their behavior is difficult to forecast accurately. De facto, they jeopardize the stability of the physical network and renewable producers competitiveness in the market. The Winpower project aims to design realistic, robust and stable control strategies for offshore networks connecting to the main electricity system renewable sources and controllable storage devices owned by different autonomous actors. Each actor must embed its own local physical device control strategy but a global network management mechanism, jointly decided between connected actors, should be designed as well.We assume a market participation of the actors as an unique entity (the coalition of actors connected by the Winpower network) allowing the coalition to facilitate the network management through resources aggregation, renewable producers to take advantage of controllable sources flexibility to handle market penalties risks, as well as storage devices owners to leverage their resources on the market and/or with the management of renewable imbalances. This work tackles the market participation of the coalition as a Cooperative Virtual Power Plant. For this purpose, we describe a multi-agent architecture trough the definition of intelligent agents managing and operating actors resources and the description of these agents interactions; it allows the alliance of local constraints and objectives and the global network management objective.We formalize the aggregation and planning of resources utilization as a Markov Decision Process (MDP), a formal model suited for sequential decision making in uncertain environments. Its aim is to define the sequence of actions which maximize expected actual incomes of the market participation, while decisions over controllable resources have uncertain outcomes. However, market participation decision is prior to the actual operation when renewable generation still is uncertain. Thus, the Markov Decision Process is intractable as its state in each decision time-slot is not fully observable. To solve such a Partially Observable MDP (POMDP), we decompose it into a classical MDP and an information state (a probability distribution over renewable generation errors). The Information State MDP (IS-MDP) obtained is solved with an adaptation of the Backwards Induction, a classical MDP resolution algorithm.Then, we describe a common simulation framework to compare our proposed methodology to some other strategies, including the state of the art in renewable generation market participation. Simulations results validate the resources aggregation strategy and confirm that cooperation is beneficial to renewable producers and storage devices owners when they participate in electricity market. The proposed architecture is designed to allow the distribution of the decision making between the coalition’s actors, through the implementation of a suitable coordination mechanism. We propose some distribution methodologies, to this end. Sources d’Énergie Renouvelables (EnR) Centrales de Stockage Agrégation Marché de l’Électricité EPEX SPOT Système Multi-Agents (SMA Processus décisionnel de Markov État d’Information Renewable Energy Sources Storage Aggregation Electricity Market EPEX SPOT Cooperative Virtual Power Plant Multi-Agent System Markov Decision Process Information State
254	Simulační model řízení obchodní jednotky / Simulation model of a retail store BRYCHCÍN, Karel January 2013 (has links) In this work are summarized the theoretical basis of retail and simulation models usable as decision making support in the management of retail units. There are described the specifics of retail and specifics of retail units in terms of their classification and basic theoretical foundations for the creation of simulation models. The work also describes the default multi-agent simulation model created by the leader of this work,. Ing. Viktor Vojtko, Ph.D., on which this work builds. Then work describes creation of case studies using multi-agent simulation model, including the calibration process of models for these case studies. General methodology of creating case studies is described in next part of the work. Then the created methodology is varified by scenarios and the last part describes proposals for further editing of simulation model.
255	GAME-THEORETIC MODELING OF MULTI-AGENT SYSTEMS: APPLICATIONS IN SYSTEMS ENGINEERING AND ACQUISITION PROCESSES Salar Safarkhani (9165011) 24 July 2020 (has links) <div><div><div><p>The process of acquiring the large-scale complex systems is usually characterized with cost and schedule overruns. To investigate the causes of this problem, we may view the acquisition of a complex system in several different time scales. At finer time scales, one may study different stages of the acquisition process from the intricate details of the entire systems engineering process to communication between design teams to how individual designers solve problems. At the largest time scale one may consider the acquisition process as series of actions which are, request for bids, bidding and auctioning, contracting, and finally building and deploying the system, without resolving the fine details that occur within each step. In this work, we study the acquisition processes in multiple scales. First, we develop a game-theoretic model for engineering of the systems in the building and deploying stage. We model the interactions among the systems and subsystem engineers as a principal-agent problem. We develop a one-shot shallow systems engineering process and obtain the optimum transfer functions that best incentivize the subsystem engineers to maximize the expected system-level utility. The core of the principal-agent model is the quality function which maps the effort of the agent to the performance (quality) of the system. Therefore, we build the stochastic quality function by modeling the design process as a sequential decision-making problem. Second, we develop and evaluate a model of the acquisition process that accounts for the strategic behavior of different parties. We cast our model in terms of government-funded projects and assume the following steps. First, the government publishes a request for bids. Then, private firms offer their proposals in a bidding process and the winner bidder enters in a con- tract with the government. The contract describes the system requirements and the corresponding monetary transfers for meeting them. The winner firm devotes effort to deliver a system that fulfills the requirements. This can be assumed as a game that the government plays with the bidder firms. We study how different parameters in the acquisition procedure affect the bidders’ behaviors and therefore, the utility of the government. Using reinforcement learning, we seek to learn the optimal policies of involved actors in this game. In particular, we study how the requirements, contract types such as cost-plus and incentive-based contracts, number of bidders, problem complexity, etc., affect the acquisition procedure. Furthermore, we study the bidding strategy of the private firms and how the contract types affect their strategic behavior.</p></div></div></div> Applied Computer Science Mechanical Engineering Operations Research Engineering Systems Design Multi agent system (MAS) deep reinforcement learning machine Learning game theory deep learning gaussian process systems science and theory bi-level optimization principal-agent model systems engineering process system acquisition process contracts strategic behavior auction bidding
256	Prediction of Protein-Protein Interactions Using Deep Learning Techniques Soleymani, Farzan 24 April 2023 (has links) Proteins are considered the primary actors in living organisms. Proteins mainly perform their functions by interacting with other proteins. Protein-protein interactions underpin various biological activities such as metabolic cycles, signal transduction, and immune response. PPI identification has been addressed by various experimental methods such as the yeast two-hybrid, mass spectrometry, and protein microarrays, to mention a few. However, due to the sheer number of proteins, experimental methods for finding interacting and non-interacting protein pairs are time-consuming and costly. Therefore a sequence-based framework called ProtInteract is developed to predict protein-protein interaction. ProtInteract comprises two components: first, a novel autoencoder architecture that encodes each protein's primary structure to a lower-dimensional vector while preserving its underlying sequential pattern by extracting uncorrelated attributes and more expressive descriptors. This leads to faster training of the second network, a deep convolutional neural network (CNN) that receives encoded proteins and predicts their interaction. Three different scenarios formulate the prediction task. In each scenario, the deep CNN predicts the class of a given encoded protein pair. Each class indicates different ranges of confidence scores corresponding to the probability of whether a predicted interaction occurs or not. The proposed framework features significantly low computational complexity and relatively fast response. The present study makes two significant contributions to the field of protein-protein interaction (PPI) prediction. Firstly, it addresses the computational challenges posed by the high dimensionality of protein datasets through the use of dimensionality reduction techniques, which extract highly informative sequence attributes. Secondly, the proposed framework, ProtInteract, utilises this information to identify the interaction characteristics of a protein based on its amino acid configuration. ProtInteract encodes the protein's primary structure into a lower-dimensional vector space, thereby reducing the computational complexity of PPI prediction. Our results provide evidence of the proposed framework's accuracy and efficiency in predicting protein-protein interactions. Long-short Term Memory Recurrent Neural Networks Protein-Protein Interaction Temporal Convolutional Network Convolutional Neural Network Autoencoder Reinforcement learning actor-critic portfolio management stock market prediction coverage control multi-agent system SARSA Q-learning Graph convolutional neural network GCN state-action-reward-state-action
257	Collaboration in Multi-agent Games : Synthesis of Finite-state Strategies in Games of Imperfect Information / Samarbete i multiagent-spel : Syntes av ändliga strategier i spel med ofullständig information Lundberg, Edvin January 2017 (has links) We study games where a team of agents needs to collaborate against an adversary to achieve a common goal. The agents make their moves simultaneously, and they have different perceptions about the system state after each move, due to different sensing capabilities. Each agent can only act based on its own experiences, since no communication is assumed during the game. However, before the game begins, the agents can agree on some strategy. A strategy is winning if it guarantees that the agents achieve their goal regardless of how the opponent acts. Identifying a winning strategy, or determining that none exists, is known as the strategy synthesis problem. In this thesis, we only consider a simple objective where the agents must force the game into a given state. Much of the literature is focused on strategies that either rely on that the agents (a) can remember everything that they have perceived or (b) can only remember the last thing that they have perceived. The strategy synthesis problem is (in the general case) undecidable in (a) and has exponential running time in (b). We are interested in the middle, where agents can have finite memory. Specifically, they should be able to keep a finite-state machine, which they update when they make new observations. In our case, the internal state of each agent represents its knowledge about the state of affairs. In other words, an agent is able to update its knowledge, and act based on it. We propose an algorithm for constructing the finite-state machine for each agent, and assigning actions to the internal states before the game begins. Not every winning strategy can be found by the algorithm, but we are convinced that the ones found are valid ones. An important building block for the algorithm is the knowledge-based subset construction (KBSC) used in the literature, which we generalise to games with multiple agents. With our construction, the game can be reduced to another game, still with uncertain state information, but with less or equal uncertainty. The construction can be applied arbitrarily many times, but it appears as if it stabilises (so that no new knowledge is gained) after only a few steps. We discuss this and other interesting properties of our algorithm in the final chapters of this thesis. / Vi studerar spel där ett lag agenter behöver samarbeta mot en motståndare för att uppnå ett mål. Agenterna agerar samtidigt, och vid varje steg av spelet så har de olika uppfattning om spelets tillstånd. De antas inte kunna kommunicera under spelets gång, så agenterna kan bara agera utifrån sina egna erfarenheter. Innan spelet börjar kan agenterna dock komma överrens om en strategi. En sådan strategi är vinnande om den garanterar att agenterna når sitt mål oavsett hur motståndaren beter sig. Att hitta en vinnande strategi är känt som syntesproblemet. I den här avhandlingen behandlar vi endast ett enkelt mål där agenterna måste tvinga in spelet i ett givet tillstånd. Mycket av litteraturen handlar om strategier där agenterna antingen antas (a) kunna minnas allt som de upplevt eller (b) bara kunna minnas det senaste de upplevt. Syntesproblemet är (i det generella fallet) oavgörbart i (a) och tar exponentiell tid i (b). Vi är intressede av fallet där agenter kan ha ändligt minne. De ska kunna ha en ändlig automat, som de kan uppdatera när de får nya observationer. I vårt fall så representerar det interna tillståndet agentens kunskap om spelets tillstånd. En agent kan då uppdatera sin kunskap och agera utifrån den. Vi föreslår en algoritm som konstruerar en ändlig automat åt varje agent, samt instruktioner för vad agenten ska göra i varje internt tillstånd. Varje vinnande strategi kan inte hittas av algoritmen, men vi är övertygade om att de som hittas är giltiga. En viktig byggsten är den kunskapsbaserade delmängskonstruktionen (KBSC), som vi generaliserar till spel med flera agenter. Med vår konstruktion kan spelet reduceras till ett annat spel som har mindre eller lika mycket osäkerhet. Detta kan göras godtyckligt många gånger, men det verkar som om att ingen ny kunskap tillkommer efter bara några gånger. Vi diskuterar detta vidare tillsammans med andra intressanta egenskaper hos algoritmen i de sista kapitlen i avhandlingen. Multi-agent games multiagent games multi-agent system imperfect information imperfect recall collaboration imperfect communication finite-state strategy concurrent games concurrent system strategy synthesis strategy construction automated programming automated problem solving automated collaboration verification knowledge-based subset construction knowledge tracking Computer Sciences Datavetenskap (datalogi)
258	A multi-agent nudge-based approach for disclosure mitigation online Ben Salem, Rim 08 1900 (has links) En 1993, alors qu’Internet faisait ses premiers pas, le New York Times publie un dessin de presse désormais célèbre avec la légende "Sur Internet, personne ne sait que tu es un chien". C’était une façon amusante de montrer qu’Internet offre à ses usagers un espace sûr à l’abri de tout préjugé, sarcasme, ou poursuites judiciaires. C’était aussi une annonce aux internautes qu’ils sont libres de ne montrer de leurs vies privées que ce qu’ils veulent laisser voir. Les années se succèdent pour faire de cette légende une promesse caduque qui n’a pu survivre aux attraits irrésistibles d’aller en ligne. Les principales tentations sont l’anonymat et la possibilité de se créer une identité imaginée, distincte de celle de la réalité. Hélas, la propagation exponentielle des réseaux sociaux a fait chevaucher les identités réelles et fictives des gens. Les usagers ressentent un besoin d’engagement de plus en plus compulsif. L’auto-divulgation bat alors son plein à cause de l’ignorance du public des conséquences de certains comportements. Pour s’attirer l’attention, les gens recourent au partage d’informations personnelles, d’appartenance de tous genres, de vœux, de désirs, etc. Par ailleurs, l’espoir et l’angoisse les incitent aussi à communiquer leurs inquiétudes concernant leurs états de santé et leurs expériences parfois traumatisantes au détriment de la confidentialité de leurs vies privées. L’ambition et l’envie de se distinguer incitent les gens à rendre publics leurs rituels, pratiques ou évènements festifs engageant souvent d’autres individus qui n’ont pas consenti explicitement à la publication du contenu. Des adolescents qui ont grandi à l’ère numérique ont exprimé leurs désapprobations quant à la façon dont leurs parents géraient leurs vies privées lorsqu’ils étaient enfants. Leurs réactions allaient d’une légère gêne à une action de poursuite en justice. La divulgation multipartite pose problème. Les professionnels, les artistes ainsi que les activistes de tout horizon ont trouvé aux réseaux sociaux un outil incontournable et efficace pour promouvoir leurs secteurs. Le télétravail qui se propage très rapidement ces dernières années a offert aux employés le confort de travailler dans un environnement familier, ils ont alors tendance à négliger la vigilance "du bureau" exposant ainsi les intérêts de leurs employeurs au danger. Ils peuvent aussi exprimer des opinions personnelles parfois inappropriées leur causant des répercussions néfastes. L’accroissement de l’insécurité liée au manque de vigilance en ligne et à l’ignorance des usagers a mené les chercheurs a puiser dans les domaines de sociologie, des sciences de comportement et de l’économie de la vie privée pour étudier les raisons et les motivations de la divulgation. Le "nudge", comme approche d’intervention pour améliorer le bien-être d’un individu ou d’un groupe de personnes, fût une solution largement adoptée pour la préservation de la vie privée. Deux concepts ont émergé. Le premier a adopté une solution "one-sizefits-all" qui est commune à tous les utilisateurs. Quoique relativement simple à mettre en œuvre et d’une protection satisfaisante de la vie privée, elle était rigide et peu attentive aux conditions individuelles des utilisateurs. Le second a plutôt privilégié les préférences des usagers pour résoudre, même en partie, la question de personnalisation des "nudges". Ce qui a été motivant pour les utilisateurs mais nuisible à leurs confidentialités. Dans cette thèse, l’idée principale est de profiter des mérites des deux concepts en les fusionnant. J’ai procédé à l’exploration de l’économie de la vie privée. Les acteurs de ce secteur sont, autres que le propriétaire de données lui-même, le courtier qui sert d’intermédiaire et l’utilisateur de ces données. Le mécanisme d’interaction entre eux est constitué par les échanges de données comme actifs et les compensations monétaires en retour. L’équilibre de cette relation est atteint par la satisfaction de ses parties prenantes. Pour faire de bons choix, l’équitabilité exige que le propriétaire de données ait les connaissances minimales nécessaires dans le domaine et qu’il soit conscient des contraintes qu’il subit éventuellement lors de la prise de décision. A la recherche d’un utilisateur éclairé, j’ai conçu un cadre que j’ai nommé Multipriv. Il englobe les facteurs d’influence sur la perception des gens de la vie privée. J’ai ensuite proposé un système multi-agents basé sur le "nudge" pour l’atténuation de la divulgation en ligne. Son principal composant comprend trois agents. Le premier est l’agent objectif Aegis qui se réfère aux solutions généralisées axées sur la protection des données personnelles. Le second est un agent personnel qui considère le contexte dans lequel se trouve le propriétaire de données. Le dernier est un agent multipartite qui représente les personnes impliquées dans le contenu en copropriété. Pour évaluer le système, une plateforme appelée Cognicy est implémentée et déployée. Elle imite de véritables plateformes de réseaux sociaux par l’offre de la possibilité de créer un profil, publier des statuts, joindre des photos, établir des liens avec d’autres, etc. Sur une population de 150 utilisateurs, ma proposition s’est classée meilleure que l’approche de base non spécifique au contexte en termes de taux d’acceptation des "nudges". Les retours des participants à la fin de leurs sessions expriment une appréciation des explications fournies dans les "nudges" et des outils mis à leur disposition sur la plateforme. / When the internet was in its infancy in 1993, the New York Times published a now-famous cartoon with the caption “On the Internet, nobody knows you’re a dog.”. It was an amusing way to denote that the internet offers a safe space and a shelter for people to be free of assumptions and to only disclose what they want to be shown of their personal lives. The major appeal to go online was anonymity and the ability to create a whole new persona separate from real life. However, the rising popularity of social media made people’s digital and physical existences collide. Social Networking Sites (SNS) feed the need for compulsive engagement and attention-seeking behaviour. This results in self-disclosure, which is the act of sharing personal information such as hopes, aspirations, fears, thoughts, etc. These platforms are fertile grounds for oversharing health information, traumatic experiences, casual partying habits, and co-owned posts that show or mention individuals other than the sharer. The latter practice is called multiparty disclosure and it is an issue especially when the other people involved do not explicitly consent to the shared content. Adolescents who grew up in the digital age expressed disapproval of how their parents handled their privacy as children. Their reactions ranged from slight embarrassment to pursuing legal action to regain a sense of control. The repercussions of privacy disclosure extend to professional lives since many people work from home nowadays and tend to be more complacent about privacy in their familiar environment. This can be damaging to employees who lose the trust of their employers, which can result in the termination of their contracts. Even when individuals do not disclose information related to their company, their professional lives can suffer the consequences of sharing unseemly posts that should have remained private. For the purpose of addressing the issue of oversharing, many researchers have studied and investigated the reasons and motivations behind it using multiple perspectives such as economics, behavioural science, and sociology. After the popularization of nudging as an intervention approach to improve the well-being of an individual or a group of people, there was an emerging interest in applying the concept to privacy preservation. After the initial wave of non-user-specific one-size-fits-all propositions, the scope of research extended to personalized solutions that consider individual preferences. The former are privacy-focused and more straightforward to implement than their personalized counterparts but they tend to be more rigid and less considerate of individual situations. On the other hand, the latter has the potential to understand users but can end up reinforcing biases and underperforming in their privacy protection objective. The main idea of my proposition is to merge the concepts introduced by the two waves to benefit from the merits of each. Because people exist within a larger ecosystem that governs their personal information, I start by exploring the economics of privacy in which the actors are presented as the data owner (individual), broker, and data user. I explain how they interact with one another through exchanges of data as assets and monetary compensation, in return. An equilibrium can be achieved where the user is satisfied with the level of anonymity they are afforded. However, in order to achieve this, the person whose information is used as a commodity needs to be aware and make the best choices for themselves. This is not always the case because users can lack knowledge to do so or they can be susceptible to contextual biases that warp their decision-making faculty. For this reason, my next objective was to design a framework called Multipriv, which encompasses the factors that influence people’s perception of privacy. Then, I propose a multi-agent nudge-based approach for disclosure mitigation online. Its core component includes an objective agent Aegis that is inspired by privacy-focused onesize-fits-all solutions. Furthermore, a personal agent represents the user’s context-specific perception, which is different from simply relying on preferences. Finally, a multiparty agent serves to give the other people involved in the co-owned content a voice. To evaluate the system, a platform called Cognicy is implemented and deployed. It mimics real social media platforms by offering the option of creating a profile, posting status updates, attaching photos, making connections with others, etc. Based on an evaluation using 150 users, my proposition proved superior to the baseline non-context-specific approach in terms of the nudge acceptance rate. Moreover, the feedback submitted by the participants at the end of their session expressed an appreciation of the explanations provided in the nudges, the visual charts, and the tools at their disposition on the platform. Divulgation Économie de la vie privée Informations personnelles Multipriv Système multi-agent Aegis Agent personnel Agent multipartite Copropriété Disclosure Self-disclosure Multiparty disclosure Economics of privacy Personal information Decision-making Multipriv Nudge-based multi-agent system Personal agent Multiparty agent Nudge
259	Deep Reinforcement Learning for Multi-Agent Path Planning in 2D Cost Map Environments : using Unity Machine Learning Agents toolkit Persson, Hannes January 2024 (has links) Multi-agent path planning is applied in a wide range of applications in robotics and autonomous vehicles, including aerial vehicles such as drones and other unmanned aerial vehicles (UAVs), to solve tasks in areas like surveillance, search and rescue, and transportation. In today's rapidly evolving technology in the fields of automation and artificial intelligence, multi-agent path planning is growing increasingly more relevant. The main problems encountered in multi-agent path planning are collision avoidance with other agents, obstacle evasion, and pathfinding from a starting point to an endpoint. In this project, the objectives were to create intelligent agents capable of navigating through two-dimensional eight-agent cost map environments to a static target, while avoiding collisions with other agents and simultaneously minimizing the path cost. The method of reinforcement learning was used by utilizing the development platform Unity and the open-source ML-Agents toolkit that enables the development of intelligent agents with reinforcement learning inside Unity. Perlin Noise was used to generate the cost maps. The reinforcement learning algorithm Proximal Policy Optimization was used to train the agents. The training was structured as a curriculum with two lessons, the first lesson was designed to teach the agents to reach the target, without colliding with other agents or moving out of bounds. The second lesson was designed to teach the agents to minimize the path cost. The project successfully achieved its objectives, which could be determined from visual inspection and by comparing the final model with a baseline model. The baseline model was trained only to reach the target while avoiding collisions, without minimizing the path cost. A comparison of the models showed that the final model outperformed the baseline model, reaching an average of $27.6\%$ lower path cost. / Multi-agent-vägsökning används inom en rad olika tillämpningar inom robotik och autonoma fordon, inklusive flygfarkoster såsom drönare och andra obemannade flygfarkoster (UAV), för att lösa uppgifter inom områden som övervakning, sök- och räddningsinsatser samt transport. I dagens snabbt utvecklande teknik inom automation och artificiell intelligens blir multi-agent-vägsökning allt mer relevant. De huvudsakliga problemen som stöts på inom multi-agent-vägsökning är kollisioner med andra agenter, undvikande av hinder och vägsökning från en startpunkt till en slutpunkt. I detta projekt var målen att skapa intelligenta agenter som kan navigera genom tvådimensionella åtta-agents kostnadskartmiljöer till ett statiskt mål, samtidigt som de undviker kollisioner med andra agenter och minimerar vägkostnaden. Metoden förstärkningsinlärning användes genom att utnyttja utvecklingsplattformen Unity och Unitys open-source ML-Agents toolkit, som möjliggör utveckling av intelligenta agenter med förstärkningsinlärning inuti Unity. Perlin Brus användes för att generera kostnadskartorna. Förstärkningsinlärningsalgoritmen Proximal Policy Optimization användes för att träna agenterna. Träningen strukturerades som en läroplan med två lektioner, den första lektionen var utformad för att lära agenterna att nå målet, utan att kollidera med andra agenter eller röra sig utanför gränserna. Den andra lektionen var utformad för att lära agenterna att minimera vägkostnaden. Projektet uppnådde framgångsrikt sina mål, vilket kunde fastställas genom visuell inspektion och genom att jämföra den slutliga modellen med en basmodell. Basmodellen tränades endast för att nå målet och undvika kollisioner, utan att minimera vägen kostnaden. En jämförelse av modellerna visade att den slutliga modellen överträffade baslinjemodellen, och uppnådde en genomsnittlig $27,6\%$ lägre vägkostnad. deep reinforcement learning reinforcement learning machine learning path planning cost map ML-agents unity artificial neural networks collision avoidance PPO multi agent multi-agent multi-agent system förstärkningsinlärning djup förstärkningsinlärning fleragentssystem kostnadkarta kostnadskartor artificiella neurala nätverk maskininlärning proximal policy optimization PPO svärmintelligens
260	Scalable Reinforcement Learning for Formation Control with Collision Avoidance : Localized policy gradient algorithm with continuous state and action space / Skalbar Förstärkande Inlärning för Formationskontroll med Kollisionsundvikande : Lokaliserad policygradientalgoritm med kontinuerligt tillstånds och handlingsutrymme Matoses Gimenez, Andreu January 2023 (has links) In the last decades, significant theoretical advances have been made on the field of distributed mulit-agent control theory. One of the most common systems that can be modelled as multi-agent systems are the so called formation control problems, in which a network of mobile agents is controlled to move towards a desired final formation. These problems additionally pose practical challenges, namely limited access to information about the global state of the system, which justify the use distributed and localized approaches for solving the control problem. The problem is further complicated if partial or no information is known about the dynamic model of the system. A widely used fundamental challenge of this approach in this setting is that the state-action space size scales exponentially with the number of agents, rendering the problem intractable for a large networks. This thesis presents a scalable and localized reinforcement learning approach to a traditional multi-agent formation control problem, with collision avoidance. A scalable reinforcement learning advantage actor critic algorithm is presented, based on previous work in the literature. Sub-optimal bounds are calculated for the accumulated reward and policy gradient localized approximations. The algorithm is tested on a two dimensional setting, with a network of mobile agents following simple integrator dynamics and stochastic localized policies. Neural networks are used to approximate the continuous value functions and policies. The formation control with collisions avoidance formulation and the algorithm presented show good scalability properties, with a polynomial increase in the number of function approximations parameters with number of agents. The reduced number of parameters decreases learning time for bigger networks, although the efficiency of computation is decreased compared to state of the art machine learning implementations. The policies obtained achieve probably safe trajectories although the lack of dynamic model makes it impossible to guarantee safety. / Under de senaste decennierna har betydande framsteg gjorts inom området för distribuerad mulit-agent reglerteori. Ett av de vanligaste systemen som kan modelleras som multiagentsystem är de så kallade formationskontrollproblemen, där ett nätverk av mobila agenter styrs för att röra sig mot en önskad slutlig formation. om systemets globala tillstånd, vilket motiverar användningen av distribuerade och lokaliserade tillvägagångssätt för att lösa det reglertekniska problemet. Problemet kompliceras ytterligare om delvis eller ingen information är känd om systemets dynamiska modell. Ett allmänt använt tillvägagångssätt för modellfri kontroll är reinforcement learning (RL). En grundläggande utmaning med detta tillvägagångssätt i den här miljön är att storleken på state-action utrymmet skalas exponentiellt med antalet agenter, vilket gör problemet svårlöst för ett stort nätverk. Detta examensarbete presenterar en skalbar och lokaliserad reinforcement learning metod på ett traditionellt reglertekniskt problem med flera agenter, med kollisionsundvikande. En reinforcement learning advantage actor critic algoritm presenteras, baserad på tidigare arbete i litteraturen. Suboptimala gränser beräknas för den ackumulerade belönings- och policygradientens lokaliserade approximationer. Algoritmen testas i en tvådimensionell miljö, med ett nätverk av mobila agenter som följer enkel integratordynamik och stokastiska lokaliserade policyer. Neurala nätverk används för att approximera de kontinuerliga värdefunktionerna och policyerna. Den presenterade formationsstyrningen med kollisionsundvikande formulering och algoritmen visar goda skalbarhetsegenskaper, med en polynomisk ökning av antalet funktionsapproximationsparametrar med antalet agenter. Det minskade antalet parametrar minskar inlärningstiden för större nätverk, även om effektiviteten i beräkningen minskar jämfört med avancerade maskininlärningsimplementeringar. De erhållna policyerna uppnår troligen säkra banor även om avsaknaden av dynamisk modell gör det omöjligt att garantera säkerheten. / En las últimas décadas, se han realizado importantes avances teóricos en el campo de la teoría del control multiagente distribuido. Uno de los sistemas más comunes que se pueden modelar como sistemas multiagente son los llamados problemas de control de formación, en los que se controla una red de agentes móviles para alcanzar una formación final deseada. Estos problemas plantean desafíos prácticos como el acceso limitado a la información del estado global del sistema, que justifican el uso de algoritmos distribuidos y locales para resolver el problema de control. El problema se complica aún más si solo se conoce información parcial o nada sobre el modelo dinámico del sistema. Un enfoque ampliamente utilizado para el control sin conocimiento del modelo dinámico es el reinforcement learning (RL). Un desafío fundamental de este método en este entorno es que el tamaño de la acción y el estado aumenta exponencialmente con la cantidad de agentes, lo que hace que el problema sea intratable para una red grande. Esta tesis presenta un algoritmo de RL escalable y local para un problema tradicional de control de formación con múltiples agentes, con prevención de colisiones. Se presenta un algoritmo “advantage actor-”critic, basado en trabajos previos en la literatura. Los límites subóptimos se calculan para las aproximaciones locales de la función Q y gradiente de la política. El algoritmo se prueba en un entorno bidimensional, con una red de agentes móviles que siguen una dinámica de integrador simple y políticas estocásticas localizadas. Redes neuronales se utilizan para aproximar las funciones y políticas de valor continuo. La formulación de del problema de formación con prevención de colisiones y el algoritmo presentado muestran buenas propiedades de escalabilidad, con un aumento polinómico en el número de parámetros con el número de agentes. El número reducido de parámetros disminuye el tiempo de aprendizaje para redes más grandes, aunque la eficiencia de la computación disminuye en comparación con las implementaciones de ML de última generación. Las politicas obtenidas alcanzan trayectorias probablemente seguras, aunque la falta de un modelo dinámico hace imposible garantizar la completa prevención de colisiones. / A les darreres dècades, s'han realitzat importants avenços teòrics en el camp de la teoria del control multiagent distribuït. Un dels sistemes més comuns que es poden modelar com a sistemes multiagent són els anomenats problemes de control de formació, en els què es controla una xarxa d'agents mòbils per assolir una formació final desitjada. Aquests problemes plantegen reptes pràctics com l'accés limitat a la informació de l'estat global del sistema, que justifiquen l'ús d'algorismes distribuïts i locals per resoldre el problema de control. El problema es complica encara més si només es coneix informació parcial sobre el model dinàmic del sistema. Un mètode àmpliament utilitzat per al control sense coneixement del model dinàmic és el reinforcement learning (RL). Un repte fonamental d'aquest mètode en aquest entorn és que la mida de l'acció i l'estat augmenta exponencialment amb la quantitat d'agents, cosa que fa que el problema sigui intractable per a una xarxa gran. Aquesta tesi presenta un algorisme de RL escalable i local per a un problema tradicional de control de formació amb múltiples agents, amb prevenció de col·lisions. Es presenta un algorisme “advantage actor-”critic, basat en treballs previs a la literatura. Els límits subòptims es calculen per a les aproximacions locals de la funció Q i gradient de la política.’ Lalgoritme es prova en un entorn bidimensional, amb una xarxa ’dagents mòbils que segueixen una dinàmica ’dintegrador simple i polítiques estocàstiques localitzades. Xarxes neuronals s'utilitzen per aproximar les funcions i les polítiques de valor continu. La formulació del problema de formació amb prevenció de col·lisions i l'algorisme presentat mostren bones propietats d'escalabilitat, amb un augment polinòmic en el nombre de paràmetres amb el nombre d'agents. El nombre reduït de paràmetres disminueix el temps d'aprenentatge per a les xarxes més grans, encara que l'eficiència de la computació disminueix en comparació amb les implementacions de ML d'última generació. Les polítiques obtingudes aconsegueixen trajectòries probablement segures, tot i que la manca d'un model dinàmic fa impossible garantir la prevenció completa de col·lisions. Control theory Multi-agent systems Distributed systems Formation control Collision avoidance Reinforcement learning Teoria de control Sistemes multiagent Sistemes distribuïts Control de formació Prevenció de col·lisions Reinforcement Learning Reglerteknik Multi-agent system Distribuerade system formationskontroll Kollisionsundvikande Reinforcement learning Teoría de control Sistemas multiagente Sistemas distribuidos Control de formación Prevención de colisiones Reinforcement Learning Control Engineering Reglerteknik Elektroteknik och elektronik

Search results