Spelling suggestions: "subject:"pomar"" "subject:"pombe""
11 |
Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et Apprentissage / Optimization of Partially Observable Systems in Wireless Networks : Game Theory, Self-adaptivity and LearningHabachi, Oussama 28 September 2012 (has links)
La dernière décennie a vu l'émergence d'Internet et l'apparition des applications multimédia qui requièrent de plus en plus de bande passante, ainsi que des utilisateurs qui exigent une meilleure qualité de service. Dans cette perspective, beaucoup de travaux ont été effectués pour améliorer l'utilisation du spectre sans fil.Le sujet de ma thèse de doctorat porte sur l'application de la théorie des jeux, la théorie des files d'attente et l'apprentissage dans les réseaux sans fil,en particulier dans des environnements partiellement observables. Nous considérons différentes couches du modèle OSI. En effet, nous étudions l'accès opportuniste au spectre sans fil à la couche MAC en utilisant la technologie des radios cognitifs (CR). Par la suite, nous nous concentrons sur le contrôle de congestion à la couche transport, et nous développons des mécanismes de contrôle de congestion pour le protocole TCP. / Since delay-sensitive and bandwidth-intense multimedia applications have emerged in the Internet, the demand for network resources has seen a steady increase during the last decade. Specifically, wireless networks have become pervasive and highly populated.These motivations are behind the problems considered in this dissertation.The topic of my PhD is about the application of game theory, queueing theory and learning techniques in wireless networks under some QoS constraints, especially in partially observable environments.We consider different layers of the protocol stack. In fact, we study the Opportunistic Spectrum Access (OSA) at the Medium Access Control (MAC) layer through Cognitive Radio (CR) approaches.Thereafter, we focus on the congestion control at the transport layer, and we develop some congestion control mechanisms under the TCP protocol.The roadmap of the research is as follows. Firstly, we focus on the MAC layer, and we seek for optimal OSA strategies in CR networks. We consider that Secondary Users (SUs) take advantage of opportunities in licensed channels while ensuring a minimum level of QoS. In fact, SUs have the possibility to sense and access licensed channels, or to transmit their packets using a dedicated access (like 3G). Therefore, a SU has two conflicting goals: seeking for opportunities in licensed channels, but spending energy for sensing those channels, or transmitting over the dedicated channel without sensing, but with higher transmission delay. We model the slotted and the non-slotted systems using a queueing framework. Thereafter, we analyze the non-cooperative behavior of SUs, and we prove the existence of a Nash equilibrium (NE) strategy. Moreover, we measure the gap of performance between the centralized and the decentralized systems using the Price of Anarchy (PoA).Even if the OSA at the MAC layer was deeply investigated in the last decade, the performance of SUs, such as energy consumption or Quality of Service (QoS) guarantee, was somehow ignored. Therefore, we study the OSA taking into account energy consumption and delay. We consider, first, one SU that access opportunistically licensed channels, or transmit its packets through a dedicated channel. Due to the partial spectrum sensing, the state of the spectrum is partially observable. Therefore, we use the Partially Observable Markov Decision Process (POMDP) framework to design an optimal OSA policy for SUs. Specifically, we derive some structural properties of the value function, and we prove that the optimal OSA policy has a threshold structure.Thereafter, we extend the model to the context of multiple SUs. We study the non-cooperative behavior of SUs and we prove the existence of a NE. Moreover, we highlight a paradox in this situation: more opportunities in the licensed spectrum may lead to worst performances for SUs. Thereafter, we focus on the study of spectrum management issues. In fact, we introduce a spectrum manager to the model, and we analyze the hierarchical game between the network manager and SUs.Finally, we focus on the transport layer and we study the congestion control for wireless networks under some QoS and Quality of Experience (QoE) constraints. Firstly, we propose a congestion control algorithm that takes into account applications' parameters and multimedia quality. In fact, we consider that network users maximize their expected multimedia quality by choosing the congestion control strategy. Since users ignore the congestion status at bottleneck links, we use a POMDP framework to determine the optimal congestion control strategy.Thereafter, we consider a subjective measure of the multimedia quality, and we propose a QoE-based congestion control algorithm. This algorithm bases on QoE feedbacks from receivers in order to adapt the congestion window size. Note that the proposed algorithms are designed based on some learning methods in order to face the complexity of solving POMDP problems.
|
12 |
Apprentissage par renforcement pour la généralisation des approches automatiques dans la conception des systèmes de dialogue oral / Statistical methods for a oral human-machine dialog systemPinault, Florian 24 November 2011 (has links)
Les systèmes de dialogue homme machine actuellement utilisés dans l’industrie sont fortement limités par une forme de communication très rigide imposant à l’utilisateur de suivre la logique du concepteur du système. Cette limitation est en partie due à leur représentation de l’état de dialogue sous la forme de formulaires préétablis.Pour répondre à cette difficulté, nous proposons d’utiliser une représentation sémantique à structure plus riche et flexible visant à permettre à l’utilisateur de formuler librement sa demande.Une deuxième difficulté qui handicape grandement les systèmes de dialogue est le fort taux d’erreur du système de reconnaissance vocale. Afin de traiter ces erreurs de manière quantitative, la volonté de réaliser une planification de stratégie de dialogue en milieu incertain a conduit à utiliser des méthodes d’apprentissage par renforcement telles que les processus de décision de Markov partiellement observables (POMDP). Mais un inconvénient du paradigme POMDP est sa trop grande complexité algorithmique. Certaines propositions récentes permettent de réduire la complexité du modèle. Mais elles utilisent une représentation en formulaire et ne peuvent être appliqués directement à la représentation sémantique riche que nous proposons d’utiliser.Afin d’appliquer le modèle POMDP dans un système dont le modèle sémantique est complexe, nous proposons une nouvelle façon de contrôler sa complexité en introduisant un nouveau paradigme : le POMDP résumé à double suivi de la croyance. Dans notre proposition, le POMDP maitre, complexe, est transformé en un POMDP résumé, plus simple. Un premier suivi de croyance (belief update) est réalisé dans l’espace maitre (en intégrant des observations probabilistes sous forme de listes nbest). Et un second suivi de croyance est réalisé dans l’espace résumé, les stratégies obtenues sont ainsi optimisées sur un véritable POMDP.Nous proposons deux méthodes pour définir la projection du POMDP maitre en un POMDP résumé : par des règles manuelles et par regroupement automatique par k plus proches voisins. Pour cette dernière, nous proposons d’utiliser la distance d’édition entre graphes, que nous généralisons pour obtenir une distance entre listes nbest.En outre, le couplage entre un système résumé, reposant sur un modèle statistique par POMDP, et un système expert, reposant sur des règles ad hoc, fournit un meilleur contrôle sur la stratégie finale. Ce manque de contrôle est en effet une des faiblesses empêchant l’adoption des POMDP pour le dialogue dans l’industrie.Dans le domaine du renseignement d’informations touristiques et de la réservation de chambres d’hôtel, les résultats sur des dialogues simulés montrent l’efficacité de l’approche par renforcement associée à un système de règles pour s’adapter à un environnement bruité. Les tests réels sur des utilisateurs humains montrent qu’un système optimisé par renforcement obtient cependant de meilleures performances sur le critère pour lequel il a été optimisé. / Dialog managers (DM) in spoken dialogue systems make decisions in highly uncertain conditions, due to errors from the speech recognition and spoken language understanding (SLU) modules. In this work a framework to interface efficient probabilistic modeling for both the SLU and the DM modules is described and investigated. Thorough representation of the user semantics is inferred by the SLU in the form of a graph of frames and, complemented with some contextual information, is mapped to a summary space in which a stochastic POMDP dialogue manager can perform planning of actions taking into account the uncertainty on the current dialogue state. Tractability is ensured by the use of an intermediate summary space. Also to reduce the development cost of SDS an approach based on clustering is proposed to automatically derive the master-summary mapping function. A implementation is presented in the Media corpus domain (touristic information and hotel booking) and tested with a simulated user.
|
13 |
Vývoj trénovatelných strategií řízení pro dialogové systémy / Development of trainable policies for spoken dialogue systemsLe, Thanh Cong January 2016 (has links)
Abstract Development of trainable policies for spoken dialogue systems Thanh Le In humanhuman interaction, speech is the most natural and effective manner of communication. Spoken Dialogue Systems (SDS) have been trying to bring that high level interaction to computer systems, so with SDS, you could talk to machines rather than learn to use mouse and keyboard for performing a task. However, as inaccuracy in speech recognition and inherent ambiguity in spoken language, the dialogue state (user's desire) can never be known with certainty, and therefore, building such a SDS is not trivial. Statistical approaches have been proposed to deal with these uncertainties by maintaining a probability distribution over every possible dialogue state. Based on these distributions, the system learns how to interact with users, somehow to achieve the final goal in the most effective manner. In Reinforcement Learning (RL), the learning process is understood as optimizing a policy of choosing action conditioned on the current belief state. Since the space of dialogue...
|
14 |
Implementace aproximativních Bayesovských metod pro odhad stavu v dialogových systémech / Approximative Bayes methods for belief monitoring in spoken dialogue systemsMarek, David January 2013 (has links)
The most important component of virtually any dialog system is a dialogue manager. The aim of the dialog manager is to propose an action (a continuation of the dialogue) given the last dialog state. The dialog state summarises all the past user input and the system input and ideally it includes all information necessary for natural progress in the dialog. For the dialog manager to work efficiently, it is important to model the probability distribution over all dialog states as precisely as possible. It is possible that the set of dialog states will be very large, so approximative methods usually must be used. In this thesis we will discuss an implementation of approximate Bayes methods for belief state monitoring. The result is a library for dialog state monitoring in real dialog systems. 1
|
15 |
Interactions in Decentralized EnvironmentsAllen, Martin William 01 February 2009 (has links)
The decentralized Markov decision process (Dec-POMDP) is a powerful formal model for studying multiagent problems where cooperative, coordinated action is optimal, but each agent acts based on local data alone. Unfortunately, it is known that Dec-POMDPs are fundamentally intractable: they are NEXP-complete in the worst case, and have been empirically observed to be beyond feasible optimal solution.To get around these obstacles, researchers have focused on special classes of the general Dec-POMDP problem, restricting the degree to which agent actions can interact with one another. In some cases, it has been proven that these sorts of structured forms of interaction can in fact reduce worst-case complexity. Where formal proofs have been lacking, empirical observations suggest that this may also be true for other cases, although less is known precisely.This thesis unifies a range of this existing work, extending analysis to establish novel complexity results for some popular restricted-interaction models. We also establish some new results concerning cases for which reduced complexity has been proven, showing correspondences between basic structural features and the potential for dimensionality reduction when employing mathematical programming techniques.As our new complexity results establish that worst-case intractability is more widespread than previously known, we look to new ways of analyzing the potential average-case difficulty of Dec-POMDP instances. As this would be extremely difficult using the tools of traditional complexity theory, we take a more empirical approach. In so doing, we identify new analytical measures that apply to all Dec-POMDPs, whatever their structure. These measures allow us to identify problems that are potentially easier to solve on average, and validate this claim empirically. As we show, the performance of well-known optimal dynamic programming methods correlates with our new measure of difficulty. Finally, we explore the approximate case, showing that our measure works well as a predictor of difficulty there, too, and provides a means of setting algorithm parameters to achieve far more efficient performance.
|
16 |
Optimisation des Systèmes Partiellement Observables dans les Réseaux Sans-fil : Théorie des jeux, Auto-adaptation et ApprentissageHabachi, Oussama 28 September 2012 (has links) (PDF)
La dernière décennie a vu l'émergence d'Internet et l'apparition des applications multimédia qui requièrent de plus en plus de bande passante, ainsi que des utilisateurs qui exigent une meilleure qualité de service. Dans cette perspective, beaucoup de travaux ont été effectués pour améliorer l'utilisation du spectre sans fil.Le sujet de ma thèse de doctorat porte sur l'application de la théorie des jeux, la théorie des files d'attente et l'apprentissage dans les réseaux sans fil,en particulier dans des environnements partiellement observables. Nous considérons différentes couches du modèle OSI. En effet, nous étudions l'accès opportuniste au spectre sans fil à la couche MAC en utilisant la technologie des radios cognitifs (CR). Par la suite, nous nous concentrons sur le contrôle de congestion à la couche transport, et nous développons des mécanismes de contrôle de congestion pour le protocole TCP.
|
17 |
Opportunistic spectrum usage and optimal control in heterogeneous wireless networksRaiss El Fenni, Mohammed 12 December 2012 (has links) (PDF)
The present dissertation deals with how to use the precious wireless resources that are usually wasted by under-utilization of networks. We have been particularly interested by all resources that can be used in an opportunistic fashion using different technologies. We have designed new schemes for better and more efficient use of wireless systems by providing mathematical frameworks. In the first part, We have been interested in cognitive radio networks, where a cellular service provider can lease a part of its resources to secondary users or virtual providers. In the second part, we have chosen delay-tolerant networks as a solution to reduce the pressure on the cell traffic, where mobile users come to use available resources effectively and with a cheaper cost. We have focused on optimal strategy for smartphones in hybrid wireless networks. In the last part, an alternative to delay-tolerant networks, specially in regions that are not covered by the cellular network, is to use Ad-hoc networks. Indeed, they can be used as an extension of the coverage area. We have developed a new analytical modeling of the IEEE 802.11e DCF/EDCF. We have investigated the intricate interactions among layers by building a general cross-layered framework to represent multi-hop ad hoc networks with asymmetric topology and traffic
|
18 |
Active Sensing for Partially Observable Markov Decision ProcessesKoltunova, Veronika 10 January 2013 (has links)
Context information on a smart phone can be used to tailor applications for specific situations (e.g. provide tailored routing advice based on location, gas prices and traffic). However, typical context-aware smart phone applications use very limited context information such as user identity, location and time. In the future, smart phones will need to decide from a wide range of sensors to gather information from in order to best accommodate user needs and preferences in a given context.
In this thesis, we present a model for active sensor selection within decision-making processes, in which observational features are selected based on longer-term impact on the decisions made by the smart phone. This thesis formulates the problem as a partially observable Markov decision process (POMDP), and proposes a non-myopic solution to the problem using a state of the art approximate planning algorithm Symbolic Perseus. We have tested our method on a 3 small example domains, comparing different policy types, discount factors and cost settings. The experimental results proved that the proposed approach delivers a better policy in the situation of costly sensors, while at the same time provides the advantage of faster policy computation with less memory usage.
|
19 |
Active Sensing for Partially Observable Markov Decision ProcessesKoltunova, Veronika 10 January 2013 (has links)
Context information on a smart phone can be used to tailor applications for specific situations (e.g. provide tailored routing advice based on location, gas prices and traffic). However, typical context-aware smart phone applications use very limited context information such as user identity, location and time. In the future, smart phones will need to decide from a wide range of sensors to gather information from in order to best accommodate user needs and preferences in a given context.
In this thesis, we present a model for active sensor selection within decision-making processes, in which observational features are selected based on longer-term impact on the decisions made by the smart phone. This thesis formulates the problem as a partially observable Markov decision process (POMDP), and proposes a non-myopic solution to the problem using a state of the art approximate planning algorithm Symbolic Perseus. We have tested our method on a 3 small example domains, comparing different policy types, discount factors and cost settings. The experimental results proved that the proposed approach delivers a better policy in the situation of costly sensors, while at the same time provides the advantage of faster policy computation with less memory usage.
|
20 |
[en] THE USE OF UAVS IN HUMANITARIAN RELIEF: A POMDP BASED METHODOLOGY FOR FINDING VICTIMS / [pt] O USO DE VANTS EM AJUDA HUMANITÁRIA: UMA METODOLOGIA BASEADA EM POMDP PARA ENCONTRAR VÍTIMASRAISSA ZURLI BITTENCOURT BRAVO 23 June 2017 (has links)
[pt] O uso de Veículos Aéreos Não Tripulados (VANTs) na ajuda humanitária tem sido proposto por pesquisadores para localizar vítimas em áreas afetadas por desastres. A urgência desse tipo de operação é encontrar pessoas afetadas o mais rápido possível, o que significa que determinar a roteirização ótima para os VANTs é muito importante para salvar vidas. Como os VANTs tem que percorrer toda a área afetada para encontrar vítimas, a operação de roteirização se torna equivalente a um problema de cobertura. Neste trabalho, uma metodologia para resolver o problema de cobertura é proposta, baseada na heurística do Processo de Decisão de Markov Parcialmente Observável (POMDP), onde as observações feitas pelos VANTs são consideradas. Essa heurística escolhe as ações baseando-se nas informações disponíveis, essas informações são as ações e observações anteriores. A formulação da roteirização do VANT é baseada na ideia de dar prioridades mais altas às áreas mais propensas a terem vítimas. Para aplicar esta técnica em casos reais, foi criada uma metodologia que consiste em quatro etapas. Primeiramente, o problema é modelado em relação à área afetada, tipo de drone que será utilizado, resolução da câmera, altura média do voo, ponto de partida ou decolagem, além do tamanho e prioridade dos estados. Em seguida, a fim de testar a eficiência do algoritmo através de simulações, grupos de vítimas são distribuídos pela área a ser sobrevoada. Então, o algoritmo é iniciado e o drone, a cada iteração, muda de estado de acordo com a heurística POMDP, até percorrer toda a área afetada. Por fim, a eficiência do algoritmo é testada através de quatro estatísticas: distância percorrida, tempo de operação, percentual de cobertura e tempo para encontrar grupos de vítimas. Essa metodologia foi aplicada em dois exemplos ilustrativos: um tornado em Xanxerê, no Brasil, que foi um desastre de início súbito em Abril de 2015, e em um campo de refugiados no Sudão do Sul, um desastre de início lento que começou em 2013. Depois de fazer simulações, foi demonstrado que a solução cobre toda a área afetada por desastres em um período de tempo razoável. A distância percorrida pelo VANT e a duração da operação, que dependem do número de estados, não tiveram um desvio padrão significativo entre as simulações, o que significa que, ainda que existam vários caminhos possíveis devido ao empate das prioridades, o algoritmo tem resultados homogêneos. O tempo para encontrar grupos de vítimas, e portanto o sucesso da operação de resgate, depende da definição das prioridades dos estados, estabelecidas por um especialista. Caso as prioridades sejam mal definidas, o VANT começará a sobrevoar áreas sem vítimas, o que levará ao fracasso da operação de resgate, uma vez que o algoritmo não estará salvando vidas o mais rápido possível. Ainda foi feita uma comparação do algoritmo proposto com o método guloso. A princípio, esse método não cobriu 100 por cento da área afetada, o que tornou a comparação injusta. Para contornar esse problema, o algoritmo guloso foi forçado a percorrer 100 por cento da área afetada e os resultados mostram que o POMDP tem resultados melhores em relação ao tempo para salvar vítimas. Já em relação a distância percorrida e tempo de operação, os resultados são iguais ou melhores para o POMDP. Isso ocorre porque o algoritmo guloso tem o viés de otimizar distância percorrida e, logo, otimiza o tempo de operação. Já o POMDP tem como objetivo, nesta dissertação, salvar vidas e faz isso de forma dinâmica, atualizando sua distribuição de probabilidades a cada observação feita. O ineditismo desta metodologia é ressaltado no capítulo 3, onde mais de 139 trabalhos foram lidos e classificados com o intuito de mostrar quais são as aplicações que drones em logística humanitária, como o POMDP é usado em drones e como a técnica de simulação é utilizada em logística humanitária. Apenas um artigo propõe o u / [en] The use of Unmanned Aerial Vehicles (UAVs) in humanitarian relief has been proposed by researchers for searching victims in disaster affected areas. The urgency of this type of operation is to find the affected people as soon as possible, which means that determining the optimal flight path for UAVs is very important to save lifes. Since the UAVs have to search through the entire affected area to find victims, the path planning operation becomes equivalent to an area coverage problem. In this study, a methodology to solve the coverage problem is proposed, based on a Partially Observable Markov Decision Processes (POMDP) heuristic, which considers the observations made from UAVs. The formulation of the UAV path planning is based on the idea of assigning higher priorities to the areas which are more likely to contain victims. The methodology was applied in two illustrative examples: a tornado in Xanxerê, Brazil, which was a rapid-onset disaster in April 2015 and a refugee s camp in South Sudan, a slow-onset disaster that started in 2013. After simulations, it is demonstrated that this solution achieves full coverage of disaster affected areas in a reasonable time span. The traveled distance and the operation s durations, which are dependent on the number of states, did not have a significative standard deviation between the simulations. It means that even if there were many possible paths, due to the tied priorities, the algorithm has homogeneous results. The time to find groups of victims, and so the success of the search and rescue operation, depends on the specialist s definition of states priorities. A comparison with a greedy algorithm showed that POMDP is faster to find victims while greedy s performance focuses on minimizing the traveled distance. Future research indicates a practical application of the methodology proposed.
|
Page generated in 0.0285 seconds