Global ETD Search

21	Learning the Parameters of Reinforcement Learning from Data for Adaptive Spoken Dialogue Systems / Apprentissage automatique des paramètres de l'apprentissage par renforcement pour les systèmes de dialogues adaptatifs Asri, Layla El 21 January 2016 (has links) Cette thèse s’inscrit dans le cadre de la recherche sur les systèmes de dialogue. Ce document propose d’apprendre le comportement d’un système à partir d’un ensemble de dialogues annotés. Le système apprend un comportement optimal via l’apprentissage par renforcement. Nous montrons qu’il n’est pas nécessaire de définir une représentation de l’espace d’état ni une fonction de récompense. En effet, ces deux paramètres peuvent être appris à partir du corpus de dialogues annotés. Nous montrons qu’il est possible pour un développeur de systèmes de dialogue d’optimiser la gestion du dialogue en définissant seulement la logique du dialogue ainsi qu’un critère à maximiser (par exemple, la satisfaction utilisateur). La première étape de la méthodologie que nous proposons consiste à prendre en compte un certain nombre de paramètres de dialogue afin de construire une représentation de l’espace d’état permettant d’optimiser le critère spécifié par le développeur. Par exemple, si le critère choisi est la satisfaction utilisateur, il est alors important d’inclure dans la représentation des paramètres tels que la durée du dialogue et le score de confiance de la reconnaissance vocale. L’espace d’état est modélisé par une mémoire sparse distribuée. Notre modèle, Genetic Sparse Distributed Memory for Reinforcement Learning (GSDMRL), permet de prendre en compte de nombreux paramètres de dialogue et de sélectionner ceux qui sont importants pour l’apprentissage par évolution génétique. L’espace d’état résultant ainsi que le comportement appris par le système sont aisément interprétables. Dans un second temps, les dialogues annotés servent à apprendre une fonction de récompense qui apprend au système à optimiser le critère donné par le développeur. A cet effet, nous proposons deux algorithmes, reward shaping et distance minimisation. Ces deux méthodes interprètent le critère à optimiser comme étant la récompense globale pour chaque dialogue. Nous comparons ces deux fonctions sur un ensemble de dialogues simulés et nous montrons que l’apprentissage est plus rapide avec ces fonctions qu’en utilisant directement le critère comme récompense finale. Nous avons développé un système de dialogue dédié à la prise de rendez-vous et nous avons collecté un corpus de dialogues annotés avec ce système. Ce corpus permet d’illustrer la capacité de mise à l’échelle de la représentation de l’espace d’état GSDMRL et constitue un bon exemple de système industriel sur lequel la méthodologie que nous proposons pourrait être appliquée / This document proposes to learn the behaviour of the dialogue manager of a spoken dialogue system from a set of rated dialogues. This learning is performed through reinforcement learning. Our method does not require the definition of a representation of the state space nor a reward function. These two high-level parameters are learnt from the corpus of rated dialogues. It is shown that the spoken dialogue designer can optimise dialogue management by simply defining the dialogue logic and a criterion to maximise (e.g user satisfaction). The methodology suggested in this thesis first considers the dialogue parameters that are necessary to compute a representation of the state space relevant for the criterion to be maximized. For instance, if the chosen criterion is user satisfaction then it is important to account for parameters such as dialogue duration and the average speech recognition confidence score. The state space is represented as a sparse distributed memory. The Genetic Sparse Distributed Memory for Reinforcement Learning (GSDMRL) accommodates many dialogue parameters and selects the parameters which are the most important for learning through genetic evolution. The resulting state space and the policy learnt on it are easily interpretable by the system designer. Secondly, the rated dialogues are used to learn a reward function which teaches the system to optimise the criterion. Two algorithms, reward shaping and distance minimisation are proposed to learn the reward function. These two algorithms consider the criterion to be the return for the entire dialogue. These functions are discussed and compared on simulated dialogues and it is shown that the resulting functions enable faster learning than using the criterion directly as the final reward. A spoken dialogue system for appointment scheduling was designed during this thesis, based on previous systems, and a corpus of rated dialogues with this system were collected. This corpus illustrates the scaling capability of the state space representation and is a good example of an industrial spoken dialogue system upon which the methodology could be applied Systèmes de dialogue Apprentissage par renforcement Évaluation Fonctions de récompense Spoken dialogue systems Reinforcement learning Evaluation Reward functions 006.31
22	Dialogue Behavior Management in Conversational Recommender Systems Wärnestål, Pontus January 2007 (has links) This thesis examines recommendation dialogue, in the context of dialogue strategy design for conversational recommender systems. The purpose of a recommender system is to produce personalized recommendations of potentially useful items from a large space of possible options. In a conversational recommender system, this task is approached by utilizing natural language recommendation dialogue for detecting user preferences, as well as for providing recommendations. The fundamental idea of a conversational recommender system is that it relies on dialogue sessions to detect, continuously update, and utilize the user's preferences in order to predict potential interest in domain items modeled in a system. Designing the dialogue strategy management is thus one of the most important tasks for such systems. Based on empirical studies as well as design and implementation of conversational recommender systems, a behavior-based dialogue model called bcorn is presented. bcorn is based on three constructs, which are presented in the thesis. It utilizes a user preference modeling framework (preflets) that supports and utilizes natural language dialogue, and allows for descriptive, comparative, and superlative preference statements, in various situations. Another component of bcorn is its message-passing formalism, pcql, which is a notation used when describing preferential and factual statements and requests. bcorn is designed to be a generic recommendation dialogue strategy with conventional, information-providing, and recommendation capabilities, that each describes a natural chunk of a recommender agent's dialogue strategy, modeled in dialogue behavior diagrams that are run in parallel to give rise to coherent, flexible, and effective dialogue in conversational recommender systems. Three empirical studies have been carried out in order to explore the problem space of recommendation dialogue, and to verify the solutions put forward in this work. Study I is a corpus study in the domain of movie recommendations. The result of the study is a characterization of recommendation dialogue, and forms a base for a first prototype implementation of a human-computer recommendation dialogue control strategy. Study II is an end-user evaluation of the acorn system that implements the dialogue control strategy and results in a verification of the effectiveness and usability of the dialogue strategy. There are also implications that influence the refinement of the model that are used in the bcorn dialogue strategy model. Study III is an overhearer evaluation of a functional conversational recommender system called CoreSong, which implements the bcorn model. The result of the study is indicative of the soundness of the behavior-based approach to conversational recommender system design, as well as the informativeness, naturalness, and coherence of the individual bcorn dialogue behaviors. / I denna avhandling undersöks rekommendationsdialog med avseende på utformningen av dialogstrategier f¨or konverserande rekommendationssystem. Syftet med ett rekommendationssystem är att generera personaliserade rekommendationer utifrån potentiellt användbara domänobjekt i stora informationsrymder. I ett konverserande rekommendationssystem angrips detta problem genom att utnyttja naturligt språkk och dialog för att modellera användarpreferenser, liksom för att ge rekommendationer. Grundidén med konverserande rekommendationssystem är att utnyttja dialogsessioner för att upptäcka, uppdatera och utnyttja en användares preferenser för att förutsäga användarens intresse för domänobjekten som modelleras i ett system. Utformningen av dialogstrategihantering är därför en av de viktigaste uppgifterna för sådana system. Baserat på empiriska studier, liksom på utformning och implementering av konverserande rekommendationssystem, presenteras en beteendebaserad dialogmodell som kallas bcorn. bcorns bas utgörs av tre konstruktioner, vilka alla presenteras i denna avhandling. bcorn utnyttjar ett preferensmodelleringsramverk (preflets) som stöder och anv¨ander sig av naturligt språk i dialog och tillåter deskriptiva, komparativa och superlativa preferensuttryck i olika situationer. Den andra komponenten i bcorn är dess interna meddelande-formalism pcql, som är en notation som kan beskriva preferens- och faktiska påståenden och frågor. bcorn är utformat som en generell rekommendationshanteringsstrategi med konventionella, informationsgivande och rekommenderande förmågor, som var och en beskriver naturliga delar av en rekommendationsagents dialogstrategi. Dessa delar modelleras i dialogbeteendediagram som exekveras parallellt för att ge upphov till koherent, flexibel och effektiv dialog i konverserande rekommendationssystem. Tre empiriska studier har utförts för att utforska problemkomplexet som utgör rekommendationsdialog och för att verifiera de lösningar som tagits fram inom ramen för detta arbete. Studie I är en korpusstudie i filmrekommendationsdomänen. Studien resulterar i en karakteristik av rekommendationsdialog, och utgör basen för en första prototyp av dialoghanteringsstrategi för rekommendationsdialog mellan människa och dator. Studie II är en slutanvändarutvärdering av systemet acorn som implementerar denna dialoghanteringsstrategi och resulterar i en verifiering av effektivitet och användbarhet av strategin. Studien resulterar också i implikationer som påverkar utformningen av den modell som används i bcorn. Studie III är en medhörningsutvärdering av det funktionella konverserande rekommendationssystemet CoreSong, som implementerar bcorn-modellen. Resultatet av studien indikerar att det beteendebaserade angreppssättet är funktionellt och att de olika dialogbeteendena i bcorn ger upphov till h¨og informationskvalitet, naturlighet och koherens i rekommendationsdialog. Natural Language Interaction Dialogue Systems Recommender Systems Dialogue Strategy Management User Modeling Preference Management Computational linguistics Datorlingvistik
23	Nové metody generování promluv v dialogových systémech / Novel Methods for Natural Language Generation in Spoken Dialogue Systems Dušek, Ondřej January 2017 (has links) Title: Novel Methods for Natural Language Generation in Spoken Dialogue Systems Author: Ondřej Dušek Department: Institute of Formal and Applied Linguistics Supervisor: Ing. Mgr. Filip Jurčíček, Ph.D., Institute of Formal and Applied Linguistics Abstract: This thesis explores novel approaches to natural language generation (NLG) in spoken dialogue systems (i.e., generating system responses to be presented the user), aiming at simplifying adaptivity of NLG in three respects: domain portability, language portability, and user-adaptive outputs. Our generators improve over state-of-the-art in all of them: First, our gen- erators, which are based on statistical methods (A* search with perceptron ranking and sequence-to-sequence recurrent neural network architectures), can be trained on data without fine-grained semantic alignments, thus simplifying the process of retraining the generator for a new domain in comparison to previous approaches. Second, we enhance the neural-network-based gener- ator so that it takes preceding dialogue context into account (i.e., user's way of speaking), thus producing user-adaptive outputs. Third, we evaluate sev- eral extensions to the neural-network-based generator designed for producing output in morphologically rich languages, showing improvements in Czech generation. In...
24	Turn-taking enhancement in spoken dialogue systems with reinforcement learning / Amélioration de la Prise de Parole dans les Systèmes de Dialogue Vocaux avec Apprentissage par Renforcement Khouzaimi, Hatim 06 June 2016 (has links) Les systèmes de dialogue incrémentaux sont capables d’entamer le traitement des paroles de l’utilisateur au moment même où il les prononce (sans attendre de signal de fin de phrase tel un long silence par exemple). Ils peuvent ainsi prendre la parole à n’importe quel moment et l’utilisateur peut faire de même (et interrompre le système). De ce fait, ces systèmes permettent d’effectuer une plus large palette de comportements de prise de parole en comparaison avec les systèmes de dialogue traditionnels. Cette thèse s’articule autour de la problématique suivante : est-il possible pour un système de dialogue incrémental d’apprendre une stratégie optimale de prise de parole de façon autonome? Tout d’abord, une analyse des mécanismes sous-jacents à la dynamique de prise de parole dans une conversation homme-homme a permis d’établir une taxonomie de ces phénomènes. Ensuite, une nouvelle architecture permettant de doter les systèmes de dialogues conventionnels de capacités de traitement incrémentales de la parole, à moindre coût, a été proposée. Dans un premier temps, un simulateur de dialogue destiné à répliquer les comportements incrémentaux de l’utilisateur et de la reconnaissance vocale a été développé puis utilisé pour effectuer les premier tests de stratégies de dialogue incrémentales. Ces dernières ont été développées à base de règles issues de l’analyse effectuée lors de l’établissement de la taxonomie des phénomènes de prise de parole. Les résultats de la simulation montrent que le caractère incrémental permet d’obtenir des interactions plus efficaces. La meilleure stratégie à base de règles a été retenue comme référence pour la suite. Dans un second temps, une stratégie basée sur l’apprentissage par renforcement a été implémentée. Elle est capable d’apprendre à optimiser ses décisions de prise de parole de façon totalement autonome étant donnée une fonction de récompense. Une première comparaison, en simulation, a montré que cette stratégie engendre des résultats encore meilleurs par rapport à la stratégie à base de règles. En guise de validation, une expérience avec des utilisateurs réels a été menée (interactions avec une maison intelligente). Une amélioration significative du taux de complétion de tâche a été constatée dans le cas de la stratégie apprise par renforcement et ce, sans dégradation de l’appréciation globale par les utilisateurs de la qualité du dialogue (en réalité, une légère amélioration a été constatée). / Incremental dialogue systems are able to process the user’s speech as it is spoken (without waiting for the end of a sentence before starting to process it). This makes them able to take the floor whenever they decide to (the user can also speak whenever she wants, even if the system is still holding the floor). As a consequence, they are able to perform a richer set of turn-taking behaviours compared to traditional systems. Several contributions are described in this thesis with the aim of showing that dialogue systems’ turn-taking capabilities can be automatically improved from data. First, human-human dialogue is analysed and a new taxonomy of turn-taking phenomena in human conversation is established. Based on this work, the different phenomena are analysed and some of them are selected for replication in a human-machine context (the ones that are more likely to improve a dialogue system’s efficiency). Then, a new architecture for incremental dialogue systems is introduced with the aim of transforming a traditional dialogue system into an incremental one at a low cost (also separating the turn-taking manager from the dialogue manager). To be able to perform the first tests, a simulated environment has been designed and implemented. It is able to replicate user and ASR behaviour that are specific to incremental processing, unlike existing simulators. Combined together, these contributions led to the establishement of a rule-based incremental dialogue strategy that is shown to improve the dialogue efficiency in a task-oriented situation and in simulation. A new reinforcement learning strategy has also been proposed. It is able to autonomously learn optimal turn-taking behavious throughout the interactions. The simulated environment has been used for training and for a first evaluation, where the new data-driven strategy is shown to outperform both the non-incremental and rule-based incremental strategies. In order to validate these results in real dialogue conditions, a prototype through which the users can interact in order to control their smart home has been developed. At the beginning of each interaction, the turn-taking strategy is randomly chosen among the non-incremental, the rule-based incremental and the reinforcement learning strategy (learned in simulation). A corpus of 206 dialogues has been collected. The results show that the reinforcement learning strategy significantly improves the dialogue efficiency without hurting the user experience (slightly improving it, in fact). Apprentissage par renforcement Dialogue incrémental Systèmes de dialogue Phénomène de prise de parole Reinforcement learning Incremental dialogue Dialogue systems Turn-taking phenomena 006.35
25	Recommandation conversationnelle : écoutez avant de parlez Vachon, Nicholas 12 1900 (has links) In a world of globalization, where offers continues to grow, the ability to direct people to their specific need is essential. After being key differentiating factors for Netflix and Amazon, Recommender Systems in general are no where near a downfall. Still, one downside of the basic recommender systems is that they are mainly based on indirect feedback (our behaviour, mainly form the past) as opposed to explicit demand at a specific time. Recent development in machine learning brings us closer to the possibility for a user to express it’s specific needs in natural language and get a machine generated reply. This is what Conversational Recommendation is about. Conversational recommendation encapsulates several machine learning sub-tasks. In this work, we focus our study on methods for the task of item (in our case, movie) recommendation from conversation. To explore this setting, we use, adapt and extend state of the art transformer based neural language modeling techniques to the task of recommendation from dialogue. We study the performance of different methods using the ReDial dataset [24], a conversational- recommendation dataset for movies. We also make use of a knowledge base of movies and measure their ability to improve performance for cold-start users, items, and/or both. This master thesis is divided as follows. First, we review all the basics concepts and the previous work necessary to to this lecture. When then dive deep into the specifics our data management, the different models we tested, the set-up of our experiments and the results we got. Follows the original a paper we submitted at RecSys 2020 Conference. Note that their is a minor inconsistency since throughout the thesis, we use v to represent items but in the paper, we used i. Overall, we find that pre-trained transformer models outperform baselines even if the baselines have access to the user preferences manually extracted from their utterances. / Dans un monde de mondialisation, où les offres continuent de croître, la capacité de référer les gens vers leurs besoins spécifiques est essentiel. Après avoir été un facteur de différenciation clé pour Netflix et Amazon, les systèmes de recommandation en général ne sont pas près de disparaître. Néanmoins, l’un des leurs inconvénients est qu’ils sont principalement basés sur des informations indirects (notre comportement, principalement du passé) par opposition à une demande explicite à un moment donné. Le développement récent de l’apprentissage automatique nous rapproche de la possibilité d’exprimer nos besoins spécifiques en langage naturel et d’obtenir une réponse générée par la machine. C’est ce en quoi consiste la recommandation conversationnelle. La recommandation conversationnelle englobe plusieurs sous-tâches d’apprentissage automatique. Dans ce travail, nous concentrons notre étude sur les méthodes entourant la tâche de recommandation d’item (dans notre cas, un film) à partir d’un dialogue. Pour explorer cette avenue, nous adaptons et étendons les techniques de modélisation du langage basées sur les transformeurs à la tâche de recommandation à partir du dialogue. Nous étudions les performances de différentes méthodes à l’aide de l’ensemble de données ReDial [24], un ensemble de données de recommandation conversationnelle pour les films. Nous utilisons également une base de connaissances de films et mesurons sa capacité à améliorer les performances lorsque peu d’information sur les utilisateurs/éléments est disponible. Ce mémoire par article est divisé comme suit. Tout d’abord, nous passons en revue tous les concepts de base et les travaux antérieurs nécessaires à cette lecture. Ensuite, nous élaborons les spécificités de notre gestion des données, les différents modèles que nous avons testés, la mise en place de nos expériences et les résultats que nous avons obtenus. Suit l’article original que nous avons soumis à la conférence RecSys 2020. Notez qu’il y a une incohérence mineure puisque tout au long du mémoire, nous utilisons v pour représenter les éléments mais dans l’article, nous avons utilisé i. Dans l’ensemble, nous constatons que les modèles de transformeurs pré-entraînés surpassent les modèles de bases même si les modèles de base ont accès aux préférences utilisateur extraites manuellement des dialogues. Recommandation Systèmes de dialogue Recommendation Dialogue Systems Transformers Transformeurs
26	End-to-end dialogové systémy s předtrénovanými jazykovými modely / End-to-end dialogue systems with pretrained language models Kulhánek, Jonáš January 2021 (has links) Current dialogue systems typically consist of separate components, which are manu- ally engineered to a large part and need extensive annotation. End-to-end trainable sys- tems exist but produce lower-quality, unreliable outputs. The recent transformer-based pre-trained language models such as GPT-2 brought considerable progress to language modelling, but they rely on huge amounts of textual data, which are not available for common dialogue domains. Therefore, training these models runs a high risk of overfit- ting. To overcome these obstacles, we propose a novel end-to-end dialogue system called AuGPT. We add auxiliary training objectives to use training data more efficiently, and we use massive data augmentation via back-translation and pretraining on multiple datasets to increase data volume and diversity. We evaluate our system using automatic methods (corpus-based metrics, user simulation), human evaluation as part of the DSTC 9 shared task challenge (where our system placed 3rd out of 10), as well as extensive manual error analysis. Our method substantially outperforms the baseline on the MultiWOZ bench- mark and shows competitive results with state-of-the-art end-to-end dialogue systems. 1
27	Hierarchical reinforcement learning for spoken dialogue systems Cuayáhuitl, Heriberto January 2009 (has links) This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision Processes (SMDPs), and proposes two hierarchical reinforcement learning methods to optimize sub-dialogues rather than full dialogues. The first method uses a hierarchy of SMDPs, where every SMDP ignores irrelevant state variables and actions in order to optimize a sub-dialogue. The second method extends the first one by constraining every SMDP in the hierarchy with prior expert knowledge. The latter method proposes a learning algorithm called 'HAM+HSMQ-Learning', which combines two existing algorithms in the literature of hierarchical reinforcement learning. Whilst the first method generates fully-learnt behaviour, the second one generates semi-learnt behaviour. In addition, this research proposes a heuristic dialogue simulation environment for automatic dialogue strategy learning. Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: First, both methods scale well at the cost of near-optimal solutions, resulting in slightly longer dialogues than the optimal solutions. Second, dialogue strategies learnt with coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Third, semi-learnt dialogue behaviours are a better alternative (because of their higher overall performance) than hand-coded or fully-learnt dialogue behaviours. Last, hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of adaptive behaviours in larger-scale spoken dialogue systems. This research makes the following contributions to spoken dialogue systems which learn their dialogue behaviour. First, the Semi-Markov Decision Process (SMDP) model was proposed to learn spoken dialogue strategies in a scalable way. Second, the concept of 'partially specified dialogue strategies' was proposed for integrating simultaneously hand-coded and learnt spoken dialogue behaviours into a single learning framework. Third, an evaluation with real users of hierarchical reinforcement learning dialogue agents was essential to validate their effectiveness in a realistic environment. 020
28	Apprentissage automatique et compréhension dans le cadre d’un dialogue homme-machine téléphonique à initiative mixte / Corpus-based spoken language understanding for mixed initiative spoken dialog systems Servan, Christophe 10 December 2008 (has links) Les systèmes de dialogues oraux Homme-Machine sont des interfaces entre un utilisateur et des services. Ces services sont présents sous plusieurs formes : services bancaires, systèmes de réservations (de billets de train, d’avion), etc. Les systèmes de dialogues intègrent de nombreux modules notamment ceux de reconnaissance de la parole, de compréhension, de gestion du dialogue et de synthèse de la parole. Le module qui concerne la problématique de cette thèse est celui de compréhension de la parole. Le processus de compréhension de la parole est généralement séparé du processus de transcription. Il s’agit, d’abord, de trouver la meilleure hypothèse de reconnaissance puis d’appliquer un processus de compréhension. L’approche proposée dans cette thèse est de conserver l’espace de recherche probabiliste tout au long du processus de compréhension en l’enrichissant à chaque étape. Cette approche a été appliquée lors de la campagne d’évaluation MEDIA. Nous montrons l’intérêt de notre approche par rapport à l’approche classique. En utilisant différentes sorties du module de RAP sous forme de graphe de mots, nous montrons que les performances du décodage conceptuel se dégradent linéairement en fonction du taux d’erreurs sur les mots (WER). Cependant nous montrons qu’une approche intégrée, cherchant conjointement la meilleure séquence de mots et de concepts, donne de meilleurs résultats qu’une approche séquentielle. Dans le souci de valider notre approche, nous menons des expériences sur le corpus MEDIA dans les mêmes conditions d’évaluation que lors de la campagne MEDIA. Il s’agit de produire des interprétations sémantiques à partir des transcriptions sans erreur. Les résultats montrent que les performances atteintes par notre modèle sont au niveau des performances des systèmes ayant participé à la campagne d’évaluation. L’étude détaillée des résultats obtenus lors de la campagne MEDIA nous permet de montrer la corrélation entre, d’une part, le taux d’erreur d’interprétation et, d’autre part, le taux d’erreur mots de la reconnaissance de la parole, la taille du corpus d’apprentissage, ainsi que l’ajout de connaissance a priori aux modèles de compréhension. Une analyse d’erreurs montre l’intérêt de modifier les probabilités des treillis de mots avec des triggers, un modèle cache ou d’utiliser des règles arbitraires obligeant le passage dans une partie du graphe et s’appliquant sur la présence d’éléments déclencheurs (mots ou concepts) en fonction de l’historique. On présente les méthodes à base de d’apprentissage automatique comme nécessairement plus gourmandes en terme de corpus d’apprentissage. En modifiant la taille du corpus d’apprentissage, on peut mesurer le nombre minimal ainsi que le nombre optimal de dialogues nécessaires à l’apprentissage des modèles de langages conceptuels du système de compréhension. Des travaux de recherche menés dans cette thèse visent à déterminer quel est la quantité de corpus nécessaire à l’apprentissage des modèles de langages conceptuels à partir de laquelle les scores d’évaluation sémantiques stagnent. Une corrélation est établie entre la taille de corpus nécessaire pour l’apprentissage et la taille de corpus afin de valider le guide d’annotations. En effet, il semble, dans notre cas de l’évaluation MEDIA, qu’il ait fallu sensiblement le même nombre d’exemple pour, d’une part, valider l’annotation sémantique et, d’autre part, obtenir un modèle stochastique « de qualité » appris sur corpus. De plus, en ajoutant des données a priori à nos modèles stochastiques, nous réduisons de manière significative la taille du corpus d’apprentissage nécessaire pour atteindre les même scores du système entièrement stochastique (près de deux fois moins de corpus à score égal). Cela nous permet de confirmer que l’ajout de règles élémentaires et intuitives (chiffres, nombres, codes postaux, dates) donne des résultats très encourageants. Ce constat a mené à la réalisation d’un système hybride mêlant des modèles à base de corpus et des modèles à base de connaissance. Dans un second temps, nous nous appliquons à adapter notre système de compréhension à une application de dialogue simple : un système de routage d’appel. La problématique de cette tâche est le manque de données d’apprentissage spécifiques au domaine. Nous la résolvons en partie en utilisant divers corpus déjà à notre disposition. Lors de ce processus, nous conservons les données génériques acquises lors de la campagne MEDIA et nous y intégrons les données spécifiques au domaine. Nous montrons l’intérêt d’intégrer une tâche de classification d’appel dans un processus de compréhension de la parole spontanée. Malheureusement, nous disposons de très peu de données d’apprentissage relatives au domaine de la tâche. En utilisant notre approche intégrée de décodage conceptuel, conjointement à un processus de filtrage, nous proposons une approche sous forme de sac de mots et de concepts. Cette approche exploitée par un classifieur permet d’obtenir des taux de classification d’appels encourageants sur le corpus de test, alors que le WER est assez élevé. L’application des méthodes développées lors de la campagne MEDIA nous permet d’améliorer la robustesse du processus de routage d’appels. / Spoken dialogues systems are interfaces between users and services. Simple examples of services for which theses dialogue systems can be used include : banking, booking (hotels, trains, flights), etc. Dialogue systems are composed of a number of modules. The main modules include Automatic Speech Recognition (ASR), Spoken Language Understanding (SLU), Dialogue Management and Speech Generation. In this thesis, we concentrate on the Spoken Language Understanding component of dialogue systems. In the past, it has usual to separate the Spoken Language Understanding process from that of Automatic Speech Recognition. First, the Automatic Speech Recognition process finds the best word hypothesis. Given this hypothesis, we then find the best semantic interpretation. This thesis presents a method for the robust extraction of basic conceptual constituents (or concepts) from an audio message. The conceptual decoding model proposed follows a stochastic paradigm and is directly integrated into the Automatic Speech Recognition process. This approach allows us to keep the probabilistic search space on sequences of words produced by the Automatic Speech Recognition module, and to project it to a probabilistic search space of sequences of concepts. The experiments carried out on the French spoken dialogue corpus MEDIA, available through ELDA, show that the performance reached by our new approach is better than the traditional sequential approach. As a starting point for evaluation, the effect that deterioration of word error rate (WER) has on SLU systems is examined though use of different ASR outputs. The SLU performance appears to decrease lineary as a function of ASR word error rate.We show, however, that the proposed integrated method of searching for both words and concets, gives better results to that of a traditionnanl sequential approach. In order to validate our approach, we conduct experiments on the MEDIA corpus in the same assessment conditions used during the MEDIA campaign. The goal is toproduce error-free semantic interpretations from transcripts. The results show that the performance achieved by our model is as good as the systems involved in the evaluation campaign. Studies made on the MEDIA corpus show the concept error rate is related to the word error rate, the size of the training corpus and a priori knwoledge added to conceptual model languages. Error analyses show the interest of modifying the probabilities of word lattice with triggers, a template cache or by using arbitrary rules requiring passage through a portion of the graph and applying the presence of triggers (words or concepts) based on history. Methods based on machine learning are generally quite demanding in terms of amount of training data required. By changing the size of the training corpus, the minimum and the optimal number of dialogues needed for training conceptual language models can be measured. Research conducted in this thesis aims to determine the size of corpus necessary for training conceptual language models from which the semantic evaluation scores stagnated. A correlation is established between the necessary corpus size for learning and the corpus size necessary to validate the manual annotations. In the case of the MEDIA evaluation campaign, it took roughly the same number of examples, first to validate the semantic annotations and, secondly, to obtain a "quality" corpus-trained stochastic model. The addition of a priori knowledge to our stochastic models reduce significantly the size of the training corpus needed to achieve the same scores as a fully stochastic system (nearly half the size for the same score). It allows us to confirm that the addition of basic intuitive rules (numbers, zip codes, dates) gives very encouraging results. It leeds us to create a hybrid system combining corpus-based and knowledge-based models. The second part of the thesis examines the application of the understanding module to another simple dialogue system task, a callrouting system. A problem with this specific task is a lack of data available for training the requiered language models. We attempt to resolve this issue by supplementing he in-domain data with various other generic corpora already available, and data from the MEDIA campaing. We show the benefits of integrating a call classification task in a SLU process. Unfortunately, we have very little training corpus in the field under consideration. By using our integrated approach to decode concepts, along with an integrated process, we propose a bag of words and concepts approach. This approach used by a classifier achieved encouraging call classification rates on the test corpus, while the WER was relativelyhigh. The methods developed are shown to improve the call routing system process robustness. Compréhension de la parole Traitement automatique de la parole Apprentissage automatique Systèmes de dialogue Speech language understanding Speech processing Natural language processing Machine learning Dialogue systems
29	Revisiting user simulation in dialogue systems : do we still need them ? : will imitation play the role of simulation ? / Revisiter la simulation d'utilisateurs dans les systèmes de dialogue parlé : est-elle encore nécessaire ? : est-ce que l'imitation peut jouer le rôle de la simulation ? Chandramohan, Senthilkumar 25 September 2012 (has links) Les récents progrès dans le domaine du traitement du langage ont apporté un intérêt significatif à la mise en oeuvre de systèmes de dialogue parlé. Ces derniers sont des interfaces utilisant le langage naturel comme medium d'interaction entre le système et l'utilisateur. Le module de gestion de dialogue choisit le moment auquel l'information qu'il choisit doit être échangée avec l'utilisateur. Ces dernières années, l'optimisation de dialogue parlé en utilisant l'apprentissage par renforcement est devenue la référence. Cependant, une grande partie des algorithmes utilisés nécessite une importante quantité de données pour être efficace. Pour gérer ce problème, des simulations d'utilisateurs ont été introduites. Cependant, ces modèles introduisent des erreurs. Par un choix judicieux d'algorithmes, la quantité de données d'entraînement peut être réduite et ainsi la modélisation de l'utilisateur évitée. Ces travaux concernent une partie des contributions présentées. L'autre partie des travaux consiste à proposer une modélisation à partir de données réelles des utilisateurs au moyen de l'apprentissage par renforcement inverse / Recent advancements in the area of spoken language processing and the wide acceptance of portable devices, have attracted signicant interest in spoken dialogue systems.These conversational systems are man-machine interfaces which use natural language (speech) as the medium of interaction.In order to conduct dialogues, computers must have the ability to decide when and what information has to be exchanged with the users. The dialogue management module is responsible to make these decisions so that the intended task (such as ticket booking or appointment scheduling) can be achieved.Thus learning a good strategy for dialogue management is a critical task.In recent years reinforcement learning-based dialogue management optimization has evolved to be the state-of-the-art. A majority of the algorithms used for this purpose needs vast amounts of training data.However, data generation in the dialogue domain is an expensive and time consuming process. In order to cope with this and also to evaluatethe learnt dialogue strategies, user modelling in dialogue systems was introduced. These models simulate real users in order to generate synthetic data.Being computational models, they introduce some degree of modelling errors. In spite of this, system designers are forced to employ user models due to the data requirement of conventional reinforcement learning algorithms can learn optimal dialogue strategies from limited amount of training data when compared to the conventional algorithms. As a consequence of this, user models are no longer required for the purpose of optimization, yet they continue to provide a fast and easy means for quantifying the quality of dialogue strategies. Since existing methods for user modelling are relatively less realistic compared to real user behaviors, the focus is shifted towards user modelling by means of inverse reinforcement learning. Using experimental results, the proposed method's ability to learn a computational models with real user like qualities is showcased as part of this work. Simulation d'utilisateurs Systèmes de dialogue parlé Apprentissage par renforcement Apprentissage par renforcement inverse Gestion de dialogue User simulation Spoken dialogue systems Reinforcement learning Inverse reinforcement learning Dialogue management
30	Reinforcement learning and reward estimation for dialogue policy optimisation Su, Pei-Hao January 2018 (has links) Modelling dialogue management as a reinforcement learning task enables a system to learn to act optimally by maximising a reward function. This reward function is designed to induce the system behaviour required for goal-oriented applications, which usually means fulfilling the user’s goal as efficiently as possible. However, in real-world spoken dialogue systems, the reward is hard to measure, because the goal of the conversation is often known only to the user. Certainly, the system can ask the user if the goal has been satisfied, but this can be intrusive. Furthermore, in practice, the reliability of the user’s response has been found to be highly variable. In addition, due to the sparsity of the reward signal and the large search space, reinforcement learning-based dialogue policy optimisation is often slow. This thesis presents several approaches to address these problems. To better evaluate a dialogue for policy optimisation, two methods are proposed. First, a recurrent neural network-based predictor pre-trained from off-line data is proposed to estimate task success during subsequent on-line dialogue policy learning to avoid noisy user ratings and problems related to not knowing the user’s goal. Second, an on-line learning framework is described where a dialogue policy is jointly trained alongside a reward function modelled as a Gaussian process with active learning. This mitigates the noisiness of user ratings and minimises user intrusion. It is shown that both off-line and on-line methods achieve practical policy learning in real-world applications, while the latter provides a more general joint learning system directly from users. To enhance the policy learning speed, the use of reward shaping is explored and shown to be effective and complementary to the core policy learning algorithm. Furthermore, as deep reinforcement learning methods have the potential to scale to very large tasks, this thesis also investigates the application to dialogue systems. Two sample-efficient algorithms, trust region actor-critic with experience replay (TRACER) and episodic natural actor-critic with experience replay (eNACER), are introduced. In addition, a corpus of demonstration data is utilised to pre-train the models prior to on-line reinforcement learning to handle the cold start problem. Combining these two methods, a practical approach is demonstrated to effectively learn deep reinforcement learning-based dialogue policies in a task-oriented information seeking domain. Overall, this thesis provides solutions which allow truly on-line and continuous policy learning in spoken dialogue systems.

Search results