Global ETD Search

81	Classification sur données médicales à l'aide de méthodes d'optimisation et de datamining, appliquée au pré-screening dans les essais cliniques / Classification on medical data using combinatorial optimization and data mining, applicated to patient screening in clinical trials Jacques, Julie 02 December 2013 (has links) Les données médicales souffrent de problèmes d'uniformisation ou d'incertitude, ce qui les rend difficilement utilisables directement par des logiciels médicaux, en particulier dans le cas du recrutement pour les essais cliniques. Dans cette thèse, nous proposons une approche permettant de palier la mauvaise qualité de ces données à l'aide de méthodes de classification supervisée. Nous nous intéresserons en particulier à 3 caractéristiques de ces données : asymétrie, incertitude et volumétrie. Nous proposons l'algorithme MOCA-I qui aborde ce problème combinatoire de classification partielle sur données asymétriques sous la forme d'un problème de recherche locale multi-objectif. Après avoir confirmé les apports de la modélisation multiobjectif dans ce contexte, nous calibrons MOCA-I et le comparons aux meilleurs algorithmes de classification de la littérature, sur des jeux de données réels et asymétriques de la littérature. Les ensembles de règles obtenus par MOCA-I sont statistiquement plus performants que ceux de la littérature, et 2 à 6 fois plus compacts. Pour les données ne présentant pas d'asymétrie, nous proposons l'algorithme MOCA, statistiquement équivalent à ceux de la littérature. Nous analysons ensuite l'impact de l'asymétrie sur le comportement de MOCA et MOCA-I, de manière théorique et expérimentale. Puis, nous proposons et évaluons différentes méthodes pour traiter les nombreuses solutions Pareto générées par MOCA-I, afin d'assister l'utilisateur dans le choix de la solution finale et réduire le phénomène de sur-apprentissage. Enfin, nous montrons comment le travail réalisé peut s'intégrer dans une solution logicielle. / Medical data suffer from uncertainty and a lack of uniformisation, making them hard to use in medical software, especially for patient screening in clinical trials. In this PhD work, we propose to deal with these problems using supervised classification methods. We will focus on 3 properties of these data : imbalance, uncertainty and volumetry. We propose the MOCA-I algorithm to cope with this partial classification combinatorial problem, that uses a multi-objective local search algorithm. After having confirmed the benefits of multiobjectivization in this context, we calibrate MOCA-I and compare it to the best algorithms of the literature, on both real data sets and imbalanced data sets from literature. MOCA-I generates rule sets that are statistically better than models obtained by the best algorithmes of the literature. Moreover, the models generated by MOCA-I are between 2 to 6 times shorter. Regarding balanced data, we propose the MOCA algorithm, statistically equivalent to best algorithms of literature. Then, we analyze both theoretically and experimentally the behaviors of MOCA and MOCA-I depending on imbalance. In order to help the decision maker to choose a solution and reduce over-fitting, we propose and evaluate different methods to handle all the Pareto solutions generated by MOCA-I. Finally, we show how this work can be integrated into a software application. Optimisation multi-objectif 006.31
82	Novel learning and exploration-exploitation methods for effective recommender systems / Nouveaux algorithmes et méthodes d’exploration-exploitation pour des systèmes de recommandations efficaces Warlop, Romain 19 October 2018 (has links) Cette thèse, réalisée en entreprise en tant que thèse CIFRE dans l'entreprise fifty-five, étudie les algorithmes des systèmes de recommandation. Nous avons proposé trois nouveaux algorithmes améliorant l'état de l'art que ce soit en termes de performance ou de prise en compte des contraintes industrielles. Pour cela nous avons proposé un premier algorithme basé sur la factorisation de tenseur, généralisation de la factorisation de matrice couramment appliquée en filtrage collaboratif.Nous avons ensuite proposé un algorithme permettant d'améliorer l'état de l'art des solutions de complétion de paniers. L'objectif des algorithmes de complétion de paniers est de proposer à l'utilisateur un nouveau produit à ajouter au panier qu'il/elle est en train d'acheter permettant ainsi d'augmenter la valeur d'un utilisateur. Pour cela nous nous sommes appuyés sur les processus ponctuels déterminantal. Nous avons généralisé l'approche de la complétion de paniers par DPP en utilisant une approche tensorielle. Enfin nous avons proposé un algorithme d'apprentissage par renforcement permettant d'alterner entre différents algorithmes de recommandation. En effet, utiliser toujours le même algorithme peut avoir tendance à ennuyer l'utilisateur pendant un certain temps, ou à l'inverse lui donner de plus en plus confiance en l'algorithme. Ainsi la performance d'un algorithme donné n'est pas stationnaire et dépend de quand et à quelle fréquence celui-ci a été utilisé. Notre algorithme d'apprentissage par renforcement apprend en temps réel à alterner entre divers algorithmes de recommandations dans le but de maximiser les performances sur le long terme. / This thesis, written in a company as a CIFRE thesis in the company fifty-five, studies recommender systems algorithms. We propose three new algorithms that improved over state-of-the-art solutions in terms of performance or matching industrial constraints. To that end, we proposed a first algorithm based on tensor factorization, a generalization of matrix factorization, commonly used on collaborative filtering. We then proposed a new algorithm that improves basket completion state-of-the-art algorithms. The goal of basket completion algorithms is to recommend a new product to a given user based on the products she is about to purchase in order to increase the user value. To that end we leverage Determinantal Point Processes, i.e., probability measure where the probability to observe a given set is proportional to the determinant of a kernel matrix. We generalized DPP approaches for basket completion using a tensor point of view coupled with a logistic regression. Finally, we proposed a reinforcement learning algorithm that allows to alternate between several recommender systems algorithms. Indeed, using always the same algorithm may either bore the user for a while or reinforce her trust in the system. Thus, the algorithm performance is not stationary and depends on when and how much the algorithm has been used in the past. Our reinforcement learning algorithm learns in real time how to alternate between several recommender system algorithms in order to maximize long term performances, that is in order to keep the user interested in the system as long as possible. Processus ponctuel déterminantal Factorisation de tenseur Complétion de panier 006.31
83	Optimisation combinatoire et extraction de connaissances sur données hétérogènes et temporelles : application à l’identification de parcours patients / Combinatorial optimization and knowledge extraction on heterogeneous and temporal data : application to patients profiles discovery Vandromme, Maxence 30 May 2017 (has links) Les données hospitalières présentent de nombreuses spécificités qui rendent difficilement applicables les méthodes de fouille de données traditionnelles. Dans cette thèse, nous nous intéressons à l'hétérogénéité de ces données ainsi qu'à leur aspect temporel. Dans le cadre du projet ANR ClinMine et d'une convention CIFRE avec la société Alicante, nous proposons deux nouvelles méthodes d'extraction de connaissances adaptées à ces types de données. Dans la première partie, nous développons l'algorithme MOSC (Multi-Objective Sequence Classification) pour la classification supervisée sur données hétérogènes, numériques et temporelles. Cette méthode accepte, en plus des termes binaires ou symboliques, des termes numériques et des séquences d'événements temporels pour former des ensembles de règles de classification. MOSC est le premier algorithme de classification supportant simultanément ces types de données. Dans la seconde partie, nous proposons une méthode de biclustering pour données hétérogènes, un problème qui n'a à notre connaissance jamais été exploré. Cette méthode, HBC (Heterogeneous BiClustering), est étendue pour supporter les données temporelles de différents types : événements temporels et séries temporelles irrégulières. HBC est utilisée pour un cas d'étude sur un ensemble de données hospitalières, dont l'objectif est d'identifier des groupes de patients ayant des profils similaires. Les résultats obtenus sont cohérents et intéressants d'un point de vue médical ; et amènent à la définition de cas d'étude plus précis. L'intégration dans une solution logicielle est également engagée, avec une version parallèle de HBC et un outil de visualisation des résultats. / Hospital data exhibit numerous specificities that make the traditional data mining tools hard to apply. In this thesis, we focus on the heterogeneity associated with hospital data and on their temporal aspect. This work is done within the frame of the ANR ClinMine research project and a CIFRE partnership with the Alicante company. In this thesis, we propose two new knowledge discovery methods suited for hospital data, each able to perform a variety of tasks: classification, prediction, discovering patients profiles, etc.In the first part, we introduce MOSC (Multi-Objective Sequence Classification), an algorithm for supervised classification on heterogeneous, numeric and temporal data. In addition to binary and symbolic terms, this method uses numeric terms and sequences of temporal events to form sets of classification rules. MOSC is the first classification algorithm able to handle these types of data simultaneously. In the second part, we introduce HBC (Heterogeneous BiClustering), a biclustering algorithm for heterogeneous data, a problem that has never been studied so far. This algorithm is extended to support temporal data of various types: temporal events and unevenly-sampled time series. HBC is used for a case study on a set of hospital data, whose goal is to identify groups of patients sharing a similar profile. The results make sense from a medical viewpoint; they indicate that relevant, and sometimes new knowledge is extracted from the data. These results also lead to further, more precise case studies. The integration of HBC within a software is also engaged, with the implementation of a parallel version and a visualization tool for biclustering results. Biclustering Classification double Données hétérogènes Données temporelles 006.31
84	Efficient sequential learning in structured and constrained environments / Apprentissage séquentiel efficace dans des environnements structurés avec contraintes Calandriello, Daniele 18 December 2017 (has links) L'avantage principal des méthodes d'apprentissage non-paramétriques réside dans le fait que la nombre de degrés de libertés du modèle appris s'adapte automatiquement au nombre d'échantillons. Ces méthodes sont cependant limitées par le "fléau de la kernelisation": apprendre le modèle requière dans un premier temps de construire une matrice de similitude entre tous les échantillons. La complexité est alors quadratique en temps et espace, ce qui s'avère rapidement trop coûteux pour les jeux de données de grande dimension. Cependant, la dimension "effective" d'un jeu de donnée est bien souvent beaucoup plus petite que le nombre d'échantillons lui-même. Il est alors possible de substituer le jeu de donnée réel par un jeu de données de taille réduite (appelé "dictionnaire") composé exclusivement d'échantillons informatifs. Malheureusement, les méthodes avec garanties théoriques utilisant des dictionnaires comme "Ridge Leverage Score" (RLS) ont aussi une complexité quadratique. Dans cette thèse nous présentons une nouvelle méthode d'échantillonage RLS qui met à jour le dictionnaire séquentiellement en ne comparant chaque nouvel échantillon qu'avec le dictionnaire actuel, et non avec l'ensemble des échantillons passés. Nous montrons que la taille de tous les dictionnaires ainsi construits est de l'ordre de la dimension effective du jeu de données final, garantissant ainsi une complexité en temps et espace à chaque étape indépendante du nombre total d'échantillons. Cette méthode présente l’avantage de pouvoir être parallélisée. Enfin, nous montrons que de nombreux problèmes d'apprentissage non-paramétriques peuvent être résolus de manière approchée grâce à notre méthode. / The main advantage of non-parametric models is that the accuracy of the model (degrees of freedom) adapts to the number of samples. The main drawback is the so-called "curse of kernelization": to learn the model we must first compute a similarity matrix among all samples, which requires quadratic space and time and is unfeasible for large datasets. Nonetheless the underlying effective dimension (effective d.o.f.) of the dataset is often much smaller than its size, and we can replace the dataset with a subset (dictionary) of highly informative samples. Unfortunately, fast data-oblivious selection methods (e.g., uniform sampling) almost always discard useful information, while data-adaptive methods that provably construct an accurate dictionary, such as ridge leverage score (RLS) sampling, have a quadratic time/space cost. In this thesis we introduce a new single-pass streaming RLS sampling approach that sequentially construct the dictionary, where each step compares a new sample only with the current intermediate dictionary and not all past samples. We prove that the size of all intermediate dictionaries scales only with the effective dimension of the dataset, and therefore guarantee a per-step time and space complexity independent from the number of samples. This reduces the overall time required to construct provably accurate dictionaries from quadratic to near-linear, or even logarithmic when parallelized. Finally, for many non-parametric learning problems (e.g., K-PCA, graph SSL, online kernel learning) we we show that we can can use the generated dictionaries to compute approximate solutions in near-linear that are both provably accurate and empirically competitive. Méthode de Nyström Échantillonnage préférentiel 006.31
85	Programmation logique inductive pour la classification et la transformation de documents semi-structurés / Inductive logic programing for tree classification and transformation Decoster, Jean 17 July 2014 (has links) L’échange d’informations entre périphériques variés et sur internet soulève de nombreux problèmes par le volume et l’hétéroclisme des données échangées. La plupart de ces échanges utilisent le format XML. Afin de les faciliter, des traitements intelligents, comme la classification et la transformation automatiques, ont été développés. Le but de cette thèse est double : proposer un framework d'apprentissage pour la classification de documents XML et étudier l'apprentissage de transformations de documents XML. Le choix d’utiliser la Programmation Logique Inductive a été fait. Même si les méthodes d'apprentissage ont alors un surcoût algorithmique non négligeable (certaines opérations deviennent NP-dures), la représentation relationnelle semble adaptée aux documents XML de par son expressivité. Notre framework pour la classification fait suite à l'étude de familles de clauses pour la représentation de structures arborescentes. Il repose sur une réécriture des opérations de base de la PLI que sont la theta-subsomption et le moindre généralisé [Plotkin1971]. Nos algorithmes sont polynomiaux en temps dans la taille de leur entrée là où ceux standards sont exponentiels. Ils permettent une identification à la limite [Gold1967] de nos familles de clauses. Notre seconde contribution débute par la modélisation d’une famille de clauses dans la lignée des programmes fonctionnels [Paulson91]. Ces clauses sont une adaptation à la PLI des scripts d'édition et prennent en compte un contexte. Elles permettent la représentation de transformations de documents XML. Leurs apprentissages sont possibles grâce à deux algorithmes de type A, approche courante en PLI (HOC-Learner [Santos2009]). / The recent proliferation of XML documents in databases and web applications rises some issues due to the numerous data exchanged and their diversity. To ease their uses, some smart means have been developed such as automatic classification and transformation. This thesis has two goals:• To propose a framework for the XML documents classification task.• To study the XML documents transformation learning.We have chosen to use Inductive Logic Programming. The expressiveness of logic programs grants flexibility in specifying the learning task and understandability to the induced theories. This flexibility implies a high computational cost, constraining the applicability of ILP systems. However, XML documents being trees, a good concession can be found.For our first contribution, we define clauses languages that allow encoding xml trees. The definition of our classification framework follows their studies. It stands on a rewriting of the standard ILP operations such as theta-subsumption and least general generalization [Plotkin1971]. Our algorithms are polynomials in time in the input size whereas the standard ones are exponentials. They grant an identification in the limit [Gold1967] of our languages.Our second contribution is the building of methods to learn XML documents transformations. It begins by the definition of a clauses class in the way of functional programs [Paulson91]. They are an ILP adaptation of edit scripts and allow a context. Their learning is possible thanks to two A-like algorithms, a common ILP approach (HOC-Learner [Santos2009]). Programmation logique inductive Transformation de données Thêta-subsomption Moindre généralisé 006.31
86	Μηχανική μάθηση : Bayesian δίκτυα και εφαρμογές Χριστακοπούλου, Κωνσταντίνα 13 October 2013 (has links) Στην παρούσα διπλωματική εργασία πραγματευόμαστε το θέμα της χρήσης των Bayesian Δικτύων -και γενικότερα των Πιθανοτικών Γραφικών Μοντέλων - στη Μηχανική Μάθηση. Στα πρώτα κεφάλαια της εργασίας αυτής παρουσιάζουμε συνοπτικά τη θεωρητική θεμελίωση αυτών των δομημένων πιθανοτικών μοντέλων, η οποία απαρτίζεται από τις βασικές φάσεις της αναπαράστασης, επαγωγής συμπερασμάτων, λήψης αποφάσεων και εκμάθησης από τα διαθέσιμα δεδομένα. Στα επόμενα κεφάλαια, εξετάζουμε ένα ευρύ φάσμα εφαρμογών των πιθανοτικών γραφικών μοντέλων και παρουσιάζουμε τα αποτελέσματα των εξομοιώσεων που υλοποιήσαμε. Συγκεκριμένα, αρχικά με χρήση γράφων ορίζονται τα Bayesian δίκτυα, Markov δίκτυα και Factor Graphs. Έπειτα, παρουσιάζονται οι αλγόριθμοι επαγωγής συμπερασμάτων που επιτρέπουν τον απευθείας υπολογισμό πιθανοτικών κατανομών από τους γράφους. Διευκολύνεται η λήψη αποφάσεων υπό αβεβαιότητα με τα δέντρα αποφάσεων και τα Influence διαγράμματα. Ακολούθως, μελετάται η εκμάθηση της δομής και των παραμέτρων των πιθανοτικών γραφικών μοντέλων σε παρουσία πλήρους ή μερικού συνόλου δεδομένων. Τέλος, παρουσιάζονται εκτενώς σενάρια τα οποία καταδεικνύουν την εκφραστική δύναμη, την ευελιξία και τη χρηστικότητα των Πιθανοτικών Γραφικών Μοντέλων σε εφαρμογές του πραγματικού κόσμου. / The main subject of this diploma thesis is how probabilistic graphical models can be used in a wide range of real-world scenarios. In the first chapters, we have presented in a concise way the theoretical foundations of graphical models, which consists of the deeply related phases of representation, inference, decision theory and learning from data. In the next chapters, we have worked on many applications, from Optical Character Recognition to Recoginizing Actions and we have presented the results from the simulations. Μηχανική μάθηση Bayesian δίκτυα 006.31 Machine learning Bayesian networks Probabilistic graphical models
87	Machine learning in embedded systems Swere, Erick A. R. January 2008 (has links) This thesis describes novel machine learning techniques specifically designed for use in real-time embedded systems. The techniques directly address three major requirements of such learning systems. Firstly, learning must be capable of being achieved incrementally, since many applications do not have a representative training set available at the outset. Secondly, to guarantee real-time performance, the techniques must be able to operate within a deterministic and limited time bound. Thirdly, the memory requirement must be limited and known a priori to ensure the limited memory available to hold data in embedded systems will not be exceeded. The work described here has three principal contributions. The frequency table is a data structure specifically designed to reduce the memory requirements of incremental learning in embedded systems. The frequency table facilitates a compact representation of received data that is sufficient for decision tree generation. The frequency table decision tree (FTDT) learning method provides classification performance similar to existing decision tree approaches, but extends these to incremental learning while substantially reducing memory usage for practical problems. The incremental decision path (IDP) method is able to efficiently induce, from the frequency table of observations, the path through a decision tree that is necessary for the classification of a single instance. The classification performance of IDP is equivalent to that of existing decision tree algorithms, but since IDP allows the maximum number of partial decision tree nodes to be determined prior to the generation of the path, both the memory requirement and the execution time are deterministic. In this work, the viability of the techniques is demonstrated through application to realtime mobile robot navigation. 006.31
88	Detecting anomalies in multivariate time series from automotive systems Theissler, Andreas January 2013 (has links) In the automotive industry test drives are conducted during the development of new vehicle models or as a part of quality assurance for series vehicles. During the test drives, data is recorded for the use of fault analysis resulting in millions of data points. Since multiple vehicles are tested in parallel, the amount of data that is to be analysed is tremendous. Hence, manually analysing each recording is not feasible. Furthermore the complexity of vehicles is ever-increasing leading to an increase of the data volume and complexity of the recordings. Only by effective means of analysing the recordings, one can make sure that the effort put in the conducting of test drives pays off. Consequently, effective means of test drive analysis can become a competitive advantage. This Thesis researches ways to detect unknown or unmodelled faults in recordings from test drives with the following two aims: (1) in a data base of recordings, the expert shall be pointed to potential errors by reporting anomalies, and (2) the time required for the manual analysis of one recording shall be shortened. The idea to achieve the first aim is to learn the normal behaviour from a training set of recordings and then to autonomously detect anomalies. The one-class classifier “support vector data description” (SVDD) is identified to be most suitable, though it suffers from the need to specify parameters beforehand. One main contribution of this Thesis is a new autonomous parameter tuning approach, making SVDD applicable to the problem at hand. Another vital contribution is a novel approach enhancing SVDD to work with multivariate time series. The outcome is the classifier “SVDDsubseq” that is directly applicable to test drive data, without the need for expert knowledge to configure or tune the classifier. The second aim is achieved by adapting visual data mining techniques to make the manual analysis of test drives more efficient. The methods of “parallel coordinates” and “scatter plot matrices” are enhanced by sophisticated filter and query operations, combined with a query tool that allows to graphically formulate search patterns. As a combination of the autonomous classifier “SVDDsubseq” and user-driven visual data mining techniques, a novel, data-driven, semi-autonomous approach to detect unmodelled faults in recordings from test drives is proposed and successfully validated on recordings from test drives. The methodologies in this Thesis can be used as a guideline when setting up an anomaly detection system for own vehicle data. 006.31
89	Gestion de l'incertitude pour l'optimisation de systèmes interactifs / Dealing with uncertainty to optimise interactive systems Daubigney, Lucie 01 October 2013 (has links) Le sujet des travaux concerne l'amélioration du comportement des machines dites \og intelligentes\fg, c'est-à-dire capables de s'adapter à leur environnement, même lorsque celui-ci évolue. Un des domaines concerné est celui des interactions homme-machine. La machine doit alors gérer différents types d'incertitude pour agir de façon appropriée. D'abord, elle doit pouvoir prendre en compte les variations de comportements entre les utilisateurs et le fait que le comportement peut varier d'une utilisation à l'autre en fonction de l'habitude à interagir avec le système. De plus, la machine doit s'adapter à l'utilisateur même si les moyens de communication entre lui et la machine sont bruités. L'objectif est alors de gérer ces incertitudes pour exhiber un comportement cohérent. Ce dernier se définit comme la suite de décisions successives que la machine doit effectuer afin de parvenir à l'objectif fixé. Une manière habituelle pour gérer les incertitudes passe par l'introduction de modèles : modèles de l'utilisateur, de la tâche, ou encore de la décision. Un inconvénient de cette méthode réside dans le fait qu'une connaissance experte liée au domaine concerné est nécessaire à la définition des modèles. Si l'introduction d'une méthode d'apprentissage automatique, l'apprentissage par renforcement a permis d'éviter une modélisation de la décision \textit{ad hoc} au problème concerné, des connaissances expertes restent toutefois nécessaires. La thèse défendue par ces travaux est que certaines contraintes liées à l'expertise humaine peuvent être relaxées tout en limitant la perte de généricité liée à l'introduction de modèles / The behaviour of machines is difficult to define, especially when machines have to adapt to a changing environment. For example, this is the case when human-machine interactions are concerned. Indeed, the machine has to deal with several sources of uncertainty to exhibit a consistent behaviour to the user. First, it has to deal with the different behaviours of the users and also with a change in the behaviour of a user when he gets used to the machine. Secondly, the communication between the user and the machine can be noisy, which makes the transfer of information more complicated. The objective is thus to deal with the different sources of uncertainty to show a consistent behaviour. Usually, dealing with uncertainties is performed by introducing models : models of the users, the task concerned or the decision. However, the accuracy of the solution depends on the accuracy of expert knowledge used to build the models. If machine learning, through reinforcement learning, has successfully avoided the use of model for the decision and removed \textit{ad hoc} knowledge about it, expert knowledge is still necessary. The thesis presented in this work is that some constraints related to human expertise can be slackened without a loss of generality related to the introduction of models Apprentissage par renforcement (PO)MDP Système de dialogue parlé 006.31
90	Sur le rôle de l’être humain dans le dialogue humain/machine / On the role of the human being in human/machine dialogue Barlier, Merwan 14 December 2018 (has links) Cette thèse s'inscrit dans le cadre de l'apprentissage par renforcement pour les systèmes de dialogue. Ce document propose différentes manières de considérer l'être humain, interlocuteur du système de dialogue. Après un aperçu des limites du cadre agent/environnement traditionnel, nous proposons de modéliser dans un premier temps le dialogue comme un jeu stochastique. Dans ce cadre, l'être humain n'est plus vu comme une distribution de probabilité stationnaire mais comme un agent cherchant à optimiser ses préférences. Nous montrons que ce cadre permet une prise en compte de phénomènes de co-adaptation intrinsèques au dialogue humain/machine et nous montrons que ce cadre étend le champ d'application des systèmes de dialogue, par exemple aux dialogues de négociations. Dans un second temps, nous présentons une méthode permettant à l'être humain d'accélérer et de sécuriser la phase d'apprentissage de son système de dialogue par le biais de conseils encodés sous la forme d'une fonction de récompense. Nous montrons que cette prise en compte de conseils permet de significativement améliorer les performances d'un agent apprenant par renforcement. Finalement, une troisième situation est considérée. Ici, un système écoute une conversation entre humains et agit de manière à influer sur le cours de la conversation. Une fonction de récompense originale permettant de maximiser le résultat de la conversation tout en minimisant l'intrusivité du système est proposé. Nous montrons que notre approche permet de significativement améliorer les conversations. Pour implémenter cette approche, un modèle de la conversation est requis. C'est pourquoi nous proposons dans une quatrième contribution d'apprendre ce modèle à partir d'un algorithme d'apprentissage d'automates à multiplicité. / The context of this thesis takes place in Reinforcement Learning for Spoken Dialogue Systems. This document proposes several ways to consider the role of the human interlocutor. After an overview of the limits of the traditional Agent/Environment framework, we first suggest to model human/machine dialogue as a Stochastic Game. Within this framework, the human being is seen as a rational agent, acting in order to optimize his preferences. We show that this framework allows to take into consideration co-adaptation phenomena and extend the applications of human/machine dialogue, e.g. negociation dialogues. In a second time, we address the issue of allowing the incorporation of human expertise in order to speed-up the learning phase of a reinforcement learning based spoken dialogue system. We provide an algorithm that takes advantage of those human advice and shows a great improvement over the performance of traditional reinforcement learning algorithms. Finally, we consider a third situation in which a system listens to a conversation between two human beings and talk when it estimates that its intervention could help to maximize the preferences of its user. We introduce a original reward function balancing the outcome of the conversation with the intrusiveness of the system. Our results obtained by simulation suggest that such an approach is suitable for computer-aided human-human dialogue. However, in order to implement this method, a model of the human/human conversation is required. We propose in a final contribution to learn this model with an algorithm based on multiplicity automata. Jeux stochastiques Automates à multiplicité 006.31

Search results