Global ETD Search

1	A dynamic logistic model for combining classifier outputs Tomas, Amber Nede January 2008 (has links) Many classification algorithms are designed on the assumption that the population of interest is stationary, i.e. it does not change over time. However, there are many real-world problems where this assumption is not appropriate. In this thesis, we develop a classifier for non-stationary populations which is based on a multiple logistic model for the conditional class probabilities and incorporates a linear combination of the outputs of a number of pre-determined component classifiers. The final classifier is able to adjust to changes in the population by sequential updating of the coefficients of the linear combination, which are the parameters of the model. The model we use is motivated by the relatively good classification performance which has been achieved by classification rules based on combining classifier outputs. However, in some cases such classifiers can also perform relatively poorly, and in general the mechanisms behind such results are little understood. For the model we propose, which is a generalisation of several existing models for stationary classification problems, we show there exists a simple relationship between the component classifiers which are used, the sign of the parameters and the decision boundaries of the final classifier. This relationship can be used to guide the choice of component classifiers, and helps with understanding the conditions necessary for the classifier to perform well. We compare several "on-line" algorithms for implementing the classification model, where the classifier is updated as new labelled observations become available. The predictive approach to classification is adopted, so each algorithm is based on updating the posterior distribution of the parameters as new information is received. Specifically, we compare a method which assumes the posterior distribution is Gaussian, a more general method developed for the class of Dynamic Generalised Linear Models, and a method based on a sequential Monte Carlo approximation of the posterior. The relationship between the model used for parameter evolution, the bias of the parameter estimates and the error of the classifier is explored. 005.3
2	Dynamic machine learning for supervised and unsupervised classification / Apprentissage automatique dynamique pour la classification supervisée et non supervisée Sîrbu, Adela-Maria 06 June 2016 (has links) La direction de recherche que nous abordons dans la thèse est l'application des modèles dynamiques d'apprentissage automatique pour résoudre les problèmes de classification supervisée et non supervisée. Les problèmes particuliers que nous avons décidé d'aborder dans la thèse sont la reconnaissance des piétons (un problème de classification supervisée) et le groupement des données d'expression génétique (un problème de classification non supervisée). Les problèmes abordés sont représentatifs pour les deux principaux types de classification et sont très difficiles, ayant une grande importance dans la vie réelle. La première direction de recherche que nous abordons dans le domaine de la classification non supervisée dynamique est le problème de la classification dynamique des données d'expression génétique. L'expression génétique représente le processus par lequel l'information d'un gène est convertie en produits de gènes fonctionnels : des protéines ou des ARN ayant différents rôles dans la vie d'une cellule. La technologie des micro-réseaux moderne est aujourd'hui utilisée pour détecter expérimentalement les niveaux d'expression de milliers de gènes, dans des conditions différentes et au fil du temps. Une fois que les données d'expression génétique ont été recueillies, l'étape suivante consiste à analyser et à extraire des informations biologiques utiles. L'un des algorithmes les plus populaires traitant de l'analyse des données d'expression génétique est le groupement, qui consiste à diviser un certain ensemble en groupes, où les composants de chaque groupe sont semblables les uns aux autres données. Dans le cas des ensembles de données d'expression génique, chaque gène est représenté par ses valeurs d'expression (caractéristiques), à des points distincts dans le temps, dans les conditions contrôlées. Le processus de regroupement des gènes est à la base des études génomiques qui visent à analyser les fonctions des gènes car il est supposé que les gènes qui sont similaires dans leurs niveaux d'expression sont également relativement similaires en termes de fonction biologique. Le problème que nous abordons dans le sens de la recherche de classification non supervisée dynamique est le regroupement dynamique des données d'expression génique. Dans notre cas, la dynamique à long terme indique que l'ensemble de données ne sont pas statiques, mais elle est sujette à changement. Pourtant, par opposition aux approches progressives de la littérature, où l'ensemble de données est enrichie avec de nouveaux gènes (instances) au cours du processus de regroupement, nos approches abordent les cas lorsque de nouvelles fonctionnalités (niveaux d'expression pour de nouveaux points dans le temps) sont ajoutés à la gènes déjà existants dans l'ensemble de données. À notre connaissance, il n'y a pas d'approches dans la littérature qui traitent le problème de la classification dynamique des données d'expression génétique, définis comme ci-dessus. Dans ce contexte, nous avons introduit trois algorithmes de groupement dynamiques que sont capables de gérer de nouveaux niveaux d'expression génique collectés, en partant d'une partition obtenue précédente, sans la nécessité de ré-exécuter l'algorithme à partir de zéro. L'évaluation expérimentale montre que notre méthode est plus rapide et plus précis que l'application de l'algorithme de classification à partir de zéro sur la fonctionnalité étendue ensemble de données... / The research direction we are focusing on in the thesis is applying dynamic machine learning models to salve supervised and unsupervised classification problems. We are living in a dynamic environment, where data is continuously changing and the need to obtain a fast and accurate solution to our problems has become a real necessity. The particular problems that we have decided te approach in the thesis are pedestrian recognition (a supervised classification problem) and clustering of gene expression data (an unsupervised classification. problem). The approached problems are representative for the two main types of classification and are very challenging, having a great importance in real life.The first research direction that we approach in the field of dynamic unsupervised classification is the problem of dynamic clustering of gene expression data. Gene expression represents the process by which the information from a gene is converted into functional gene products: proteins or RNA having different roles in the life of a cell. Modern microarray technology is nowadays used to experimentally detect the levels of expressions of thousand of genes, across different conditions and over time. Once the gene expression data has been gathered, the next step is to analyze it and extract useful biological information. One of the most popular algorithms dealing with the analysis of gene expression data is clustering, which involves partitioning a certain data set in groups, where the components of each group are similar to each other. In the case of gene expression data sets, each gene is represented by its expression values (features), at distinct points in time, under the monitored conditions. The process of gene clustering is at the foundation of genomic studies that aim to analyze the functions of genes because it is assumed that genes that are similar in their expression levels are also relatively similar in terms of biological function.The problem that we address within the dynamic unsupervised classification research direction is the dynamic clustering of gene expression data. In our case, the term dynamic indicates that the data set is not static, but it is subject to change. Still, as opposed to the incremental approaches from the literature, where the data set is enriched with new genes (instances) during the clustering process, our approaches tackle the cases when new features (expression levels for new points in time) are added to the genes already existing in the data set. To our best knowledge, there are no approaches in the literature that deal with the problem of dynamic clustering of gene expression data, defined as above. In this context we introduced three dynamic clustering algorithms which are able to handle new collected gene expression levels, by starting from a previous obtained partition, without the need to re-run the algorithm from scratch. Experimental evaluation shows that our method is faster and more accurate than applying the clustering algorithm from scratch on the feature extended data set... Classification supervisée Classification non supervisée Expression génétique Dynamic classification Pedestrian recognition Gene clustering Dynamic fusion
3	Extraction des paramètres et classification dynamique dans le cadre de la détection et du suivi de défaut de roulements / Extraction of new features and integration of dynamic classification to improve bearing fault monitoring Kerroumi, Sanaa 21 October 2016 (has links) Parmi les techniques utilisées en maintenance, l'analyse vibratoire reste l'outil le plus efficace pour surveiller l'état interne des machines tournantes en fonctionnement. En effet l'état de chaque composant constituant la machine peut être caractérisé par un ou plusieurs indicateurs de défaut issus de l'analyse vibratoire. Le suivi de ces indicateurs permet de détecter la présence d'un défaut et même de le localiser. Cependant, l'évolution de ces indicateurs peut être influencée par d'autres paramètres comme la variation de charge, la vitesse de rotation ou le remplacement d'un composant. Cela peut provoquer des fausses alarmes et remettre en question la fiabilité du diagnostic. Cette thèse a pour objectif de combiner l'analyse vibratoire avec la méthode de reconnaissance des formes afin d'une part d'améliorer la détection de défaut des composants en particulier le défaut de roulement et d'autre part de mieux suivre l'évolution de la dégradation pour caractériser le degré de sévérité du défaut. Pour cela nous avons développé des méthodes de classification dynamique pour prendre en compte l'évolution du système. Les observations à classifier sont constituées d'indicateurs de défauts et des combinaisons linéaires de ceux-ci. La démarche de la reconnaissance des formes dynamique consiste à extraire, à sélectionner et à classifier ces observations de façon continue. Trois méthodes de classification dynamiques ont été développées durant cette thèse : le « Dynamic DBSCAN » qui la première version dynamique de DBSCAN développée pour pouvoir suivre les évolutions des classes, « Evolving scalable DBSCAN » ESDBSCAN qui représente une version en ligne et évolutive de DBSCAN et finalement « Dynamic Fuzzy Scalabale DBSCAN » DFSDBSCAN qui est une version dynamique et floue de la méthode de classification ESDBSCAN adaptée pour un apprentissage en ligne. Ces méthodes distinguent les variations des observations liées au changement du mode de fonctionnement de la machine (variation de vitesse ou de charges) et les variations liées au défaut. Ainsi, Elles permettent de détecter, de façon précoce, l'apparition d'un défaut qui se traduit par la création d'une nouvelle classe dite classe dégradée et de suivre l'évolution de celle-ci. Cette méthodologie permettrait d'améliorer l'estimation de la durée de vie résiduelle du composant en analysant la distance séparant la classe "saine" et "dégradée". L'application sur des données réelles a permis d'identifier les différents états du roulement au cours temps (sain ou normal, défectueux) et l'évolution des observations liée à la variation de vitesse et au changement de charges avec un taux d'erreur faible et d'établir un diagnostic fiable. Afin de caractériser le degré de précocité du diagnostic des méthodes développées nous avons comparé ces résultats avec ceux établis par des méthodes classiques de détection. Cette comparaison nous a montré que les méthodes proposées permettent un diagnostic plus précoce et plus fiable.Mots clés : Diagnostic et suivi, roulements, méthodes de reconnaissance des formes, apprentissage en ligne, classification dynamique, analyse vibratoire, DFSDBSCAN, ESDBSCAN, DDBSCAN. / Various techniques can be used in rotating machines condition based maintenance. Among which vibration analysis remains the most popular and most effective tool for monitoring the internal state of an operating machine. Through vibration analysis, the state of each component constituting the machine can be characterized by one or more fault indicators. Monitoring these indicators can be used to detect the presence of a defect or even locate it. However, the evolution of these indicators can be influenced by other parameters than defect such as the variation of load, speed or replacement of a component. So counting solely on the evolution of these fault indicators to diagnose a machine can cause false alarms and question the reliability of the diagnosis.In this thesis, we combined vibration analysis tools with pattern recognition method to firstly improve fault detection reliability of components such as bearings, secondly to assess the severity of degradation by closely monitor the defect growth and finally to estimate their remaining useful life. For these reasons, we have designed a pattern recognition process capable of; identifying defect even in machines running under non stationary conditions, processing evolving data of an evolving system and can handle an online learning. This process will have to decide the internal state of the machine using only faults indicators or linear combinations of fault indicators.The process of pattern recognition of dynamic forms consists of extracting and selecting useful information, classify these observations continuously into their right classes then decide on an action according to the observations' class.Three dynamic classification methods have been developed during this thesis: Dynamic DBSCAN that was developed to capitalize on the time evolution of the data and their classes, Evolving Scalable DBSCAN (ESDBSCAN) that was created to overcome the shortcoming of DDBSCAN in online processing and finally Dynamic Fuzzy Scalable DBSCAN (DFSDBSCAN); a dynamic fuzzy and semi-supervised version of ESDBSCAN. These methods can detect the observations evolution and identify the nature of the change causing it; either if it's a change in operating mode of the machine (speed variation or load) or a change related to the defect.With these techniques we were are able to enhance the reliability of fault detection by identifying the origin of the fault indicators evolution. An evolution caused by an alteration of the operating mode and changes caused by defect result in two different types of classes evolution (the appearance of a new class we named it 'defected' in case of defect or a drift otherwise). Not only that but these techniques helped us enhance the precocity of the fault detection and estimate the remaining useful life of the monitored component as well by analyzing the distance separating the class 'healthy' and 'defected'.The application of the designed process on real data helped us prove the legitimacy of the proposed techniques in identifying the different states of bearings over time (healthy or normal, defective) and the origin of the observations' evolution with a low error rate, a reliable diagnosis and a low memory occupation.Keywords: Diagnosis and monitoring, bearings, pattern recognition, learning, dynamic classification, Vibration Analysis, DFSDBSCAN, ESDBSCAN, DDBSCAN Diagnostic et suivi Roulements Méthodes de reconnaissance des formes Analyse vibratoire Apprentissage en ligne Classification dynamique Diagnosis and monitoring Bearings Pattern recognition Vibratory analysis Online learning Dynamic classification

1

Page generated in 0.0999 seconds