Global ETD Search

31	Klasifikátor pro sémantické vzory užívání anglických sloves / Classifier for semantic patterns of English verbs Kríž, Vincent January 2012 (has links) The goal of the diploma thesis is to design, implement and evaluate classifiers for automatic classification of semantic patterns of English verbs according to a pattern lexicon that draws on the Corpus Pattern Analysis. We use a pilot collection of 30 sample English verbs as training and test data sets. We employ standard methods of machine learning. In our experiments we use decision trees, k-nearest neighbourghs (kNN), support vector machines (SVM) and Adaboost algorithms. Among other things we concentrate on feature design and selection. We experiment with both morpho-syntactic and semantic features. Our results show that the morpho-syntactic features are the most important for statistically-driven semantic disambiguation. Nevertheless, for some verbs the use of semantic features plays an important role.
32	Efficient Kernel Methods For Large Scale Classification Asharaf, S 07 1900 (has links) Classification algorithms have been widely used in many application domains. Most of these domains deal with massive collection of data and hence demand classification algorithms that scale well with the size of the data sets involved. A classification algorithm is said to be scalable if there is no significant increase in time and space requirements for the algorithm (without compromising the generalization performance) when dealing with an increase in the training set size. Support Vector Machine (SVM) is one of the most celebrated kernel based classification methods used in Machine Learning. An SVM capable of handling large scale classification problems will definitely be an ideal candidate in many real world applications. The training process involved in SVM classifier is usually formulated as a Quadratic Programing(QP) problem. The existing solution strategies for this problem have an associated time and space complexity that is (at least) quadratic in the number of training points. This makes the SVM training very expensive even on classification problems having a few thousands of training examples. This thesis addresses the scalability of the training algorithms involved in both two class and multiclass Support Vector Machines. Efficient training schemes reducing the space and time requirements of the SVM training process are proposed as possible solutions. The classification schemes discussed in the thesis for handling large scale two class classification problems are a) Two selective sampling based training schemes for scaling Non-linear SVM and b) Clustering based approaches for handling unbalanced data sets with Core Vector Machine. To handle large scale multicalss classification problems, the thesis proposes Multiclass Core Vector Machine (MCVM), a scalable SVM based multiclass classifier. In MVCM, the multiclass SVM problem is shown to be equivalent to a Minimum Enclosing Ball (MEB) problem and is then solved using a fast approximate MEB finding algorithm. Experimental studies were done with several large real world data sets such as IJCNN1 and Acoustic data sets from LIBSVM page, Extended USPS data set from CVM page and network intrusion detection data sets of DARPA, US Defense used in KDD 99 contest. From the empirical results it is observed that the proposed classification schemes achieve good generalization performance at low time and space requirements. Further, the scalability experiments done with large training data sets have demonstrated that the proposed schemes scale well. A novel soft clustering scheme called Rough Support Vector Clustering (RSVC) employing the idea of Soft Minimum Enclosing Ball Problem (SMEB) is another contribution discussed in this thesis. Experiments done with a synthetic data set and the real world data set namely IRIS, have shown that RSVC finds meaningful soft cluster abstractions. Machine Learning Automatic Classification Kernel Method Classification Algorithms Support Vector Machine (SVM) Core Vector Machine (CVM) Rough Support Vector Clustering (RSVC) Multiclass Core Vector Machine (MCVM) Computer Science
33	Statistical modeling of the human sleep process via physiological recordings Fairley, Jacqueline Antoinette 09 January 2009 (has links) The main objective of this work was the development of a computer-based Expert Sleep Analysis Methodology (ESAM) to aid sleep care physicians in the diagnosis of pre-Parkinson's disease symptoms using polysomnogram data. ESAM is significant because it streamlines the analysis of the human sleep cycles and aids the physician in the identification, treatment, and prediction of sleep disorders. In this work four aspects of computer-based human sleep analysis were investigated: polysomnogram interpretation, pre-processing, sleep event classification, and abnormal sleep detection. A review of previous developments in these four areas is provided along with their relationship to the establishment of ESAM. Polysomnogram interpretation focuses on the ambiguities found in human polysomnogram analysis when using the rule based 1968 sleep staging manual edited by Rechtschaffen and Kales (R&K). ESAM is presented as an alternative to the R&K approach in human polysomnogram interpretation. The second area, pre-processing, addresses artifact processing techniques for human polysomnograms. Sleep event classification, the third area, discusses feature selection, classification, and human sleep modeling approaches. Lastly, abnormal sleep detection focuses on polysomnogram characteristics common to patients suffering from Parkinson's disease. The technical approach in this work utilized polysomnograms of control subjects and pre-Parkinsonian disease patients obtained from the Emory Clinic Sleep Disorders Center (ECSDC) as inputs into ESAM. The engineering tools employed during the development of ESAM included the Generalized Singular Value Decomposition (GSVD) algorithm, sequential forward and backward feature selection algorithms, Particle Swarm Optimization algorithm, k-Nearest Neighbor classification, and Gaussian Observation Hidden Markov Modeling (GOHMM). In this study polysomnogram data was preprocessed for artifact removal and compensation using band-pass filtering and the GSVD algorithm. Optimal features for characterization of polysomnogram data of control subjects and pre-Parkinsonian disease patients were obtained using the sequential forward and backward feature selection algorithms, Particle Swarm Optimization, and k-Nearest Neighbor classification. ESAM output included GOHMMs constructed for both control subjects and pre-Parkinsonian disease patients. Furthermore, performance evaluation techniques were implemented to make conclusions regarding the constructed GOHMM's reflection of the underlying nature of the human sleep cycle. Evolutionary computer algorithms Biosignal processing Quantitative-based human sleep analysis Human sleep pathology Sleep Polysomnography Parkinson's disease Computer simulation Automatic classification
34	Automatic genre classification of home pages on the web / Kennedy, Alistair. January 2004 (has links) (PDF) Thesis (B.C.S.)--Dalhousie University, Halifax. / "Submitted in partial fulfillment of the requirements for the degree of bachelor of computer science with honours at Dalhousie University, Halifax, Nova Scotia, April 2004." Includes bibliographical references (p. 33-35). Also available in PDF via the World Wide Web.
35	[en] THE CREATION OF A SEMI-AUTOMATIC CLASSIFICATION MODEL USING GEOGRAPHIC KNOWLEDGE: A CASE STUDY IN THE NORTHERN PORTION OF THE TIJUCA MASSIF - RJ / [pt] A CRIAÇÃO DE UM MODELO DE CLASSIFICAÇÃO SEMI-AUTOMÁTICA UTILIZANDO CONHECIMENTO GEOGRÁFICO: UM ESTUDO DE CASO NA PORÇÃO SETENTRIONAL DO MACIÇO DA TIJUCA - RJ RAFAEL DA SILVA NUNES 30 August 2018 (has links) [pt] Os processos de transformação da paisagem são resultantes da interação de elementos (bióticos e abióticos) que compõe a superfície da Terra. Baseia-se, a partir de uma perspectiva holística, no inter-relacionamento de uma série de ações e objetos que confluem para que a paisagem seja percebida como um momento sintético da confluência de inúmeras temporalidades. Desta maneira, as geotecnologias passam a se constituir como um importante aparato técnico-científico para a interpretação desta realidade ao possibilitar novas e diferentes formas do ser humano interpretar a paisagem. Um dos produtos gerados a partir desta interpretação é a classificação de uso e cobertura do solo e que se configura como um instrumento central para a análise das dinâmicas territoriais. Desta maneira, o objetivo do presente trabalho é elaboração de um modelo de classificação semi-automática baseada em conhecimento geográfico para o levantamento do padrão de uso e cobertura da paisagem a partir da utilização de imagens de satélite de alta resolução, tendo como recorte analítico uma área na porção setentrional no Maciço da Tijuca. O modelo baseado na análise de imagens baseadas em objetos, quando confrontados com a classificação visual, culminou em um valor acima de 80 por cento de correspondência tanto para imagens de 2010 e 2009, apresentando valores bastante elevados também na comparação classe a classe. A elaboração do presente modelo contribuiu diretamente para a otimização da produção dos dados elaborados contribuindo sobremaneira para a aceleração da interpretação das imagens analisadas, assim como para a minimização de erros ocasionados pela subjetividade atrelada ao próprio classificador. / [en] The transformation processes of the landscape are results from the interaction of factors (biotic and abiotic) that makes up the Earth s surface. This interaction, from a holistic perspective, is then based on the inter-relationship of a series of actions and objects that converge so that landscape is perceived as a moment of confluence of numerous synthetic temporalities. Thus, the geotechnologies come to constitute an important technical and scientific apparatus for the interpretation of this reality by enabling new and different ways of interpreting the human landscape. One of the products that can be generated from this interpretation is the use classification and land cover and is configured as a central instrument for the analysis of territorial dynamics. Thus, the aim of this work is the development of a semi-automatic classification model based on geographic knowledge to survey the pattern of land use and cover the landscape from the use of satellite images of high resolution, with the analytical approach an area in the northern portion of the Tijuca Massif. The model built on an Object-Based Image Analysis, when confronted with the visual classification, culminated in a value above 80 percent match for 2010 and 2009, with very high values in the comparison class to class. The development of this model directly contributed to the optimization of the production of processed data contributing greatly to the acceleration of the interpretation of the images analyzed, as well as to minimize errors caused by the subjectivity linked to the classifier itself. [pt] IMAGEM DE SATELITE [en] SATELLITE IMAGE [pt] PAISAGEM [en] LANDSCAPE [pt] USO E COBERTURA DO SOLO [en] LAND USE AND LAND COVER [pt] GEOTECNOLOGIAS [en] GEOTECHNOLOGIES [pt] CLASSIFICACAO SEMI-AUTOMATICA [en] SEMI-AUTOMATIC CLASSIFICATION
36	Automatic classification of natural signals for environmental monitoring / Classification automatique de signaux naturels pour la surveillance environnementale Malfante, Marielle 03 October 2018 (has links) Ce manuscrit de thèse résume trois ans de travaux sur l’utilisation des méthodes d’apprentissage statistique pour l’analyse automatique de signaux naturels. L’objectif principal est de présenter des outils efficaces et opérationnels pour l’analyse de signaux environnementaux, en vue de mieux connaitre et comprendre l’environnement considéré. On se concentre en particulier sur les tâches de détection et de classification automatique d’événements naturels.Dans cette thèse, deux outils basés sur l’apprentissage supervisé (Support Vector Machine et Random Forest) sont présentés pour (i) la classification automatique d’événements, et (ii) pour la détection et classification automatique d’événements. La robustesse des approches proposées résulte de l’espace des descripteurs dans lequel sont représentés les signaux. Les enregistrements y sont en effet décrits dans plusieurs espaces: temporel, fréquentiel et quéfrentiel. Une comparaison avec des descripteurs issus de réseaux de neurones convolutionnels (Deep Learning) est également proposée, et favorise les descripteurs issus de la physique au détriment des approches basées sur l’apprentissage profond.Les outils proposés au cours de cette thèse sont testés et validés sur des enregistrements in situ de deux environnements différents : (i) milieux marins et (ii) zones volcaniques. La première application s’intéresse aux signaux acoustiques pour la surveillance des zones sous-marines côtières : les enregistrements continus sont automatiquement analysés pour détecter et classifier les différents sons de poissons. Une périodicité quotidienne est mise en évidence. La seconde application vise la surveillance volcanique : l’architecture proposée classifie automatiquement les événements sismiques en plusieurs catégories, associées à diverses activités du volcan. L’étude est menée sur 6 ans de données volcano-sismiques enregistrées sur le volcan Ubinas (Pérou). L’analyse automatique a en particulier permis d’identifier des erreurs de classification faites dans l’analyse manuelle originale. L’architecture pour la classification automatique d’événements volcano-sismiques a également été déployée et testée en observatoire en Indonésie pour la surveillance du volcan Mérapi. Les outils développés au cours de cette thèse sont rassemblés dans le module Architecture d’Analyse Automatique (AAA), disponible en libre accès. / This manuscript summarizes a three years work addressing the use of machine learning for the automatic analysis of natural signals. The main goal of this PhD is to produce efficient and operative frameworks for the analysis of environmental signals, in order to gather knowledge and better understand the considered environment. Particularly, we focus on the automatic tasks of detection and classification of natural events.This thesis proposes two tools based on supervised machine learning (Support Vector Machine, Random Forest) for (i) the automatic classification of events and (ii) the automatic detection and classification of events. The success of the proposed approaches lies in the feature space used to represent the signals. This relies on a detailed description of the raw acquisitions in various domains: temporal, spectral and cepstral. A comparison with features extracted using convolutional neural networks (deep learning) is also made, and favours the physical features to the use of deep learning methods to represent transient signals.The proposed tools are tested and validated on real world acquisitions from different environments: (i) underwater and (ii) volcanic areas. The first application considered in this thesis is devoted to the monitoring of coastal underwater areas using acoustic signals: continuous recordings are analysed to automatically detect and classify fish sounds. A day to day pattern in the fish behaviour is revealed. The second application targets volcanoes monitoring: the proposed system classifies seismic events into categories, which can be associated to different phases of the internal activity of volcanoes. The study is conducted on six years of volcano-seismic data recorded on Ubinas volcano (Peru). In particular, the outcomes of the proposed automatic classification system helped in the discovery of misclassifications in the manual annotation of the recordings. In addition, the proposed automatic classification framework of volcano-seismic signals has been deployed and tested in Indonesia for the monitoring of Mount Merapi. The software implementation of the framework developed in this thesis has been collected in the Automatic Analysis Architecture (AAA) package and is freely available. Classification automatique Acoustique sous-Marine Volcano-Sismique Réduction de dimension Apprentissage statistique Apprentissage profond Automatic classification Underwater acoustic Volcano-Seismic Dimensionality reduction Machine learning Deep learning 004
37	Tipologia de traços linguísticos de textos do português do Brasil dos séculos XVI, XVII, XVIII e XIX: uma proposta para a classificação automática de gêneros textuais Souza, Jacqueline Aparecida de 26 February 2010 (has links) Made available in DSpace on 2016-06-02T20:25:07Z (GMT). No. of bitstreams: 1 3377.pdf: 3546850 bytes, checksum: d15885076635f742d9e61ee253c4d220 (MD5) Previous issue date: 2010-02-26 / Universidade Federal de Minas Gerais / Based on methodological postulates of the Linguistic of corpus and on the genre concepts, proposed by Swales (1990) and Biber (1995), this research intends to describe linguistic traces which are characteristic of historic texts and correlate them to their respective genres, as well as propose a typology of traces so that it is possible to automatically identify the genre. In order to execute the research, the corpus of the Portuguese of the centuries XVI, XVII and XVII of the project Historical Dictionary of the Portuguese in Brazil (program Institutes of the Millennium/CNPq UNESP/Araraquara), which is constituted by 2,459 texts and 7,5 million words has been used. In order to realize a historical description, the study has started from synchronic characteristics obtained from the table of contemporary traces elaborated by Aires (2005). As for the manipulation of the corpus, it has been used the Philologic, the Unitex as well as another tool for the extraction and quantification of traces that has been developed. For the purposes of classification, algorithms available at Weka (Waikato Environment for knowledge Analysis) such as: Naive Bayes, Bayes Net, SMO, Multilayer Perceptron e RBFNetwork, J48, NBTree have been used. The description has been made based on the 62 traces, which include statistics based on a text as a whole and on words, including classes of verbs, pronouns, adverbs as well as discourse markers, expressions and lexical units. It has been concluded that the genres share specific linguistic characteristics. However, they also present their own standards with the use of specific expressions and the frequency of lexical units. Despite the limitations and complications in using a historical corpus, the performance of the classifiers based on the raised traces was satisfactory and the rate of correct classification was 84% and 92%. / Com base nos postulados metodológicos da Linguística de Corpus e nos conceitos de gênero, propostos por Swales (1990) e Biber (1995), esta pesquisa pretende descrever traços linguísticos característicos de textos históricos, correlacionando-os a seus respectivos gêneros, e propor uma tipologia de traços de forma que seja possível identificar o gênero de cada texto automaticamente. Para execução da pesquisa foi utilizado o corpus do português dos séculos XVI, XVII e XVIII do projeto Dicionário Histórico do Português do Brasil (programa Institutos do Milênio/CNPq UNESP/Araraquara), constituído por 2.459 textos e 7.5 milhões de palavras. Para realizar uma descrição histórica, partiu-se de características sincrônicas obtidas a partir da tabela de traços contemporâneos elaborada por Aires (2005). No que tange à manipulação do corpus, utilizou-se o Philologic, o Unitex e desenvolveu-se uma ferramenta para extração e quantificação dos traços. Para fins de classificação, foram utilizados os algoritmos disponibilizados no Weka (Waikato Environment for Knowledge Analysis), tais como: Naive Bayes, Bayes Net, SMO, Multilayer Perceptron e RBFNetwork, J48, NBTree. A descrição foi realizada com base em 62 traços, os quais abarcam estatísticas baseadas no texto como um todo e em palavras, incluindo as classes de verbos, pronomes, advérbios, como também marcadores discursivos, expressões e unidades lexicais. Concluiu-se que os gêneros compartilham características linguísticas específicas, porém, também apresentam seus padrões próprios, como o uso de determinadas expressões e a frequência de unidades lexicais. Apesar das limitações e complicações em utilizar um corpus histórico, o desempenho dos classificadores com base nos traços levantados foi satisfatório, com a taxa de acerto 84% e 92% de classificação correta. Linguística Linguística de corpus Aprendizado de computador Corpus histórico Traços lingüísticos Gêneros textuais Classificação automática Corpus linguistics Features Textual genre Automatic classification LINGUISTICA, LETRAS E ARTES::LINGUISTICA
38	Approches non supervisées pour la recommandation de lectures et la mise en relation automatique de contenus au sein d'une bibliothèque numérique / Unsupervised approaches to recommending reads and automatically linking content within a digital library Benkoussas, Chahinez 14 December 2016 (has links) Cette thèse s’inscrit dans le domaine de la recherche d’information (RI) et la recommandation de lecture. Elle a pour objets :— La création de nouvelles approches de recherche de documents utilisant des techniques de combinaison de résultats, d’agrégation de données sociales et de reformulation de requêtes ;— La création d’une approche de recommandation utilisant des méthodes de RI et les graphes entre les documents. Deux collections de documents ont été utilisées. Une collection qui provient de l’évaluation CLEF (tâche Social Book Search - SBS) et la deuxième issue du domaine des sciences humaines et sociales (OpenEdition, principalement Revues.org). La modélisation des documents de chaque collection repose sur deux types de relations :— Dans la première collection (CLEF SBS), les documents sont reliés avec des similarités calculées par Amazon qui se basent sur plusieurs facteurs (achats des utilisateurs, commentaires, votes, produits achetés ensemble, etc.) ;— Dans la deuxième collection (OpenEdition), les documents sont reliés avec des relations de citations (à partir des références bibliographiques).Le manuscrit est structuré en deux parties. La première partie «état de l’art» regroupe une introduction générale, un état de l’art sur la RI et sur les systèmes de recommandation. La deuxième partie «contributions» regroupe un chapitre sur la détection de comptes rendus de lecture au sein de la collection OpenEdition (Revues.org), un chapitre sur les méthodes de RI utilisées sur des requêtes complexes et un dernier chapitre qui traite l’approche de recommandation proposée qui se base sur les graphes. / This thesis deals with the field of information retrieval and the recommendation of reading. It has for objects:— The creation of new approach of document retrieval and recommendation using techniques of combination of results, aggregation of social data and reformulation of queries;— The creation of an approach of recommendation using methods of information retrieval and graph theories.Two collections of documents were used. First one is a collection which is provided by CLEF (Social Book Search - SBS) and the second from the platforms of electronic sources in Humanities and Social Sciences OpenEdition.org (Revues.org). The modelling of the documents of every collection is based on two types of relations:— For the first collection (SBS), documents are connected with similarity calculated by Amazon which is based on several factors (purchases of the users, the comments, the votes, products bought together, etc.);— For the second collection (OpenEdition), documents are connected with relations of citations, extracted from bibliographical references.We show that the proposed approaches bring in most of the cases gain in the performances of research and recommendation. The manuscript is structured in two parts. The first part "state of the art" includes a general introduction, a state of the art of informationretrieval and recommender systems. The second part "contributions" includes a chapter on the detection of reviews of books in Revues.org; a chapter on the methods of IR used on complex queries written in natural language and last chapter which handles the proposed approach of recommendation which is based on graph. Recherche d’information Recommandation Modèles de recherche d’information Graphes Bibliothèque numérique Réseau de citations Classification automatique. Information retrieval Recommendation Information retrieval models Graphs Digital library Citation’s network Automatic classification.
39	Représentations parcimonieuses et apprentissage de dictionnaires pour la compression et la classification d'images satellites / Sparse representations and dictionary learning for the compression and the classification of satellite images Aghaei Mazaheri, Jérémy 20 July 2015 (has links) Cette thèse propose d'explorer des méthodes de représentations parcimonieuses et d'apprentissage de dictionnaires pour compresser et classifier des images satellites. Les représentations parcimonieuses consistent à approximer un signal par une combinaison linéaire de quelques colonnes, dites atomes, d'un dictionnaire, et ainsi à le représenter par seulement quelques coefficients non nuls contenus dans un vecteur parcimonieux. Afin d'améliorer la qualité des représentations et d'en augmenter la parcimonie, il est intéressant d'apprendre le dictionnaire. La première partie de la thèse présente un état de l'art consacré aux représentations parcimonieuses et aux méthodes d'apprentissage de dictionnaires. Diverses applications de ces méthodes y sont détaillées. Des standards de compression d'images sont également présentés. La deuxième partie traite de l'apprentissage de dictionnaires structurés sur plusieurs niveaux, d'une structure en arbre à une structure adaptative, et de leur application au cas de la compression d'images satellites en les intégrant dans un schéma de codage adapté. Enfin, la troisième partie est consacrée à l'utilisation des dictionnaires structurés appris pour la classification d'images satellites. Une méthode pour estimer la Fonction de Transfert de Modulation (FTM) de l'instrument dont provient une image est étudiée. Puis un algorithme de classification supervisée, utilisant des dictionnaires structurés rendus discriminants entre les classes à l'apprentissage, est présenté dans le cadre de la reconnaissance de scènes au sein d'une image. / This thesis explores sparse representation and dictionary learning methods to compress and classify satellite images. Sparse representations consist in approximating a signal by a linear combination of a few columns, known as atoms, from a dictionary, and thus representing it by only a few non-zero coefficients contained in a sparse vector. In order to improve the quality of the representations and to increase their sparsity, it is interesting to learn the dictionary. The first part of the thesis presents a state of the art about sparse representations and dictionary learning methods. Several applications of these methods are explored. Some image compression standards are also presented. The second part deals with the learning of dictionaries structured in several levels, from a tree structure to an adaptive structure, and their application to the compression of satellite images, by integrating them in an adapted coding scheme. Finally, the third part is about the use of learned structured dictionaries for the classification of satellite images. A method to estimate the Modulation Transfer Function (MTF) of the instrument used to capture an image is studied. A supervised classification algorithm, using structured dictionaries made discriminant between classes during the learning, is then presented in the scope of scene recognition in a picture. Traitement d'images Compression d'images Classification automatique Imagerie satellitaire Représentation parcimonieuse Apprentissage automatique Image processing Image compression Automatic classification Satellite images Sparse representation Machine learning
40	Data mining and volcanic eruption forcasting / Fouille de données et prédiction des éruptions volcaniques Boué, Anaïs 30 April 2015 (has links) L'intégration de méthodes de prédiction des éruptions volcaniques dans une stratégie de surveillance globale peut être un outil d'aide à la décision précieux pour la gestion des crises, si les limites des méthodes utilisées sont connues. La plupart des tentatives de prédictions déterministes des éruptions volcaniques et des glissements de terrain sont effectuées avec la méthode FFM (material Failure Forecast Method). Cette méthode consiste à ajuster une loi de puissance empirique aux précurseurs de sismicité ou de déformation des éruptions. Jusqu'à présent, la plupart des travaux de recherche se sont attachés à faire des prédictions a posteriori, basées sur la séquence complète de précurseurs, mais le potentiel de la méthode FFM pour la prédiction en temps réel, en n'utilisant qu'une partie de la séquence, n'a encore jamais été évaluée. De plus, il est difficile de conclure quant-à la capacité de la méthode pour prédire les éruptions volcaniques car le nombre d'exemples publiés est très limité et aucune évaluation statistique de son potentiel n'a été faite jusqu'à présent. Par conséquent, il est important de procéder à une application systématique de la FFM sur un nombre important d'éruptions, dans des contextes volcaniques variés. Cette thèse présente une approche rigoureuse de la FFM, appliquée aux précurseurs sismiques des éruptions volcaniques, développée pour une application en temps réel. J'utilise une approche Bayésienne basée sur la théorie de la FFM et sur un outil de classification automatique des signaux ayant des mécanismes à la source différents. Les paramètres d'entrée de la méthode sont les densités de probabilité des données, déduites de la performance de l'outil de classification. Le paramètre de sortie donne la distribution de probabilité du temps de prédiction à chaque temps d'observation précédant l'éruption. Je détermine deux critères pour évaluer la fiabilité d'une prédiction en temps réel : l'étalement de la densité de probabilité de la prédiction et sa stabilité dans le temps. La méthode développée ici surpasse les applications classiques de la FFM, que ce soit pour des applications en a posteriori ou en temps réel, en particulier parce que l'information concernant l'incertitude sur les donnée est précisément prise en compte. La classification automatique des signaux sismo-volcaniques permet une application systématique de cette méthode de prédiction sur des dizaines d'années de données pour des contextes volcaniques andésitiques, au volcan Colima (Mexique) et au volcan Mérapi (Indonésie), et pour un contexte basaltique au Piton de la Fournaise (La Réunion, France). Je quantifie le nombre d'éruptions qui ne sont pas précédées de précurseurs, ainsi que les crises sismiques qui ne sont pas associées à des épisodes volcaniques. Au total, 64 séquences de précurseurs sont étudiées et utilisées pour tester la méthode de prédiction des éruptions développée dans cette thèse. Ce travail permet de déterminer dans quelles conditions la FFM peut être appliquée avec succès et de quantifier le taux de réussite de la méthode en temps réel et en a posteriori. Seulement 62% des séquences précurseurs étudiées dans cette thèse sont utilisable dans le cadre de la FFM et la moitié du nombre total d'éruptions sont prédites a posteriori. En temps réel, seulement 36% du nombre total d'éruptions auraient pu être prédites. Cependant, ces prédictions sont précises dans 83% des cas pour lesquels les critères de fiabilités sont satisfaites. Par conséquent, il apparaît que l'on peut avoir confiance en la méthode de prédiction en temps réel développée dans cette thèse mais que la FFM semble être applicable en temps réel uniquement si elle est intégrée dans une statégie de prédiction plus globale. Cependant, elle pourrait être potentiellement utile combinée avec d'autres méthodes de prédictions et supervisée par un observeur. Ces résultats reflètent le manque de connaissances actuelles concernant les mécanismes pré-éruptifs. / Eruption forecasting methods are valuable tools for supporting decision making during volcanic crises if they are integrated in a global monitoring strategy and if their potentiality and limitations are known. Many attempts for deterministic forecasting of volcanic eruptions and landslides have been performed using the material Failure Forecast Method (FFM). This method consists in adjusting an empirical power law on precursory patterns of seismicity or deformation. Until now, most of the studies have presented hindsight forecasts, based on complete time series of precursors, and do not evaluate the method's potential for carrying out real-time forecasting with partial precursory sequences. Moreover, the limited number of published examples and the absence of systematic application of the FFM makes it difficult to conclude as to the ability of the method to forecast volcanic eruptions. Thus it appears important to gain experience by carrying out systematic forecasting attempts in various eruptive contexts. In this thesis, I present a rigorous approach of the FFM designed for real-time applications on volcano-seismic precursors. I use a Bayesian approach based on the FFM theory and an automatic classification of the seismic events that do not have the same source mechanisms. The probability distributions of the data deduced from the performance of the classification are used as input. As output, the method provides the probability of the forecast time at each observation time before the eruption. The spread of the posterior probability density function of the prediction time and its stability with respect to the observation time are used as criteria to evaluate the reliability of the forecast. I show that the method developed here outperforms the classical application of the FFM both for hindsight and real-time attempts because it accurately takes the uncertainty of the data information into account. The automatic classification of volcano-seismic signals allows for a systematic application of this forecasting method to decades of seismic data from andesitic volcanoes including Volcan de Colima (Mexico) and Merapi volcano (Indonesia), and from the basaltic volcano of Piton de la Fournaise (Reunion Island, France). The number of eruptions that are not preceded by precursors is quantified, as well as the number of seismic crises that are not followed by eruptions. Then, I use 64 precursory sequences and apply the forecasting method developed in this thesis. I thus determine in which conditions the FFM can be successfully applied and I quantify the success rate of the method in real-time and in hindsight. Only 62% of the precursory sequences analysed in this thesis were suitable for the application of FFM and half of the total number of eruptions are successfully forecast in hindsight. In real-time, the method allows for the successful predictions of only 36% of the total of all eruptions considered. Nevertheless, real-time predictions are successful for 83% of the cases that fulfil the reliability criteria. Therefore, we can have a good confidence on the method when the reliability criteria are met, but the deterministic real-time forecasting tool developed in this thesis is not sufficient in itself. However, it could potentially be informative combined with other forecasting methods and supervised by an observer. These results reflect the lack of knowledge concerning the pre-eruptive mechanisms. Prédiction des éruptions Signaux sismo-volcaniques Classification automatique Reconnaissance vocale Méthode d'approche à la rupture Eruptions prediction Seismo-volcanic signals Automatic classification Voice recognition Failure Forecast Method 550

Search results