Global ETD Search

31	Approche algébrique et théorie des valeurs extrêmes pour la détection de ruptures : Application aux signaux biomédicaux / Algebraic approach and extreme value theory for change-point detection : Application to the biomedical signals Debbabi, Nehla 14 December 2015 (has links) Ce travail développe des techniques non-supervisées de détection et de localisation en ligne de ruptures dans les signaux enregistrés dans un environnement bruité. Ces techniques reposent sur l'association d'une approche algébrique avec la TVE. L'approche algébrique permet d'appréhender aisément les ruptures en les caractérisant en termes de distributions de Dirac retardées et leurs dérivées dont la manipulation est facile via le calcul opérationnel. Cette caractérisation algébrique, permettant d'exprimer explicitement les instants d'occurrences des ruptures, est complétée par une interprétation probabiliste en termes d'extrêmes : une rupture est un évènement rare dont l'amplitude associée est relativement grande. Ces évènements sont modélisés dans le cadre de la TVE, par une distribution de Pareto Généralisée. Plusieurs modèles hybrides sont proposés dans ce travail pour décrire à la fois le comportement moyen (bruit) et les comportements extrêmes (les ruptures) du signal après un traitement algébrique. Des algorithmes entièrement non-supervisés sont développés pour l'évaluation de ces modèles hybrides, contrairement aux techniques classiques utilisées pour les problèmes d'estimation en question qui sont heuristiques et manuelles. Les algorithmes de détection de ruptures développés dans cette thèse ont été validés sur des données générées, puis appliqués sur des données réelles provenant de différents phénomènes, où les informations à extraire sont traduites par l'apparition de ruptures. / This work develops non supervised techniques for on-line detection and location of change-points in noisy recorded signals. These techniques are based on the combination of an algebraic approach with the Extreme Value Theory (EVT). The algebraic approach offers an easy identification of the change-points. It characterizes them in terms of delayed Dirac distributions and their derivatives which are easily handled via operational calculus. This algebraic characterization, giving rise to an explicit expression of the change-points locations, is completed with a probabilistic interpretation in terms of extremes: a change point is seen as a rare and extreme event. Based on EVT, these events are modeled by a Generalized Pareto Distribution.Several hybrid multi-components models are proposed in this work, modeling at the same time the mean behavior (noise) and the extremes ones (change-points) of the signal after an algebraic processing. Non supervised algorithms are proposed to evaluate these hybrid models, avoiding the problems encountered with classical estimation methods which are graphical ad hoc ones. The change-points detection algorithms developed in this thesis are validated on generated data and then applied on real data, stemming from different phenomenons, where change-points represent the information to be extracted. Détection des ruptures Théorie des valeurs extrêmes Signaux biomédicaux Estimation Modélisation Programmation Change-Point detection Extreme value theory Biomedical signals Estimation Modelling Programmation
32	Statistical Inference for Change Points in High-Dimensional Offline and Online Data Li, Lingjun 07 April 2020 (has links) No description available. Mathematics Statistics Change point analysis Change-point detection Spatial-temporal data Large p small n High-dimensional data Average run length Expected detection delay
33	Improving Change Point Detection Using Self-Supervised VAEs : A Study on Distance Metrics and Hyperparameters in Time Series Analysis Workinn, Daniel January 2023 (has links) This thesis addresses the optimization of the Variational Autoencoder-based Change Point Detection (VAE-CP) approach in time series analysis, a vital component in data-driven decision making. We evaluate the impact of various distance metrics and hyperparameters on the model’s performance using a systematic exploration and robustness testing on diverse real-world datasets. Findings show that the Dynamic Time Warping (DTW) distance metric significantly enhances the quality of the extracted latent variable space and improves change point detection. The research underscores the potential of the VAE-CP approach for more effective and robust handling of complex time series data, advancing the capabilities of change point detection techniques. / Denna uppsats behandlar optimeringen av en Variational Autoencoder-baserad Change Point Detection (VAE-CP)-metod i tidsserieanalys, en vital komponent i datadrivet beslutsfattande. Vi utvärderar inverkan av olika distansmått och hyperparametrar på modellens prestanda med hjälp av systematisk utforskning och robusthetstestning på diverse verkliga datamängder. Resultaten visar att distansmåttet Dynamic Time Warping (DTW) betydligt förbättrar kvaliteten på det extraherade latenta variabelutrymmet och förbättrar detektionen av brytpunkter (eng. change points). Forskningen understryker potentialen med VAE-CP-metoden för mer effektiv och robust hantering av komplexa tidsseriedata, vilket förbättrar förmågan hos tekniker för att upptäcka brytpunkter. Change point detection Time series data Segmentation Machine learning Data mining Detektion av brytpunkter Tidsseriedata Segmentering Maskininlärning Datautvinning Computer and Information Sciences Data- och informationsvetenskap
34	An empirical study of the impact of data dimensionality on the performance of change point detection algorithms / En empirisk studie av data dimensionalitetens påverkan på change point detection algoritmers prestanda Noharet, Léo January 2023 (has links) When a system is monitored over time, changes can be discovered in the time series of monitored variables. Change Point Detection (CPD) aims at finding the time point where a change occurs in the monitored system. While CPD methods date back to the 1950’s with applications in quality control, few studies have been conducted on the impact of data dimensionality on CPD algorithms. This thesis intends to address this gap by examining five different algorithms using synthetic data that incorporates changes in mean, covariance, and frequency across dimensionalities up to 100. Additionally, the algorithms are evaluated on a collection of data sets originating from various domains. The studied methods are then assessed and ranked based on their performance on both synthetic and real data sets, to aid future users in selecting an appropriate CPD method. Finally, stock data from the 30 most traded companies on the Swedish stock market are collected to create a new CPD data set to which the CPD algorithms are applied. The changes of the monitored system that the CPD algorithms aim to detect are the changes in policy rate set by the Swedish central bank, Riksbank. The results of the thesis show that the dimensionality impacts the accuracy of the methods when noise is present and when the degree of mean or covariance change is small. Additionally, the application of the algorithms on real world data sets reveals large differences in performance between the studied methods, underlining the importance of comparison studies. Ultimately, the kernel based CPD method performed the best across the real world data set employed in the thesis. / När system övervakas över tid kan förändringar upptäckas i de uppmätade variablers tidsseriedata. Change Point Detection (CPD) syftar till att hitta tidpunkten då en förändring inträffar i det övervakade systemet’s tidseriedata. Medan CPD-metoder har sitt urspring i kvalitetskontroll under 1950-talet, har få studier undersökt datans dimensionalitets påverkan på CPD-algoritmer’s förmåga. Denna avhandling avser att fylla denna kunskapslucka genom att undersöka fem olika algoritmer med hjälp av syntetiska data som inkorporerar förändringar i medelvärde, kovarians och frekvens över dimensioner upp till 100. Dessutom jämförs algoritmerna med hjälp av en samling av data från olika domäner. De studerade metoderna bedöms och rangordnas sedan baserat på deras prestanda på både syntetiska och verkliga datauppsättningar för att hjälpa framtida användare att välja en lämplig CPD algoritm. Slutligen har aktiedata samlats från de 30 mest handlade företagen på den svenska aktiemarknaden för att skapa ett nytt data set. De förändringar i det övervakade systemet som CPD-algoritmerna syftar till att upptäcka är förändringarna i styrräntan som fastställs av Riksbanken. Resultaten av studien tyder på att dimensionaliteten påverkar förmågan hos algoritmerna att upptäcka förändringspunkterna när brus förekommer i datan och när graden av förändringen är liten. Dessutom avslöjar tillämpningen av algoritmerna på den verkliga datan stora skillnader i prestanda mellan de studerade metoderna, vilket understryker vikten av jämförelsestudier för att avslöja dessa skillnader. Slutligen presterade den kernel baserade CPD metoden bäst. Time series segmentation Change point detection Multivariate time series Data dimensionality Tidsserie-segmentering Förändringspunkts detektering Mulitvariabla tidsserier Data dimentionalitet Computer and Information Sciences Data- och informationsvetenskap
35	Likelihood Ratio Combination of Multiple Biomarkers and Change Point Detection in Functional Time Series Du, Zhiyuan 24 September 2024 (has links) Utilizing multiple biomarkers in medical research is crucial for the diagnostic accuracy of detecting diseases. An optimal method for combining these biomarkers is essential to maximize the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC). The optimality of the likelihood ratio has been proven but the challenges persist in estimating the likelihood ratio, primarily on the estimation of multivariate density functions. In this study, we propose a non-parametric approach for estimating multivariate density functions by utilizing Smoothing Spline density estimation to approximate the full likelihood function for both diseased and non-diseased groups, which compose the likelihood ratio. Simulation results demonstrate the efficiency of our method compared to other biomarker combination techniques under various settings for generated biomarker values. Additionally, we apply the proposed method to a real-world study aimed at detecting childhood autism spectrum disorder (ASD), showcasing its practical relevance and potential for future applications in medical research. Change point detection for functional time series has attracted considerable attention from researchers. Existing methods either rely on FPCA, which may perform poorly with complex data, or use bootstrap approaches in forms that fall short in effectively detecting diverse change functions. In our study, we propose a novel self-normalized test for functional time series implemented via a non-overlapping block bootstrap to circumvent reliance on FPCA. The SN factor ensures both monotonic power and adaptability for detecting diverse change functions on complex data. We also demonstrate our test's robustness in detecting changes in the autocovariance operator. Simulation studies confirm the superior performance of our test across various settings, and real-world applications further illustrate its practical utility. / Doctor of Philosophy / In medical research, it is crucial to accurately detect diseases and predict patient outcomes using multiple health indicators, also known as biomarkers. Combining these biomarkers effectively can significantly improve our ability to diagnose and treat various health conditions. However, finding the best way to combine these biomarkers has been a long-standing challenge. In this study, we propose a new, easy-to-understand method for combining multiple biomarkers using advanced estimation techniques. Our method takes into account various factors and provides a more accurate way to evaluate the combined information from different biomarkers. Through simulations, we demonstrated that our method performs better than other existing methods under a variety of scenarios. Furthermore, we applied our new method to a real-world study focusing on detecting childhood autism spectrum disorder (ASD), highlighting its practical value and potential for future applications in medical research. Detecting changes in patterns over time, especially shifts in averages, has become an important focus in data analysis. Existing methods often rely on techniques that may not perform well with more complex data or are limited in the types of changes they can detect. In this study, we introduce a new approach that improves the accuracy of detecting changes in complex data patterns. Our method is flexible and can identify changes in both the mean and variation of the data over time. Through simulations, we demonstrate that this approach is more accurate than current methods. Furthermore, we applied our method to real-world climate research data, illustrating its practical value. Multiple biomarkers ROC curve Likelihood ratio Smoothing spline density estimation Change point detection Functional time series Self-normalization Non-overlapping block bootstrap
36	Intrångsdetektering på CAN bus data : En studie för likvärdig jämförelse av metoder Hedman, Pontus, Skepetzis, Vasilios January 2020 (has links) Utförda hacker-attacker på moderna fordon belyser ett behov av snabb detektering av hot inom denna miljö, särskilt när det förekommer en trend inom denna industri där moderna fordon idag kan klassas som IoT-enheter. Det förekommer kända fall av attacker där en angripare förmår stoppa fordon i drift, eller ta bromsar ur funktion, och detta har påvisats ske fjärrstyrt. Denna studie undersöker detektion av utförda attacker, på en riktig bil, genom studie av CAN bus meddelanden. De två modellerna CUSUM, från området Change Point Detection, och Random Forests, från området maskininlärning, tillämpas på riktig datamängd, för att sedan jämföras på simulerad data sinsemellan. En ny hypotesdefinition introduceras vilket möjliggör att evalueringsmetoden Conditional expected delay kan nyttjas för fallet Random Forests, där resultat förmås jämföras med evalueringsresultat från CUSUM. Conditional expected delay har inte tidigare studerats för metod av maskininlärning. De båda metoderna evalueras också genom ROC-kurva. Sammantaget förmås de båda metoderna jämföras sinsemellan, med varandras etablerade evalueringsmetoder. Denna studie påvisar metod och hypotes för att brygga de två områdena change point detection och maskininlärning, för att evaluera de två enligt gemensamt motiverade parametervärden. / There are known hacker attacks which have been conducted on modern vehicles. These attacks illustrates a need for early threat detection in this environment. Development of security systems in this environment is of special interest due to the increasing interconnection of vehicles and their newfound classification as IoT devices. Known attacks, that have even been carried out remotely on modern vehicles, include attacks which allow a perpetrator to stop vehicles, or to disable brake mechanisms. This study examines the detection of attacks carried out on a real vehicle, by studying CAN bus messages. The two methods CUSUM, from the field of Change Point Detection, and Random Forests, from the field of Machine Learning, are both applied to real data, and then later comparably evaluated on simulated data. A new hypothesis defintion is introduced which allows for the evaluation method Conditional expected delay to be used in the case of Random Forests, where results may be compared to evaluation results from CUSUM. Conditional expected delay has not been studied in the machinelarning case before. Both methods are also evaluated by method of ROC curve. The combined hypothesis definition for the two separate fields, allow for a comparison between the two models, in regard to each other's established evaluation methods. This study present a method and hypothesis to bridge the two separate fields of study, change point detection, and machinelearning, to achieve a comparable evaluation between the two. CUSUM random Forests CAN Bus anomaly detection change point detection machine learning outlier detection intrusion detection system CUSUM random forests CAN Bus anomaly detection change point detection machine learning outlier detection intrusion detection system Computer Systems Datorsystem Communication Systems Kommunikationssystem Annan elektroteknik och elektronik Övrig annan teknik
37	Detection of the Change Point and Optimal Stopping Time by Using Control Charts on Energy Derivatives AL, Cihan, Koroglu, Kubra January 2011 (has links) No description available. change point detection optimal stopping time shiryaev shewhart ewma cusum statistical process control SPC Financial Mathematics changepoint detection optimalstopping time Shiryaev metod Cusum metod Ewma metod Applied mathematics Tillämpad matematik Mathematical statistics Matematisk statistik
38	Détection de ruptures multiples dans des séries temporelles multivariées : application à l'inférence de réseaux de dépendance / Multiple change-point detection in multivariate time series : application to the inference of dependency networks Harlé, Flore 21 June 2016 (has links) Cette thèse présente une méthode pour la détection hors-ligne de multiples ruptures dans des séries temporelles multivariées, et propose d'en exploiter les résultats pour estimer les relations de dépendance entre les variables du système. L'originalité du modèle, dit du Bernoulli Detector, réside dans la combinaison de statistiques locales issues d'un test robuste, comparant les rangs des observations, avec une approche bayésienne. Ce modèle non paramétrique ne requiert pas d'hypothèse forte sur les distributions des données. Il est applicable sans ajustement à la loi gaussienne comme sur des données corrompues par des valeurs aberrantes. Le contrôle de la détection d'une rupture est prouvé y compris pour de petits échantillons. Pour traiter des séries temporelles multivariées, un terme est introduit afin de modéliser les dépendances entre les ruptures, en supposant que si deux entités du système étudié sont connectées, les événements affectant l'une s'observent instantanément sur l'autre avec une forte probabilité. Ainsi, le modèle s'adapte aux données et la segmentation tient compte des événements communs à plusieurs signaux comme des événements isolés. La méthode est comparée avec d'autres solutions de l'état de l'art, notamment sur des données réelles de consommation électrique et génomiques. Ces expériences mettent en valeur l'intérêt du modèle pour la détection de ruptures entre des signaux indépendants, conditionnellement indépendants ou complètement connectés. Enfin, l'idée d'exploiter les synchronisations entre les ruptures pour l'estimation des relations régissant les entités du système est développée, grâce au formalisme des réseaux bayésiens. En adaptant la fonction de score d'une méthode d'apprentissage de la structure, il est vérifié que le modèle d'indépendance du système peut être en partie retrouvé grâce à l'information apportée par les ruptures, estimées par le modèle du Bernoulli Detector. / This thesis presents a method for the multiple change-points detection in multivariate time series, and exploits the results to estimate the relationships between the components of the system. The originality of the model, called the Bernoulli Detector, relies on the combination of a local statistics from a robust test, based on the computation of ranks, with a global Bayesian framework. This non parametric model does not require strong hypothesis on the distribution of the observations. It is applicable without modification on gaussian data as well as data corrupted by outliers. The detection of a single change-point is controlled even for small samples. In a multivariate context, a term is introduced to model the dependencies between the changes, assuming that if two components are connected, the events occurring in the first one tend to affect the second one instantaneously. Thanks to this flexible model, the segmentation is sensitive to common changes shared by several signals but also to isolated changes occurring in a single signal. The method is compared with other solutions of the literature, especially on real datasets of electrical household consumption and genomic measurements. These experiments enhance the interest of the model for the detection of change-points in independent, conditionally independent or fully connected signals. The synchronization of the change-points within the time series is finally exploited in order to estimate the relationships between the variables, with the Bayesian network formalism. By adapting the score function of a structure learning method, it is checked that the independency model that describes the system can be partly retrieved through the information given by the change-points, estimated by the Bernoulli Detector. Détection de ruptures Inférence bayésienne Statistiques de rang Séries temporelles multivariées Réseaux bayésiens Classes d'équivalence de Markov Change-Point detection Bayesian inference Rank statistics Multivariate time series Bayesian networks Markov equivalence classes 620
39	Détection et classification de signatures temporelles CAN pour l’aide à la maintenance de sous-systèmes d’un véhicule de transport collectif / Detection and classification of temporal CAN signatures to support maintenance of public transportation vehicle subsystems Cheifetz, Nicolas 09 September 2013 (has links) Le problème étudié dans le cadre de cette thèse porte essentiellement sur l'étape de détection de défaut dans un processus de diagnostic industriel. Ces travaux sont motivés par la surveillance de deux sous-systèmes complexes d'un autobus impactant la disponibilité des véhicules et leurs coûts de maintenance : le système de freinage et celui des portes. Cette thèse décrit plusieurs outils dédiés au suivi de fonctionnement de ces deux systèmes. On choisit une approche de diagnostic par reconnaissance des formes qui s'appuie sur l'analyse de données collectées en exploitation à partir d'une nouvelle architecture télématique embarquée dans les autobus. Les méthodes proposées dans ces travaux de thèse permettent de détecter un changement structurel dans un flux de données traité séquentiellement, et intègrent des connaissances disponibles sur les systèmes surveillés. Le détecteur appliqué aux freins s'appuie sur les variables de sortie (liées au freinage) d'un modèle physique dynamique du véhicule qui est validé expérimentalement dans le cadre de nos travaux. L'étape de détection est ensuite réalisée par des cartes de contrôle multivariées à partir de données multidimensionnelles. La stratégie de détection pour l'étude du système porte traite directement les données collectées par des capteurs embarqués pendant des cycles d'ouverture et de fermeture, sans modèle physique a priori. On propose un test séquentiel à base d'hypothèses alimenté par un modèle génératif pour représenter les données fonctionnelles. Ce modèle de régression permet de segmenter des courbes multidimensionnelles en plusieurs régimes. Les paramètres de ce modèle sont estimés par un algorithme de type EM dans un mode semi-supervisé. Les résultats obtenus à partir de données réelles et simulées ont permis de mettre en évidence l'efficacité des méthodes proposées aussi bien pour l'étude des freins que celle des portes / This thesis is mainly dedicated to the fault detection step occurring in a process of industrial diagnosis. This work is motivated by the monitoring of two complex subsystems of a transit bus, which impact the availability of vehicles and their maintenance costs: the brake and the door systems. This thesis describes several tools that monitor operating actions of these systems. We choose a pattern recognition approach based on the analysis of data collected from a new IT architecture on-board the buses. The proposed methods allow to detect sequentially a structural change in a datastream, and take advantage of prior knowledge of the monitored systems. The detector applied to the brakes is based on the output variables (related to the brake system) from a physical dynamic modeling of the vehicle which is experimentally validated in this work. The detection step is then performed by multivariate control charts from multidimensional data. The detection strategy dedicated to doors deals with data collected by embedded sensors during opening and closing cycles, with no need for a physical model. We propose a sequential testing approach using a generative model to describe the functional data. This regression model allows to segment multidimensional curves in several regimes. The model parameters are estimated via a specific EM algorithm in a semi-supervised mode. The results obtained from simulated and real data allow to highlight the effectiveness of the proposed methods on both the study of brakes and doors Diagnostic et maintenance préventive Détection de changement Tests séquentiels d’hypothèses Séquence de courbes Algorithme EM et modèles de mélange Diagnosis and preventive maintenance Change-point detection Sequential hypothesis testing Curves sequence EM algorithm and mixture models Brake system and doors of a urban buses
40	Contributions à la modélisation de données spatiales et fonctionnelles : applications / Contributions to modeling spatial and functional data : applications Ternynck, Camille 28 November 2014 (has links) Dans ce mémoire de thèse, nous nous intéressons à la modélisation non paramétrique de données spatiales et/ou fonctionnelles, plus particulièrement basée sur la méthode à noyau. En général, les échantillons que nous avons considérés pour établir les propriétés asymptotiques des estimateurs proposés sont constitués de variables dépendantes. La spécificité des méthodes étudiées réside dans le fait que les estimateurs prennent en compte la structure de dépendance des données considérées.Dans une première partie, nous appréhendons l’étude de variables réelles spatialement dépendantes. Nous proposons une nouvelle approche à noyau pour estimer les fonctions de densité de probabilité et de régression spatiales ainsi que le mode. La particularité de cette approche est qu’elle permet de tenir compte à la fois de la proximité entre les observations et de celle entre les sites. Nous étudions les comportements asymptotiques des estimateurs proposés ainsi que leurs applications à des données simulées et réelles.Dans une seconde partie, nous nous intéressons à la modélisation de données à valeurs dans un espace de dimension infinie ou dites "données fonctionnelles". Dans un premier temps, nous adaptons le modèle de régression non paramétrique introduit en première partie au cadre de données fonctionnelles spatialement dépendantes. Nous donnons des résultats asymptotiques ainsi que numériques. Puis, dans un second temps, nous étudions un modèle de régression de séries temporelles dont les variables explicatives sont fonctionnelles et le processus des innovations est autorégressif. Nous proposons une procédure permettant de tenir compte de l’information contenue dans le processus des erreurs. Après avoir étudié le comportement asymptotique de l’estimateur à noyau proposé, nous analysons ses performances sur des données simulées puis réelles.La troisième partie est consacrée aux applications. Tout d’abord, nous présentons des résultats de classification non supervisée de données spatiales (multivariées), simulées et réelles. La méthode de classification considérée est basée sur l’estimation du mode spatial, obtenu à partir de l’estimateur de la fonction de densité spatiale introduit dans le cadre de la première partie de cette thèse. Puis, nous appliquons cette méthode de classification basée sur le mode ainsi que d’autres méthodes de classification non supervisée de la littérature sur des données hydrologiques de nature fonctionnelle. Enfin, cette classification des données hydrologiques nous a amené à appliquer des outils de détection de rupture sur ces données fonctionnelles. / In this dissertation, we are interested in nonparametric modeling of spatial and/or functional data, more specifically based on kernel method. Generally, the samples we have considered for establishing asymptotic properties of the proposed estimators are constituted of dependent variables. The specificity of the studied methods lies in the fact that the estimators take into account the structure of the dependence of the considered data.In a first part, we study real variables spatially dependent. We propose a new kernel approach to estimating spatial probability density of the mode and regression functions. The distinctive feature of this approach is that it allows taking into account both the proximity between observations and that between sites. We study the asymptotic behaviors of the proposed estimates as well as their applications to simulated and real data. In a second part, we are interested in modeling data valued in a space of infinite dimension or so-called "functional data". As a first step, we adapt the nonparametric regression model, introduced in the first part, to spatially functional dependent data framework. We get convergence results as well as numerical results. Then, later, we study time series regression model in which explanatory variables are functional and the innovation process is autoregressive. We propose a procedure which allows us to take into account information contained in the error process. After showing asymptotic behavior of the proposed kernel estimate, we study its performance on simulated and real data.The third part is devoted to applications. First of all, we present unsupervised classificationresults of simulated and real spatial data (multivariate). The considered classification method is based on the estimation of spatial mode, obtained from the spatial density function introduced in the first part of this thesis. Then, we apply this classification method based on the mode as well as other unsupervised classification methods of the literature on hydrological data of functional nature. Lastly, this classification of hydrological data has led us to apply change point detection tools on these functional data. Estimation non paramétrique Estimateur à noyau Densité de probabilité Mode Régression Statistique spatiale Données fonctionnelles Séries temporelles Classification non supervisée Détection de rupture Nonparametric estimation Kernel estimate Probability density Mode Regression Spatial statistics Functional data Time series Unsupervised classification Change point detection

Search results