Global ETD Search

31	Wavelet methods and statistical applications: network security and bioinformatics Kwon, Deukwoo 01 November 2005 (has links) Wavelet methods possess versatile properties for statistical applications. We would like to explore the advantages of using wavelets in the analyses in two different research areas. First of all, we develop an integrated tool for online detection of network anomalies. We consider statistical change point detection algorithms, for both local changes in the variance and for jumps detection, and propose modified versions of these algorithms based on moving window techniques. We investigate performances on simulated data and on network traffic data with several superimposed attacks. All detection methods are based on wavelet packets transformations. We also propose a Bayesian model for the analysis of high-throughput data where the outcome of interest has a natural ordering. The method provides a unified approach for identifying relevant markers and predicting class memberships. This is accomplished by building a stochastic search variable selection method into an ordinal model. We apply the methodology to the analysis of proteomic studies in prostate cancer. We explore wavelet-based techniques to remove noise from the protein mass spectra. The goal is to identify protein markers associated with prostate-specific antigen (PSA) level, an ordinal diagnostic measure currently used to stratify patients into different risk groups. Bayesian ordinal probit model wavelet methods change point detection network security bioinformatics proteomics SELDI-TOF MS Bayesian variable selection biomarker
32	Process monitoring and feedback control using multiresolution analysis and machine learning Ganesan, Rajesh 01 June 2005 (has links) Online process monitoring and feedback control are two widely researched aspects that can impact the performance of a myriad of process applications. Semiconductor manufacturing is one such application that due to the ever increasing demands placed on its quality and speed holds tremendous potentials for further research and development in the areas of monitoring and control. One of the key areas of semiconductor manufacturing that has received significant attention among researchers and practitioners in recent years is the online sensor based monitoring and feedback control of its nanoscale wafer fabrication process. Monitoring and feedback control strategies of nanomanufacturing processes often require a combination of monitoring using nonstationary and multiscale signals, and a robust feedback control using complex process models. It is also essential for the monitoring and feedback control strategies to possess stringent properties such as high speed of execution, low cost of operation, ease of implementation, high accuracy, and capability for online implementation. Due to the above requirement, a need is being felt to develop state-of-the-art sensor data processing algorithms that can perform far superior to those that are currently available both in the literature and commercially in the form of softwares.The contributions of this dissertation are three fold. It first focuses on the development of an efficient online scheme for process monitoring. The scheme combines the potentials of wavelet based multiresolution analysis and sequential probability ratio test to develop a very sensitive strategy to detect changes in nonstationary signals. Secondly, the dissertation presents a novel online feedback control scheme. The control problem is cast in the framework of probabilistic dynamic decision making, and the control scheme is built on the mathematical foundations of wavelet based multiresolution analysis, dynamic programming, and machine learning. Analysis of convergence of the control scheme is also presented. Finally, the monitoring and the control schemes are tested on a nanoscale manufacturing process (chemical mechanical planarization, CMP) used in silicon wafer fabrication. The results obtained from experimental data clearly indicate that the approaches developed outperform the existing approaches. The novelty of the research in this dissertation stems from the fact that they further the science of sensor based process monitoring and control by uniting sophisticated concepts from signal processing, statistics, stochastic processes, and artificial intelligence, and yet remain versatile to many real world process applications. End point detection Reinforcement learning Run-by-run control Sequential probability ratio test Wavelet American Studies Arts and Humanities
33	System Surveillance Mansoor, Shaheer January 2013 (has links) In recent years, trade activity in stock markets has increased substantially. This is mainly attributed to the development of powerful computers and intranets connecting traders to markets across the globe. The trades have to be carried out almost instantaneously and the systems in place that handle trades are burdened with millions of transactions a day, several thousand a minute. With increasing transactions the time to execute a single trade increases, and this can be seen as an impact on the performance. There is a need to model the performance of these systems and provide forecasts to give a heads up on when a system is expected to be overwhelmed by transactions. This was done in this study, in cooperation with Cinnober Financial Technologies, a firm which provides trading solutions to stock markets. To ensure that the models developed weren‟t biased, the dataset was cleansed, i.e. operational and other transactions were removed, and only valid trade transactions remained. For this purpose, a descriptive analysis of time series along with change point detection and LOESS regression were used. State space model with Kalman Filtering was further used to develop a time varying coefficient model for the performance, and this model was applied to make forecasts. Wavelets were also used to produce forecasts, and besides this high pass filters were used to identify low performance regions. The State space model performed very well to capture the overall trend in performance and produced reliable forecasts. This can be ascribed to the property of Kalman Filter to handle noisy data well. Wavelets on the other hand didn‟t produce reliable forecasts but were more efficient in detecting regions of low performance. State Space Models Forecasting Wavelets LOESS Change Point Detection Financial Systems Trading Transactions per second Kalman Filtering
34	Detekce změn v lineárních modelech a bootstrap / Detekce změn v lineárních modelech a bootstrap Čellár, Matúš January 2016 (has links) This thesis discusses the changes in parameters of linear models and methods of their detection. It begins with a short introduction of the two basic types of change point detection procedures and bootstrap algorithms developed specifically to deal with dependent data. In the following chapter we focus on the location model - the simplest example of a linear model with a change in parameters. On this model we will illustrate a way of long-run variance estimation and implementation of selected bootstrap procedures. In the last chapter we show how to extend the applied methods to linear models with a change in parameters. We will compare the performance of change point tests based on asymptotic and bootstrap critical values through simulation studies in both our considered methods. The performance of selected long-run variance estimator will also be examined both for situations when the change in parameters occurs and when it does not. 1
35	Change detection in metal oxide gas sensor signals for open sampling systems Pashami, Sepideh January 2015 (has links) This thesis addresses the problem of detecting changes in the activity of a distant gas source from the response of an array of metal oxide (MOX) gas sensors deployed in an Open Sampling System (OSS). Changes can occur due to gas source activity such as a sudden alteration in concentration or due to exposure to a different compound. Applications such as gas-leak detection in mines or large-scale pollution monitoring can benefit from reliable change detection algorithms, especially where it is impractical to continuously store or transfer sensor readings, or where reliable calibration is difficult to achieve. Here, it is desirable to detect a change point indicating a significant event, e.g. presence of gas or a sudden change in concentration. The main challenges are turbulent dispersion of gas and the slow response and recovery times of MOX sensors. Due to these challenges, the gas sensor response exhibits fluctuations that interfere with the changes of interest. The contributions of this thesis are centred on developing change detection methods using MOX sensor responses. First, we apply the Generalized Likelihood Ratio algorithm (GLR), a commonly used method that does not make any a priori assumption about change events. Next, we propose TREFEX, a novel change point detection algorithm, which models the response of MOX sensors as a piecewise exponential signal and considers the junctions between consecutive exponentials as change points. We also propose the rTREFEX algorithm as an extension of TREFEX. The core idea behind rTREFEX is an attempt to improve the fitted exponentials of TREFEX by minimizing the number of exponentials even further. GLR, TREFEX and rTREFEX are evaluated for various MOX sensors and gas emission profiles. A sensor selection algorithm is then introduced and the change detection algorithms are evaluated with the selected sensor subsets. A comparison between the three proposed algorithms shows clearly superior performance of rTREFEX both in detection performance and in estimating the change time. Further, rTREFEX is evaluated in real-world experiments where data is gathered by a mobile robot. Finally, a gas dispersion simulation was developed which integrates OpenFOAM flow simulation and a filament-based gas propagation model to simulate gas dispersion for compressible flows with a realistic turbulence model. Computer Science Datavetenskap (datalogi)
36	Exploration of Non-Linear and Non-Stationary Approaches to Statistical Seasonal Forecasting in the Sahel Gado Djibo, Abdouramane January 2016 (has links) Water resources management in the Sahel region of West Africa is extremely difﬁcult because of high inter-annual rainfall variability as well as a general reduction of water availability in the region. Observed changes in streamﬂow directly disturb key socioeconomic activities such as the agriculture sector, which constitutes one of the main survival pillars of the West African population. Seasonal rainfall forecasting is considered as one possible way to increase resilience to climate variability by providing information in advance about the amount of rainfall expected in each upcoming rainy season. Moreover, the availability of reliable information about streamflow magnitude a few months before a rainy season will immensely beneﬁt water users who want to plan their activities. However, since the 90s, several studies have attempted to evaluate the predictability of Sahelian weather characteristics and develop seasonal rainfall and streamflow forecast models to help stakeholders take better decisions. Unfortunately, two decades later, forecasting is still difficult, and forecasts have a limited value for decision-making. It is believed that the low performance in seasonal forecasting is due to the limits of commonly used predictors and forecast approaches for this region. In this study, new seasonal forecasting approaches are developed and new predictors tested in an attempt to predict the seasonal rainfall over the Sirba watershed located in between Niger and Burkina Faso, in West Africa. Using combined statistical methods, a pool of 84 predictors with physical links with the West African monsoon and its dynamics were selected, with their optimal lag times. They were first reduced through screening using linear correlation with satellite rainfall over West Africa. Correlation analysis and principal component analysis were used to keep the most predictive principal components. Linear regression was used to get synthetic forecasts, and the model was assessed to rank the tested predictors. The three best predictors, air temperature (from Pacific Tropical North), sea level pressure (from Atlantic Tropical South) and relative humidity (from Mediterranean East) were retained and tested as inputs for seasonal rainfall forecasting models. In this thesis it has been chosen to depart from the stationarity and linearity assumptions used in most seasonal forecasting methods: 1. Two probabilistic non-stationary methods based on change point detection were developed and tested. Each method uses one of the three best predictors. Model M1 allows for changes in model parameters according to annual rainfall magnitude, while M2 allows for changes in model parameters with time. M1 and M2 were compared to the classical linear model with constant parameters (M3) and to the linear model with climatology (M4). The model allowing changes in the predictand-predictor relationship according to rainfall amplitude (M1) and using AirTemp as a predictor was the best model for seasonal rainfall forecasting in the study area. 2. Non-linear models including regression trees, feed-forward neural networks and non-linear principal component analysis were implemented and tested to forecast seasonal rainfall using the same predictors. Forecast performances were compared using coefficients of determination, Nash-Sutcliffe coefficients and hit rate scores. Non-linear principal component analysis was the best non-linear model (R2: 0.46; Nash: 0.45; HIT: 60.7), while the feed-forward neural networks and regression tree models performed poorly. All the developed rainfall forecasting methods were subsequently used to forecast seasonal annual mean streamflow and maximum monthly streamflow by introducing the rainfall forecasted in a SWAT model of the Sirba watershed, and the results are summarized as follows: 1. Non-stationary models: Models M1 and M2 were compared to models M3 and M4, and the results revealed that model M3 using RHUM as a predictor at a lag time of 8 months was the best method for seasonal annual mean streamflow forecasting, whereas model M1 using air temperature as a predictor at a lag time of 4 months was the best model to predict maximum monthly streamflow in the Sirba watershed. Moreover, the calibrated SWAT model achieved a NASH value of 0.83. 2. Non-linear models: The seasonal rainfall obtained from the non-linear principal component analysis model was disaggregated into daily rainfall using the method of fragment, and then fed into the SWAT hydrological model to produce streamflow. This forecast was fairly acceptable, with a Nash value of 0.58. The evaluation of the level of risk associated with each seasonal forecast was carried out using a simple risk measure: the probability of overtopping of the flood protection dykes in Niamey, Niger. A HEC-RAS hydrodynamic model of the Niger River around Niamey was developed for the 1980-2014 period, and a copula analysis was used to model the dependence structure of streamflows and predict the distribution of streamflow in Niamey given the predicted streamflow on the Sirba watershed. Finally, the probabilities of overtopping of the flood protection dykes were estimated for each year in the 1980-2014 period. The findings of this study can be used as a guideline to improve the performance of seasonal forecasting in the Sahel. This research clearly confirmed the possibility of rainfall and streamflow forecasting in the Sirba watershed at a seasonal time scale using potential predictors other than sea surface temperature. West African monsoon change point detection seasonal flood forecast probability risk assessment Sahel Sirba watershed
37	Change-point detection and kernel methods / Détection de ruptures et méthodes à noyaux Garreau, Damien 12 October 2017 (has links) Dans cette thèse, nous nous intéressons à une méthode de détection des ruptures dans une suite d’observations appartenant à un ensemble muni d’un noyau semi-défini positif. Cette procédure est une version « à noyaux » d’une méthode des moindres carrés pénalisés. Notre principale contribution est de montrer que, pour tout noyau satisfaisant des hypothèses raisonnables, cette méthode fournit une segmentation proche de la véritable segmentation avec grande probabilité. Ce résultat est obtenu pour un noyau borné et une pénalité linéaire, ainsi qu’une autre pénalité venant de la sélection de modèles. Les preuves reposent sur un résultat de concentration pour des variables aléatoires bornées à valeurs dans un espace de Hilbert, et nous obtenons une version moins précise de ce résultat lorsque l’on supposeseulement que la variance des observations est finie. Dans un cadre asymptotique, nous retrouvons les taux minimax usuels en détection de ruptures lorsqu’aucune hypothèse n’est faite sur la taille des segments. Ces résultats théoriques sont confirmés par des simulations. Nous étudions également de manière détaillée les liens entre différentes notions de distances entre segmentations. En particulier, nous prouvons que toutes ces notions coïncident pour des segmentations suffisamment proches. D’un point de vue pratique, nous montrons que l’heuristique du « saut de dimension » pour choisir la constante de pénalisation est un choix raisonnable lorsque celle-ci est linéaire. Nous montrons également qu’une quantité clé dépendant du noyau et qui apparaît dans nos résultats théoriques influe sur les performances de cette méthode pour la détection d’une unique rupture. Dans un cadre paramétrique, et lorsque le noyau utilisé est invariant partranslation, il est possible de calculer cette quantité explicitement. Grâce à ces calculs, nouveaux pour plusieurs d’entre eux, nous sommes capable d’étudier précisément le comportement de la constante de pénalité maximale. Pour finir, nous traitons de l’heuristique de la médiane, un moyen courant de choisir la largeur de bande des noyaux à base de fonctions radiales. Dans un cadre asymptotique, nous montrons que l’heuristique de la médiane se comporte à la limite comme la médiane d’une distribution que nous décrivons complètement dans le cadre du test à deux échantillons à noyaux et de la détection de ruptures. Plus précisément, nous montrons que l’heuristique de la médiane est approximativement normale centrée en cette valeur. / In this thesis, we focus on a method for detecting abrupt changes in a sequence of independent observations belonging to an arbitrary set on which a positive semidefinite kernel is defined. That method, kernel changepoint detection, is a kernelized version of a penalized least-squares procedure. Our main contribution is to show that, for any kernel satisfying some reasonably mild hypotheses, this procedure outputs a segmentation close to the true segmentation with high probability. This result is obtained under a bounded assumption on the kernel for a linear penalty and for another penalty function, coming from model selection.The proofs rely on a concentration result for bounded random variables in Hilbert spaces and we prove a less powerful result under relaxed hypotheses—a finite variance assumption. In the asymptotic setting, we show that we recover the minimax rate for the change-point locations without additional hypothesis on the segment sizes. We provide empirical evidence supporting these claims. Another contribution of this thesis is the detailed presentation of the different notions of distances between segmentations. Additionally, we prove a result showing these different notions coincide for sufficiently close segmentations.From a practical point of view, we demonstrate how the so-called dimension jump heuristic can be a reasonable choice of penalty constant when using kernel changepoint detection with a linear penalty. We also show how a key quantity depending on the kernelthat appears in our theoretical results influences the performance of kernel change-point detection in the case of a single change-point. When the kernel is translationinvariant and parametric assumptions are made, it is possible to compute this quantity in closed-form. Thanks to these computations, some of them novel, we are able to study precisely the behavior of the maximal penalty constant. Finally, we study the median heuristic, a popular tool to set the bandwidth of radial basis function kernels. Fora large sample size, we show that it behaves approximately as the median of a distribution that we describe completely in the setting of kernel two-sample test and kernel change-point detection. More precisely, we show that the median heuristic is asymptotically normal around this value. Détection de ruptures Méthodes à noyaux Moindres carrés pénalisés Heuristique de la médiane Change-point detection Kernel methods Penalized least-squares Median heuristic 510
38	Détection d'anomalies et de ruptures dans les séries temporelles. Applications à la gestion de production de l'électricité / Detection of outliers and changepoints in time series. Applications over the management of electricity production Allab, Nedjmeddine 21 November 2016 (has links) Continental est l'outil de référence utilisé par EDF pour la gestion d'électricité à long terme. il permet d'élaborer la stratégie d'exploitation du parc constitué de centrales réparties sur toute l'europe. l'outil simule sur chaque zone et chaque scénario plusieurs variables telles que la demande d'électricité, la quantité générée ainsi que les coûts associés. nos travaux de thèse ont pour objectif de fournir des méthodes d'analyse de ces données de production afin de faciliter leur étude et leur synthèse. nous récoltons un ensemble de problématiques auprès des utilisateurs de continental que nous tentons de résoudre à l'aide des technique de détection d'anomalies et de ruptures dans les séries temporelles. / Continental is the main tool that edf uses for the long-term management of electricity. It elaborates the strategy exploitation of the electrical parc made up by power plants distributed all over europe. the tool simulates for each zone and each scenario several variables, such as the electricity demand, the generated quantity as well as the related costs. our works aim to provide methods to analyse the data of electricity production in order to ease their discovery and synthesis. we get a set of problmatics from the users of continental that we tent to solve through techniques of outliers and changepoints detection in time series. Séries temporelles Détection d'anomalies Détection de ruptures Edf Gestion de la production électrique Outlier detection Charge point detection Time series 519.5
39	Détection de ruptures multiples – application aux signaux physiologiques. / Multiple change point detection – application to physiological signals. Truong, Charles 29 November 2018 (has links) Ce travail s’intéresse au problème de détection de ruptures multiples dans des signaux physiologiques (univariés ou multivariés). Ce type de signaux comprend par exemple les électrocardiogrammes (ECG), électroencéphalogrammes (EEG), les mesures inertielles (accélérations, vitesses de rotation, etc.). L’objectif de cette thèse est de fournir des algorithmes de détection de ruptures capables (i) de gérer de long signaux, (ii) d’être appliqués dans de nombreux scénarios réels, et (iii) d’intégrer la connaissance d’experts médicaux. Par ailleurs, les méthodes totalement automatiques, qui peuvent être utilisées dans un cadre clinique, font l’objet d’une attention particulière. Dans cette optique, des procédures robustes de détection et des stratégies supervisées de calibration sont décrites, et une librairie Python open-source et documentée, est mise en ligne.La première contribution de cette thèse est un algorithme sous-optimal de détection de ruptures, capable de s’adapter à des contraintes sur temps de calcul, tout en conservant la robustesse des procédures optimales. Cet algorithme est séquentiel et alterne entre les deux étapes suivantes : une rupture est détectée, puis retranchée du signal grâce à une projection. Dans le cadre de sauts de moyenne, la consistance asymptotique des instants estimés de ruptures est démontrée. Nous prouvons également que cette stratégie gloutonne peut facilement être étendue à d’autres types de ruptures, à l’aide d’espaces de Hilbert à noyau reproduisant. Grâce à cette approche, des hypothèses fortes sur le modèle génératif des données ne sont pas nécessaires pour gérer des signaux physiologiques. Les expériences numériques effectuées sur des séries temporelles réelles montrent que ces méthodes gloutonnes sont plus précises que les méthodes sous-optimales standards et plus rapides que les algorithmes optimaux.La seconde contribution de cette thèse comprend deux algorithmes supervisés de calibration automatique. Ils utilisent tous les deux des exemples annotés, ce qui dans notre contexte correspond à des signaux segmentés. La première approche apprend le paramètre de lissage pour la détection pénalisée d’un nombre inconnu de ruptures. La seconde procédure apprend une transformation non-paramétrique de l’espace de représentation, qui améliore les performances de détection. Ces deux approches supervisées produisent des algorithmes finement calibrés, capables de reproduire la stratégie de segmentation d’un expert. Des résultats numériques montrent que les algorithmes supervisés surpassent les algorithmes non-supervisés, particulièrement dans le cas des signaux physiologiques, où la notion de rupture dépend fortement du phénomène physiologique d’intérêt.Toutes les contributions algorithmiques de cette thèse sont dans "ruptures", une librairie Python open-source, disponible en ligne. Entièrement documentée, "ruptures" dispose également une interface consistante pour toutes les méthodes. / This work addresses the problem of detecting multiple change points in (univariate or multivariate) physiological signals. Well-known examples of such signals include electrocardiogram (ECG), electroencephalogram (EEG), inertial measurements (acceleration, angular velocities, etc.). The objective of this thesis is to provide change point detection algorithms that (i) can handle long signals, (ii) can be applied on a wide range of real-world scenarios, and (iii) can incorporate the knowledge of medical experts. In particular, a greater emphasis is placed on fully automatic procedures which can be used in daily clinical practice. To that end, robust detection methods as well as supervised calibration strategies are described, and a documented open-source Python package is released.The first contribution of this thesis is a sub-optimal change point detection algorithm that can accommodate time complexity constraints while retaining most of the robustness of optimal procedures. This algorithm is sequential and alternates between the two following steps: a change point is estimated then its contribution to the signal is projected out. In the context of mean-shifts, asymptotic consistency of estimated change points is obtained. We prove that this greedy strategy can easily be extended to other types of changes, by using reproducing kernel Hilbert spaces. Thanks this novel approach, physiological signals can be handled without making assumption of the generative model of the data. Experiments on real-world signals show that those approaches are more accurate than standard sub-optimal algorithms and faster than optimal algorithms.The second contribution of this thesis consists in two supervised algorithms for automatic calibration. Both rely on labeled examples, which in our context, consist in segmented signals. The first approach learns the smoothing parameter for the penalized detection of an unknown number of changes. The second procedure learns a non-parametric transformation of the representation space, that improves detection performance. Both supervised procedures yield finely tuned detection algorithms that are able to replicate the segmentation strategy of an expert. Results show that those supervised algorithms outperform unsupervised algorithms, especially in the case of physiological signals, where the notion of change heavily depends on the physiological phenomenon of interest.All algorithmic contributions of this thesis can be found in ``ruptures'', an open-source Python library, available online. Thoroughly documented, ``ruptures'' also comes with a consistent interface for all methods. Méthodes gloutonnes Détection de ruptures Calibration automatique Logiciel de statistiques Statistical software Greedy methods Automatic calibration Change point detection
40	Development of a Client-Side Evil Twin Attack Detection System for Public Wi-Fi Hotspots based on Design Science Approach Horne, Liliana R. 01 January 2018 (has links) Users and providers benefit considerably from public Wi-Fi hotspots. Users receive wireless Internet access and providers draw new prospective customers. While users are able to enjoy the ease of Wi-Fi Internet hotspot networks in public more conveniently, they are more susceptible to a particular type of fraud and identify theft, referred to as evil twin attack (ETA). Through setting up an ETA, an attacker can intercept sensitive data such as passwords or credit card information by snooping into the communication links. Since the objective of free open (unencrypted) public Wi-Fi hotspots is to provide ease of accessibility and to entice customers, no security mechanisms are in place. The public’s lack of awareness of the security threat posed by free open public Wi-Fi hotspots makes this problem even more heinous. Client-side systems to help wireless users detect and protect themselves from evil twin attacks in public Wi-Fi hotspots are in great need. In this dissertation report, the author explored the problem of the need for client-side detection systems that will allow wireless users to help protect their data from evil twin attacks while using free open public Wi-Fi. The client-side evil twin attack detection system constructed as part of this dissertation linked the gap between the need for wireless security in free open public Wi-Fi hotspots and limitations in existing client-side evil twin attack detection solutions. Based on design science research (DSR) literature, Hevner’s seven guidelines of DSR, Peffer’s design science research methodology (DSRM), Gregor’s IS design theory, and Hossen & Wenyuan’s (2014) study evaluation methodology, the author developed design principles, procedures and specifications to guide the construction, implementation, and evaluation of a prototype client-side evil twin attack detection artifact. The client-side evil twin attack detection system was evaluated in a hotel public Wi-Fi environment. The goal of this research was to develop a more effective, efficient, and practical client-side detection system for wireless users to independently detect and protect themselves from mobile evil twin attacks while using free open public Wi-Fi hotspots. The experimental results showed that client-side evil twin attack detection system can effectively detect and protect users from mobile evil twin AP attacks in public Wi-Fi hotspots in various real-world scenarios despite time delay caused by many factors. design science research evil twin attack public Wi-Fi hotspots wireless security Computer Sciences

Search results