Global ETD Search

11	Detekce nestabilit v některých panelových datech / Detection of instabilities in some panel data Láf, Adam January 2018 (has links) This thesis deals with the detection of change in the intercept in panel data re- gression model. We are interested in testing a null hypothesis that there was no change in the intercept during the observation period in case with no depen- dency between panels and with the number of panels and observations in each panel going to infinity. Based on the results for simplified case with no additional regressors we propose a statistical test and show its properties. We also derive a consistent estimate of the parameter of change based on the least squares me- thod. The main contribution of the thesis is the derivation of theoretical results of the proposed test while variances of errors are known and its modification for unknown variance parameters. A large simulation study is conducted to examine the results. Then we present an application to real data, particularly we use four factor CAPM model to detect change in monthly returns of US mutual funds during an observation period 2004-2011 and show a significant change during the sub-prime crisis in 2007-2008. This work expands existing results for de- tecting changes in the mean in panel data and offers many directions for further beneficial research. 1
12	Comparison of change-point detection algorithms for vector time series Du, Yang January 2010 (has links) Change-point detection aims to reveal sudden changes in sequences of data. Special attention has been paid to the detection of abrupt level shifts, and applications of such techniques can be found in a great variety of fields, such as monitoring of climate change, examination of gene expressions and quality control in the manufacturing industry. In this work, we compared the performance of two methods representing frequentist and Bayesian approaches, respectively. The frequentist approach involved a preliminary search for level shifts using a tree algorithm followed by a dynamic programming algorithm for optimizing the locations and sizes of the level shifts. The Bayesian approach involved an MCMC (Markov chain Monte Carlo) implementation of a method originally proposed by Barry and Hartigan. The two approaches were implemented in R and extensive simulations were carried out to assess both their computational efficiency and ability to detect abrupt level shifts. Our study showed that the overall performance regarding the estimated location and size of change-points was comparable for the Bayesian and frequentist approach. However, the Bayesian approach performed better when the number of change-points was small; whereas the frequentist became stronger when the change-point proportion increased. The latter method was also better at detecting simultaneous change-points in vector time series. Theoretically, the Bayesian approach has a lower computational complexity than the frequentist approach, but suitable settings for the combined tree and dynamic programming can greatly reduce the processing time. change-point detection time series tree dynamic programming Bayesian MCMC Computer and Information Sciences Data- och informationsvetenskap
13	Gaussian processes for state space models and change point detection Turner, Ryan Darby January 2012 (has links) This thesis details several applications of Gaussian processes (GPs) for enhanced time series modeling. We first cover different approaches for using Gaussian processes in time series problems. These are extended to the state space approach to time series in two different problems. We also combine Gaussian processes and Bayesian online change point detection (BOCPD) to increase the generality of the Gaussian process time series methods. These methodologies are evaluated on predictive performance on six real world data sets, which include three environmental data sets, one financial, one biological, and one from industrial well drilling. Gaussian processes are capable of generalizing standard linear time series models. We cover two approaches: the Gaussian process time series model (GPTS) and the autoregressive Gaussian process (ARGP).We cover a variety of methods that greatly reduce the computational and memory complexity of Gaussian process approaches, which are generally cubic in computational complexity. Two different improvements to state space based approaches are covered. First, Gaussian process inference and learning (GPIL) generalizes linear dynamical systems (LDS), for which the Kalman filter is based, to general nonlinear systems for nonparametric system identification. Second, we address pathologies in the unscented Kalman filter (UKF).We use Gaussian process optimization (GPO) to learn UKF settings that minimize the potential for sigma point collapse. We show how to embed mentioned Gaussian process approaches to time series into a change point framework. Old data, from an old regime, that hinders predictive performance is automatically and elegantly phased out. The computational improvements for Gaussian process time series approaches are of even greater use in the change point framework. We also present a supervised framework learning a change point model when change point labels are available in training. 620
14	Exploring Change Point Detection in Network Equipment Logs Björk, Tim January 2021 (has links) Change point detection (CPD) is the method of detecting sudden changes in timeseries, and its importance is great concerning network traffic. With increased knowledge of occurring changes in data logs due to updates in networking equipment,a deeper understanding is allowed for interactions between the updates and theoperational resource usage. In a data log that reflects the amount of network traffic, there are large variations in the time series because of reasons such as connectioncount or external changes to the system. To circumvent these unwanted variationchanges and assort the deliberate variation changes is a challenge. In this thesis, we utilize data logs retrieved from a network equipment vendor to detect changes, then compare the detected changes to when firmware/signature updates were applied, configuration changes were made, etc. with the goal to achieve a deeper understanding of any interaction between firmware/signature/configuration changes and operational resource usage. Challenges in the data quality and data processing are addressed through data manipulation to counteract anomalies and unwanted variation, as well as experimentation with parameters to achieve the most ideal settings. Results are produced through experiments to test the accuracy of the various change pointdetection methods, and for investigation of various parameter settings. Through trial and error, a satisfactory configuration is achieved and used in large scale log detection experiments. The results from the experiments conclude that additional information about how changes in variation arises is required to derive the desired understanding. Change point detection log change detection time series data signal processing Computer Engineering Datorteknik
15	Privacy of Sudden Events in Cyber-Physical Systems Alisic, Rijad January 2021 (has links) Cyberattacks against critical infrastructures has been a growing problem for the past couple of years. These infrastructures are a particularly desirable target for adversaries, due to their vital importance in society. For instance, a stop in the operation of a critical infrastructure could result in a crippling effect on a nation's economy, security or public health. The reason behind this increase is that critical infrastructures have become more complex, often being integrated with a large network of various cyber components. It is through these cyber components that an adversary is able to access the system and conduct their attacks. In this thesis, we consider methods which can be used as a first line of defence against such attacks for Cyber-Physical Systems (CPS). Specifically, we start by studying how information leaks about a system's dynamics helps an adversary to generate attacks that are difficult to detect. In many cases, such attacks can be detrimental to a CPS since they can drive the system to a breaking point without being detected by the operator that is tasked to secure the system. We show that an adversary can use small amounts of data procured from information leaks to generate these undetectable attacks. In particular, we provide the minimal amount of information that is needed in order to keep the attack hidden even if the operator tries to probe the system for attacks. We design defence mechanisms against such information leaks using the Hammersley-Chapman-Robbins lower bound. With it, we study how information leakage could be mitigated through corruption of the data by injection of measurement noise. Specifically, we investigate how information about structured input sequences, which we call events, can be obtained through the output of a dynamical system and how this leakage depends on the system dynamics. For example, it is shown that a system with fast dynamical modes tends to disclose more information about an event compared to a system with slower modes. However, a slower system leaks information over a longer time horizon, which means that an adversary who starts to collect information long after the event has occured might still be able to estimate it. Additionally, we show how sensor placements can affect the information leak. These results are then used to aid the operator to detect privacy vulnerabilities in the design of a CPS. Based on the Hammersley-Chapman-Robbins lower bound, we provide additional defensive mechanisms that can be deployed by an operator online to minimize information leakage. For instance, we propose a method to modify the structured inputs in order to maximize the usage of the existing noise in the system. This mechanism allows us to explicitly deal with the privacy-utility trade-off, which is of interest when optimal control problems are considered. Finally, we show how the adversary's certainty of the event increases as a function of the number of samples they collect. For instance, we provide sufficient conditions for when their estimation variance starts to converge to its final value. This information can be used by an operator to estimate when possible attacks from an adversary could occur, and change the CPS before that, rendering the adversary's collected information useless. / De senaste åren har cyberanfall mot kritiska infrastructurer varit ett växande problem. Dessa infrastrukturer är särskilt utsatta för cyberanfall, eftersom de uppfyller en nödvändig function för att ett samhälle ska fungera. Detta gör dem till önskvärda mål för en anfallare. Om en kritisk infrastruktur stoppas från att uppfylla sin funktion, då kan det medföra förödande konsekvenser för exempelvis en nations ekonomi, säkerhet eller folkhälsa. Anledningen till att mängden av attacker har ökat beror på att kritiska infrastrukturer har blivit alltmer komplexa eftersom de numera ingår i stora nätverk dör olika typer av cyberkomponenter ingår. Det är just genom dessa cyberkomponenter som en anfallare kan få tillgång till systemet och iscensätta cyberanfall. I denna avhandling utvecklar vi metoder som kan användas som en första försvarslinje mot cyberanfall på cyberfysiska system (CPS). Vi med att undersöka hur informationsläckor om systemdynamiken kan hjälpa en anfallare att skapa svårupptäckta attacker. Oftast är sådana attacker förödande för CPS, eftersom en anfallare kan tvinga systemet till en bristningsgräns utan att bli upptäcka av operatör vars uppgift är att säkerställa systemets fortsatta funktion. Vi bevisar att en anfallare kan använda relativt små mängder av data för att generera dessa svårupptäckta attacker. Mer specifikt så härleder ett uttryck för den minsta mängd information som krävs för att ett anfall ska vara svårupptäckt, även för fall då en operatör tar till sig metoder för att undersöka om systemet är under attack. I avhandlingen konstruerar vi försvarsmetoder mot informationsläcker genom Hammersley-Chapman-Robbins olikhet. Med denna olikhet kan vi studera hur informationsläckan kan dämpas genom att injicera brus i datan. Specifikt så undersöker vi hur mycket information om strukturerade insignaler, vilket vi kallar för händelser, till ett dynamiskt system som en anfallare kan extrahera utifrån dess utsignaler. Dessutom kollar vi på hur denna informationsmängd beror på systemdynamiken. Exempelvis så visar vi att ett system med snabb dynamik läcker mer information jämfört med ett långsammare system. Däremot smetas informationen ut över ett längre tidsintervall för långsammare system, vilket leder till att anfallare som börjar tjuvlyssna på ett system långt efter att händelsen har skett kan fortfarande uppskatta den. Dessutom så visar vi jur sensorplaceringen i ett CPS påverkar infromationsläckan. Dessa reultat kan användas för att bistå en operatör att analysera sekretessen i ett CPS. Vi använder även Hammersley-Chapman-Robbins olikhet för att utveckla försvarslösningar mot informationsläckor som kan användas \textit{online}. Vi föreslår modifieringar till den strukturella insignalen så att systemets befintliga brus utnyttjas bättre för att gömma händelsen. Om operatören har andra mål den försöker uppfylla med styrningen så kan denna metod användas för att styra avvängingen mellan sekretess och operatorns andra mål. Slutligen så visar vi hur en anfallares uppskattning av händelsen förbättras som en funktion av mängden data får tag på. Operatorn kan använda informationen för att ta reda på när anfallaren kan tänka sig vara redo att anfalla systemet, och därefter ändra systemet innan detta sker, vilket gör att anfallarens information inte längre är användbar. / <p>QC 20210820</p> Privacy Security Cyber-Physical Systems Automatic Control Estimation Machine Learning Change Point Detection Control Engineering Reglerteknik
16	Nonparametric Bayesian Clustering under Structural Restrictions Hanxi Sun (11009154) 23 July 2021 (has links) <div>Model-based clustering, with its flexibility and solid statistical foundations, is an important tool for unsupervised learning, and has numerous applications in a variety of fields. This dissertation focuses on nonparametric Bayesian approaches to model-based clustering under structural restrictions. These are additional constraints on the model that embody prior knowledge, either to regularize the model structure to encourage interpretability and parsimony or to encourage statistical sharing through underlying tree or network structure.</div><div><br></div><div>The first part in the dissertation focuses on the most commonly used model-based clustering models, mixture models. Current approaches typically model the parameters of the mixture components as independent variables, which can lead to overfitting that produces poorly separated clusters, and can also be sensitive to model misspecification. To address this problem, we propose a novel Bayesian mixture model with the structural restriction being that the clusters repel each other.The repulsion is induced by the generalized Matérn type-III repulsive point process. We derive an efficient Markov chain Monte Carlo (MCMC) algorithm for posterior inference, and demonstrate its utility on a number of synthetic and real-world problems. <br></div><div><br></div><div>The second part of the dissertation focuses on clustering populations with a hierarchical dependency structure that can be described by a tree. A classic example of such problems, which is also the focus of our work, is the phylogenetic tree with nodes often representing biological species. The structure of this problem refers to the hierarchical structure of the populations. Clustering of the populations in this problem is equivalent to identify branches in the tree where the populations at the parent and child node have significantly different distributions. We construct a nonparametric Bayesian model based on hierarchical Pitman-Yor and Poisson processes to exploit this, and develop an efficient particle MCMC algorithm to address this problem. We illustrate the efficacy of our proposed approach on both synthetic and real-world problems.</div> Statistics Poisson point processes Thinning Parsimony Pitman-Yor processes Phylogenetic tree model Change-point detection
17	Wavelet methods and statistical applications: network security and bioinformatics Kwon, Deukwoo 01 November 2005 (has links) Wavelet methods possess versatile properties for statistical applications. We would like to explore the advantages of using wavelets in the analyses in two different research areas. First of all, we develop an integrated tool for online detection of network anomalies. We consider statistical change point detection algorithms, for both local changes in the variance and for jumps detection, and propose modified versions of these algorithms based on moving window techniques. We investigate performances on simulated data and on network traffic data with several superimposed attacks. All detection methods are based on wavelet packets transformations. We also propose a Bayesian model for the analysis of high-throughput data where the outcome of interest has a natural ordering. The method provides a unified approach for identifying relevant markers and predicting class memberships. This is accomplished by building a stochastic search variable selection method into an ordinal model. We apply the methodology to the analysis of proteomic studies in prostate cancer. We explore wavelet-based techniques to remove noise from the protein mass spectra. The goal is to identify protein markers associated with prostate-specific antigen (PSA) level, an ordinal diagnostic measure currently used to stratify patients into different risk groups. Bayesian ordinal probit model wavelet methods change point detection network security bioinformatics proteomics SELDI-TOF MS Bayesian variable selection biomarker
18	System Surveillance Mansoor, Shaheer January 2013 (has links) In recent years, trade activity in stock markets has increased substantially. This is mainly attributed to the development of powerful computers and intranets connecting traders to markets across the globe. The trades have to be carried out almost instantaneously and the systems in place that handle trades are burdened with millions of transactions a day, several thousand a minute. With increasing transactions the time to execute a single trade increases, and this can be seen as an impact on the performance. There is a need to model the performance of these systems and provide forecasts to give a heads up on when a system is expected to be overwhelmed by transactions. This was done in this study, in cooperation with Cinnober Financial Technologies, a firm which provides trading solutions to stock markets. To ensure that the models developed weren‟t biased, the dataset was cleansed, i.e. operational and other transactions were removed, and only valid trade transactions remained. For this purpose, a descriptive analysis of time series along with change point detection and LOESS regression were used. State space model with Kalman Filtering was further used to develop a time varying coefficient model for the performance, and this model was applied to make forecasts. Wavelets were also used to produce forecasts, and besides this high pass filters were used to identify low performance regions. The State space model performed very well to capture the overall trend in performance and produced reliable forecasts. This can be ascribed to the property of Kalman Filter to handle noisy data well. Wavelets on the other hand didn‟t produce reliable forecasts but were more efficient in detecting regions of low performance. State Space Models Forecasting Wavelets LOESS Change Point Detection Financial Systems Trading Transactions per second Kalman Filtering
19	Detekce změn v lineárních modelech a bootstrap / Detekce změn v lineárních modelech a bootstrap Čellár, Matúš January 2016 (has links) This thesis discusses the changes in parameters of linear models and methods of their detection. It begins with a short introduction of the two basic types of change point detection procedures and bootstrap algorithms developed specifically to deal with dependent data. In the following chapter we focus on the location model - the simplest example of a linear model with a change in parameters. On this model we will illustrate a way of long-run variance estimation and implementation of selected bootstrap procedures. In the last chapter we show how to extend the applied methods to linear models with a change in parameters. We will compare the performance of change point tests based on asymptotic and bootstrap critical values through simulation studies in both our considered methods. The performance of selected long-run variance estimator will also be examined both for situations when the change in parameters occurs and when it does not. 1
20	Change detection in metal oxide gas sensor signals for open sampling systems Pashami, Sepideh January 2015 (has links) This thesis addresses the problem of detecting changes in the activity of a distant gas source from the response of an array of metal oxide (MOX) gas sensors deployed in an Open Sampling System (OSS). Changes can occur due to gas source activity such as a sudden alteration in concentration or due to exposure to a different compound. Applications such as gas-leak detection in mines or large-scale pollution monitoring can benefit from reliable change detection algorithms, especially where it is impractical to continuously store or transfer sensor readings, or where reliable calibration is difficult to achieve. Here, it is desirable to detect a change point indicating a significant event, e.g. presence of gas or a sudden change in concentration. The main challenges are turbulent dispersion of gas and the slow response and recovery times of MOX sensors. Due to these challenges, the gas sensor response exhibits fluctuations that interfere with the changes of interest. The contributions of this thesis are centred on developing change detection methods using MOX sensor responses. First, we apply the Generalized Likelihood Ratio algorithm (GLR), a commonly used method that does not make any a priori assumption about change events. Next, we propose TREFEX, a novel change point detection algorithm, which models the response of MOX sensors as a piecewise exponential signal and considers the junctions between consecutive exponentials as change points. We also propose the rTREFEX algorithm as an extension of TREFEX. The core idea behind rTREFEX is an attempt to improve the fitted exponentials of TREFEX by minimizing the number of exponentials even further. GLR, TREFEX and rTREFEX are evaluated for various MOX sensors and gas emission profiles. A sensor selection algorithm is then introduced and the change detection algorithms are evaluated with the selected sensor subsets. A comparison between the three proposed algorithms shows clearly superior performance of rTREFEX both in detection performance and in estimating the change time. Further, rTREFEX is evaluated in real-world experiments where data is gathered by a mobile robot. Finally, a gas dispersion simulation was developed which integrates OpenFOAM flow simulation and a filament-based gas propagation model to simulate gas dispersion for compressible flows with a realistic turbulence model. Computer Science Datavetenskap (datalogi)

Search results