Spelling suggestions: "subject:" anomaly detection"" "subject:" unomaly detection""
111 |
An Extensible Framework For Automated Network Attack Signature GenerationKenar, Serkan 01 January 2010 (has links) (PDF)
The effectiveness of misuse-based intrusion detection systems (IDS) are seriously broken, with the advance of threats in terms of speed and scale. Today worms, trojans, viruses and other threats can spread all around the globe in less than thirty minutes. In order to detect these emerging threats, signatures must be generated automatically and distributed to intrusion detection systems rapidly. There are studies on automatically generating signatures for worms
and attacks. However, either these systems rely on Honeypots which are supposed to receive only suspicious traffic, or use port-scanning outlier detectors. In this study, an open, extensible system based on an network IDS is proposed to identify suspicious traffic using anomaly
detection methods, and to automatically generate signatures of attacks out of this suspicious traffic. The generated signatures are classified and fedback into the IDS either locally or distributed. Design and proof-of-concept implementation are described and developed system
is tested on both synthetic and real network data. The system is designed as a framework to test different methods and evaluate the outcomes of varying configurations easily.
The test results show that, with a properly defined attack detection algorithm, attack signatures could be generated with high accuracy and efficiency. The resulting system could be used to prevent early damages of fast-spreading worms and other threats.
|
112 |
Coding-Based System Primitives for Airborne Cloud ComputingLin, Chit-Kwan January 2011 (has links)
The recent proliferation of sensors in inhospitable environments such as disaster or battle zones has not been matched by in situ data processing capabilities due to a lack of computing infrastructure in the field. We envision a solution based on small, low-altitude unmanned aerial vehicles (UAVs) that can deploy elastically-scalable computing infrastructure anywhere, at any time. This airborne compute cloud—essentially, micro-data centers hosted on UAVs—would communicate with terrestrial assets over a bandwidth-constrained wireless network with variable, unpredictable link qualities. Achieving high performance over this ground-to-air mobile radio channel thus requires making full and efficient use of every single transmission opportunity. To this end, this dissertation presents two system primitives that improve throughput and reduce network overhead by using recent distributed coding methods to exploit natural properties of the airborne environment (i.e., antenna beam diversity and anomaly sparsity). We first built and deployed an UAV wireless networking testbed and used it to characterize the ground-to-UAV wireless channel. Our flight experiments revealed that antenna beam diversity from using multiple SISO radios boosts reception range and aggregate throughput. This observation led us to develop our first primitive: ground-to-UAV bulk data transport. We designed and implemented FlowCode, a reliable link layer for uplink data transport that uses network coding to harness antenna beam diversity gains. Via flight experiments, we show that FlowCode can boost reception range and TCP throughput as much as 4.5-fold. Our second primitive permits low-overhead cloud status monitoring. We designed CloudSense, a network switch that compresses cloud status streams in-network via compressive sensing. CloudSense is particularly useful for anomaly detection tasks requiring global relative comparisons (e.g., MapReduce straggler detection) and can achieve up to 16.3-fold compression as well as early detection of the worst anomalies. Our efforts have also shed light on the close relationship between network coding and compressive sensing. Thus, we offer FlowCode and CloudSense not only as first steps toward the airborne compute cloud, but also as exemplars of two classes of applications—approximation intolerant and tolerant—to which network coding and compressive sensing should be judiciously and selectively applied. / Engineering and Applied Sciences
|
113 |
Kompiuterių tinklo srautų anomalijų aptikimo metodai / Detection of network traffic anomaliesKrakauskas, Vytautas 03 June 2006 (has links)
This paper describes various network monitoring technologies and anomaly detection methods. NetFlow were chosen for anomaly detection system being developed. Anomalies are detected using a deviation value. After evaluating quality of developed system, new enhancements were suggested and implemented. Flow data distribution was suggested, to achieve more precise NetFlow data representation, enabling a more precise network monitoring information usage for anomaly detection. Arithmetic average calculations were replaced with more flexible Exponential Weighted Moving Average algorithm. Deviation weight was introduced to reduce false alarms. Results from experiment with real life data showed that proposed changes increased precision of NetFlow based anomaly detection system.
|
114 |
Exploring Event Log Analysis with Minimum Apriori InformationMakanju, Adetokunbo 02 April 2012 (has links)
The continued increase in the size and complexity of modern computer systems has led to a commensurate increase in the size of their logs. System logs are an invaluable resource to systems administrators during fault resolution. Fault resolution is a time-consuming and knowledge intensive process. A lot of the time spent in fault resolution is spent sifting through large volumes of information, which includes event logs, to find the root cause of the problem. Therefore, the ability to analyze log files automatically and accurately will lead to significant savings in the time and cost of downtime events for any organization. The automatic analysis and search of system logs for fault symptoms, otherwise called alerts, is the primary motivation for the work carried out in this thesis. The proposed log alert detection scheme is a hybrid framework, which incorporates anomaly detection and signature generation to accomplish its goal. Unlike previous work, minimum apriori knowledge of the system being analyzed is assumed. This assumption enhances the platform portability of the framework. The anomaly detection component works in a bottom-up manner on the contents of historical system log data to detect regions of the log, which contain anomalous (alert) behaviour. The identified anomalous regions are then passed to the signature generation component, which mines them for patterns. Consequently, future occurrences of the underlying alert in the anomalous log region, can be detected on a production system using the discovered pattern. The combination of anomaly detection and signature generation, which is novel when compared to previous work, ensures that a framework which is accurate while still being able to detect new and unknown alerts is attained.
Evaluations of the framework involved testing it on log data for High Performance Cluster (HPC), distributed and cloud systems. These systems provide a good range for the types of computer systems used in the real world today. The results indicate that the system that can generate signatures for detecting alerts, which can achieve a Recall rate of approximately 83% and a false positive rate of approximately 0%, on average.
|
115 |
Semi-supervised learning of bitmask pairs for an anomaly-based intrusion detection systemArdolino, Kyle R. January 2008 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Electrical Engineering, 2008. / Includes bibliographical references.
|
116 |
Applications of GUI usage analysisImsand, Eric Shaun. Hamilton, John A., January 2008 (has links) (PDF)
Thesis (Ph. D.)--Auburn University, 2008. / Abstract. Includes bibliographical references (p. 119-122).
|
117 |
Anomaly-based intrusion detection using using lightweight stateless payload inspectionNwanze, Nnamdi Chike. January 2009 (has links)
Thesis (Ph. D.)--State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Electrical and Computer Engineering, 2009. / Includes bibliographical references.
|
118 |
On Learning from Collective DataXiong, Liang 01 December 2013 (has links)
In many machine learning problems and application domains, the data are naturally organized by groups. For example, a video sequence is a group of images, an image is a group of patches, a document is a group of paragraphs/words, and a community is a group of people. We call them the collective data. In this thesis, we study how and what we can learn from collective data. Usually, machine learning focuses on individual objects, each of which is described by a feature vector and studied as a point in some metric space. When approaching collective data, researchers often reduce the groups into vectors to which traditional methods can be applied. We, on the other hand, will try to develop machine learning methods that respect the collective nature of data and learn from them directly. Several different approaches were taken to address this learning problem. When the groups consist of unordered discrete data points, it can naturally be characterized by its sufficient statistics – the histogram. For this case we develop efficient methods to address the outliers and temporal effects in the data based on matrix and tensor factorization methods. To learn from groups that contain multi-dimensional real-valued vectors, we develop both generative methods based on hierarchical probabilistic models and discriminative methods using group kernels based on new divergence estimators. With these tools, we can accomplish various tasks such as classification, regression, clustering, anomaly detection, and dimensionality reduction on collective data. We further consider the practical side of the divergence based algorithms. To reduce their time and space requirements, we evaluate and find methods that can effectively reduce the size of the groups with little impact on the accuracy. We also proposed the conditional divergence along with an efficient estimator in order to correct the sampling biases that might be present in the data. Finally, we develop methods to learn in cases where some divergences are missing, caused by either insufficient computational resources or extreme sampling biases. In addition to designing new learning methods, we will use them to help the scientific discovery process. In our collaboration with astronomers and physicists, we see that the new techniques can indeed help scientists make the best of data.
|
119 |
Data gathering and anomaly detection in wireless sensors networks / Collecte de données et détection d’anomalies dans les réseaux de capteurs sans filMoussa, Mohamed Ali 10 November 2017 (has links)
L'utilisation des réseaux de capteurs sans fil (WSN) ne cesse d'augmenter au point de couvrir divers domaines et applications. Cette tendance est supportée par les avancements techniques achevés dans la conception des capteurs, qui ont permis de réduire le coût ainsi que la taille de ces composants. Toutefois, il reste plusieurs défis qui font face au déploiement et au bon fonctionnement de ce type de réseaux et qui parviennent principalement de la limitation des ressources de capteurs ainsi de l'imperfection des données collectées. Dans cette thèse, on adresse le problème de collecte de données et de détection d'anomalies dans les réseaux de capteurs. Nous visons à assurer ces deux fonctionnalités tout en économisant l'utilisation des ressources de capteurs et en prolongeant la durée de vie de réseaux. Tout au long de ce travail, nous présentons plusieurs solutions qui permettent une collecte efficace de données de capteurs ainsi que une bonne détection des éventuelles anomalies. Dans notre première contribution, nous décrivons une solution basée sur la technique Compressive Sensing (CS) qui permet d'équilibrer le trafic transmis par les nœuds dans le réseau. Notre approche diffère des solutions existantes par la prise en compte de la corrélation temporelle ainsi que spatiale dans le processus de décompression des données. De plus, nous proposons une nouvelle formulation pour détecter les anomalies. Les simulations réalisées sur des données réelles prouvent l'efficacité de notre approche en termes de reconstruction de données et de détection d'anomalies par rapport aux approches existantes. Pour mieux optimiser l'utilisation des ressources de WSNs, nous proposons dans une deuxième contribution une solution de collecte de données et de détection d'anomalies basée sur la technique Matrix Completion (MC) qui consiste à transmettre un sous ensemble aléatoire de données de capteurs. Nous développons un algorithme qui estime les mesures manquantes en se basant sur plusieurs propriétés des données. L'algorithme développé permet également de dissimuler les anomalies de la structure normale des données. Cette solution est améliorée davantage dans notre troisième contribution, où nous proposons une formulation différente du problème de collecte de données et de détection d'anomalies. Nous reformulons les connaissances a priori sur les données cibles par des contraintes convexes. Ainsi, les paramètres impliqués dans l'algorithme développé sont liés a certaines propriétés physiques du phénomène observé et sont faciles à ajuster. Nos deux approches montrent de bonnes performances en les simulant sur des données réelles. Enfin, nous proposons dans la dernière contribution une nouvelle technique de collecte de données qui consiste à envoyer que les positions les plus importantes dans la représentation parcimonieuse des données uniquement. Nous considérons dans cette approche le bruit qui peut s'additionner aux données reçues par le nœud collecteur. Cette solution permet aussi de détecter les pics dans les mesures prélevées. En outre, nous validons l'efficacité de notre solution par une analyse théorique corroborée par des simulations sur des données réelles / The use of Wireless Sensor Networks (WSN)s is steadily increasing to cover various applications and domains. This trend is supported by the technical advancements in sensor manufacturing process which allow a considerable reduction in the cost and size of these components. However, there are several challenges facing the deployment and the good functioning of this type of networks. Indeed, WSN's applications have to deal with the limited energy, memory and processing capacities of sensor nodes as well as the imperfection of the probed data. This dissertation addresses the problem of collecting data and detecting anomalies in WSNs. The aforementioned functionality needs to be achieved while ensuring a reliable data quality at the collector node, a good anomaly detection accuracy, a low false alarm rate as well as an efficient energy consumption solution. Throughout this work, we provide different solutions that allow to meet these requirements. Foremost, we propose a Compressive Sensing (CS) based solution that allows to equilibrate the traffic carried by nodes regardless their distance from the sink. This solution promotes a larger lifespan of the WSN since it balances the energy consumption between sensor nodes. Our approach differs from existing CS-based solutions by taking into account the sparsity of sensory representation in the temporal domain in addition to the spatial dimension. Moreover, we propose a new formulation to detect aberrant readings. The simulations carried on real datasets prove the efficiency of our approach in terms of data recovering and anomaly detection compared to existing solutions. Aiming to further optimize the use of WSN resources, we propose in our second contribution a Matrix Completion (MC) based data gathering and anomaly detection solution where an arbitrary subset of nodes contributes at the data gathering process at each operating period. To fill the missing values, we mainly relay on the low rank structure of sensory data as well as the sparsity of readings in some transform domain. The developed algorithm also allows to dissemble anomalies from the normal data structure. This solution is enhanced in our third contribution where we propose a constrained formulation of the data gathering and anomalies detection problem. We reformulate the textit{a prior} knowledge about the target data as hard convex constraints. Thus, the involved parameters into the developed algorithm become easy to adjust since they are related to some physical properties of the treated data. Both MC based approaches are tested on real datasets and demonstrate good capabilities in terms of data reconstruction quality and anomaly detection performance. Finally, we propose in the last contribution a position based compressive data gathering scheme where nodes cooperate to compute and transmit only the relevant positions of their sensory sparse representation. This technique provide an efficient tool to deal with the noisy nature of WSN environment as well as detecting spikes in the sensory data. Furthermore, we validate the efficiency of our solution by a theoretical analysis and corroborate it by a simulation evaluation
|
120 |
Concentration of measure, negative association, and machine learningRoot, Jonathan 07 December 2016 (has links)
In this thesis we consider concentration inequalities and the concentration
of measure phenomenon from a variety of angles. Sharp tail bounds on the deviation of Lipschitz functions of independent random variables about their mean are well known. We consider variations on this theme for dependent variables on the Boolean cube.
In recent years negatively associated probability distributions have been studied as potential generalizations of independent random variables. Results on this class of distributions have been sparse at best, even when restricting to the Boolean cube. We consider the class of negatively associated distributions topologically, as a subset of the general class of probability measures. Both the weak (distributional) topology and the total variation topology
are considered, and the simpler notion of negative correlation is investigated.
The concentration of measure phenomenon began with Milman's proof of Dvoretzky's theorem, and is therefore intimately connected to the field of high-dimensional convex geometry. Recently this field has found application in the area of compressed sensing. We consider these applications and in particular analyze the use of Gordon's min-max inequality in various compressed sensing frameworks, including the Dantzig selector and the matrix uncertainty selector.
Finally we consider the use of concentration inequalities in developing a theoretically sound anomaly detection algorithm. Our method uses a ranking procedure based on KNN graphs
of given data. We develop a max-margin learning-to-rank framework to train limited complexity models to imitate these KNN scores. The resulting anomaly detector is shown to be asymptotically optimal in that for any false alarm rate α, its decision region converges to the α-percentile minimum volume level set of the unknown
underlying density.
|
Page generated in 0.105 seconds