• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 329
  • 18
  • 17
  • 17
  • 15
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 484
  • 484
  • 215
  • 212
  • 160
  • 138
  • 116
  • 91
  • 81
  • 75
  • 70
  • 68
  • 61
  • 60
  • 59
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
311

Autoencoder-based anomaly detection in time series : Application to active medical devices

Gietzelt, Marie January 2024 (has links)
The aim of this thesis is to derive an unsupervised method for detecting anomalies in time series. Autoencoder-based approaches are widely used for the task of detecting anomalies where a model learns to reconstruct the pattern of the given data. The main idea is that the model will be good at reconstructing data that does not contain anomalous behavior. If the model fails to reconstruct an observation it will be marked as anomalous. In this thesis, the derived method is applied to data from active medical devices manufactured by B. Braun. The given data consist of 6,000 length-varying time series, where the average length is greater than 14,000. Hence, the given sample size is small compared to their lengths. Subsequences of the same pattern where anomalies are expected to appear can be extracted from the time series taking expert knowledge about the data into account. Considering the subsequences for the model training, the problem can betranslated into a problem with a large dataset of short time series. It is shown that a common autoencoder is able to reconstruct anomalies well and is therefore not useful to solve the task. It is demonstrated that a variational autoencoder works better as there are large differences between the given anomalous observations and their reconstructions. Furthermore, several thresholds for these differences are compared. The relative number of detected anomalies in the two given datasets are 3.12% and 5.03%.
312

Unveiling Anomaly Detection: Navigating Cultural Shifts and Model Dynamics in AIOps Implementations

Sandén, Therese January 2024 (has links)
This report examines Artificial Intelligence for IT Operations, commonly known as AIOps, delving deeper into the area of anomaly detection and also investigating the effects of the shift in working methods when a company starts using AI-driven tools. Two anomaly detection machine learning algorithms were explored, Isolation Forest(IF)and Local Outlier Factor(LOF), and compared by testing with a focuson throughput and resource efficiency, to mirror how they would operate in a real-time cloud environment. From a throughput and efficiency perspective, LOF outperforms IF when using default parameters, making it a more suitable choice for cloud environments where processing speed is critical. The higher throughput of LOF indicates that it can handle a larger volume of log data more quickly, which is essential for real-time anomaly detection in dynamic cloud settings. However,  LOF’s higher memory usage suggests that it may be less scalable in memory-constrained environments within the cloud. This could lead to increased costs due to the need for more memory resources. The tests show, however, that tuning the models’ parameters are essential to fit them to different types of data. Through a literature study, it is evident that the integration of AI and automation into routine tasks presents an opportunity for workforce development and operational improvement.Addressing cultural barriers and fostering collaboration across IT teamsare essential for successful adoption and implementation.
313

Cascaded Ensembling for Resource-Efficient Multivariate Time Series Anomaly Detection

Mapitigama Boththanthrige, Dhanushki Pavithya January 2024 (has links)
The rapid evolution of Connected and Autonomous Vehicles (CAVs) has led to a surge in research on efficient anomaly detection methods to ensure their safe and reliable operation. While state-of-the-art deep learning models offer promising results in this domain, their high computational requirements present challenges for deployment in resource-constrained environments, such as Electronic Control Units (ECU) in vehicles. In this context, we consider using the ensemble learning technique specifically the cascaded modeling approach for real-time and resource-efficient multivariate time series anomaly detection in CAVs. The study was done in collaboration with SCANIA, a transport solutions provider. The company is now undergoing a transformation of providing autonomous and sustainable solutions and this work will contribute towards that transformation. Our methodology employs unsupervised learning techniques to construct a cascade of models, comprising a coarse-grained model with lower computational complexity at level one, and a more intricate fine-grained model at level two. Furthermore, we incorporate cascaded model training to refine the complex model's ability to make decisions on uncertain and anomalous events, leveraging insights from the simpler model. Through extensive experimentation, we investigate the trade-off between model performance and computational complexity, demonstrating that our proposed cascaded model achieves greater efficiency with no performance degradation. Further, we do a comparative analysis of the impact of probabilistic versus deterministic approaches and assess the feasibility of model training at edge environments using the Federated Learning concept.
314

Αναγνώριση επιθέσεων web σε web-servers

Στυλιανού, Γεώργιος 09 July 2013 (has links)
Οι επιθέσεις στο Διαδίκτυο και ειδικά οι επιθέσεις άρνησης εξυπηρέτησης (Denial of Service, DoS) αποτελούν ένα πολύ σοβαρό πρόβλημα για την ομαλή λειτουργία του Διαδικτύου. Αυτό το είδος επιθέσεων στοχεύει στην διατάραξη της καλής λειτουργίας ενός συστήματος, καταναλώνοντας τους πόρους του ή προκαλώντας υπερφόρτωση στο δίκτυο, καθιστώντας το ανίκανο να παρέχει στους πελάτες του τις υπηρεσίες για τις οποίες προορίζεται. Η αντιμετώπιση των επιθέσεων αυτών έχει απασχολήσει πολλούς ερευνητές τα τελευταία χρόνια και έχουν προταθεί πολλές διαφορετικές μέθοδοι πρόληψης, ανίχνευσης, και απόκρισης. Στα πλαίσια της παρούσας διπλωματικής επιχειρείται αρχικά ο ορισμός και η ταξινόμηση των επιθέσεων DoS και DDoS, με ιδιαίτερη αναφορά στις επιθέσεις DoS στον Παγκόσμιο Ιστό. Στη συνέχεια αναλύονται διάφοροι τρόποι αναγνώρισης επιθέσεων, με κύριους άξονες την αναγνώριση υπογραφής και την ανίχνευση ανωμαλιών. Γίνεται εμβάθυνση στο πεδίο της ανίχνευσης ανωμαλιών και πραγματοποιείται η μελέτη ενός συστήματος που ανιχνεύει ανωμαλίες σε δεδομένα κίνησης δικτύου που περιέχουν επιθέσεις. / Attacks in the Internet, and especially Denial of Service attacks, are a very serious threat to the normal function of the Internet. This kind of attack aims to the disruption of the normal function of a system, by consuming its resources or overloading the network, making it incapable to provide services, that is designed for, to the clients. In recent years many researchers have tried to propose solutions to prevent, detect and respond effectively to attacks. In this thesis, first a definition, and then a classification of DoS and DDoS attacks is proposed, with distinctive reference to attacks in the World Wide Web. Several ways of attack detection are analyzed, with signature detection and anomaly detection being the most significant. Afterwards, the field of anomaly detection is thoroughly analyzed, and a system that detects anomalies to a dataset of network traffic that contains attacks, is examined.
315

Detection and localization of link-level network anomalies using end-to-end path monitoring / Détection et localisation des anomalies réseau au niveau des liens en utilisant de la surveillance des chemins de bout-en-bout

Salhi, Emna 13 February 2013 (has links)
L'objectif de cette thèse est de trouver des techniques de détection et de localisation des anomalies au niveau des liens qui soient à faible coût, précises et rapides. La plupart des techniques de détection et de localisation des anomalies au niveau des liens qui existent dans la littérature calculent les solutions, c-à-d l'ensemble des chemins à monitorer et les emplacements des dispositifs de monitorage, en deux étapes. La première étape sélectionne un ensemble minimal d'emplacements des dispositifs de monitorage qui permet de détecter/localiser toutes les anomalies possibles. La deuxième étape sélectionne un ensemble minimal de chemins de monitorage entre les emplacements sélectionnés de telle sorte que tous les liens du réseau soient couverts/distinguables paire par paire. Toutefois, ces techniques ignorent l'interaction entre les objectifs d'optimisation contradictoires des deux étapes, ce qui entraîne une utilisation sous-optimale des ressources du réseau et des mesures de monitorage biaisées. L'un des objectifs de cette thèse est d'évaluer et de réduire cette interaction. A cette fin, nous proposons des techniques de détection et de localisation d'anomalies au niveau des liens qui sélectionnent les emplacements des moniteurs et les chemins qui doivent être monitorés conjointement en une seule étape. Par ailleurs, nous démontrons que la condition établie pour la localisation des anomalies est suffisante mais pas nécessaire. Une condition nécessaire et suffisante qui minimise le coût de localisation considérablement est établie. Il est démontré que les deux problèmes sont NP-durs. Des algorithmes heuristiques scalables et efficaces sont alors proposés. / The aim of this thesis is to come up with cost-efficient, accurate and fast schemes for link-level network anomaly detection and localization. It has been established that for detecting all potential link-level anomalies, a set of paths that cover all links of the network must be monitored, whereas for localizing all potential link-level anomalies, a set of paths that can distinguish between all links of the network pairwise must be monitored. Either end-node of each path monitored must be equipped with a monitoring device. Most existing link-level anomaly detection and localization schemes are two-step. The first step selects a minimal set of monitor locations that can detect/localize any link-level anomaly. The second step selects a minimal set of monitoring paths between the selected monitor locations such that all links of the network are covered/distinguishable pairwise. However, such stepwise schemes do not consider the interplay between the conflicting optimization objectives of the two steps, which results in suboptimal consumption of the network resources and biased monitoring measurements. One of the objectives of this thesis is to evaluate and reduce this interplay. To this end, one-step anomaly detection and localization schemes that select monitor locations and paths that are to be monitored jointly are proposed. Furthermore, we demonstrate that the already established condition for anomaly localization is sufficient but not necessary. A necessary and sufficient condition that minimizes the localization cost drastically is established. The problems are demonstrated to be NP-Hard. Scalable and near-optimal heuristic algorithms are proposed.
316

Contribution to the interpretation of evolving communities in complex networks : Application to the study of social interactions / Contribution à l’interprétation des communautés en évolution dans des réseaux complexes : Application à l’étude des interactions sociales

Orman, Keziban 16 July 2014 (has links)
Les réseaux complexes constituent un outil pratique pour modéliser les systèmes complexes réels. Pour cette raison, ils sont devenus très populaires au cours de la dernière décennie. De nombreux outils existent pour étudier les réseaux complexes. Parmi ceux-ci, la détection de la communauté est l’un des plus importants. Une communauté est grossièrement définie comme un groupe de nœuds plus densément connectés entre eux qu’avec le reste du réseau. Dans la littérature, cette définition intuitive a été formalisée de plusieurs différentes façons, ce qui a conduit à d’innombrables méthodes et variantes permettant de les détecter. Du point de vue applicatif, le sens des communautés est aussi important que leur détection. Cependant, bien que la tâche de détection de communautés en elle-même ait attiré énormément d’attention, le problème de leur interprétation n’a pas été sérieusement abordé jusqu’à présent. Dans cette thèse, nous voyons l’interprétation des communautés comme un problème indépendant du processus de leur détection, consistant à identifier les éléments leurs caractéristiques les plus typiques. Nous le décomposons en deux sous-problèmes : 1) trouver un moyen approprié pour représenter une communauté ; et 2) sélectionner de façon objective les parties les plus caractéristiques de cette représentation. Pour résoudre ces deux sous-problèmes, nous exploitons l’information encodée dans les réseaux dynamiques attribués. Nous proposons une nouvelle représentation des communautés sous la forme de séquences temporelles de descripteurs associés à chaque nœud individuellement. Ces descripteurs peuvent être des mesures topologiques et des attributs nodaux. Nous détectons ensuite les motifs séquentiels émergents dans cet ensemble de données, afin d’identifier les ceux qui sont les plus caractéristiques de la communauté. Nous effectuons une validation de notre procédé sur des réseaux attribués dynamiques générés artificiellement. A cette occasion, nous étudions son comportement relativement à des changements structurels de la structure de communautés, à des modifications des valeurs des attributs. Nous appliquons également notre procédé à deux systèmes du monde réel : un réseau de collaborations scientifiques issu de DBLP, et un réseau d’interactions sociales et musicales tiré du service LastFM. Nos résultats montrent que les communautés détectées ne sont pas complètement homogènes. Certaines communautés sont composées de petits groupes de nœuds qui ont tendance à évoluer ensemble au cours du temps, que ce soit en termes de propriétés individuelles ou collectives. Les anomalies détectées correspondent généralement à des profils typiques : nœuds mal placés par l’outil de détection de communautés, ou nœuds différant des tendances de leur communautés sur certains points, et/ou non-synchrones avec l’évolution de leur communauté, ou encore nœuds complètement différents. / Complex Networks constitute a convenient tool to model real-world complex systems. For this reason, they have become very popular in the last decade. Many tools exist to study complex networks. Among them, community detection is one of the most important. A community is roughly defined as a group of nodes more connected internally than to the rest of the network. In the literature, this intuitive definition has been formalized in many ways, leading to countless different methods and variants to detect communities. In the large majority of cases, the result of these methods is set of node groups in which each node group corresponds to a community. From the applicative point of view, the meaning of these groups is as important as their detection. However, although the task of detecting communities in itself took a lot of attraction, the problem of interpreting them has not been properly tackled until now. In this thesis, we see the interpretation of communities as a problem independent from the community detection process, consisting in identifying the most characteristic features of communities. We break it down into two sub-problems: 1) finding an appropriate way to represent a community and 2) objectively selecting the most characteristic parts of this representation. To solve them, we take advantage of the information encoded in dynamic attributed networks. We propose a new representation of communities under the form of temporal sequences of topological measures and attribute values associated to individual nodes. We then look for emergent sequential patterns in this dataset, in order to identify the most characteristic community features. We perform a validation of our framework on artificially generated dynamic attributed networks. At this occasion, we study its behavior relatively to changes in the temporal evolution of the communities, and to the distribution and evolution of nodal features. We also apply our framework to real-world systems: a DBLP network of scientific collaborations, and a LastFM network of social and musical interactions. Our results show that the detected communities are not completely homogeneous, in the sense several node topic or interests can be identified for a given community. Some communities are composed of smaller groups of nodes which tend to evolve together as time goes by, be it in terms of individual (attributes, topological measures) or relational (community migration) features. The detected anomalies generally fit some generic profiles: nodes misplaced by the community detection tool, nodes relatively similar to their communities, but also significantly different on certain features and/or not synchronized with their community evolution, and finally nodes with completely different interests.
317

Anomaly Detection in Categorical Data with Interpretable Machine Learning : A random forest approach to classify imbalanced data

Yan, Ping January 2019 (has links)
Metadata refers to "data about data", which contains information needed to understand theprocess of data collection. In this thesis, we investigate if metadata features can be usedto detect broken data and how a tree-based interpretable machine learning algorithm canbe used for an effective classification. The goal of this thesis is two-fold. Firstly, we applya classification schema using metadata features for detecting broken data. Secondly, wegenerate the feature importance rate to understand the model’s logic and reveal the keyfactors that lead to broken data. The given task from the Swedish automotive company Veoneer is a typical problem oflearning from extremely imbalanced data set, with 97 percent of data belongs healthy dataand only 3 percent of data belongs to broken data. Furthermore, the whole data set containsonly categorical variables in nominal scales, which brings challenges to the learningalgorithm. The notion of handling imbalanced problem for continuous data is relativelywell-studied, but for categorical data, the solution is not straightforward. In this thesis, we propose a combination of tree-based supervised learning and hyperparametertuning to identify the broken data from a large data set. Our methods arecomposed of three phases: data cleaning, which is eliminating ambiguous and redundantinstances, followed by the supervised learning algorithm with random forest, lastly, weapplied a random search for hyper-parameter optimization on random forest model. Our results show empirically that tree-based ensemble method together with a randomsearch for hyper-parameter optimization have made improvement to random forest performancein terms of the area under the ROC. The model outperformed an acceptableclassification result and showed that metadata features are capable of detecting brokendata and providing an interpretable result by identifying the key features for classificationmodel.
318

Contribution à la modélisation et à la détection d'anomalies du traffic Internet à partir de mesures d'un coeur de réseau opérateur / Contribution to Internet traffic modelling and anomaly detection based on ISP backbone measurements

Grandemange, Quentin 06 April 2018 (has links)
Grâce au partenariat avec l'entreprise luxembourgeoise Post Luxembourg, nous avons pu tester différentes méthodes pour mesurer le trafic interdomaine à la bordure de leur réseau avec Internet. Le choix s'est porté sur une technologie existante : Netflow. Avec ces données nous avons pu réaliser diverses analyses afin de comprendre l'évolution du trafic en fonction de différents paramètres comme l'heure de la journée, le jour de la semaine... D'après ces analyses, plusieurs solutions ont été envisagées pour modéliser le trafic. Deux méthodes ont été proposées et testées sur des données réelles : une méthode d'analyse de séries temporelles et une méthode de machine learning reposant sur les processus gaussiens. Ces techniques ont été comparées sur différents systèmes autonomes. Les résultats sont satisfaisants pour les deux méthodes avec un avantage pour la méthode des processus gaussiens. Cette thèse propose le développement d'une solution logicielle ANODE mise en production chez Post Luxembourg et permettant l'analyse de bout en bout du trafic de cœur de réseau : mesure de données, modélisation, prédiction et détection d'anomalies / Inter-domain routing statistics are not usually publicly available but with the partnership with Post Luxembourg, we deployed a network wide measurements of Internet traffic. Those statistics show clear daily and weekly pattern and several points of interest. From all the information gathered, two modelling approach were chosen: the first one from the time series domain and the second one from the machine learning approach. Both were tested on several dataset of autonomous systems and the second one, Gaussian Process, was kept for the next steps. The proposal of this study is the development of a software solution called ANODE, which is used at Post Luxembourg, allowing the analysis of backbone traffic: measurments, modelling, forecasting and anomaly detection
319

MACHINE LEARNING FOR MECHANICAL ANALYSIS

Bengtsson, Sebastian January 2019 (has links)
It is not reliable to depend on a persons inference on dense data of high dimensionality on a daily basis. A person will grow tired or become distracted and make mistakes over time. Therefore it is desirable to study the feasibility of replacing a persons inference with that of Machine Learning in order to improve reliability. One-Class Support Vector Machines (SVM) with three different kernels (linear, Gaussian and polynomial) are implemented and tested for Anomaly Detection. Principal Component Analysis is used for dimensionality reduction and autoencoders are used with the intention to increase performance. Standard soft-margin SVMs were used for multi-class classification by utilizing the 1vsAll and 1vs1 approaches with the same kernels as for the one-class SVMs. The results for the one-class SVMs and the multi-class SVM methods are compared against each other within their respective applications but also against the performance of Back-Propagation Neural Networks of varying sizes. One-Class SVMs proved very effective in detecting anomalous samples once both Principal Component Analysis and autoencoders had been applied. Standard SVMs with Principal Component Analysis produced promising classification results. Twin SVMs were researched as an alternative to standard SVMs.
320

Space-efficient data sketching algorithms for network applications

Hua, Nan 06 July 2012 (has links)
Sketching techniques are widely adopted in network applications. Sketching algorithms “encode” data into succinct data structures that can later be accessed and “decoded” for various purposes, such as network measurement, accounting, anomaly detection and etc. Bloom filters and counter braids are two well-known representatives in this category. Those sketching algorithms usually need to strike a tradeoff between performance (how much information can be revealed and how fast) and cost (storage, transmission and computation). This dissertation is dedicated to the research and development of several sketching techniques including improved forms of stateful Bloom Filters, Statistical Counter Arrays and Error Estimating Codes. Bloom filter is a space-efficient randomized data structure for approximately representing a set in order to support membership queries. Bloom filter and its variants have found widespread use in many networking applications, where it is important to minimize the cost of storing and communicating network data. In this thesis, we propose a family of Bloom Filter variants augmented by rank-indexing method. We will show such augmentation can bring a significant reduction of space and also the number of memory accesses, especially when deletions of set elements from the Bloom Filter need to be supported. Exact active counter array is another important building block in many sketching algorithms, where storage cost of the array is of paramount concern. Previous approaches reduce the storage costs while either losing accuracy or supporting only passive measurements. In this thesis, we propose an exact statistics counter array architecture that can support active measurements (real-time read and write). It also leverages the aforementioned rank-indexing method and exploits statistical multiplexing to minimize the storage costs of the counter array. Error estimating coding (EEC) has recently been established as an important tool to estimate bit error rates in the transmission of packets over wireless links. In essence, the EEC problem is also a sketching problem, since the EEC codes can be viewed as a sketch of the packet sent, which is decoded by the receiver to estimate bit error rate. In this thesis, we will first investigate the asymptotic bound of error estimating coding by viewing the problem from two-party computation perspective and then investigate its coding/decoding efficiency using Fisher information analysis. Further, we develop several sketching techniques including Enhanced tug-of-war(EToW) sketch and the generalized EEC (gEEC)sketch family which can achieve around 70% reduction of sketch size with similar estimation accuracies. For all solutions proposed above, we will use theoretical tools such as information theory and communication complexity to investigate how far our proposed solutions are away from the theoretical optimal. We will show that the proposed techniques are asymptotically or empirically very close to the theoretical bounds.

Page generated in 0.0807 seconds