• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 243
  • 17
  • 17
  • 15
  • 13
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 383
  • 383
  • 168
  • 164
  • 128
  • 111
  • 82
  • 70
  • 69
  • 61
  • 58
  • 57
  • 44
  • 44
  • 43
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Prediction and Anomaly Detection Techniques for Spatial Data

Liu, Xutong 11 June 2013 (has links)
With increasing public sensitivity and concern on environmental issues, huge amounts of spatial data have been collected from location based social network applications to scientific data. This has encouraged formation of large spatial data set and generated considerable interests for identifying novel and meaningful patterns. Allowing correlated observations weakens the usual statistical assumption of independent observations, and complicates the spatial analysis. This research focuses on the construction of efficient and effective approaches for three main mining tasks, including spatial outlier detection, robust inference for spatial dataset, and spatial prediction for large multivariate non-Gaussian data. spatial outlier analysis, which aims at detecting abnormal objects in spatial contexts, can help  extract important knowledge in many applications. There exist the well-known masking and swamping problems in most approaches, which can't still satisfy certain requirements aroused recently. This research focuses on development of spatial outlier detection techniques for three aspects, including spatial numerical outlier detection, spatial categorical outlier detection and identification of the number of spatial numerical outliers. First, this report introduces Random Walk based approaches to identify spatial numerical outliers. The Bipartite and an Exhaustive Combination weighted graphs are modeled based on spatial and/or non-spatial attributes, and then Random walk techniques are performed on the graphs to compute the relevance among objects. The objects with lower relevance are recognized as outliers. Second, an entropy-based method is proposed to estimate the optimum number of outliers. According to the entropy theory, we expect that, by incrementally removing outliers, the entropy value will decrease sharply, and reach a stable state when all the outliers have been removed. Finally, this research designs several Pair Correlation Function based methods to detect spatial categorical outliers for both single and multiple attribute data. Within them, Pair Correlation Ratio(PCR) is defined and estimated for each pair of categorical combinations based on their co-occurrence frequency at different spatial distances. The observations with the lower PCRs are diagnosed as potential SCOs. Spatial kriging is a widely used predictive model whose predictive accuracy could be significantly compromised if the observations are contaminated by outliers. Also, due to spatial heterogeneity, observations are often different types. The prediction of multivariate spatial processes plays an important role when there are cross-spatial dependencies between multiple responses. In addition, given the large volume of spatial data, it is computationally challenging. These raise three research topics: 1).robust prediction for spatial data sets; 2).prediction of multivariate spatial observations; and 3). efficient processing for large data sets. First, increasing the robustness of spatial kriging model can be systematically addressed by integrating heavy tailed distributions. However, it is analytically intractable inference. Here, we presents a novel robust and reduced Rank spatial kriging Model (R$^3$-SKM), which is resilient to the influences of outliers and allows for fast spatial inference. Second, this research introduces a flexible hierarchical Bayesian framework that permits the simultaneous modeling of mixed type variable. Specifically, the mixed-type attributes are mapped to latent numerical random variables that are multivariate Gaussian in nature. Finally, the knot-based techniques is utilized to model the predictive process as a reduced rank spatial process, which projects the process realizations of the spatial model to a lower dimensional subspace. This projection significantly reduces the computational cost. / Ph. D.
12

Tuning and Optimising Concept Drift Detection

Do, Ethan Quoc-Nam January 2021 (has links)
Data drifts naturally occur in data streams due to seasonality, change in data usage, and the data generation process. Concepts modelled via the data streams will also experience such drift. The problem of differentiating concept drift from anomalies is important to identify normal vs abnormal behaviour. Existing techniques achieve poor responsiveness and accuracy towards this differentiation task. We take two approaches to address this problem. First, we extend an existing sliding window algorithm to include multiple windows to model recently seen data stream patterns, and define new parameters to compare the data streams. Second, we study a set of optimisers and tune a Bi-LSTM model parameters to maximize accuracy. / Thesis / Master of Applied Science (MASc)
13

Spatio-Temporal Anomaly Detection

Das, Mahashweta January 2009 (has links)
No description available.
14

Threat Detection in Program Execution and Data Movement: Theory and Practice

Shu, Xiaokui 25 June 2016 (has links)
Program attacks are one of the oldest and fundamental cyber threats. They compromise the confidentiality of data, the integrity of program logic, and the availability of services. This threat becomes even severer when followed by other malicious activities such as data exfiltration. The integration of primitive attacks constructs comprehensive attack vectors and forms advanced persistent threats. Along with the rapid development of defense mechanisms, program attacks and data leak threats survive and evolve. Stealthy program attacks can hide in long execution paths to avoid being detected. Sensitive data transformations weaken existing leak detection mechanisms. New adversaries, e.g., semi-honest service provider, emerge and form threats. This thesis presents theoretical analysis and practical detection mechanisms against stealthy program attacks and data leaks. The thesis presents a unified framework for understanding different branches of program anomaly detection and sheds light on possible future program anomaly detection directions. The thesis investigates modern stealthy program attacks hidden in long program executions and develops a program anomaly detection approach with data mining techniques to reveal the attacks. The thesis advances network-based data leak detection mechanisms by relaxing strong requirements in existing methods. The thesis presents practical solutions to outsource data leak detection procedures to semi-honest third parties and identify noisy or transformed data leaks in network traffic. / Ph. D.
15

Discovery of Triggering Relations and Its Applications in Network Security and Android Malware Detection

Zhang, Hao 30 November 2015 (has links)
An increasing variety of malware, including spyware, worms, and bots, threatens data confidentiality and system integrity on computing devices ranging from backend servers to mobile devices. To address these threats, exacerbated by dynamic network traffic patterns and growing volumes, network security has been undergoing major changes to improve accuracy and scalability in the security analysis techniques. This dissertation addresses the problem of detecting the network anomalies on a single device by inferring the traffic dependence to ensure the root-triggers. In particular, we propose a dependence model for illustrating the network traffic causality. This model depicts the triggering relation of network requests, and thus can be used to reason about the occurrences of network events and pinpoint stealthy malware activities. The triggering relationships can be inferred by means of both rule-based and learning-based approaches. The rule-based approach originates from several heuristic algorithms based on the domain knowledge. The learning-based approach discovers the triggering relationship using a pairwise comparison operation that converts the requests into event pairs with comparable attributes. Machine learning classifiers predict the triggering relationship and further reason about the legitimacy of requests by enforcing their root-triggers. We apply our dependence model on the network traffic from a single host and a mobile device. Evaluated with real-world malware samples and synthetic attacks, our findings confirm that the traffic dependence model provides a significant source of semantic and contextual information that detects zero-day malicious applications. This dissertation also studies the usability of visualizing the traffic causality for domain experts. We design and develop a tool with a visual locality property. It supports different levels of visual based querying and reasoning required for the sensemaking process on complex network data. The significance of this dissertation research is in that it provides deep insights on the dependency of network requests, and leverages structural and semantic information, allowing us to reason about network behaviors and detect stealthy anomalies. / Ph. D.
16

Building trustworthy machine learning systems in adversarial environments

Wang, Ning 26 May 2023 (has links)
Modern AI systems, particularly with the rise of big data and deep learning in the last decade, have greatly improved our daily life and at the same time created a long list of controversies. AI systems are often subject to malicious and stealthy subversion that jeopardizes their efficacy. Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly boost the accuracy of machine learning models, they also create opportunities for adversaries to tamper with models or extract sensitive data. Malicious data providers can compromise machine learning systems by supplying false data and intermediate computation results. Even a well-trained model can be deceived to misbehave by an adversary who provides carefully designed inputs. Furthermore, curious parties can derive sensitive information of the training data by interacting with a machine-learning model. These adversarial scenarios, known as poisoning attack, adversarial example attack, and inference attack, have demonstrated that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust. To address these problems, we proposed the following solutions: (1) FLARE, which detects and mitigates stealthy poisoning attacks by leveraging latent space representations; (2) MANDA, which detects adversarial examples by utilizing evaluations from diverse sources, i.e, model-based prediction and data-based evaluation; (3) FeCo which enhances the robustness of machine learning-based network intrusion detection systems by introducing a novel representation learning method; and (4) DP-FedMeta, which preserves data privacy and improves the privacy-accuracy trade-off in machine learning systems through a novel adaptive clipping mechanism. / Doctor of Philosophy / Over the past few decades, machine learning (ML) has become increasingly popular for enhancing efficiency and effectiveness in data analytics and decision-making. Notable applications include intelligent transportation, smart healthcare, natural language generation, intrusion detection, etc. While machine learning methods are often employed for beneficial purposes, they can also be exploited for malicious intents. Well-trained language models have demonstrated generalizability deficiencies and intrinsic biases; generative ML models used for creating art have been repurposed by fraudsters to produce deepfakes; and facial recognition models trained on big data have been found to leak sensitive information about data owners. Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly improve the accuracy of ML models, they also enable adversaries to corrupt models and infer sensitive data. This leads to various adversarial attacks, such as model poisoning during training, adversarially crafted data in testing, and data inference. It is evident that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust. This research focuses on building trustworthy machine-learning systems in adversarial environments from a data perspective. It encompasses two themes: securing ML systems against security or privacy vulnerabilities (security of AI) and using ML as a tool to develop novel security solutions (AI for security). For the first theme, we studied adversarial attack detection in both the training and testing phases and proposed FLARE and MANDA to secure matching learning systems in the two phases, respectively. Additionally, we proposed a privacy-preserving learning system, dpfed, to defend against privacy inference attacks. We achieved a good trade-off between accuracy and privacy by proposing an adaptive data clipping and perturbing method. In the second theme, the research is focused on enhancing the robustness of intrusion detection systems through data representation learning.
17

Anomaly detection techniques for unsupervised machine learning

Iivari, Albin January 2022 (has links)
Anomalies in data can be of great importance as they often indicate faulty behaviour. Locating these can thus assist in finding the source of the issue. Isolation Forest, an unsupervised machine learning model used to detect anomalies, is evaluated against two other commonly used models. The data set used were log files from a company named Trimma. The log files contained information about different events that executed. Different types of event could differ in execution time. The models were then used to find logs where some event took longer than usual to execute. The feature created for the models was a percentual difference from the median of each job type. The comparison made on various data set sizes, using one feature, showed that Isolation Forest did not perform the best with regard to execution time among the models. Isolation Forest classified similar data points compared to the other models. However, the smallest classified anomaly differed a bit from the other models. This discrepancy was only seen in the smaller anomalies, the larger deviations were consistently classified as anomalies by all models.
18

Botnet detection techniques: review, future trends, and issues

Karim, A., Bin Salleh, R., Shiraz, M., Shah, S.A.A., Awan, Irfan U., Anuar, N.B. January 2014 (has links)
No / In recent years, the Internet has enabled access to widespread remote services in the distributed computing environment; however, integrity of data transmission in the distributed computing platform is hindered by a number of security issues. For instance, the botnet phenomenon is a prominent threat to Internet security, including the threat of malicious codes. The botnet phenomenon supports a wide range of criminal activities, including distributed denial of service (DDoS) attacks, click fraud, phishing, malware distribution, spam emails, and building machines for illegitimate exchange of information/materials. Therefore, it is imperative to design and develop a robust mechanism for improving the botnet detection, analysis, and removal process. Currently, botnet detection techniques have been reviewed in different ways; however, such studies are limited in scope and lack discussions on the latest botnet detection techniques. This paper presents a comprehensive review of the latest state-of-the-art techniques for botnet detection and figures out the trends of previous and current research. It provides a thematic taxonomy for the classification of botnet detection techniques and highlights the implications and critical aspects by qualitatively analyzing such techniques. Related to our comprehensive review, we highlight future directions for improving the schemes that broadly span the entire botnet detection research field and identify the persistent and prominent research challenges that remain open. / University of Malaya, Malaysia (No. FP034-2012A)
19

AUTOMATED HEALTH OPERATIONS FOR THE SAPPHIRE SPACECRAFT

Swartwout, Michael A., Kitts, Christopher A. 10 1900 (has links)
International Telemetering Conference Proceedings / October 27-30, 1997 / Riviera Hotel and Convention Center, Las Vegas, Nevada / Stanford’s Space Systems Development Laboratory is developing methods for automated spacecraft health operations. Such operations greatly reduce the need for ground-space communication links and full-time operators. However, new questions emerge about how to supply operators with the spacecraft information that is no longer available. One solution is to introduce a low-bandwidth health beacon and to develop new approaches in on-board summarization of health data for telemetering. This paper reviews the development of beacon operations and data summary, describes the implementation of beacon-based health management on board SAPPHIRE, and explains the mission operations response to health emergencies. Additional information is provided on the role of SSDL’s academic partners in developing a worldwide network of beacon receiving stations.
20

Exploitation of signal information for mobile speed estimation and anomaly detection

Afgani, Mostafa Z. January 2011 (has links)
Although the primary purpose of the signal received by amobile handset or smartphone is to enable wireless communication, the information extracted can be reused to provide a number of additional services. Two such services discussed in this thesis are: mobile speed estimation and signal anomaly detection. The proposed algorithms exploit the propagation environment specific information that is already imprinted on the received signal and therefore do not incur any additional signalling overhead. Speed estimation is useful for providing navigation and location based services in areas where global navigation satellite systems (GNSS) based devices are unusable while the proposed anomaly detection algorithms can be used to locate signal faults and aid spectrum sensing in cognitive radio systems. The speed estimation algorithms described within this thesis require a receiver with at least two antenna elements and a wideband radio frequency (RF) signal source. The channel transfer function observed at the antenna elements are compared to yield an estimate of the device speed. The basic algorithm is a one-dimensional and unidirectional two-antenna solution. The speed of the mobile receiver is estimated from a knowledge of the fixed inter-antenna distance and the time it takes for the trailing antenna to sense similar channel conditions previously observed at the leading antenna. A by-product of the algorithm is an environment specific spatial correlation function which may be combined with theoretical models of spatial correlation to extend and improve the accuracy of the algorithm. Results obtained via computer simulations are provided. The anomaly detection algorithms proposed in this thesis highlight unusual signal features while ignoring events that are nominal. When the test signal possesses a periodic frame structure, Kullback-Leibler divergence (KLD) analysis is employed to statistically compare successive signal frames. A method of automatically extracting the required frame period information from the signal is also provided. When the signal under test lacks a periodic frame structure, information content analysis of signal events can be used instead. Clean training data is required by this algorithm to initialise the reference event probabilities. In addition to the results obtained from extensive computer simulations, an architecture for field-programmable gate array (FPGA) based hardware implementations of the KLD based algorithm is provided. Results showing the performance of the algorithms against real test signals captured over the air are also presented. Both sets of algorithms are simple, effective and have low computational complexity – implying that real-time implementations on platforms with limited processing power and energy are feasible. This is an important quality since location based services are expected to be an integral part of next generation cognitive radio handsets.

Page generated in 0.0647 seconds