Global ETD Search

21	Building trustworthy machine learning systems in adversarial environments Wang, Ning 26 May 2023 (has links) Modern AI systems, particularly with the rise of big data and deep learning in the last decade, have greatly improved our daily life and at the same time created a long list of controversies. AI systems are often subject to malicious and stealthy subversion that jeopardizes their efficacy. Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly boost the accuracy of machine learning models, they also create opportunities for adversaries to tamper with models or extract sensitive data. Malicious data providers can compromise machine learning systems by supplying false data and intermediate computation results. Even a well-trained model can be deceived to misbehave by an adversary who provides carefully designed inputs. Furthermore, curious parties can derive sensitive information of the training data by interacting with a machine-learning model. These adversarial scenarios, known as poisoning attack, adversarial example attack, and inference attack, have demonstrated that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust. To address these problems, we proposed the following solutions: (1) FLARE, which detects and mitigates stealthy poisoning attacks by leveraging latent space representations; (2) MANDA, which detects adversarial examples by utilizing evaluations from diverse sources, i.e, model-based prediction and data-based evaluation; (3) FeCo which enhances the robustness of machine learning-based network intrusion detection systems by introducing a novel representation learning method; and (4) DP-FedMeta, which preserves data privacy and improves the privacy-accuracy trade-off in machine learning systems through a novel adaptive clipping mechanism. / Doctor of Philosophy / Over the past few decades, machine learning (ML) has become increasingly popular for enhancing efficiency and effectiveness in data analytics and decision-making. Notable applications include intelligent transportation, smart healthcare, natural language generation, intrusion detection, etc. While machine learning methods are often employed for beneficial purposes, they can also be exploited for malicious intents. Well-trained language models have demonstrated generalizability deficiencies and intrinsic biases; generative ML models used for creating art have been repurposed by fraudsters to produce deepfakes; and facial recognition models trained on big data have been found to leak sensitive information about data owners. Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly improve the accuracy of ML models, they also enable adversaries to corrupt models and infer sensitive data. This leads to various adversarial attacks, such as model poisoning during training, adversarially crafted data in testing, and data inference. It is evident that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust. This research focuses on building trustworthy machine-learning systems in adversarial environments from a data perspective. It encompasses two themes: securing ML systems against security or privacy vulnerabilities (security of AI) and using ML as a tool to develop novel security solutions (AI for security). For the first theme, we studied adversarial attack detection in both the training and testing phases and proposed FLARE and MANDA to secure matching learning systems in the two phases, respectively. Additionally, we proposed a privacy-preserving learning system, dpfed, to defend against privacy inference attacks. We achieved a good trade-off between accuracy and privacy by proposing an adaptive data clipping and perturbing method. In the second theme, the research is focused on enhancing the robustness of intrusion detection systems through data representation learning. adversarial machine learning anomaly detection differential privacy
22	Anomaly Detection for Smart Infrastructure: An Unsupervised Approach for Time Series Comparison Gandra, Harshitha 25 January 2022 (has links) Time series anomaly detection can prove to be a very useful tool to inspect and maintain the health and quality of an infrastructure system. While tackling such a problem, the main concern lies in the imbalanced nature of the dataset. In order to mitigate this problem, this thesis proposes two unsupervised anomaly detection frameworks. The first one is an architecture which leverages the concept of matrix profile which essentially refers to a data structure containing the euclidean scores of the subsequences of two time series that is obtained through a similarity join.It is an architecture comprising of a data fusion technique coupled with using matrix profile analysis under the constraints of varied sampling rate for different time series. To this end, we have proposed a framework, through which a time series that is being evaluated for anomalies is quantitatively compared with a benchmark (anomaly-free) time series using the proposed asynchronous time series comparison that was inspired by matrix profile approach for anomaly detection on time series . In order to evaluate the efficacy of this framework, it was tested on a case study comprising of a Class I Rail road dataset. The data collection system integrated into this railway system collects data through different data acquisition channels which represent different transducers. This framework was applied to all the channels and the best performing channels were identified. The average Recall and Precision achieved on the single channel evaluation through this framework was 93.5% and 55% respectively with an error threshold of 0.04 miles or 211 feet. A limitation that was noticed in this framework was that there were some false positive predictions. In order to overcome this problem, a second framework has been proposed which incorporates the idea of extracting signature patterns in a time series also known as motifs which can be leveraged to identify anomalous patterns. This second framework proposed is a motif based framework which operates under the same constraints of a varied sampling rate. Here, a feature extraction method and a clustering method was used in the training process of a One Class Support Vector Machine (OCSVM) coupled with a Kernel Density Estimation (KDE) technique. The average Recall and Precision achieved on the same case study through this frame work was 74% and 57%. In comparison to the first, the second framework does not perform as well. There will be future efforts focused on improving this classification-based anomaly detection method / Master of Science / Time series anomaly detection refers to the identification of any outliers or deviations present in a time series data. This technique could prove to be useful to mitigate any unplanned events by facilitating early maintenance. The first method proposed involves comparing an anomaly-free dataset with the time series of interest. The difference between these two time series are noted and the point with the highest difference will be considered to be an anomaly. The performance of this model was evaluated on a Rail road dataset and the cumuluative average Recall (how useful the predictions are) and average Precison (how accurate the predictions are) 93.5% and 55% respectively with an acceptable error range of 0.04 miles or 211 feet. The second method proposed involves extracting all segments in the anomaly-free dataset and grouping them according to their similarity. Here, a OCSVM is used to train these individual groups. OCSVM is a machine learning algorithm which learns to classify a data as either anomalous or normal. It is then coupled with the KDE which creates a distribution across all the anomalies and identifies the anomaly as one with a high distribution of predictions.The performance of this model was evaluated on a Rail road dataset and the cumulative average Recall and cumulative average Precision 74% and 57% respectively with an acceptable error range of 0.04 miles or 211 feet. Anomaly Detection Asynchronous Unsupervised Matrix Profile OCSVM
23	Semi-supervised and Self-evolving Learning Algorithms with Application to Anomaly Detection in Cloud Computing Pannu, Husanbir Singh 12 1900 (has links) Semi-supervised learning (SSL) is the most practical approach for classification among machine learning algorithms. It is similar to the humans way of learning and thus has great applications in text/image classification, bioinformatics, artificial intelligence, robotics etc. Labeled data is hard to obtain in real life experiments and may need human experts with experimental equipments to mark the labels, which can be slow and expensive. But unlabeled data is easily available in terms of web pages, data logs, images, audio, video les and DNA/RNA sequences. SSL uses large unlabeled and few labeled data to build better classifying functions which acquires higher accuracy and needs lesser human efforts. Thus it is of great empirical and theoretical interest. We contribute two SSL algorithms (i) adaptive anomaly detection (AAD) (ii) hybrid anomaly detection (HAD), which are self evolving and very efficient to detect anomalies in a large scale and complex data distributions. Our algorithms are capable of modifying an existing classier by both retiring old data and adding new data. This characteristic enables the proposed algorithms to handle massive and streaming datasets where other existing algorithms fail and run out of memory. As an application to semi-supervised anomaly detection and for experimental illustration, we have implemented a prototype of the AAD and HAD systems and conducted experiments in an on-campus cloud computing environment. Experimental results show that the detection accuracy of both algorithms improves as they evolves and can achieve 92.1% detection sensitivity and 83.8% detection specificity, which makes it well suitable for anomaly detection in large and streaming datasets. We compared our algorithms with two popular SSL methods (i) subspace regularization (ii) ensemble of Bayesian sub-models and decision tree classifiers. Our contributed algorithms are easy to implement, significantly better in terms of space, time complexity and accuracy than these two methods for semi-supervised anomaly detection mechanism. Machine learning anomaly detection cloud computing
24	Threat Detection in Program Execution and Data Movement: Theory and Practice Shu, Xiaokui 25 June 2016 (has links) Program attacks are one of the oldest and fundamental cyber threats. They compromise the confidentiality of data, the integrity of program logic, and the availability of services. This threat becomes even severer when followed by other malicious activities such as data exfiltration. The integration of primitive attacks constructs comprehensive attack vectors and forms advanced persistent threats. Along with the rapid development of defense mechanisms, program attacks and data leak threats survive and evolve. Stealthy program attacks can hide in long execution paths to avoid being detected. Sensitive data transformations weaken existing leak detection mechanisms. New adversaries, e.g., semi-honest service provider, emerge and form threats. This thesis presents theoretical analysis and practical detection mechanisms against stealthy program attacks and data leaks. The thesis presents a unified framework for understanding different branches of program anomaly detection and sheds light on possible future program anomaly detection directions. The thesis investigates modern stealthy program attacks hidden in long program executions and develops a program anomaly detection approach with data mining techniques to reveal the attacks. The thesis advances network-based data leak detection mechanisms by relaxing strong requirements in existing methods. The thesis presents practical solutions to outsource data leak detection procedures to semi-honest third parties and identify noisy or transformed data leaks in network traffic. / Ph. D. Cybersecurity Program Anomaly Detection Data Leak Detection
25	Discovery of Triggering Relations and Its Applications in Network Security and Android Malware Detection Zhang, Hao 30 November 2015 (has links) An increasing variety of malware, including spyware, worms, and bots, threatens data confidentiality and system integrity on computing devices ranging from backend servers to mobile devices. To address these threats, exacerbated by dynamic network traffic patterns and growing volumes, network security has been undergoing major changes to improve accuracy and scalability in the security analysis techniques. This dissertation addresses the problem of detecting the network anomalies on a single device by inferring the traffic dependence to ensure the root-triggers. In particular, we propose a dependence model for illustrating the network traffic causality. This model depicts the triggering relation of network requests, and thus can be used to reason about the occurrences of network events and pinpoint stealthy malware activities. The triggering relationships can be inferred by means of both rule-based and learning-based approaches. The rule-based approach originates from several heuristic algorithms based on the domain knowledge. The learning-based approach discovers the triggering relationship using a pairwise comparison operation that converts the requests into event pairs with comparable attributes. Machine learning classifiers predict the triggering relationship and further reason about the legitimacy of requests by enforcing their root-triggers. We apply our dependence model on the network traffic from a single host and a mobile device. Evaluated with real-world malware samples and synthetic attacks, our findings confirm that the traffic dependence model provides a significant source of semantic and contextual information that detects zero-day malicious applications. This dissertation also studies the usability of visualizing the traffic causality for domain experts. We design and develop a tool with a visual locality property. It supports different levels of visual based querying and reasoning required for the sensemaking process on complex network data. The significance of this dissertation research is in that it provides deep insights on the dependency of network requests, and leverages structural and semantic information, allowing us to reason about network behaviors and detect stealthy anomalies. / Ph. D. Network Security Stealthy Malware Anomaly Detection
26	Application of a Layered Hidden Markov Model in the Detection of Network Attacks Taub, Lawrence 01 January 2013 (has links) Network-based attacks against computer systems are a common and increasing problem. Attackers continue to increase the sophistication and complexity of their attacks with the goal of removing sensitive data or disrupting operations. Attack detection technology works very well for the detection of known attacks using a signature-based intrusion detection system. However, attackers can utilize attacks that are undetectable to those signature-based systems whether they are truly new attacks or modified versions of known attacks. Anomaly-based intrusion detection systems approach the problem of attack detection by detecting when traffic differs from a learned baseline. In the case of this research, the focus was on a relatively new area known as payload anomaly detection. In payload anomaly detection, the system focuses exclusively on the payload of packets and learns the normal contents of those payloads. When a payload's contents differ from the norm, an anomaly is detected and may be a potential attack. A risk with anomaly-based detection mechanisms is they suffer from high false positive rates which reduce their effectiveness. This research built upon previous research in payload anomaly detection by combining multiple techniques of detection in a layered approach. The layers of the system included a high-level navigation layer, a request payload analysis layer, and a request-response analysis layer. The system was tested using the test data provided by some earlier payload anomaly detection systems as well as new data sets. The results of the experiments showed that by combining these layers of detection into a single system, there were higher detection rates and lower false positive rates. anomaly detection hidden markov model intrusion detection network security payload anomaly detection security Computer Sciences
27	Anomaly detection techniques for unsupervised machine learning Iivari, Albin January 2022 (has links) Anomalies in data can be of great importance as they often indicate faulty behaviour. Locating these can thus assist in finding the source of the issue. Isolation Forest, an unsupervised machine learning model used to detect anomalies, is evaluated against two other commonly used models. The data set used were log files from a company named Trimma. The log files contained information about different events that executed. Different types of event could differ in execution time. The models were then used to find logs where some event took longer than usual to execute. The feature created for the models was a percentual difference from the median of each job type. The comparison made on various data set sizes, using one feature, showed that Isolation Forest did not perform the best with regard to execution time among the models. Isolation Forest classified similar data points compared to the other models. However, the smallest classified anomaly differed a bit from the other models. This discrepancy was only seen in the smaller anomalies, the larger deviations were consistently classified as anomalies by all models. Machine learning unsupervised anomaly detection anomaly detection in logfiles isolation forest cblof Computer Sciences Datavetenskap (datalogi)
28	Anomaly Detection in Snus Manufacturing : A machine learning approach for quality assurance / Avvikelseidentifiering inom snustillverkning : En maskininlärningsttillämpning för kvalitetskontroll Duberg, Melker January 2023 (has links) The art of anomaly detection is a relevant topic for most producing companies since it allows for real-time quality assurance in production. However, previous research is lacking on the applicability of anomaly detection methods on non-synthetic image datasets. Using a dataset provided by Swedish Match consisting of 943 images of snus cans without lids, we offer an extension to a recent anomaly detection benchmark study by assessing how 29 anomaly detection algorithms perform on our non-synthetic dataset. The results showed that fully supervised methods performed the best, and that labelled data significantly improved model performance. Although the achieved results were not satisfactory in terms of AUCROC and AUCPR, there were clear indications that performance can be improved by increasing the amount of training data. The best-performing model was Logistic Regression. / Avvikelsedetektering är ett relevant ämne för de flesta aktörerna inom tillverkningsindustrin eftersom det möjliggör kvalitetssäkring i realtid i produktionskedjor. I tidigare forskning har det saknats studier gjorda med verklighetstrogna, icke-syntetiska dataset. Med hjälp av ett dataset tillhandahållet av Swedish Match bestående av 943 bilder på öppna snusdosor tillför vi en vetenskaplig påbyggnad till en nyligen publicerad jämförelsestudie inom avvikelsedetektering. Detta genom att träna och utvärdera 29 avvikelsedetekteringsmodeller på vårt icke-syntetiska dataset. Resultaten visade att fully supervised-modellerna presterade bäst, och att klassificerad träningsdata ökar prestandan. Trots att modellerna generellt uppnådde låg AUCPR och AUCROC finns det tydliga indikationer på att detta är uppnåbart genom att utöka träningsdatamängden. Den bäst presterande modellen var Logistic Regression. Anomaly Detection Machine Learning Anomaly Detection Machine Learning Computer and Information Sciences Data- och informationsvetenskap
29	Botnet detection techniques: review, future trends, and issues Karim, A., Bin Salleh, R., Shiraz, M., Shah, S.A.A., Awan, Irfan U., Anuar, N.B. January 2014 (has links) No / In recent years, the Internet has enabled access to widespread remote services in the distributed computing environment; however, integrity of data transmission in the distributed computing platform is hindered by a number of security issues. For instance, the botnet phenomenon is a prominent threat to Internet security, including the threat of malicious codes. The botnet phenomenon supports a wide range of criminal activities, including distributed denial of service (DDoS) attacks, click fraud, phishing, malware distribution, spam emails, and building machines for illegitimate exchange of information/materials. Therefore, it is imperative to design and develop a robust mechanism for improving the botnet detection, analysis, and removal process. Currently, botnet detection techniques have been reviewed in different ways; however, such studies are limited in scope and lack discussions on the latest botnet detection techniques. This paper presents a comprehensive review of the latest state-of-the-art techniques for botnet detection and figures out the trends of previous and current research. It provides a thematic taxonomy for the classification of botnet detection techniques and highlights the implications and critical aspects by qualitatively analyzing such techniques. Related to our comprehensive review, we highlight future directions for improving the schemes that broadly span the entire botnet detection research field and identify the persistent and prominent research challenges that remain open. / University of Malaya, Malaysia (No. FP034-2012A) Botnet detection ; Anomaly detection ; Network security ; Attack ; Defense ; Taxonomy ; Remote-control behavior ; Anomaly detection ; Command ; Dark
30	AUTOMATED HEALTH OPERATIONS FOR THE SAPPHIRE SPACECRAFT Swartwout, Michael A., Kitts, Christopher A. 10 1900 (has links) International Telemetering Conference Proceedings / October 27-30, 1997 / Riviera Hotel and Convention Center, Las Vegas, Nevada / Stanford’s Space Systems Development Laboratory is developing methods for automated spacecraft health operations. Such operations greatly reduce the need for ground-space communication links and full-time operators. However, new questions emerge about how to supply operators with the spacecraft information that is no longer available. One solution is to introduce a low-bandwidth health beacon and to develop new approaches in on-board summarization of health data for telemetering. This paper reviews the development of beacon operations and data summary, describes the implementation of beacon-based health management on board SAPPHIRE, and explains the mission operations response to health emergencies. Additional information is provided on the role of SSDL’s academic partners in developing a worldwide network of beacon receiving stations. Health management Beacon Spacecraft operations Data summary Anomaly detection Automation

Search results