Spelling suggestions: "subject:"bobust machine learning"" "subject:"arobust machine learning""
1 |
Robust Large Margin Approaches for Machine Learning in Adversarial SettingsTorkamani, MohamadAli 21 November 2016 (has links)
Machine learning algorithms are invented to learn from data and to use data to perform predictions and analyses. Many agencies are now using machine learning algorithms to present services and to perform tasks that used to be done by humans. These services and tasks include making high-stake decisions. Determining the right decision strongly relies on the correctness of the input data. This fact provides a tempting incentive for criminals to try to deceive machine learning algorithms by manipulating the data that is fed to the algorithms. And yet, traditional machine learning algorithms are not designed to be safe when confronting unexpected inputs.
In this dissertation, we address the problem of adversarial machine learning; i.e., our goal is to build safe machine learning algorithms that are robust in the presence of noisy or adversarially manipulated data.
Many complex questions -- to which a machine learning system must respond -- have complex answers. Such outputs of the machine learning algorithm can have some internal structure, with exponentially many possible values. Adversarial machine learning will be more challenging when the output that we want to predict has a complex structure itself. In this dissertation, a significant focus is on adversarial machine learning for predicting structured outputs.
In this thesis, first, we develop a new algorithm that reliably performs collective classification: It jointly assigns labels to the nodes of graphed data. It is robust to malicious changes that an adversary can make in the properties of the different nodes of the graph. The learning method is highly efficient and is formulated as a convex quadratic program. Empirical evaluations confirm that this technique not only secures the prediction algorithm in the presence of an adversary, but it also generalizes to future inputs better, even if there is no adversary.
While our robust collective classification method is efficient, it is not applicable to generic structured prediction problems. Next, we investigate the problem of parameter learning for robust, structured prediction models. This method constructs regularization functions based on the limitations of the adversary in altering the feature space of the structured prediction algorithm. The proposed regularization techniques secure the algorithm against adversarial data changes, with little additional computational cost. In this dissertation, we prove that robustness to adversarial manipulation of data is equivalent to some regularization for large-margin structured prediction, and vice versa. This confirms some of the previous results for simpler problems.
As a matter of fact, an ordinary adversary regularly either does not have enough computational power to design the ultimate optimal attack, or it does not have sufficient information about the learner's model to do so. Therefore, it often tries to apply many random changes to the input in a hope of making a breakthrough. This fact implies that if we minimize the expected loss function under adversarial noise, we will obtain robustness against mediocre adversaries. Dropout training resembles such a noise injection scenario. Dropout training was initially proposed as a regularization technique for neural networks. The procedure is simple: At each iteration of training, randomly selected features are set to zero. We derive a regularization method for large-margin parameter learning based on dropout. Our method calculates the expected loss function under all possible dropout values. This method results in a simple objective function that is efficient to optimize. We extend dropout regularization to non-linear kernels in several different directions. We define the concept of dropout for input space, feature space, and input dimensions, and we introduce methods for approximate marginalization over feature space, even if the feature space is infinite-dimensional.
Empirical evaluations show that our techniques consistently outperform the baselines on different datasets.
|
2 |
Distributionally robust unsupervised domain adaptation and its applications in 2D and 3D image analysisWang, Yibin 08 August 2023 (has links) (PDF)
Obtaining ground-truth label information from real-world data along with uncertainty quantification can be challenging or even infeasible. In the absence of labeled data for a certain task, unsupervised domain adaptation (UDA) techniques have shown great accomplishment by learning transferable knowledge from labeled source domain data and adapting it to unlabeled target domain data, yet uncertainties are still a big concern under domain shifts. Distributionally robust learning (DRL) is emerging as a high-potential technique for building reliable learning systems that are robust to distribution shifts. In this research, a distributionally robust unsupervised domain adaptation (DRUDA) method is proposed to enhance the machine learning model generalization ability under input space perturbations. The DRL-based UDA learning scheme is formulated as a min-max optimization problem by optimizing worst-case perturbations of the training source data. Our Wasserstein distributionally robust framework can reduce the shifts in the joint distributions across domains. The proposed DRUDA method has been tested on various benchmark datasets. In addition, a gradient mapping-guided explainable network (GMGENet) is proposed to analyze 3D medical images for extracapsular extension (ECE) identification. DRUDA-enhanced GMGENet is evaluated, and experimental results demonstrate that the proposed DRUDA improves transfer performance on target domains for the 3D image analysis task successfully. This research enhances the understanding of distributionally robust optimization in domain adaptation and is expected to advance the current unsupervised machine learning techniques.
|
3 |
Trojan Attacks and Defenses on Deep Neural NetworksYingqi Liu (13943811) 13 October 2022 (has links)
<p>With the fast spread of machine learning techniques, sharing and adopting public deep neural networks become very popular. As deep neural networks are not intuitive for human to understand, malicious behaviors can be injected into deep neural networks undetected. We call it trojan attack or backdoor attack on neural networks. Trojaned models operate normally when regular inputs are provided, and misclassify to a specific output label when the input is stamped with some special pattern called trojan trigger. Deploying trojaned models can cause various severe consequences including endangering human lives (in applications like autonomous driving). Trojan attacks on deep neural networks introduce two challenges. From the attacker's perspective, since the training data or training process is usually not accessible to the attacker, the attacker needs to find a way to carry out the trojan attack without access to training data. From the user's perspective, the user needs to quickly scan the online public deep neural networks and detect trojaned models.</p>
<p>We try to address these challenges in this dissertation. For trojan attack without access to training data, We propose to invert the neural network to generate a general trojan trigger, and then retrain the model with reverse-engineered training data to inject malicious behaviors to the model. The malicious behaviors are only activated by inputs stamped with the trojan trigger. To scan and detect trojaned models, we develop a novel technique that analyzes inner neuron behaviors by determining how output activation change when we introduce different levels of stimulation to a neuron. A trojan trigger is then reverse-engineered through an optimization procedure using the stimulation analysis results, to confirm that a neuron is truly compromised. Furthermore, for complex trojan attacks, we propose a novel complex trigger detection method. It leverages a novel symmetric feature differencing method to distinguish features of injected complex triggers from natural features. For trojan attacks on NLP models, we propose a novel backdoor scanning technique. It transforms a subject model to an equivalent but differentiable form. It then inverts a distribution of words denoting their likelihood in the trigger and applies a novel word discriminativity analysis to determine if the subject model is particularly discriminative for the presence of likely trigger words.</p>
|
Page generated in 0.0927 seconds