1 |
Analytic Treatment of Deep Neural Networks Under Additive Gaussian NoiseAlfadly, Modar 12 April 2018 (has links)
Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours. One puzzling behaviour is the reaction of DNNs to various noise attacks, where it has been shown that there exist small adversarial noise that can result in a severe degradation in the performance of DNNs. To rigorously treat this, we derive exact analytic expressions for the first and second moments (mean and variance) of a small piecewise linear (PL) network with a single rectified linear unit (ReLU) layer subject to general Gaussian input. We experimentally show that these expressions are tight under simple linearizations of deeper PL-DNNs, especially popular architectures in the literature (e.g. LeNet and AlexNet). Extensive experiments on image classification show that these expressions can be used to study the behaviour of the output mean of the logits for each class, the inter-class confusion and the pixel-level spatial noise sensitivity of the network. Moreover, we show how these expressions can be used to systematically construct targeted and non-targeted adversarial attacks. Then, we proposed a special estimator DNN, named mixture of linearizations (MoL), and derived the analytic expressions for its output mean and variance, as well. We employed these expressions to train the model to be particularly robust against Gaussian attacks without the need for data augmentation. Upon training this network on a loss that is consolidated with the derived output probabilistic moments, the network is not only robust under very high variance Gaussian attacks but is also as robust as networks that are trained with 20 fold data augmentation.
|
2 |
Understanding a Block of Layers in Deep Neural Networks: Optimization, Probabilistic and Tropical Geometric PerspectivesBibi, Adel 04 1900 (has links)
This dissertation aims at theoretically studying a block of layers that is common in al- most all deep learning models. The block of layers of interest is the composition of an affine layer followed by a nonlinear activation that is followed by another affine layer. We study this block from three perspectives. (i) An Optimization Perspective. Is it possible that the output of the forward pass through this block is an optimal solution to a certain convex optimization problem? We show an equivalency between the forward pass through this block and a single iteration of deterministic and stochastic algorithms solving a ten- sor formulated convex optimization problem. As consequence, we derive for the first time a formula for computing the singular values of convolutional layers surpassing the need for the prohibitive construction of the underlying linear operator. Thereafter, we show that several deep networks can have this block replaced with the corresponding optimiza- tion algorithm predicted by our theory resulting in networks with improved generalization performance. (ii) A Probabilistic Perspective. Is it possible to analytically analyze the output of a deep network upon subjecting the input to Gaussian noise? To that regard, we derive analytical formulas for the first and second moments of this block under Gaussian input noise. We demonstrate that the derived expressions can be used to efficiently analyze the output of an arbitrary deep network in addition to constructing Gaussian adversarial attacks surpassing any need for prohibitive data augmentation procedures. (iii) A Tropi- cal Geometry Perspective. Is it possible to characterize the decision boundaries of this block as a geometric structure representing a solution set to a certain class of polynomials
(tropical polynomials)? If so, then, is it possible to utilize this geometric representation of the decision boundaries for novel reformulations to classical computer vision and machine learning tasks on arbitrary deep networks? We show that the decision boundaries of this block are a subset of a tropical hypersurface, which is intimately related to a the polytope that is the convex hull of two zonotopes. We utilize this geometric characterization to shed lights on new perspectives of network pruning.
|
Page generated in 0.091 seconds