Spelling suggestions: "subject:"foments off beural betworks"" "subject:"foments off beural conetworks""
1 |
Studying Perturbations on the Input of Two-Layer Neural Networks with ReLU ActivationAlsubaihi, Salman 07 1900 (has links)
Neural networks was shown to be very susceptible to small and imperceptible
perturbations on its input. In this thesis, we study perturbations on two-layer
piecewise linear networks. Such studies are essential in training neural networks
that are robust to noisy input. One type of perturbations we consider is `1 norm
bounded perturbations. Training Deep Neural Networks (DNNs) that are robust
to norm bounded perturbations, or adversarial attacks, remains an elusive problem.
While verification based methods are generally too expensive to robustly train large
networks, it was demonstrated in [1] that bounded input intervals can be inexpensively
propagated per layer through large networks. This interval bound propagation (IBP)
approach lead to high robustness and was the first to be employed on large networks.
However, due to the very loose nature of the IBP bounds, particularly for large
networks, the required training procedure is complex and involved. In this work, we
closely examine the bounds of a block of layers composed of an affine layer followed
by a ReLU nonlinearity followed by another affine layer. In doing so, we propose
probabilistic bounds, true bounds with overwhelming probability, that are provably
tighter than IBP bounds in expectation. We then extend this result to deeper networks
through blockwise propagation and show that we can achieve orders of magnitudes
tighter bounds compared to IBP. With such tight bounds, we demonstrate that a
simple standard training procedure can achieve the best robustness-accuracy tradeoff
across several architectures on both MNIST and CIFAR10. We, also, consider
Gaussian perturbations, where we build on a previous work that derives the first
and second output moments of a two-layer piecewise linear network [2]. In this work,
we derive an exact expression for the second moment, by dropping the zero mean
assumption in [2].
|
Page generated in 0.083 seconds