Return to search

Regularization, Uncertainty Estimation and Out of Distribution Detection in Convolutional Neural Networks

Classification is an important task in the field of machine learning and when classifiers are trained on images, a variety of problems can surface during inference. 1) Recent trends of using convolutional neural networks (CNNs) for various machine learning tasks has borne many successes and CNNs are surprisingly expressive in their learning ability due to a large number of parameters and numerous stacked layers in the CNNs. This increased model complexity also increases the risk of overfitting to the training data. Increasing the size of the training data using synthetic or artificial means (data augmentation) helps CNNs learn better by reducing the amount of over-fitting and producing a regularization effect to improve generalization of the learned model. 2) CNNs have proven to be very good classifiers and generally localize objects well; however, the loss functions typically used to train classification CNNs do not penalize inability to localize an object, nor do they take into account an object's relative size in the given image when producing confidence measures. 3) Convolutional neural networks always output in the space of the learnt classes with high confidence while predicting the class of a given image regardless of what the image consists of. For example an ImageNet-1K trained CNN can not say if the given image has no objects that it was trained on if it is provided with an image of a dinosaur (not an ImageNet category) or if the image has the main object cut out of it (context only). We approach these three different problems using bounding box information and learning to produce high entropy predictions on out of distribution classes.

To address the first problem, we propose a novel regularization method called CopyPaste. The idea behind our approach is that images from the same class share similar context and can be 'mixed' together without affecting the labels. We use bounding box annotations that are available for a subset of ImageNet images. We consistently outperform the standard baseline and explore the idea of combining our approach with other recent regularization methods as well. We show consistent performance gains on PASCAL VOC07, MS-COCO and ImageNet datasets.

For the second problem we employ objectness measures to learn meaningful CNN predictions. Objectness is a measure of likelihood of an object from any class being present in a given image. We present a novel approach to object localization that combines the ideas of objectness and label smoothing during training. Unlike previous methods, we compute a smoothing factor that is adaptive based on relative object size within an image.

We present extensive results using ImageNet and OpenImages to demonstrate that CNNs trained using adaptive label smoothing are much less likely to be overconfident in their predictions, as compared to CNNs trained using hard targets. We train CNNs using objectness computed from bounding box annotations that are available for the ImageNet dataset and the OpenImages dataset. We perform extensive experiments with the aim of improving the ability of a classification CNN to learn better localizable features and show object detection performance improvements, calibration and classification performance on standard datasets. We also show qualitative results using class activation maps to illustrate the improvements.

Lastly, we extend the second approach to train CNNs with images belonging to out of distribution and context using a uniform distribution of probability over the set of target classes for such images. This is a novel way to use uniform smooth labels as it allows the model to learn better confidence bounds. We sample 1000 classes (mutually exclusive to the 1000 classes in ImageNet-1K) from the larger ImageNet dataset comprising about 22K classes. We compare our approach with standard baselines and provide entropy and confidence plots for in distribution and out of distribution validation sets. / Doctor of Philosophy / Categorization is an important task in everyday life. Humans can perform the task of classifying objects effortlessly in pictures. Machines can also be trained to classify objects in images. With the tremendous growth in the area of artificial intelligence, machines have surpassed human performance for some tasks. However, there are plenty of challenges for artificial neural networks. Convolutional Neural Networks (CNNs) are a type of artificial neural networks. 1) Sometimes, CNNs simply memorize the samples provided during training and fail to work well with images that are slightly different from the training samples. 2) CNNs have proven to be very good classifiers and generally localize objects well; however, the objective functions typically used to train classification CNNs do not penalize inability to localize an object, nor do they take into account an object's relative size in the given image. 3) Convolutional neural networks always produce an output in the space of the learnt classes with high confidence while predicting the class of a given image regardless of what the image consists of. For example, an ImageNet-1K (a popular dataset) trained CNN can not say if the given image has no objects that it was trained on if it is provided with an image of a dinosaur (not an ImageNet category) or if the image has the main object cut out of it (images with background only).

We approach these three different problems using object position information and learning to produce low confidence predictions on out of distribution classes.

To address the first problem, we propose a novel regularization method called CopyPaste. The idea behind our approach is that images from the same class share similar context and can be 'mixed' together without affecting the labels. We use bounding box annotations that are available for a subset of ImageNet images. We consistently outperform the standard baseline and explore the idea of combining our approach with other recent regularization methods as well. We show consistent performance gains on PASCAL VOC07, MS-COCO and ImageNet datasets.

For the second problem we employ objectness measures to learn meaningful CNN predictions. Objectness is a measure of likelihood of an object from any class being present in a given image. We present a novel approach to object localization that combines the ideas of objectness and label smoothing during training. Unlike previous methods, we compute a smoothing factor that is adaptive based on relative object size within an image.

We present extensive results using ImageNet and OpenImages to demonstrate that CNNs trained using adaptive label smoothing are much less likely to be overconfident in their predictions, as compared to CNNs trained using hard targets. We train CNNs using objectness computed from bounding box annotations that are available for the ImageNet dataset and the OpenImages dataset. We perform extensive experiments with the aim of improving the ability of a classification CNN to learn better localizable features and show object detection performance improvements, calibration and classification performance on standard datasets. We also show qualitative results to illustrate the improvements.

Lastly, we extend the second approach to train CNNs with images belonging to out of distribution and context using a uniform distribution of probability over the set of target classes for such images. This is a novel way to use uniform smooth labels as it allows the model to learn better confidence bounds. We sample 1000 classes (mutually exclusive to the 1000 classes in ImageNet-1K) from the larger ImageNet dataset comprising about 22K classes. We compare our approach with standard baselines on `in distribution' and `out of distribution' validation sets.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/99953
Date11 September 2020
CreatorsKrothapalli, Ujwal K.
ContributorsElectrical and Computer Engineering, Abbott, A. Lynn, Acar, Pinar, Jones, Creed F. III, Zhu, Yunhui, Zeng, Haibo
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0021 seconds