Spelling suggestions: "subject:"pruning anda compression"" "subject:"pruning anda 8compression""
1 |
Optimizing Deep Neural Networks Performance: Efficient Techniques For Training and InferenceSharma, Ankit 01 January 2023 (has links) (PDF)
Recent advances in computer vision tasks are mainly due to the success of large deep neural networks. The current state-of-the-art models have high computational costs during inference and suffer from a high memory footprint. Therefore, deploying these large networks on edge devices remains a serious concern. Furthermore, training these over-parameterized networks is computationally expensive and requires a longer training time. Thus, there is a demand to develop techniques that can efficiently reduce training costs and also be able to deploy neural networks on mobile and embedded devices. This dissertation presents practices like designing a lightweight network architecture and increasing network resource utilization. These solutions improve the efficiency of large networks during training and inference.
We first propose an efficient micro-architecture (slim modules) to construct a light-weight Slim-CNN to predicting face attributes. Slim modules uses depthwise separable convolutions with pointwise convolutions, making them computationally efficient for embedded applications. Next, we investigate the problem of obtaining a compact pruned model from an untrained original network in a single-stage process. We introduce our RAPID framework that distills knowledge to a pruned student model from a teacher model under online settings. Next, we analyze the phenomena of inactive channels in a trained neural network. We take a deep dive into the gradient updates of these channels and discover that these channels have no weight update after a few early epochs. Thus, we present our channel regeneration technique that reinitializes batch normalization gamma values of all inactive channels. The gradient updates of these channels improve after the regeneration step, resulting in an increase in the contribution of these channels to the network performance.
Finally, we introduce a method to improve computational efficiency in pre-trained vision transformers by reducing redundancy in visual data. Our method selects image windows or regions with high objectness measures, as these regions may contain an object of any class. Across all works in this dissertation, we extensively evaluate our proposed methods and demonstrate that our techniques improve the computational efficiency of deep neural networks during training and inference.
|
Page generated in 0.1252 seconds