• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 3
  • 1
  • Tagged with
  • 14
  • 14
  • 13
  • 9
  • 7
  • 7
  • 5
  • 5
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Convolutional Neural Networks on FPGA and GPU on the Edge: A Comparison

Pettersson, Linus January 2020 (has links)
When asked to implement a neural network application, the decision concerning what hardware platform to use may not always be easily made. This thesis studies various relevant platforms with regards to performance, power efficiency and usability, with the purpose of providing a basis for such decisions. The hardware platforms which are studied were a GPU, an FPGA and a CPU. The project implements Convolutional Neural Networks (CNN) on the different hardware platforms using several tools and frameworks. The final implementation uses BNN-PYNQ for the implementation on the FPGA and CPU, which provided ready-to-run code and overlays for quantized CNNs and fully connected neural networks. Next, these networks are copied using TensorFlow, and optimized to FP32, FP16 and INT8 precision using TensorRT for use on the GPU. The results indicate that the FPGA outperforms the GPU with a factor of 100 for the CNN networks, and a factor of 1000 on the fully connected networks with regards to inference speed. For power efficiency, the FPGA again outperforms the GPU. The thesis concludes that for a neural network application, an FPGA is preferred if performance is a priority. However, the GPU proved to have a greater ease of use due to the many tools and frameworks available. If easy implementation and high design flexibility is a priority, a GPU is instead recommended.
2

Performance analysis: CNN model on smartphones versus on cloud : With focus on accuracy and execution time

Klas, Stegmayr, Edwin, Johansson January 2023 (has links)
In the modern digital landscape, mobile devices serve as crucial data generators.Their usage spans from simple communication to various applications such as userbehavior analysis and intelligent applications. However, privacy concerns associatedwith data collection are persistent. Deep learning technologies, specifically Convo-lutional Neural Networks, have been increasingly integrated into mobile applicationsas a promising solution. In this study, we evaluated the performance of a CNN im-plemented on iOS smartphones using the CIFAR-10 data set, comparing the model’saccuracy and execution time before and after conversion for on-device deployment.The overarching objective was not to design the most accurate model but to inves-tigate the feasibility of deploying machine learning models on-device while retain-ing their accuracy. The results revealed that both on-cloud and on-device modelsyielded high accuracy (93.3% and 93.25%, respectively). However, a significantdifference was observed in the total execution time, with the on-device model re-quiring a considerably longer duration (45.64 seconds) than the cloud-based model(4.55 seconds). This study provides insights into the performance of deep learningmodels on iOS smartphones, aiding in understanding their practical applications andlimitations.
3

A COMPARATIVE STUDY OF FFN AND CNN WITHIN IMAGE RECOGNITION : The effects of training and accuracy of different artificial neural network designs

Knutsson, Magnus, Lindahl, Linus January 2019 (has links)
Image recognition and -classification is becoming more important as the need to be able to process large amounts of images is becoming more common. The aim of this thesis is to compare two types of artificial neural networks, FeedForward Network and Convolutional Neural Network, to see how these compare when performing the task of image recognition. Six models of each type of neural network was created that differed in terms of width, depth and which activation function they used in order to learn. This enabled the experiment to also see if these parameters had any effect on the rate which a network learn and how the network design affected the validation accuracy of the models. The models were implemented using the API Keras, and trained and tested using the dataset CIFAR-10. The results showed that within the scope of this experiment the CNN models were always preferable as they achieved a statistically higher validation accuracy compared to their FFN counterparts.
4

Classification of road side material using convolutional neural network and a proposed implementation of the network through Zedboard Zynq 7000 FPGA

Rahman, Tanvir 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, Convolutional Neural Networks (CNNs) have become the state-of- the-art method for object detection and classi cation in the eld of machine learning and arti cial intelligence. In contrast to a fully connected network, each neuron of a convolutional layer of a CNN is connected to fewer selected neurons from the previous layers and kernels of a CNN share same weights and biases across the same input layer dimension. These features allow CNN architectures to have fewer parameters which in turn reduces calculation complexity and allows the network to be implemented in low power hardware. The accuracy of a CNN depends mostly on the number of images used to train the network, which requires a hundred thousand to a million images. Therefore, a reduced training alternative called transfer learning is used, which takes advantage of features from a pre-trained network and applies these features to the new problem of interest. This research has successfully developed a new CNN based on the pre-trained CIFAR-10 network and has used transfer learning on a new problem to classify road edges. Two network sizes were tested: 32 and 16 Neuron inputs with 239 labeled Google street view images on a single CPU. The result of the training gives 52.8% and 35.2% accuracy respectively for 250 test images. In the second part of the research, High Level Synthesis (HLS) hardware model of the network with 16 Neuron inputs is created for the Zynq 7000 FPGA. The resulting circuit has 34% average FPGA utilization and 2.47 Watt power consumption. Recommendations to improve the classi cation accuracy with deeper network and ways to t the improved network on the FPGA are also mentioned at the end of the work.
5

Increasing CNN representational power using absolute cosine value regularization

Singleton, William S. 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The Convolutional Neural Network (CNN) is a mathematical model designed to distill input information into a more useful representation. This distillation process removes information over time through a series of dimensionality reductions, which ultimately, grant the model the ability to resist noise, and generalize effectively. However, CNNs often contain elements that are ineffective at contributing towards useful representations. This Thesis aims at providing a remedy for this problem by introducing Absolute Cosine Value Regularization (ACVR). This is a regularization technique hypothesized to increase the representational power of CNNs by using a Gradient Descent Orthogonalization algorithm to force the vectors that constitute their filters at any given convolutional layer to occupy unique positions in in their respective spaces. This method should in theory, lead to a more effective balance between information loss and representational power, ultimately, increasing network performance. The following Thesis proposes and examines the mathematics and intuition behind ACVR, and goes on to propose Dynamic-ACVR (D-ACVR). This Thesis also proposes and examines the effects of ACVR on the filters of a low-dimensional CNN, as well as the effects of ACVR and D-ACVR on traditional Convolutional filters in VGG-19. Finally, this Thesis proposes and examines regularization of the Pointwise filters in MobileNetv1.
6

Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment

Akash Gaikwad (5931047) 17 January 2019 (has links)
<p>In recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems.</p> <p>This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model. </p> <p>This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model.</p> <p>1: Pruning based on Taylor expansion of change in cost function Delta C.</p> <p>2: Pruning based on L<sub>2</sub> normalization of activation maps.</p> <p>3: Pruning based on a combination of method 1 and method 2.</p><p>The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L<sub>2</sub> normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.</p><p></p>
7

Increasing CNN Representational Power Using Absolute Cosine Value Regularization

William Steven Singleton (8740647) 21 April 2020 (has links)
The Convolutional Neural Network (CNN) is a mathematical model designed to distill input information into a more useful representation. This distillation process removes information over time through a series of dimensionality reductions, which ultimately, grant the model the ability to resist noise, and generalize effectively. However, CNNs often contain elements that are ineffective at contributing towards useful representations. This Thesis aims at providing a remedy for this problem by introducing Absolute Cosine Value Regularization (ACVR). This is a regularization technique hypothesized to increase the representational power of CNNs by using a Gradient Descent Orthogonalization algorithm to force the vectors that constitute their filters at any given convolutional layer to occupy unique positions in R<sup>n</sup>. This method should in theory, lead to a more effective balance between information loss and representational power, ultimately, increasing network performance. The following Thesis proposes and examines the mathematics and intuition behind ACVR, and goes on to propose Dynamic-ACVR (D-ACVR). This Thesis also proposes and examines the effects of ACVR on the filters of a low-dimensional CNN, as well as the effects of ACVR and D-ACVR on traditional Convolutional filters in VGG-19. Finally, this Thesis proposes and examines regularization of the Pointwise filters in MobileNetv1.
8

Hyperparameters relationship to the test accuracy of a convolutional neural network

Lundh, Felix, Barta, Oscar January 2021 (has links)
Machine learning for image classification is a hot topic and it is increasing in popularity. Therefore the aim of this study is to provide a better understanding of convolutional neural network hyperparameters by comparing the test accuracy of convolutional neural network models with different hyperparameter value configurations. The focus of this study is to see whether there is an influence in the learning process depending on which hyperparameter values were used. For conducting the experiments convolutional neural network models were developed using the programming language Python utilizing the library Keras. The dataset used for this study iscifar-10, it includes 60000 colour images of 10 categories ranging from man-made objects to different animal species. Grid search is used for instantiating models with varying learning rate and momentum, width and depth values. Learning rate is only tested combined with momentum and width is only tested combined with depth. Activation functions, convolutional layers and batch size are tested individually. Grid search is compared against Bayesian optimization to see which technique will find the most optimized learning rate and momentum values. Results illustrate that the impact different hyperparameters have on the overall test accuracy varies. Learning rate and momentum affects the test accuracy greatly, however suboptimal values for learning rate and momentum can decrease the test accuracy severely. Activation function, width and depth, convolutional layer and batch size have a lesser impact on test accuracy. Regarding Bayesian optimization compared to grid search, results show that Bayesian optimization will not necessarily find more optimal hyperparameter values.
9

Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment

Gaikwad, Akash S. 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems. This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model. This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model. 1: Pruning based on Taylor expansion of change in cost function Delta C. 2: Pruning based on L2 normalization of activation maps. 3: Pruning based on a combination of method 1 and method 2. The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L2 normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.
10

Efficientnext: Efficientnet For Embedded Systems

Deokar, Abhishek 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Convolutional Neural Networks have come a long way since AlexNet. Each year the limits of the state of the art are being pushed to new levels. EfficientNet pushed the performance metrics to a new high and EfficientNetV2 even more so. Even so, architectures for mobile applications can benefit from improved accuracy and reduced model footprint. The classic Inverted Residual block has been the foundation upon which most mobile networks seek to improve. EfficientNet architecture is built using the same Inverted Residual block. In this thesis we experiment with Harmonious Bottlenecks in place of the Inverted Residuals to observe a reduction in the number of parameters and improvement in accuracy. The designed network is then deployed on the NXP i.MX 8M Mini board for Image classification. / 2023-10-11

Page generated in 0.0166 seconds