• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • 1
  • Tagged with
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Moving Sound Sources Direction of Arrival Classification Using Different Deep Learning Schemes

Rusrus, Jana 19 April 2023 (has links)
Sound source localization is an important task for several applications and the use of deep learning for this task has recently become a popular research topic. While the majority of the previous work has focused on static sound sources, in this work we evaluate the performance of a deep learning classification system for localization of high-speed moving sound sources. In particular, we systematically evaluate the effect of a wide range of parameters at three levels including: data generation (e.g., acoustic conditions), feature extraction (e.g., STFT parameters), and model training (e.g., neural network architectures). We evaluate the performance of multiple metrics in terms of precision, recall, F-score and confusion matrix in a multi-class multi-label classification framework. We used four different deep learning models: feedforward neural networks, recurrent neural network, gated recurrent networks and temporal Convolutional neural network. We showed that (1) the presence of some reverberation in the training dataset can help in achieving better detection for the direction of arrival of acoustic sources, (2) window size does not affect the performance of static sources but highly affects the performance of moving sources, (3) sequence length has a significant effect on the performance of recurrent neural network architectures, (4) temporal convolutional neural networks can outperform both recurrent and feedforward networks for moving sound sources, (5) training and testing on white noise is easier for the network than training on speech data, and (6) increasing the number of elements in the microphone array improves the performance of the direction of arrival estimation.
2

Self-supervised učení v aplikacích počítačového vidění / Self-supervised learning in computer vision applications

Vančo, Timotej January 2021 (has links)
The aim of the diploma thesis is to make research of the self-supervised learning in computer vision applications, then to choose a suitable test task with an extensive data set, apply self-supervised methods and evaluate. The theoretical part of the work is focused on the description of methods in computer vision, a detailed description of neural and convolution networks and an extensive explanation and division of self-supervised methods. Conclusion of the theoretical part is devoted to practical applications of the Self-supervised methods in practice. The practical part of the diploma thesis deals with the description of the creation of code for working with datasets and the application of the SSL methods Rotation, SimCLR, MoCo and BYOL in the role of classification and semantic segmentation. Each application of the method is explained in detail and evaluated for various parameters on the large STL10 dataset. Subsequently, the success of the methods is evaluated for different datasets and the limiting conditions in the classification task are named. The practical part concludes with the application of SSL methods for pre-training the encoder in the application of semantic segmentation with the Cityscapes dataset.
3

Classification of road side material using convolutional neural network and a proposed implementation of the network through Zedboard Zynq 7000 FPGA

Rahman, Tanvir 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, Convolutional Neural Networks (CNNs) have become the state-of- the-art method for object detection and classi cation in the eld of machine learning and arti cial intelligence. In contrast to a fully connected network, each neuron of a convolutional layer of a CNN is connected to fewer selected neurons from the previous layers and kernels of a CNN share same weights and biases across the same input layer dimension. These features allow CNN architectures to have fewer parameters which in turn reduces calculation complexity and allows the network to be implemented in low power hardware. The accuracy of a CNN depends mostly on the number of images used to train the network, which requires a hundred thousand to a million images. Therefore, a reduced training alternative called transfer learning is used, which takes advantage of features from a pre-trained network and applies these features to the new problem of interest. This research has successfully developed a new CNN based on the pre-trained CIFAR-10 network and has used transfer learning on a new problem to classify road edges. Two network sizes were tested: 32 and 16 Neuron inputs with 239 labeled Google street view images on a single CPU. The result of the training gives 52.8% and 35.2% accuracy respectively for 250 test images. In the second part of the research, High Level Synthesis (HLS) hardware model of the network with 16 Neuron inputs is created for the Zynq 7000 FPGA. The resulting circuit has 34% average FPGA utilization and 2.47 Watt power consumption. Recommendations to improve the classi cation accuracy with deeper network and ways to t the improved network on the FPGA are also mentioned at the end of the work.

Page generated in 0.043 seconds