Spelling suggestions: "subject:"learning"" "subject:"learnig""
1 |
Moving Sound Sources Direction of Arrival Classification Using Different Deep Learning SchemesRusrus, Jana 19 April 2023 (has links)
Sound source localization is an important task for several applications and the use of deep learning for this task has recently become a popular research topic. While the majority of the previous work has focused on static sound sources, in this work we evaluate the performance of a deep learning classification system for localization of high-speed moving sound sources. In particular, we systematically evaluate the effect of a wide range of parameters at three levels including: data generation (e.g., acoustic conditions), feature extraction (e.g., STFT parameters), and model training (e.g., neural network architectures). We evaluate the performance of multiple metrics in terms of precision, recall, F-score and confusion matrix in a multi-class multi-label classification framework. We used four different deep learning models: feedforward neural networks, recurrent neural network, gated recurrent networks and temporal Convolutional neural network. We showed that (1) the presence of some reverberation in the training dataset can help in achieving better detection for the direction of arrival of acoustic sources, (2) window size does not affect the performance of static sources but highly affects the performance of moving sources, (3) sequence length has a significant effect on the performance of recurrent neural network architectures, (4)
temporal convolutional neural networks can outperform both recurrent and feedforward networks for moving sound sources, (5) training and testing on white noise is easier for the network than training on speech data, and (6) increasing the number of elements in the microphone array improves the performance of the direction of arrival estimation.
|
2 |
Self-supervised učení v aplikacích počítačového vidění / Self-supervised learning in computer vision applicationsVančo, Timotej January 2021 (has links)
The aim of the diploma thesis is to make research of the self-supervised learning in computer vision applications, then to choose a suitable test task with an extensive data set, apply self-supervised methods and evaluate. The theoretical part of the work is focused on the description of methods in computer vision, a detailed description of neural and convolution networks and an extensive explanation and division of self-supervised methods. Conclusion of the theoretical part is devoted to practical applications of the Self-supervised methods in practice. The practical part of the diploma thesis deals with the description of the creation of code for working with datasets and the application of the SSL methods Rotation, SimCLR, MoCo and BYOL in the role of classification and semantic segmentation. Each application of the method is explained in detail and evaluated for various parameters on the large STL10 dataset. Subsequently, the success of the methods is evaluated for different datasets and the limiting conditions in the classification task are named. The practical part concludes with the application of SSL methods for pre-training the encoder in the application of semantic segmentation with the Cityscapes dataset.
|
3 |
Classification of road side material using convolutional neural network and a proposed implementation of the network through Zedboard Zynq 7000 FPGARahman, Tanvir 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, Convolutional Neural Networks (CNNs) have become the state-of-
the-art method for object detection and classi cation in the eld of machine learning
and arti cial intelligence. In contrast to a fully connected network, each neuron of a
convolutional layer of a CNN is connected to fewer selected neurons from the previous
layers and kernels of a CNN share same weights and biases across the same input layer
dimension. These features allow CNN architectures to have fewer parameters which in
turn reduces calculation complexity and allows the network to be implemented in low
power hardware. The accuracy of a CNN depends mostly on the number of images
used to train the network, which requires a hundred thousand to a million images.
Therefore, a reduced training alternative called transfer learning is used, which takes
advantage of features from a pre-trained network and applies these features to the new
problem of interest. This research has successfully developed a new CNN based on
the pre-trained CIFAR-10 network and has used transfer learning on a new problem
to classify road edges. Two network sizes were tested: 32 and 16 Neuron inputs with
239 labeled Google street view images on a single CPU. The result of the training
gives 52.8% and 35.2% accuracy respectively for 250 test images. In the second part
of the research, High Level Synthesis (HLS) hardware model of the network with 16
Neuron inputs is created for the Zynq 7000 FPGA. The resulting circuit has 34%
average FPGA utilization and 2.47 Watt power consumption. Recommendations to
improve the classi cation accuracy with deeper network and ways to t the improved
network on the FPGA are also mentioned at the end of the work.
|
Page generated in 0.0306 seconds