Return to search

Compare Accuracy of Alternative Methods for Sound Classification on Environmental Sounds of Similar Characteristics

Artificial neural networks have in the last decade been a vital tool in image recognition, signal processing and speech recognition. Because of these networks' ability to be highly flexible, they suit a vast amount of different data. This flexible attribute is very sought for within the field of environmental sound classification. This thesis seeks to investigate if audio from three types of water usage can be distinguished and classified. The usage types investigated are handwashing, showering and WC-flushing. The data originally consisted of sound recordings in WAV format. The recordings were converted into spectrograms, which are visual representations of audio signals. Two neural networks are addressed for this image classification issue, namely a Multilayer Perceptron (MLP) and a Convolutional Neural Network (CNN). Further, these spectrograms are subject to both image preprocessing using a Sobel filter, a Canny edge detector and a Gabor filter while also being subjected to data augmentation by applying different brightness and zooming alterations. The result showed that the CNN gave superior results compared to the MLP. The image preprocessing techniques did not improve the data and the model performances, neither did augmentation or a combination between them. An important finding was that constructing the convolutional and pooling filters of the CNN into rectangular shapes and using every other filter type horizontally and vertically on the input spectrogram gave superior results. It seemed to capture more information of the spectrograms since spectrograms mainly contain information in a horizontal or vertical direction. This model achieved 91.14% accuracy. The result stemming from this model architecture  further contributes to the environmental sound classification community. / <p>Masters thesis approved 20th june 2022.</p>

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:su-208203
Date January 2022
CreatorsRudberg, Olov
PublisherStockholms universitet, Statistiska institutionen
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0024 seconds