Global ETD Search

1	Techniques for Soundscape Retrieval and Synthesis January 2013 (has links) abstract: The study of acoustic ecology is concerned with the manner in which life interacts with its environment as mediated through sound. As such, a central focus is that of the soundscape: the acoustic environment as perceived by a listener. This dissertation examines the application of several computational tools in the realms of digital signal processing, multimedia information retrieval, and computer music synthesis to the analysis of the soundscape. Namely, these tools include a) an open source software library, Sirens, which can be used for the segmentation of long environmental field recordings into individual sonic events and compare these events in terms of acoustic content, b) a graph-based retrieval system that can use these measures of acoustic similarity and measures of semantic similarity using the lexical database WordNet to perform both text-based retrieval and automatic annotation of environmental sounds, and c) new techniques for the dynamic, realtime parametric morphing of multiple field recordings, informed by the geographic paths along which they were recorded. / Dissertation/Thesis / Ph.D. Computer Science 2013 Computer science Music Acoustics acoustic ecology environmental sound retrieval soundscape synthesis
2	Classifying Environmental Sounds with Image Networks Boddapati, Venkatesh January 2017 (has links) Context. Environmental Sound Recognition, unlike Speech Recognition, is an area that is still in the developing stages with respect to using Deep Learning methods. Sound can be converted into images by extracting spectrograms and the like. Object Recognition from images using deep Convolutional Neural Networks is a currently developing area holding high promise. The same technique has been studied and applied, but on image representations of sound. Objectives. In this study, investigation is done to determine the best possible accuracy of performing a sound classification task using existing deep Convolutional Neural Networks by comparing the data pre-processing parameters. Also, a novel method of combining different features into a single image is proposed and its effect tested. Lastly, the performance of an existing network that fuses Convolutional and Recurrent Neural architectures is tested on the selected datasets. Methods. In this, experiments were conducted to analyze the effects of data pre-processing parameters on the best possible accuracy with two CNNs. Also, experiment was also conducted to determine whether the proposed method of feature combination is beneficial or not. Finally, an experiment to test the performance of a combined network was conducted. Results. GoogLeNet had the highest classification accuracy of 73% on 50-class dataset and 90-93% on 10-class datasets. The sampling rate and frame length values of the respective datasets which contributed to the high scores are 16kHz, 40ms and 8kHz, 50ms respectively. The proposed combination of features does not improve the classification accuracy. The fused CRNN network could not achieve high accuracy on the selected datasets. Conclusions. It is concluded that deep networks designed for object recognition can be successfully used to classify environmental sounds and the pre-processing parameters’ values determined for achieving best accuracy. The novel method of feature combination does not significantly improve the accuracy when compared to spectrograms alone. The fused network which learns the special and temporal features from spectral images performs poorly in the classification task when compared to the convolutional network alone. Machine Learning Environmental Sound Classification Image Classification. Computer Sciences Datavetenskap (datalogi)
3	Jiná místa / Other places Jáchim, Jan January 2013 (has links) Walk through a fictional dreamed space. Audiovisual installation combines surround sound and parallel text commentary. Using the imagination of sound illusion, projected text and absence of image, it aims to motivate the spactator's dreaming.
4	Compare Accuracy of Alternative Methods for Sound Classification on Environmental Sounds of Similar Characteristics Rudberg, Olov January 2022 (has links) Artificial neural networks have in the last decade been a vital tool in image recognition, signal processing and speech recognition. Because of these networks' ability to be highly flexible, they suit a vast amount of different data. This flexible attribute is very sought for within the field of environmental sound classification. This thesis seeks to investigate if audio from three types of water usage can be distinguished and classified. The usage types investigated are handwashing, showering and WC-flushing. The data originally consisted of sound recordings in WAV format. The recordings were converted into spectrograms, which are visual representations of audio signals. Two neural networks are addressed for this image classification issue, namely a Multilayer Perceptron (MLP) and a Convolutional Neural Network (CNN). Further, these spectrograms are subject to both image preprocessing using a Sobel filter, a Canny edge detector and a Gabor filter while also being subjected to data augmentation by applying different brightness and zooming alterations. The result showed that the CNN gave superior results compared to the MLP. The image preprocessing techniques did not improve the data and the model performances, neither did augmentation or a combination between them. An important finding was that constructing the convolutional and pooling filters of the CNN into rectangular shapes and using every other filter type horizontally and vertically on the input spectrogram gave superior results. It seemed to capture more information of the spectrograms since spectrograms mainly contain information in a horizontal or vertical direction. This model achieved 91.14% accuracy. The result stemming from this model architecture further contributes to the environmental sound classification community. / <p>Masters thesis approved 20th june 2022.</p> Machine Learning Algorithms Neural Networks Computer Vision Image Recognition Environmental Sound Classification Data Augmentation Probability Theory and Statistics Sannolikhetsteori och statistik Mathematics Matematik Computer Sciences Datavetenskap (datalogi)
5	Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition Johnson, Eric Martin 27 September 2022 (has links) No description available. Acoustics Audiology Behavioral Sciences Artificial Intelligence Computer Engineering Health Sciences Communication speech perception time-frequency masking noise reduction hearing impairment environmental sound identification environmental sound recognition masking speech recognition speech intelligibility speech in noise speech enhancement deep learning attention attentive recurrent network deep neural network divided attention acoustics
6	Sensing the environment : development of monitoring aids for persons with profound deafness or deafblindness Ranjbar, Parivash January 2009 (has links) Earlier studies of persons with deafness (D) and/or deafblindness (DB) have primarily focused on the mobility and communication problems. The purpose of the present study was to develop technology for monitoring aids to improve the ability of persons with D and/or DB to detect, identify, and perceive direction of events that produce sounds in their surroundings. The purpose was achieved stepwise in four studies. In Study I, the focus was on hearing aids for persons with residual low frequency hearing. In Study II-IV, the focus was on vibratory aids for persons with total D. In Study I, six signal processing algorithms (calculation methods) based on two principles, transposition and modulation, were developed and evaluated regarding auditory identification of environmental sounds. Twenty persons with normal hearing listened to 45 environmental sounds processed with the six different algorithms and identified them in three experiments. In Exp. 1, the sounds were unknown and the subjects had to identify them freely. In Exp. 2 and 3, the sounds were known and the subjects had to identify them by choosing one of 45 sounds. The transposing algorithms showed better results (median value in Exp. 3, 64%-69%) than the modulating algorithms (40%-52%) did, and they were good candidates for implementing in a hearing aid for persons with residual low frequency hearing. In Study II, eight algorithms were developed based on three principles, transposition, modulation, and filtration – in addition to No Processing as reference, and evaluated for vibratory identification of environmental sounds. The transposing algorithms and the modulating algorithms were also adapted to the vibratory thresholds of the skin. Nineteen persons with profound D tested the algorithms using a stationary, wideband vibrator and identified them by choosing one of 10 randomly selected from the list of 45 sounds. One transposing algorithm and two modulating algorithms showed better (p<0.05) scores than did the No Processing method. Two transposing and three modulating algorithms showed better (p<0.05) scores than did the filtering algorithm. Adaptation to the vibratory thresholds of the skin did not improve the vibratory identification results. In Study III, the two transposing algorithms and the three modulating algorithms with the best identification scores in Study II, plus their adapted alternative, were evaluated in a laboratory study. Five persons from Study II with profound D tested the algorithms using a portable narrowband vibrator and identified the sounds by choosing one of 45 sounds in three experiments (Exp. 1, 2, and 3). In Exp. 1, the sounds were pre-processed and directly fed to the vibrator. In Exp. 2 and 3, the sounds were presented in an acoustic test room, without or with background noise (SNR=+5 dB), and processed in real time. Five of the algorithms had acceptable results (27%-41%) in the three experiments and constitute candidates for a miniaturized vibratory aid (VA). The algorithms had the same rank order in both tests in the acoustic room (Exp. 2, and 3), and the noise did not worsen the identification results. In Study IV, the portable vibrotactile monitoring aid (with stationary processor) for detection, identification and directional perception of environmental sounds was evaluated in a field study. The same five persons with profound D as in Study III tested the aid using a randomly chosen algorithm, drawn from the five with the best results in Study III, in a home and in a traffic environment. The persons identified 12 events at home and five events in a traffic environment when they were inexperienced (the events were unknown) and later when they were experienced (the events were known). The VA consistently improved the ability with regard to detection, identification and directional perception of environmental sounds for all five persons. It is concluded that the selected algorithms improve the ability to detect, and identify sound emitting events. In future, the algorithms will be implemented in a low frequency hearing aid for persons with low frequency residual hearing or in a fully portable vibratory monitoring aid, for persons with profound D or DB to improve their ability to sense the environment. Auditive identification Deaf Deafblind Directional perception Environmental sound Filtering Frequency transposing Hearing aid Hearing impairment Modulating Monitoring Tactile perception Vibratory aid Vibratory identification Annan elektroteknik och elektronik Engineering and Technology Teknik och teknologier

1

Page generated in 0.0966 seconds