Global ETD Search

1	A computer model of auditory stream segregation Beauvois, Michael W. January 1991 (has links) A simple computer model is described that takes a novel approach to the problem of accounting for perceptual coherence among successive pure tones of changing frequency by using simple physiological principles that operate at a peripheral, rather than a central level. The model is able to reproduce a number of streaming phenomena found in the literature using the same parameter values. These are: (1) the build-up of streaming over time; (2) the temporal coherence and fission boundaries of human listeners; (3) the ambiguous region; and (4) the trill threshold. In addition, the principle of excitation integration used in the model can be used to account for auditory grouping on the basis of the Gestalt perceptual principles of closure, proximity, continuity, and good continuation, as well as the pulsation threshold. The examples of Gestalt auditory grouping accounted for by the excitation integration principle indicate that the predictive power of the model would be considerably enhanced by the addition of a cross-channel grouping mechanism that worked on the basis of common on sets and offsets, as more complex stimuli could then be processed by the model. 534 Sound source separation
2	An investigation of the combustive sound source McNeese, Andrew Reed 23 December 2010 (has links) This thesis describes the development and testing of the Combustive Sound Source (CSS), which is a broadband underwater sound source. The CSS is being developed as a clean, safe, and cost effective replacement to underwater explosive charges, which exhibit an inherent danger to marine life and researchers using the charges. The basic operation of the CSS is as follows. A combustible mixture of gas is held below the surface of the water in a combustion chamber and ignited with an electric spark. A combustion wave propagates through the mixture and converts the fuel and oxidizer into a bubble of combustion products, which expands due to an increase in temperature, and then ultimately collapses to a smaller volume than before ignition, producing a high intensity, low frequency acoustic signal. The thesis begins by discussing the background, history, and purpose of developing the CSS. It continues by describing the current apparatus and the essential components and convenient features added to the latest mechanical design. The general operation is discussed along with a description of an experiment conducted to determine the acoustic output and robustness of the current CSS. The results of this experiment are presented in terms of the effect of volume, ignition depth, oxidizing gas, combustion chamber size, and repeatability of acoustic signatures. Discussion of apparatus robustness is presented to suggest improvements for future CSS designs. / text Underwater acoustics Combustive Sound Source CSS Underwater sound source
3	Μελέτη και ανάπτυξη συστήματος εντοπισμού θέσης ηχητικής πηγής Κρεμαστιώτης, Ηρακλής, Ρίνη, Χαρίκλεια 30 April 2014 (has links) Στην παρούσα Διπλωματική εργασία μελετήθηκε, σχεδιάστηκε και αναπτύχθηκε ένα σύστημα εντοπισμού θέσης ηχητικής πηγής σε μία επιφάνεια δύο διαστάσεων. Η πηγή που χρησιμοποιήθηκε ήταν ένα ηχείο τοποθετημένο σε αυτή την επιφάνεια, μέσω του οποίου εκπέμπονταν ακουστικές συχνότητες. Σχεδιάστηκε και κατασκευάστηκε το απαιτούμενο υλικό, το οποίο αποτελείται από τρεις πλακέτες μικροφώνων οι οποίες συνδέονται σε μία κεντρική πλακέτα που εξυπηρετεί την επικοινωνία με τον ηλεκτρονικό υπολογιστή. Η επεξεργασία των σημάτων γίνεται με τη βοήθεια του λογισμικού που αναπτύχθηκε, το οποίο εμφανίζει σε πραγματικό χρόνο τις πληροφορίες για την κίνηση της πηγής. Το σύστημα αυτό μπορεί να εφαρμοστεί για τη μετατροπή μιας επιφάνειας (πχ του παραδοσιακού πίνακα κιμωλίας) σε έναν χαμηλού κόστους διαδραστικό πίνακα που θα έχει τη δυνατότητα να καταγράφει ηλεκτρονικά τα δεδομένα την ώρα που γράφονται. / In this Diploma Thesis a position detection system for a sound source on a two-dimensional coordinate system was studied, designed and implemented. The sound source was a loudspeaker positioned on the surface on which the sound frequencies were transmitted. All the required hardware was designed and constructed, including three microphone boards connected to a main board which handled communications with a computer. Signal processing was carried out using purpose-built software, which displays in real time information on the motion of the sound source. This system can be used to transform a surface (such as a blackboard or a whiteboard) into a low cost interactive board, adding the ability to electronically record the data while it is being created. Ηχητική πηγή Εντοπισμός 620.21 Sound source Detection
4	Moving Sound Sources Direction of Arrival Classification Using Different Deep Learning Schemes Rusrus, Jana 19 April 2023 (has links) Sound source localization is an important task for several applications and the use of deep learning for this task has recently become a popular research topic. While the majority of the previous work has focused on static sound sources, in this work we evaluate the performance of a deep learning classification system for localization of high-speed moving sound sources. In particular, we systematically evaluate the effect of a wide range of parameters at three levels including: data generation (e.g., acoustic conditions), feature extraction (e.g., STFT parameters), and model training (e.g., neural network architectures). We evaluate the performance of multiple metrics in terms of precision, recall, F-score and confusion matrix in a multi-class multi-label classification framework. We used four different deep learning models: feedforward neural networks, recurrent neural network, gated recurrent networks and temporal Convolutional neural network. We showed that (1) the presence of some reverberation in the training dataset can help in achieving better detection for the direction of arrival of acoustic sources, (2) window size does not affect the performance of static sources but highly affects the performance of moving sources, (3) sequence length has a significant effect on the performance of recurrent neural network architectures, (4) temporal convolutional neural networks can outperform both recurrent and feedforward networks for moving sound sources, (5) training and testing on white noise is easier for the network than training on speech data, and (6) increasing the number of elements in the microphone array improves the performance of the direction of arrival estimation. Direction of arrival Deep learninig Sound source localization
5	Research of noise and vibration analysis for structures involving transfer path and sound source / 伝達経路および音源を有する構造物に対する振動・騒音解析に関する研究 / デンタツケイロオヨビオンゲンオユウスルコウゾウブツニタイスルシンドウ・ソウオンカイセキニカンスルケンキュウ / 伝達経路および音源を有する構造物に対する振動騒音解析に関する研究ヘララディンヒルミビン, Hilmi Bin Hela Ladin 22 March 2016 (has links) 本論文は，従来よりも信頼性や効率が良く実用的に構造物の振動・騒音を低減することを目指して，実験および解析技術に対する新たなアプローチを確立することを目的としている．そのために，統計的エネルギー解析法と伝達経路解析法を統合することで，両者の入力を相互利用可能な手法を提案し，構造物の振動・騒音の低減を支援するための，実験および解析手法を構築した． / In this thesis, we have established new theoretical approaches as well as some basic practical applications in the development of noise and vibration analysis for structures involving transfer path and sound source from airborne noise and structure-borne noise. These new approaches were extracted from the existing experimental and analysis technique of noise and vibration for structures, which will improve their efficiency and reliability for noise and vibration reduction on industrial machineries as well as other machines. / 博士(工学) / Doctor of Philosophy in Engineering / 同志社大学 / Doshisha University Noise and Vibration Transfer Pass Analysis Sound Source
6	Σύστημα εντοπισμού ηχητικής πηγής Κεττένης, Χρίστος 16 June 2011 (has links) Στη παρούσα Διπλωματική εργασία μελετήθηκε, σχεδιάστηκε και υλοποιήθηκε ένα σύστημα εντοπισμού θέσης ηχητικής πηγής. Συγκεκριμένα, αυτή η πηγή μπορεί να είναι ο κτύπος των δακτύλων ενός χρήστη πάνω στην επιφάνεια ενός τραπεζιού, καθώς “πληκτρολογεί”, ή ο ήχος ενός μολυβιού ή μίας κιμωλίας που παράγεται από το γράψιμο σε χαρτί ή σε έναν πίνακα, αντιστοίχως. Στόχος της εφαρμογής αυτής είναι η μετατροπή της επιφάνεια ενός γραφείου ή ενός πίνακα σε ένα “φτηνό” αλλά αποδοτικό ηλεκτρονικό μέσο εισόδου γραφικών ή εγγραφής, δηλαδή σε μία ηλεκτρονική ταμπλέτα. Τελικώς, αυτή η εργασία επικεντρώνεται στα προβλήματα τα οποία μειώνουν την ακρίβεια αυτού του εντοπισμού και επιπλέον σχεδιάστηκε και κατασκευάστηκε το υλικό και το λογισμικό που απαιτείται για την υποστήριξη του προαναφερθέντος συστήματος. / In this diploma Thesis a sound source positioning system, was studied, designed and implemented. The sound, in particular, could be produced by "tapping a keyboard" onto a table surface or it is the noise produced while writing with a pencil or a piece of chalk, onto paper or on the surface of a blackboard, respectively. The aim of this application is to transform the surface of a desk or a blackboard into a "cheap" but effective electronic input device, in other, words an electronic tablet. Lastly, this Thesis is focused on the problems causing the reduction of accuracy in estimating the posistion of the acoustic source and also on the design and construction of the hardware and software that support the produced system. Ηχητική πηγή 620.21 Sound source Sound source monitoring
7	Harmonic Sound Source Separation in Monaural Music Signals Goel, Priyank January 2013 (has links) (PDF) Sound Source Separation refers to separating sound signals according to their sources from a given observed sound. It is efficient to code and very easy to analyze and manipulate sounds from individual sources separately than in a mixture. This thesis deals with the problem of source separation in monaural recordings of harmonic musical instruments. A good amount of literature is surveyed and presented since sound source separation has been tried by many researchers over many decades through various approaches. A prediction driven approach is first presented which is inspired by old-plus-new heuristic used by humans for Auditory Scene Analysis. In this approach, the signals from different sources are predicted using a general model and then these predictions are reconciled with observed sound to get the separated signal. This approach failed for real world sound recordings in which the spectrum of the source signals change very dynamically. Considering the dynamic nature of the spectrums, an approach which uses covariance matrix of amplitudes of harmonics is proposed. The overlapping and non-overlapping harmonics of the notes are first identified with the knowledge of pitch of the notes. The notes are matched on the basis of their covariance profiles. The second order properties of overlapping harmonics of a note are estimated with the use of co-variance matrix of a matching note. The full harmonic is then reconstructed using these second order characteristics. The technique has performed well over sound samples taken from RWC musical Instrument database. Sound Source Separation Harmonic Musical Instruments Harmonic Sound Source Seperation Monaural Music Signals Sinusoidal Modeling Monaural Sound Source Seperation Auditory Scene Analysis Monaural Musical Recordings Communication Engineering
8	Informed algorithms for sound source separation in enclosed reverberant environments Khan, Muhammad Salman January 2013 (has links) While humans can separate a sound of interest amidst a cacophony of contending sounds in an echoic environment, machine-based methods lag behind in solving this task. This thesis thus aims at improving performance of audio separation algorithms when they are informed i.e. have access to source location information. These locations are assumed to be known a priori in this work, for example by video processing. Initially, a multi-microphone array based method combined with binary time-frequency masking is proposed. A robust least squares frequency invariant data independent beamformer designed with the location information is utilized to estimate the sources. To further enhance the estimated sources, binary time-frequency masking based post-processing is used but cepstral domain smoothing is required to mitigate musical noise. To tackle the under-determined case and further improve separation performance at higher reverberation times, a two-microphone based method which is inspired by human auditory processing and generates soft time-frequency masks is described. In this approach interaural level difference, interaural phase difference and mixing vectors are probabilistically modeled in the time-frequency domain and the model parameters are learned through the expectation-maximization (EM) algorithm. A direction vector is estimated for each source, using the location information, which is used as the mean parameter of the mixing vector model. Soft time-frequency masks are used to reconstruct the sources. A spatial covariance model is then integrated into the probabilistic model framework that encodes the spatial characteristics of the enclosure and further improves the separation performance in challenging scenarios i.e. when sources are in close proximity and when the level of reverberation is high. Finally, new dereverberation based pre-processing is proposed based on the cascade of three dereverberation stages where each enhances the twomicrophone reverberant mixture. The dereverberation stages are based on amplitude spectral subtraction, where the late reverberation is estimated and suppressed. The combination of such dereverberation based pre-processing and use of soft mask separation yields the best separation performance. All methods are evaluated with real and synthetic mixtures formed for example from speech signals from the TIMIT database and measured room impulse responses. 621.382
9	Multichannel audio processing for speaker localization, separation and enhancement Martí Guerola, Amparo 29 October 2013 (has links) This thesis is related to the field of acoustic signal processing and its applications to emerging communication environments. Acoustic signal processing is a very wide research area covering the design of signal processing algorithms involving one or several acoustic signals to perform a given task, such as locating the sound source that originated the acquired signals, improving their signal to noise ratio, separating signals of interest from a set of interfering sources or recognizing the type of source and the content of the message. Among the above tasks, Sound Source localization (SSL) and Automatic Speech Recognition (ASR) have been specially addressed in this thesis. In fact, the localization of sound sources in a room has received a lot of attention in the last decades. Most real-word microphone array applications require the localization of one or more active sound sources in adverse environments (low signal-to-noise ratio and high reverberation). Some of these applications are teleconferencing systems, video-gaming, autonomous robots, remote surveillance, hands-free speech acquisition, etc. Indeed, performing robust sound source localization under high noise and reverberation is a very challenging task. One of the most well-known algorithms for source localization in noisy and reverberant environments is the Steered Response Power - Phase Transform (SRP-PHAT) algorithm, which constitutes the baseline framework for the contributions proposed in this thesis. Another challenge in the design of SSL algorithms is to achieve real-time performance and high localization accuracy with a reasonable number of microphones and limited computational resources. Although the SRP-PHAT algorithm has been shown to be an effective localization algorithm for real-world environments, its practical implementation is usually based on a costly fine grid-search procedure, making the computational cost of the method a real issue. In this context, several modifications and optimizations have been proposed to improve its performance and applicability. An effective strategy that extends the conventional SRP-PHAT functional is presented in this thesis. This approach performs a full exploration of the sampled space rather than computing the SRP at discrete spatial positions, increasing its robustness and allowing for a coarser spatial grid that reduces the computational cost required in a practical implementation with a small hardware cost (reduced number of microphones). This strategy allows to implement real-time applications based on location information, such as automatic camera steering or the detection of speech/non-speech fragments in advanced videoconferencing systems. As stated before, besides the contributions related to SSL, this thesis is also related to the field of ASR. This technology allows a computer or electronic device to identify the words spoken by a person so that the message can be stored or processed in a useful way. ASR is used on a day-to-day basis in a number of applications and services such as natural human-machine interfaces, dictation systems, electronic translators and automatic information desks. However, there are still some challenges to be solved. A major problem in ASR is to recognize people speaking in a room by using distant microphones. In distant-speech recognition, the microphone does not only receive the direct path signal, but also delayed replicas as a result of multi-path propagation. Moreover, there are multiple situations in teleconferencing meetings when multiple speakers talk simultaneously. In this context, when multiple speaker signals are present, Sound Source Separation (SSS) methods can be successfully employed to improve ASR performance in multi-source scenarios. This is the motivation behind the training method for multiple talk situations proposed in this thesis. This training, which is based on a robust transformed model constructed from separated speech in diverse acoustic environments, makes use of a SSS method as a speech enhancement stage that suppresses the unwanted interferences. The combination of source separation and this specific training has been explored and evaluated under different acoustical conditions, leading to improvements of up to a 35% in ASR performance. / Martí Guerola, A. (2013). Multichannel audio processing for speaker localization, separation and enhancement [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/33101 Sound source localization Sound source separation SRP-PHAT Microphone array Speaker detection Automatic speech recognition. TEORIA DE LA SEÑAL Y COMUNICACIONES
10	Kinect em conjunto com o SRP-PHAT como solução de localização de fonte sonora Seewald, Lucas Adams 28 March 2014 (has links) Submitted by Maicon Juliano Schmidt (maicons) on 2015-07-06T18:20:56Z No. of bitstreams: 1 Lucas Adams Seewald.pdf: 2650183 bytes, checksum: b48d406145d4e90aaf15d30b38b2ccbc (MD5) / Made available in DSpace on 2015-07-06T18:20:56Z (GMT). No. of bitstreams: 1 Lucas Adams Seewald.pdf: 2650183 bytes, checksum: b48d406145d4e90aaf15d30b38b2ccbc (MD5) Previous issue date: 2014-01-31 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / PROSUP - Programa de Suporte à Pós-Gradução de Instituições de Ensino Particulares / Este documento apresenta uma avaliação de aplicabilidade do Kinect em conjunto com o SPR-PHAT como solução de Localização de Fonte Sonora. Um protótipo capaz de se comunicar com o aparelho e executar SRP-PHAT foi implementado com a finalidade de testar a precisão da solução. É realizada uma revisão dos fundamentos da Localização de Fonte Sonora e seus princípios matemáticos, com foco específico no SRP-PHAT. Seguindo para o Kinect, são realizadas algumas considerações a respeito de seus componentes e limitações. São apresentados alguns trabalhos que recorrem ao aparelho para localizar fontes sonoras, seguidos de resultados de precisão do SRP-PHAT obtidos por diferentes autores. Foram realizados dois grupos de experimentos, um voltado para as características da fonte sonora e o outro para a qualidade da solução proposta. Os experimentos incluem localização em duas e três dimensões, utilizando dois Kinects no segundo caso. As particularidades de implementação do programa que manipula os Kinects e executa o algoritmo de localização são fornecidas juntamente com descrições dos procedimentos de teste adotados. Os resultados apresentados mostram que a solução é capaz de apontar com precisão para a direção da fonte. / This document presents an evaluation of Kinect together with SRP-PHAT as a Sound Source Localization solution. A functional prototype able to communicate with the device and perform SRP-PHAT was implemented in order to test the solution’s accuracy. The fundamentals of Sound Source Localization and it’s mathematical principles are reviewed, focusing specifically on the SRP-PHAT. Moving on to the Kinect device, some considerations are made about it’s components and limitations. Related work which resources to Kinects source localization capabilities is presented, followed by SRP-PHAT precision test results attained by different authors. Two experimental sets were conducted, one focused on the source signal properties and the other on measuring the proposed solutions quality. Performed experiments comprehend two dimensional and three dimensional localization, being a second Kinect needed for the latter. Implementation aspects concerning the software responsible for manipulating both Kinects and executing the localization algorithm are described along with experimental procedure details. Presented results show that the proposed solution can accurately point at the sources direction. SRP-PHAT Localização de Fonte Sonora Kinect Sound source localization

Search results