Global ETD Search

1	Informed algorithms for sound source separation in enclosed reverberant environments Khan, Muhammad Salman January 2013 (has links) While humans can separate a sound of interest amidst a cacophony of contending sounds in an echoic environment, machine-based methods lag behind in solving this task. This thesis thus aims at improving performance of audio separation algorithms when they are informed i.e. have access to source location information. These locations are assumed to be known a priori in this work, for example by video processing. Initially, a multi-microphone array based method combined with binary time-frequency masking is proposed. A robust least squares frequency invariant data independent beamformer designed with the location information is utilized to estimate the sources. To further enhance the estimated sources, binary time-frequency masking based post-processing is used but cepstral domain smoothing is required to mitigate musical noise. To tackle the under-determined case and further improve separation performance at higher reverberation times, a two-microphone based method which is inspired by human auditory processing and generates soft time-frequency masks is described. In this approach interaural level difference, interaural phase difference and mixing vectors are probabilistically modeled in the time-frequency domain and the model parameters are learned through the expectation-maximization (EM) algorithm. A direction vector is estimated for each source, using the location information, which is used as the mean parameter of the mixing vector model. Soft time-frequency masks are used to reconstruct the sources. A spatial covariance model is then integrated into the probabilistic model framework that encodes the spatial characteristics of the enclosure and further improves the separation performance in challenging scenarios i.e. when sources are in close proximity and when the level of reverberation is high. Finally, new dereverberation based pre-processing is proposed based on the cascade of three dereverberation stages where each enhances the twomicrophone reverberant mixture. The dereverberation stages are based on amplitude spectral subtraction, where the late reverberation is estimated and suppressed. The combination of such dereverberation based pre-processing and use of soft mask separation yields the best separation performance. All methods are evaluated with real and synthetic mixtures formed for example from speech signals from the TIMIT database and measured room impulse responses. 621.382
2	Time-Frequency Masking Performance for Improved Intelligibility with Microphone Arrays Morgan, Joshua P. 01 January 2017 (has links) Time-Frequency (TF) masking is an audio processing technique useful for isolating an audio source from interfering sources. TF masking has been applied and studied in monaural and binaural applications, but has only recently been applied to distributed microphone arrays. This work focuses on evaluating the TF masking technique's ability to isolate human speech and improve speech intelligibility in an immersive "cocktail party" environment. In particular, an upper-bound on TF masking performance is established and compared to the traditional delay-sum and general sidelobe canceler (GSC) beamformers. Additionally, the novel technique of combining the GSC with TF masking is investigated and its performance evaluated. This work presents a resource-efficient method for studying the performance of these isolation techniques and evaluates their performance using both virtually simulated data and data recorded in a real-life acoustical environment. Further, methods are presented to analyze speech intelligibility post-processing, and automated objective intelligibility measurements are applied alongside informal subjective assessments to evaluate the performance of these processing techniques. Finally, the causes for subjective/objective intelligibility measurement disagreements are discussed, and it was shown that TF masking did enhance intelligibility beyond delay-sum beamforming and that the utilization of adaptive beamforming can be beneficial. Distributed Microphones Cocktail Party Time-Frequency Masking Beamforming Adaptive Beamforming Intelligibili Signal Processing
3	Implementation and Evaluation of Single Filter Frequency Masking Narrow-Band High-Speed Recursive Digital Filters / Implementering och utvärdering av smalbandiga rekursiva digitala frekvensmaskningsfilter för hög hastighet med identiska subfilter Mohsén, Mikael January 2003 (has links) In this thesis two versions of a single filter frequency masking narrow-band high-speed recursive digital filter structure, proposed in [1], have been implemented and evaluated considering the maximal clock frequency, the maximal sample frequency and the power consumption. The structures were compared to a conventional filter structure, that was also implemented. The aim was to see if the proposed structure had some benefits when implemented and synthesized, not only in theory. For the synthesis standard cells from AMS csx 0.35 mm CMOS technology were used. Electronics digital filters frequency masking high-speed low-power voltage scaling carry-save Elektronik Electronics Elektronik
4	Auditory Front-Ends for Noise-Robust Automatic Speech Recognition Yeh, Ja-Zang 25 August 2010 (has links) The human auditory perception system is much more noise-robust than any state-of the art automatic speech recognition (ASR) system. It is expected that the noise-robustness of speech feature can be improved by employing the human auditory based feature extraction procedure. In this thesis, we investigate modifying the commonly-used feature extraction process for automatic speech recognition systems. A novel frequency masking curve, which is based on modeling the basilar membrane as a cascade system of damped simple harmonic oscillators, is used to replace the critical-band masking curve to compute the masking threshold. We mathematically analyze the coupled motion of the oscillator system (basilar membrane) when they are driven by short-time stationary (speech) signals. Based on the analysis, we derive the relation between the amplitudes of neighboring oscillators, and accordingly insert a masking module in the front-end signal processing stage to modify the speech spectrum. We evaluate the proposed method on the Aurora 2.0 noisy-digit speech database. When combined with the commonly-used cepstral mean subtraction post-processing, the proposed auditory front-end module achieves a significant improvement. The method of correlational masking effect curve combine with CMS can achieves relative improvements of 25.9% over the baseline respectively. After applying the methods iteratively, the relative improvement improves from 25.9% to 30.3%. frequency masking front end processing feature extraction noise-robust speech recognition
5	Implementation and Evaluation of Single Filter Frequency Masking Narrow-Band High-Speed Recursive Digital Filters / Implementering och utvärdering av smalbandiga rekursiva digitala frekvensmaskningsfilter för hög hastighet med identiska subfilter Mohsén, Mikael January 2003 (has links) <p>In this thesis two versions of a single filter frequency masking narrow-band high-speed recursive digital filter structure, proposed in [1], have been implemented and evaluated considering the maximal clock frequency, the maximal sample frequency and the power consumption. The structures were compared to a conventional filter structure, that was also implemented. The aim was to see if the proposed structure had some benefits when implemented and synthesized, not only in theory. For the synthesis standard cells from AMS csx 0.35 mm CMOS technology were used.</p> Electronics digital filters frequency masking high-speed low-power voltage scaling carry-save Elektronik Electronics Elektronik
6	Deep learning methods for reverberant and noisy speech enhancement Zhao, Yan 15 September 2020 (has links) No description available. Computer Science Engineering Deep neural networks Supervised learning Attention Speech enhancement Speech denoising Speech dereverberation Time-frequency masking Speech intelligibility Speech quality Computational auditory scene analysis
7	Supervised Speech Separation Using Deep Neural Networks Wang, Yuxuan 21 May 2015 (has links) No description available. Computer Science Engineering Speech separation time-frequency masking computational auditory scene analysis acoustic features deep neural networks training targets generalization speech intelligibility speech quality
8	Variable acoustics in multi-functional stadiums / Variabel akustik i multi-funktionsarenor Vernersson, Felix January 2022 (has links) This paper handles the background theory, methods and results of the master thesis project titled "Variable acoustics in multi-functional stadiums". \\ The purpose of the project was to investigate whether variable room acoustics could be applicable to large multi-functional stadiums to improve their ability to adapt the soundscape in the stadium for different types of events. The two events which were analyzed during the project was electrically amplified concerts and ice hockey matches. \\ The paper starts by going over relevant acoustical and psycho-acoustical parameters and concepts as well as giving a few examples on already existing multi-functional stadiums including their acoustical strengths and weaknesses towards the two types of events. The report concludes that reflections are of the utmost importance for both types of events, especially early-arriving reflection with great magnitudes. At concerts, these are wished to be repressed while at hockey-matches, the early reflections should be amplified and increased in quantity to give the crowd a better feedback from the stadium increasing the supporters ability to create a loud and intense atmosphere. \\ Gallon fabric, aluminum and plexi-glass was tested in the MWL-laboratory in order to assess the materials reflective capabilities as the idea was to use these materials as reflectors during the hockey-matches. The results showed close to full reflection across the entire spectrum for aluminum and plexi-glass while the gallon fabric showed great reflective capabilities for the higher frequencies while letting the lower frequencies pass through the material. \\ The effects off the reflectors on the soundscape was simulated in a fictional stadium which was built in the modelling software SketchUp using the simulation software ODEON. The results showed great promise as the reflectors gave a great increase in the early reflections. As for the concerts, rolling-curtains which can easily be mounted and removed was added around the walls of the stadium while the reflectors was removed. This solution also showed great results during the simulations as the early reflections was now suppressed instead of magnified. / Denna uppsats behandlar bekgrundsteori, metodik och resultat från examensarbetet titulerat "Variabel rumsakustik i multi-funktions arenor".\\ Syftet med projektet var att undersöka huruvida variabel rumsakustik skulle kunna tillämpas på stora multi-funktions arenor för att förbättra dess förmåga att anpass sin ljudbild för olika typer av evenemang. Projektet riktar sig mot elförstärkta konserter och ishockey-matcher. \\ Uppsatsen börjar med att gå igenom relevanta akustiska och psyko-akustiska parametrar och begrepp för att sedan ge några exempel på redan existerande multi-funktions arenors akustiska styrkor och svagheter vid de bägge typerna av evenemang. Rapporten drar slutsatsen att reflektioner är av yttersta vikt vid bägge fallen, särkiljt de som är tidigt anländande och av hög magnitud. Under konserter önskas dessa att dämpas medans man vid ishockey-matcher önskar att förstärka dessa och öka dess antal för att ge publiken en starkare akustisk återkoppling från arenan och underlätta för supportrarna att skapa en högljudd och intensiv atmosfär. \\ Galonstyg, aluminium och plexiglas testades i MWL-laboratoriet för att bedöma dess reflekterande förmågor då idén var att använda dessa material som reflektorer under ishockey-matcherna. Resultaten visade nära full reflektion över hela spektrat för de aluminiumet och plexi-glaset medan galonstyget visade stora reflketerande egenskaper vid högre frekvenser samtidigt som det tillät de lägre frekvenserna passera genom materialet. \\ Reflektorernas effekt på ljudbilden simulerades i en påhittat arena som byggdes i moddeleringsprogrammet SketchUp med hjälp av simuleringsprogrammet ODEON. Resultaten var mycket lovande då en stor ökning sågs hos de tidiga reflexerna, både i kvantitet och kvalitet. För konserterna användes istället ljudabsorberande rullgardiner längs arenans väggar medans reflektorerna togs bort. Simuleringsresultaten visade nu istället en markant minsking i tidiga reflexer. Variable acoustics Reflections Gain Reverberation Clarity Frequency masking Perception of reflected sounds Variabel akustik Reflektioner Rumsförstärkning Efterklang Klarhet Frekvensmaskering Uppfattning av reflekterade ljud Vehicle Engineering Farkostteknik
9	Nedourčená slepá separace zvukových signálů / Underdetermined Blind Audio Signal Separation Čermák, Jan January 2008 (has links) We often have to face the fact that several signals are mixed together in unknown environment. The signals must be first extracted from the mixture in order to interpret them correctly. This problem is in signal processing society called blind source separation. This dissertation thesis deals with multi-channel separation of audio signals in real environment, when the source signals outnumber the sensors. An introduction to blind source separation is presented in the first part of the thesis. The present state of separation methods is then analyzed. Based on this knowledge, the separation systems implementing fuzzy time-frequency mask are introduced. However these methods are still introducing nonlinear changes in the signal spectra, which can yield in musical noise. In order to reduce musical noise, novel methods combining time-frequency binary masking and beamforming are introduced. The new separation system performs linear spatial filtering even if the source signals outnumber the sensors. Finally, the separation systems are evaluated by objective and subjective tests in the last part of the thesis.
10	Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition Johnson, Eric Martin 27 September 2022 (has links) No description available. Acoustics Audiology Behavioral Sciences Artificial Intelligence Computer Engineering Health Sciences Communication speech perception time-frequency masking noise reduction hearing impairment environmental sound identification environmental sound recognition masking speech recognition speech intelligibility speech in noise speech enhancement deep learning attention attentive recurrent network deep neural network divided attention acoustics

Search results