Global ETD Search

31	Redução adaptativa de eco e de ruído para terminais viva-voz. / Speech enhancement and acoustic echo cancellation for hands-free sets. André Horácio Camargo Carezia 09 August 2002 (has links) Há um grande interesse hoje em desenvolver terminais viva-voz que permitam aos participantes de uma conversa à distância contarem com um bom grau de naturalidade e inteligibilidade. O objetivo deste trabalho é apresentar solução para dois impedimentos que surgem quando se deseja projetar um terminal viva-voz para ser utilizado em automóveis: o eco acústico resultante do acoplamento entre microfone e alto-falante do terminal; e o ruído ambiente produzido por exemplo pelo vento, pneus e motor do veículo. A solução proposta envolve o uso de filtros adaptativos e alterações no espectro do sinal de voz para minimizar os problemas mencionados. Os aspectos teóricos são abordados de forma breve, sem deixar no entanto que nenhum detalhe importante fique de fora. Uma implementação prática e eficiente em processador digital de sinais é um dos destaques do trabalho. / There is currently great motivation in developing hands-free devices which offer users, engaged in a telephone conversation, a good level of naturalness and intelligibility. In this work, the goal is to present a solution for two well-known problems that occur when designing a hands-free device for use in automobile environments: (1) the acoustic echo coupling between microphone and speaker, and (2) the background noise generated for example by wind, tires and vehicle engine. The proposed solution includes adaptive filtering techniques and modifications in the speech signal spectrum, in order to minimize the two problems above. Theoretical issues are briefly analyzed, however the author believes no relevant detail is kept out. Highlighted in the report is a practical and efficient implementation of the algorithms in a modern digital signal processor. Read more algoritmos adaptativos cancelamento de eco acústico supressão de ruído para sinais de voz acoustic echo cancellation adaptive filters speech enhancement
32	Improved speech communication in a car / Förbättrad komunikation i bil Nygren, Mårten January 2003 (has links) <p>In modern cars a lot of effort is put on reducing the background noise level. Despite these efforts it is often difficult for persons in the rear seat(s) to hear the persons in the front seat. This is partly due to the background noise, but also geometry and acoustics properties of the passenger compartment. </p><p>The aim of this thesis was to implement a speech enhancement system to increase the audibility between the driver and the rear passenger(s). The speech enhancement system should not affect the directivity of the speech or increase the background noise level. </p><p>A speech enhancement system has been implemented on a DSP in a test car. A microphone was placed in front of the driver to collect his/her speech. The microphone signal was bandpass filtered to remove the main part of the background noise and to avoid aliasing. The signal was delayed before it was sent out in the rear loudspeaker. The delay made the speech from the driver reaching the rear passenger before the sound the rear loudspeakers. This delay was enough to get the right directivity of the sound, i.e. making speech sounding as if it came from the driver instead of the rear loudspeakers. </p><p>In the thesis other methods to reduce background noise and get directivity of the sound were evaluated, but not implemented in the test car. The evaluations of the system showed that the audibility was increased. At the same time the background noise level was not noticeable increased. The work has been performed at A2 Acoustics AB in Linköping, during spring 2003.</p> Read more Reglerteknik Automobile Speech enhancement Microphone array Communication Bil Tal förstärkning Mikrofon array Kommunikation Reglerteknik Automatic control Reglerteknik
33	Single-Channel Multiple Regression for In-Car Speech Enhancement ITAKURA, Fumitada, TAKEDA, Kazuya, ITOU, Katsunobu, LI, Weifeng 01 March 2006 (has links) No description available. K-means clustering environmental adaptation pairwise preference test mean opinion score multi-layer perceptron speech recognition speech enhancement
34	Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement ITAKURA, Fumitada, TAKEDA, Kazuya, DAT, Tran Huy 01 March 2006 (has links) No description available. log-spectral magnitude power spectral magnitude MAP MMSE fourth-order moment gamma modeling speech recognition speech enhancement
35	Improved speech communication in a car / Förbättrad komunikation i bil Nygren, Mårten January 2003 (has links) In modern cars a lot of effort is put on reducing the background noise level. Despite these efforts it is often difficult for persons in the rear seat(s) to hear the persons in the front seat. This is partly due to the background noise, but also geometry and acoustics properties of the passenger compartment. The aim of this thesis was to implement a speech enhancement system to increase the audibility between the driver and the rear passenger(s). The speech enhancement system should not affect the directivity of the speech or increase the background noise level. A speech enhancement system has been implemented on a DSP in a test car. A microphone was placed in front of the driver to collect his/her speech. The microphone signal was bandpass filtered to remove the main part of the background noise and to avoid aliasing. The signal was delayed before it was sent out in the rear loudspeaker. The delay made the speech from the driver reaching the rear passenger before the sound the rear loudspeakers. This delay was enough to get the right directivity of the sound, i.e. making speech sounding as if it came from the driver instead of the rear loudspeakers. In the thesis other methods to reduce background noise and get directivity of the sound were evaluated, but not implemented in the test car. The evaluations of the system showed that the audibility was increased. At the same time the background noise level was not noticeable increased. The work has been performed at A2 Acoustics AB in Linköping, during spring 2003. Read more Reglerteknik Automobile Speech enhancement Microphone array Communication Bil Tal förstärkning Mikrofon array Kommunikation Reglerteknik Automatic control Reglerteknik
36	Speech Enhancement Utilizing Phase Continuity Between Consecutive Analysis Windows Mehmetcik, Erdal 01 September 2011 (has links) (PDF) It is commonly accepted that the induced noise on DFT phase spectrum has a negligible effect on speech intelligibility for short durations of analysis windows, as the early intelligibility studies pointed out. This fact is confirmed by recent intelligibility studies as well. Based on this phenomenon, classical speech enhancement algorithms do not modify DFT phase spectrum and only make changes in the DFT magnitude spectrum. However, in recent studies it is also indicated that these classical speech enhancement algorithms are not capable of improving the intelligibility scores of noise degraded speech signals. In other words, the contained information in a noise degraded signal cannot be increased by classical enhancement methods. Instead the ease of listening, i.e. quality, can be improved. Hence additional effort can be made to increase the amount of quality improvement using both DFT magnitude and DFT phase. Therefore if the performances of the classical methods are to be improved in terms of speech quality, the effect of DFT phase on speech quality needs to be studied. In this work, the contribution of DFT phase on speech quality is investigated through some simulations using an objective quality assessment criterion. It is concluded from these simulations that, the phase spectrum has a significant effect on speech quality for short durations of analysis windows. Furthermore, phase values of low frequency components are found to have the largest contribution to this quality improvement. Under the motivation of these results, a new enhancement method is proposed which modifies the phase of certain low frequency components as well as the magnitude spectrum. The proposed algorithm is implemented in MATLAB environment. The results indicate that the proposed system improves the performance of the classical methods in terms of speech quality. Read more
37	Probabilistic space maps for speech with applications Kalgaonkar, Kaustubh 22 August 2011 (has links) The objective of the proposed research is to develop a probabilistic model of speech production that exploits the multiplicity of mapping between the vocal tract area functions (VTAF) and speech spectra. Two thrusts are developed. In the first, a latent variable model that captures uncertainty in estimating the VTAF from speech data is investigated. The latent variable model uses this uncertainty to generate many-to-one mapping between observations of the VTAF and speech spectra. The second uses the probabilistic model of speech production to improve the performance of traditional speech algorithms, such as enhancement, acoustic model adaptation, etc. In this thesis, we propose to model the process of speech production with a probability map. This proposed model treats speech production as a probabilistic process with many-to-one mapping between VTAF and speech spectra. The thesis not only outlines a statistical framework to generate and train these probabilistic models from speech, but also demonstrates its power and flexibility with such applications as enhancing speech from both perceptual and recognition perspectives. Automatic bandwidth expansion Probabilistic space maps Statistical models Acoustic model adaptation Speech enhancement Speech perception Speech processing systems
38	Nie-destruktiewe klankonttrekking, restourasie en spraakverheffing van Edison-fonograafsilinders Van der Westhuizen, Ewald 12 1900 (has links) Thesis (MScEng)--University of Stellenbosch, 2003. / ENGLISH ABSTRACT: Two non-destructive methods of extracting audio from Edison phonographic cylinders were investigated. A recording device with high accuracy positioning was designed and manufactured. A microscopic image method was investigated first. Surface images of the cylinder were obtained using a webcamera. An audio signal was then extracted from the width modulation. Results were not pleasing as echoes caused by intergroove modulation were perceptable. The audio also lacked resolution. The true modulation of the audio is not embedded in the width, but in the depth of the groove. The second audio extraction method involved using a laser pick-up from a compact disc player to measure the depth of the groove. Three laser recording methods were investigated. The first was forward recording, that measured the depth modulation in the recording direction of the groove. The second method, backward recording, was identical to forward recording with the mechanical system moving in reverse. Four recordings from different positions in the groove were combined to create an audio signal. This combination of recordings showed a substantial improvement in the signal-to-noise ratio. A third recording method, transverse recording, that measured the whole depth profile of the groove was also investigated. The groove profile was then processed to an audio signal. A manual audio restoration program was written to replace visible sections of distorted data with better interpolations. Two speech enhancement methods were investigated, the first being the most commonly used speech enhancement method for digital audio restoration, Short-Time Spectral Attenuation (STSA). The second method is based on linear predictive coefficient (LPC) estimation of short-time frames. The two methods were evaluated by means of listening tests. The LPC enhancement method was preferred because it enhanced the intelligibility of the speech. / AFRIKAANSE OPSOMMING: Twee nie-destruktiewe metodes om klank van Edison-fonograafsilinders te onttrek, is ondersoek. 'n Opneemtoestel, wat die silinders met baie hoë akkuraatheid posisioneer, IS ontwerp en vervaardig. 'n Mikroskopiese beeldrnetode IS as eerste klankonttrekkingsmetode ondersoek. Mikroskopiese beelde is met 'n webkamera van die silinderoppervlak geneem. Klank is vanuit die wydtemodulasie sigbaar in die beelde onttrek. Resultate was nie bevredigend nie weens groefintermodulasie-eggo's en 'n tekort aan resolusie. Die ware modulasie van die klank is nie in die wydte van die groefie gegraveer nie, maar in die diepte. Die tweede klankonttrekkingsmetode gebruik 'n aangepaste lasersensor van 'n CD-speler om die dieptemodulasie van die groefie te meet. Drie laseropneemmetodes is ondersoek. Die eerste is voorwaartse opname, wat die dieptemodulasie in die opneemrigting van die groefie meet. 'n Tweede opneemmetode, truwaartse opname, is identies aan voorwaartse opname, behalwe dat die meganiese stelsel in trurat beweeg. Vier opnames vanuit verskillende posisies in die groefbreedte is gekombineer om 'n klanksein te vorm. Die kombinasie van vier opnames toon 'n beduidende verbetering op die sein-tot-ruis-verhouding. Dit het aanleiding gegee tot die derde opneemmetode, dwarsskandering, wat die hele profiel van die groef meet. Die groefprofiel word dan verwerk tot 'n klanksein. 'n Handoudiorestourasieprogram is geskryf om sigbare verwringing in die klanksein met beter interpolasies te vervang. Twee spraakverheffingsmetodes is ondersoek. Short- Time Spectral Attenuation (STSA) is die mees gebruikte metode vir oudiorestourasie. 'n Tweede spraakverheffingsmetode wat van 'n lineêre voorspellingskoëffisiëntafskatting (LPC-afskatting) van korttydraampies gebruik maak, is ook toegepas. Die twee metodes is deur luistertoetse teen mekaar opgeweeg. Die LPCmetode is verkies aangesien dit die verstaanbaarheid van die spraak beter behoue laat bly. Read more Phonocylinders Sound recordings Sound -- Recording and reproducing Dissertations -- Electronic engineering Speech enhancement Audio restoration Audio extraction Theses -- Electronic engineering
39	Odstraňování šumu pomocí neuronových sítí s cyklickou konzistencí / Speech Enhancement with Cycle-Consistent Neural Networks Karlík, Pavol January 2020 (has links) Hlboké neurónové siete sa bežne používajú v oblasti odstraňovania šumu. Trénovací proces neurónovej siete je možné rožšíriť využitím druhej neurónovej siete, ktorej cieľom je vložiť šum do čistej rečovej nahrávky. Tieto dve siete sa môžu spolu využiť k rekonštrukcii pôvodných čistých a zašumených nahrávok. Táto práca skúma efektivitu tejto techniky, zvanej cyklická konzistencia. Cyklická konzistencia zlepšuje robustnosť neurónovej siete bez toho, aby sa daná sieť akokoľvek modifikovala, nakoľko vystavuje sieť na odstraňovanie šumu rôznorodejšiemu množstvu zašumených dát. Avšak, táto technika vyžaduje trénovacie dáta skladajúce sa z párov vstupných a referenčných nahrávok. Tieto dáta niesu vždy dostupné. Na trénovanie modelov s nepárovanými dátami využívame generatívne neurónové siete s cyklickou konzistenciou. V tejto práci sme vykonali veľké množstvo experimentov s modelmi trénovanými na párovaných a nepárovaných dátach. Naše výsledky ukazujú, že využitie cyklickej konzistencie výrazne zlepšuje výkonnosť modelov.
40	Robust Audio Scene Analysis for Rescue Robots / レスキューロボットのための頑健な音環境理解 Bando, Yoshiaki 26 March 2018 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21209号 / 情博第662号 / 新制\|\|情\|\|114(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授河原達也, 教授鹿島久嗣, 教授田中利幸, 講師吉井和佳 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Audio signal processing Multi-modal signal processing Rescue robotics Speech enhancement Posture estimation Hose-shaped rescue robot 007

Search results