• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 32
  • 7
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 57
  • 57
  • 16
  • 13
  • 11
  • 11
  • 11
  • 10
  • 10
  • 10
  • 10
  • 8
  • 8
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Improved speech communication in a car / Förbättrad komunikation i bil

Nygren, Mårten January 2003 (has links)
<p>In modern cars a lot of effort is put on reducing the background noise level. Despite these efforts it is often difficult for persons in the rear seat(s) to hear the persons in the front seat. This is partly due to the background noise, but also geometry and acoustics properties of the passenger compartment. </p><p>The aim of this thesis was to implement a speech enhancement system to increase the audibility between the driver and the rear passenger(s). The speech enhancement system should not affect the directivity of the speech or increase the background noise level. </p><p>A speech enhancement system has been implemented on a DSP in a test car. A microphone was placed in front of the driver to collect his/her speech. The microphone signal was bandpass filtered to remove the main part of the background noise and to avoid aliasing. The signal was delayed before it was sent out in the rear loudspeaker. The delay made the speech from the driver reaching the rear passenger before the sound the rear loudspeakers. This delay was enough to get the right directivity of the sound, i.e. making speech sounding as if it came from the driver instead of the rear loudspeakers. </p><p>In the thesis other methods to reduce background noise and get directivity of the sound were evaluated, but not implemented in the test car. The evaluations of the system showed that the audibility was increased. At the same time the background noise level was not noticeable increased. The work has been performed at A2 Acoustics AB in Linköping, during spring 2003.</p>
32

Single-Channel Multiple Regression for In-Car Speech Enhancement

ITAKURA, Fumitada, TAKEDA, Kazuya, ITOU, Katsunobu, LI, Weifeng 01 March 2006 (has links)
No description available.
33

Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement

ITAKURA, Fumitada, TAKEDA, Kazuya, DAT, Tran Huy 01 March 2006 (has links)
No description available.
34

Improved speech communication in a car / Förbättrad komunikation i bil

Nygren, Mårten January 2003 (has links)
In modern cars a lot of effort is put on reducing the background noise level. Despite these efforts it is often difficult for persons in the rear seat(s) to hear the persons in the front seat. This is partly due to the background noise, but also geometry and acoustics properties of the passenger compartment. The aim of this thesis was to implement a speech enhancement system to increase the audibility between the driver and the rear passenger(s). The speech enhancement system should not affect the directivity of the speech or increase the background noise level. A speech enhancement system has been implemented on a DSP in a test car. A microphone was placed in front of the driver to collect his/her speech. The microphone signal was bandpass filtered to remove the main part of the background noise and to avoid aliasing. The signal was delayed before it was sent out in the rear loudspeaker. The delay made the speech from the driver reaching the rear passenger before the sound the rear loudspeakers. This delay was enough to get the right directivity of the sound, i.e. making speech sounding as if it came from the driver instead of the rear loudspeakers. In the thesis other methods to reduce background noise and get directivity of the sound were evaluated, but not implemented in the test car. The evaluations of the system showed that the audibility was increased. At the same time the background noise level was not noticeable increased. The work has been performed at A2 Acoustics AB in Linköping, during spring 2003.
35

Speech Enhancement Utilizing Phase Continuity Between Consecutive Analysis Windows

Mehmetcik, Erdal 01 September 2011 (has links) (PDF)
It is commonly accepted that the induced noise on DFT phase spectrum has a negligible effect on speech intelligibility for short durations of analysis windows, as the early intelligibility studies pointed out. This fact is confirmed by recent intelligibility studies as well. Based on this phenomenon, classical speech enhancement algorithms do not modify DFT phase spectrum and only make changes in the DFT magnitude spectrum. However, in recent studies it is also indicated that these classical speech enhancement algorithms are not capable of improving the intelligibility scores of noise degraded speech signals. In other words, the contained information in a noise degraded signal cannot be increased by classical enhancement methods. Instead the ease of listening, i.e. quality, can be improved. Hence additional effort can be made to increase the amount of quality improvement using both DFT magnitude and DFT phase. Therefore if the performances of the classical methods are to be improved in terms of speech quality, the effect of DFT phase on speech quality needs to be studied. In this work, the contribution of DFT phase on speech quality is investigated through some simulations using an objective quality assessment criterion. It is concluded from these simulations that, the phase spectrum has a significant effect on speech quality for short durations of analysis windows. Furthermore, phase values of low frequency components are found to have the largest contribution to this quality improvement. Under the motivation of these results, a new enhancement method is proposed which modifies the phase of certain low frequency components as well as the magnitude spectrum. The proposed algorithm is implemented in MATLAB environment. The results indicate that the proposed system improves the performance of the classical methods in terms of speech quality.
36

Probabilistic space maps for speech with applications

Kalgaonkar, Kaustubh 22 August 2011 (has links)
The objective of the proposed research is to develop a probabilistic model of speech production that exploits the multiplicity of mapping between the vocal tract area functions (VTAF) and speech spectra. Two thrusts are developed. In the first, a latent variable model that captures uncertainty in estimating the VTAF from speech data is investigated. The latent variable model uses this uncertainty to generate many-to-one mapping between observations of the VTAF and speech spectra. The second uses the probabilistic model of speech production to improve the performance of traditional speech algorithms, such as enhancement, acoustic model adaptation, etc. In this thesis, we propose to model the process of speech production with a probability map. This proposed model treats speech production as a probabilistic process with many-to-one mapping between VTAF and speech spectra. The thesis not only outlines a statistical framework to generate and train these probabilistic models from speech, but also demonstrates its power and flexibility with such applications as enhancing speech from both perceptual and recognition perspectives.
37

Nie-destruktiewe klankonttrekking, restourasie en spraakverheffing van Edison-fonograafsilinders

Van der Westhuizen, Ewald 12 1900 (has links)
Thesis (MScEng)--University of Stellenbosch, 2003. / ENGLISH ABSTRACT: Two non-destructive methods of extracting audio from Edison phonographic cylinders were investigated. A recording device with high accuracy positioning was designed and manufactured. A microscopic image method was investigated first. Surface images of the cylinder were obtained using a webcamera. An audio signal was then extracted from the width modulation. Results were not pleasing as echoes caused by intergroove modulation were perceptable. The audio also lacked resolution. The true modulation of the audio is not embedded in the width, but in the depth of the groove. The second audio extraction method involved using a laser pick-up from a compact disc player to measure the depth of the groove. Three laser recording methods were investigated. The first was forward recording, that measured the depth modulation in the recording direction of the groove. The second method, backward recording, was identical to forward recording with the mechanical system moving in reverse. Four recordings from different positions in the groove were combined to create an audio signal. This combination of recordings showed a substantial improvement in the signal-to-noise ratio. A third recording method, transverse recording, that measured the whole depth profile of the groove was also investigated. The groove profile was then processed to an audio signal. A manual audio restoration program was written to replace visible sections of distorted data with better interpolations. Two speech enhancement methods were investigated, the first being the most commonly used speech enhancement method for digital audio restoration, Short-Time Spectral Attenuation (STSA). The second method is based on linear predictive coefficient (LPC) estimation of short-time frames. The two methods were evaluated by means of listening tests. The LPC enhancement method was preferred because it enhanced the intelligibility of the speech. / AFRIKAANSE OPSOMMING: Twee nie-destruktiewe metodes om klank van Edison-fonograafsilinders te onttrek, is ondersoek. 'n Opneemtoestel, wat die silinders met baie hoë akkuraatheid posisioneer, IS ontwerp en vervaardig. 'n Mikroskopiese beeldrnetode IS as eerste klankonttrekkingsmetode ondersoek. Mikroskopiese beelde is met 'n webkamera van die silinderoppervlak geneem. Klank is vanuit die wydtemodulasie sigbaar in die beelde onttrek. Resultate was nie bevredigend nie weens groefintermodulasie-eggo's en 'n tekort aan resolusie. Die ware modulasie van die klank is nie in die wydte van die groefie gegraveer nie, maar in die diepte. Die tweede klankonttrekkingsmetode gebruik 'n aangepaste lasersensor van 'n CD-speler om die dieptemodulasie van die groefie te meet. Drie laseropneemmetodes is ondersoek. Die eerste is voorwaartse opname, wat die dieptemodulasie in die opneemrigting van die groefie meet. 'n Tweede opneemmetode, truwaartse opname, is identies aan voorwaartse opname, behalwe dat die meganiese stelsel in trurat beweeg. Vier opnames vanuit verskillende posisies in die groefbreedte is gekombineer om 'n klanksein te vorm. Die kombinasie van vier opnames toon 'n beduidende verbetering op die sein-tot-ruis-verhouding. Dit het aanleiding gegee tot die derde opneemmetode, dwarsskandering, wat die hele profiel van die groef meet. Die groefprofiel word dan verwerk tot 'n klanksein. 'n Handoudiorestourasieprogram is geskryf om sigbare verwringing in die klanksein met beter interpolasies te vervang. Twee spraakverheffingsmetodes is ondersoek. Short- Time Spectral Attenuation (STSA) is die mees gebruikte metode vir oudiorestourasie. 'n Tweede spraakverheffingsmetode wat van 'n lineêre voorspellingskoëffisiëntafskatting (LPC-afskatting) van korttydraampies gebruik maak, is ook toegepas. Die twee metodes is deur luistertoetse teen mekaar opgeweeg. Die LPCmetode is verkies aangesien dit die verstaanbaarheid van die spraak beter behoue laat bly.
38

Odstraňování šumu pomocí neuronových sítí s cyklickou konzistencí / Speech Enhancement with Cycle-Consistent Neural Networks

Karlík, Pavol January 2020 (has links)
Hlboké neurónové siete sa bežne používajú v oblasti odstraňovania šumu. Trénovací proces neurónovej siete je možné rožšíriť využitím druhej neurónovej siete, ktorej cieľom je vložiť šum do čistej rečovej nahrávky. Tieto dve siete sa môžu spolu využiť k rekonštrukcii pôvodných čistých a zašumených nahrávok. Táto práca skúma efektivitu tejto techniky, zvanej cyklická konzistencia. Cyklická konzistencia zlepšuje robustnosť neurónovej siete bez toho, aby sa daná sieť akokoľvek modifikovala, nakoľko vystavuje sieť na odstraňovanie šumu rôznorodejšiemu množstvu zašumených dát. Avšak, táto technika vyžaduje trénovacie dáta skladajúce sa z párov vstupných a referenčných nahrávok. Tieto dáta niesu vždy dostupné. Na trénovanie modelov s nepárovanými dátami využívame generatívne neurónové siete s cyklickou konzistenciou. V tejto práci sme vykonali veľké množstvo experimentov s modelmi trénovanými na párovaných a nepárovaných dátach. Naše výsledky ukazujú, že využitie cyklickej konzistencie výrazne zlepšuje výkonnosť modelov.
39

Robust Audio Scene Analysis for Rescue Robots / レスキューロボットのための頑健な音環境理解

Bando, Yoshiaki 26 March 2018 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21209号 / 情博第662号 / 新制||情||114(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 河原 達也, 教授 鹿島 久嗣, 教授 田中 利幸, 講師 吉井 和佳 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
40

Transfer learning approaches for feature denoising and low-resource speech recognition

Bagchi, Deblin 10 September 2020 (has links)
No description available.

Page generated in 0.0419 seconds