• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 502
  • 40
  • 37
  • 35
  • 27
  • 25
  • 21
  • 19
  • 11
  • 11
  • 11
  • 11
  • 11
  • 11
  • 11
  • Tagged with
  • 913
  • 913
  • 502
  • 214
  • 159
  • 148
  • 148
  • 97
  • 96
  • 83
  • 78
  • 70
  • 69
  • 69
  • 66
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Speech accent identification and speech recognition enhancement by speaker accent adaptation /

Tanabian, Mohammad M., January 1900 (has links)
Thesis (M.Sc.) - Carleton University, 2005. / Includes bibliographical references (p. 150-155). Also available in electronic format on the Internet.
72

An Approach to Automatic and Human Speech Recognition Using Ear-Recorded Speech

Johnston, Samuel John Charles, Johnston, Samuel John Charles January 2017 (has links)
Speech in a noisy background presents a challenge for the recognition of that speech both by human listeners and by computers tasked with understanding human speech (automatic speech recognition; ASR). Years of research have resulted in many solutions, though none so far have completely solved the problem. Current solutions generally require some form of estimation of the noise, in order to remove it from the signal. The limitation is that noise can be highly unpredictable and highly variable, both in form and loudness. The present report proposes a method of recording a speech signal in a noisy environment that largely prevents noise from reaching the recording microphone. This method utilizes the human skull as a noise-attenuation device by placing the microphone in the ear canal. For further noise dampening, a pair of noise-reduction earmuffs are used over the speakers' ears. A corpus of speech was recorded with a microphone in the ear canal, while also simultaneously recording speech at the mouth. Noise was emitted from a loudspeaker in the background. Following the data collection, the speech recorded at the ear was analyzed. A substantial noise-reduction benefit was found over mouth-recorded speech. However, this speech was missing much high-frequency information. With minor processing, mid-range frequencies were amplified, increasing the intelligibility of the speech. A human perception task was conducted using both the ear-recorded and mouth-recorded speech. Participants in this experiment were significantly more likely to understand ear-recorded speech over the noisy, mouth-recorded speech. Yet, participants found mouth-recorded speech with no noise the easiest to understand. These recordings were also used with an ASR system. Since the ear-recorded speech is missing much high-frequency information, it did not recognize the ear-recorded speech readily. However, when an acoustic model was trained low-pass filtered speech, performance improved. These experiments demonstrated that humans, and likely an ASR system, with additional training, would be able to more easily recognize ear-recorded speech than speech in noise. Further speech processing and training may be able to improve the signal's intelligibility for both human and automatic speech recognition.
73

Low-Resource Automatic Speech Recognition Domain Adaptation: A Case-Study in Aviation Maintenance

Nadine Amr Mahmoud Amin (16648563) 02 August 2023 (has links)
<p>With timeliness and efficiency being critical in the aviation maintenance industry, the need has been growing for smart technological solutions that help in optimizing and streamlining the different underlying tasks. One such task is the technical documentation of the performed maintenance operations. Instead of paper-based documentation, voice tools that transcribe spoken logbook entries allow technicians to document their work right away in a hands-free and time efficient manner. However, an accurate automatic speech recognition (ASR) model requires large training corpora, which are lacking in the domain of aviation maintenance. In addition, ASR models which are trained on huge corpora in standard English perform poorly in such a technical domain with non-standard terminology. Hence, this thesis investigates the extent to which fine-tuning an ASR model, pre-trained on standard English corpora, on limited in-domain data improves its recognition performance in the technical domain of aviation maintenance. The thesis presents a case study on one such pre-trained ASR model, wav2vec 2.0. Results show that fine-tuning the model on a limited anonymized dataset of maintenance logbook entries brings about a significant reduction in its error rates when tested on not only an anonymized in-domain dataset, but also a non-anonymized one. This suggests that any available aviation maintenance logbooks, even if anonymized for privacy, can be used to fine-tune general-purpose ASR models and enhance their in-domain performance. Lastly, an analysis on the influence of voice characteristics on model performance stresses the need for balanced datasets representative of the population of aviation maintenance technicians.</p>
74

The application of classical information retrieval techniques to spoken documents

James, David Anthony January 1995 (has links)
No description available.
75

Stochastic models for speech understanding

O'Shea, Philip James January 1994 (has links)
No description available.
76

Neural networks for speech and speaker recognition

Elvira, Jose M. January 1994 (has links)
No description available.
77

A correlogram approach to speaker identification based on a human auditory model

Ertas, Figen January 1997 (has links)
No description available.
78

High speed auditory analysis

Gransden, I. R. January 1995 (has links)
No description available.
79

Speaker modelling for voice conversion

Ho, Ching-Hsiang January 2001 (has links)
No description available.
80

Multi-resolution analysis based acoustic features for speech recognition =: 基於多尺度分析的聲學特徵在語音識別中的應用. / 基於多尺度分析的聲學特徵在語音識別中的應用 / Multi-resolution analysis based acoustic features for speech recognition =: Ji yu duo chi du fen xi de sheng xue te zheng zai yu yin shi bie zhong de ying yong. / Ji yu duo chi du fen xi de sheng xue te zheng zai yu yin shi bie zhong de ying yong

January 1999 (has links)
Chan Chun Ping. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 134-137). / Text in English; abstracts in English and Chinese. / Chan Chun Ping. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Automatic Speech Recognition --- p.1 / Chapter 1.2 --- Review of Speech Recognition Techniques --- p.2 / Chapter 1.3 --- Review of Signal Representation --- p.4 / Chapter 1.4 --- Review of Wavelet Transform --- p.7 / Chapter 1.5 --- Objective of Thesis --- p.11 / Chapter 1.6 --- Thesis Outline --- p.11 / References --- p.13 / Chapter 2 --- Baseline Speech Recognition System --- p.17 / Chapter 2.1 --- Intoduction --- p.17 / Chapter 2.2 --- Feature Extraction --- p.18 / Chapter 2.3 --- Hidden Markov Model for Speech Recognition --- p.24 / Chapter 2.3.1 --- The Principle of Using HMM in Speech Recognition --- p.24 / Chapter 2.3.2 --- Elements of an HMM --- p.27 / Chapter 2.3.3 --- Parameters Estimation and Recognition Algorithm --- p.30 / Chapter 2.3.4 --- Summary of HMM based Speech Recognition --- p.31 / Chapter 2.4 --- TIMIT Continuous Speech Corpus --- p.32 / Chapter 2.5 --- Baseline Speech Recognition Experiments --- p.36 / Chapter 2.6 --- Summary --- p.39 / References --- p.40 / Chapter 3 --- Multi-Resolution Based Acoustic Features --- p.42 / Chapter 3.1 --- Introduction --- p.42 / Chapter 3.2 --- Discrete Wavelet Transform --- p.43 / Chapter 3.3 --- Periodic Discrete Wavelet Transform --- p.47 / Chapter 3.4 --- Multi-Resolution Analysis on STFT Spectrum --- p.49 / Chapter 3.5 --- Principal Component Analysis --- p.52 / Chapter 3.5.1 --- Related Work --- p.52 / Chapter 3.5.2 --- Theoretical Background of PCA --- p.53 / Chapter 3.5.3 --- Examples of Basis Vectors Found by PCA --- p.57 / Chapter 3.6 --- Experiments for Multi-Resolution Based Feature --- p.60 / Chapter 3.6.1 --- Experiments with Clean Speech --- p.60 / Chapter 3.6.2 --- Experiments with Noisy Speech --- p.64 / Chapter 3.7 --- Summary --- p.69 / References --- p.70 / Chapter 4 --- Wavelet Packet Based Acoustic Features --- p.72 / Chapter 4.1 --- Introduction --- p.72 / Chapter 4.2 --- Wavelet Packet Filter-Bank --- p.74 / Chapter 4.3 --- Dimensionality Reduction --- p.76 / Chapter 4.4 --- Filter-Bank Parameters --- p.77 / Chapter 4.4.1 --- Mel-Scale Wavelet Packet Filter-Bank --- p.77 / Chapter 4.4.2 --- Effect of Down-Sampling --- p.78 / Chapter 4.4.3 --- Mel-Scale Wavelet Packet Tree --- p.81 / Chapter 4.4.4 --- Wavelet Filters --- p.84 / Chapter 4.5 --- Experiments Using Wavelet Packet Based Acoustic Features --- p.86 / Chapter 4.6 --- Broad Phonetic Class Analysis --- p.89 / Chapter 4.7 --- Discussion --- p.92 / Chapter 4.8 --- Summary --- p.99 / References --- p.100 / Chapter 5 --- De-Noising by Wavelet Transform --- p.101 / Chapter 5.1 --- Introduction --- p.101 / Chapter 5.2 --- De-Noising Capability of Wavelet Transform --- p.103 / Chapter 5.3 --- Wavelet Transform Based Wiener Filtering --- p.105 / Chapter 5.3.1 --- Sub-Band Position for Wiener Filtering --- p.107 / Chapter 5.3.2 --- Estimation of Short-Time Speech and Noise Power --- p.109 / Chapter 5.4 --- De-Noising Embedded in Wavelet Packet Filter-Bank --- p.115 / Chapter 5.5 --- Experiments Using Wavelet Build-in De-Noising Properties --- p.118 / Chapter 5.6 --- Discussion --- p.120 / Chapter 5.6.1 --- Broad Phonetic Class Analysis --- p.122 / Chapter 5.6.2 --- Distortion Measure --- p.124 / Chapter 5.7 --- Summary --- p.132 / References --- p.134 / Chapter 6 --- Conclusions and Future Work --- p.138 / Chapter 6.1 --- Conclusions --- p.138 / Chapter 6.2 --- Future Work --- p.140 / References --- p.142 / Appendix 1 Jacobi's Method --- p.143 / Appendix 2 Broad Phonetic Class --- p.148

Page generated in 0.0674 seconds