41 |
Long-term autocorrelation coefficients in automatic speech recognitionOdom, Warren Edward, 1952- January 1976 (has links)
No description available.
|
42 |
A projection-based measure for automatic speech recognition in noiseCarlson, Beth A. 12 1900 (has links)
No description available.
|
43 |
Distortion compensation in speech signals using a blind iterative algorithm based on memoryless symmetrical nonlinearitiesHubert-Brierre, Florent Maxime 08 1900 (has links)
No description available.
|
44 |
Towards pose invariant visual speech processingPass, A. R. January 2013 (has links)
No description available.
|
45 |
An acoustic-phonetic approach in automatic Arabic speech recognitionAl-Zabibi, Marwan January 1990 (has links)
In a large vocabulary speech recognition system the broad phonetic classification technique is used instead of detailed phonetic analysis to overcome the variability in the acoustic realisation of utterances. The broad phonetic description of a word is used as a means of lexical access, where the lexicon is structured into sets of words sharing the same broad phonetic labelling. This approach has been applied to a large vocabulary isolated word Arabic speech recognition system. Statistical studies have been carried out on 10,000 Arabic words (converted to phonemic form) involving different combinations of broad phonetic classes. Some particular features of the Arabic language have been exploited. The results show that vowels represent about 43% of the total number of phonemes. They also show that about 38% of the words can uniquely be represented at this level by using eight broad phonetic classes. When introducing detailed vowel identification the percentage of uniquely specified words rises to 83%. These results suggest that a fully detailed phonetic analysis of the speech signal is perhaps unnecessary. In the adopted word recognition model, the consonants are classified into four broad phonetic classes, while the vowels are described by their phonemic form. A set of 100 words uttered by several speakers has been used to test the performance of the implemented approach. In the implemented recognition model, three procedures have been developed, namely voiced-unvoiced-silence segmentation, vowel detection and identification, and automatic spectral transition detection between phonemes within a word. The accuracy of both the V-UV-S and vowel recognition procedures is almost perfect. A broad phonetic segmentation procedure has been implemented, which exploits information from the above mentioned three procedures. Simple phonological constraints have been used to improve the accuracy of the segmentation process. The resultant sequence of labels are used for lexical access to retrieve the word or a small set of words sharing the same broad phonetic labelling. For the case of having more than one word-candidates, a verification procedure is used to choose the most likely one.
|
46 |
A domain based approach to natural language modellingDonnelly, Paul Gerard January 1998 (has links)
No description available.
|
47 |
Robustness in ASR : an experimental study of the interrelationship between discriminant feature-space transformation, speaker normalization and environment compensationKeyvani, Alireza. January 2007 (has links)
This thesis addresses the general problem of maintaining robust automatic speech recognition (ASR) performance under diverse speaker populations, channel conditions, and acoustic environments. To this end, the thesis analyzes the interactions between environment compensation techniques, frequency warping based speaker normalization, and discriminant feature-space transformation (DFT). These interactions were quantified by performing experiments on the connected digit utterances comprising the Aurora 2 database, using continuous density hidden Markov models (HMM) representing individual digits. / Firstly, given that the performance of speaker normalization techniques degrades in the presence of noise, it is shown that reducing the effects of noise through environmental compensation, prior to speaker normalization, leads to substantial improvements in ASR performance. The speaker normalization techniques considered here were vocal tract length normalization (VTLN) and the augmented state-space acoustic decoder (MATE). Secondly, given that discriminant feature-space transformation (DFT) are known to increase class separation, it is shown that performing speaker normalization using VTLN in a discriminant feature-space leads to improvements in the performance of this technique. Classes, in our experiments, corresponded to HMM states. Thirdly, an effort was made to achieve higher class discrimination by normalizing the speech data used to estimate the discriminant feature-space transform. Normalization, in our experiments, corresponded to reducing the variability within each class through the use of environment compensation and speaker normalization. Significant ASR performance improvements were obtained when normalization was performed using environment compensation, while our results were inconclusive for the case where normalization consisted of speaker normalization. Finally, aimed at increasing its noise robustness, a simple modification of MATE is presented. This modification consisted of using, during recognition, knowledge of the distribution of warping factors selected by MATE during training.
|
48 |
Automatic speech recognition for closed captioning of television :Ahmer, Ingrid Unknown Date (has links)
This thesis addresses the application of automatic speech recognition to the task of offline closed-captioning of television programs, and describes the collection of corpora to support such research and an exploration of issues to be addressed. The use of automatic speech recognition (ASR) for transcription of broadcast speech and as an aid to captioning is reviewed. As background to the task, the methodology for large vocabulary continuous speech recognition (LVCSR) is presented, with particular attention given to the issues of large vocabulary language modelling and consideration of the acoustic complexity arising in broadcast material. / Thesis (MEng(Telecommunications))--University of South Australia, 2002.
|
49 |
Automatic speech recognition for closed captioning of television :Ahmer, Ingrid Unknown Date (has links)
This thesis addresses the application of automatic speech recognition to the task of offline closed-captioning of television programs, and describes the collection of corpora to support such research and an exploration of issues to be addressed. The use of automatic speech recognition (ASR) for transcription of broadcast speech and as an aid to captioning is reviewed. As background to the task, the methodology for large vocabulary continuous speech recognition (LVCSR) is presented, with particular attention given to the issues of large vocabulary language modelling and consideration of the acoustic complexity arising in broadcast material. / Thesis (MEng(Telecommunications))--University of South Australia, 2002.
|
50 |
Automatic speech recognition for closed captioning of television :Ahmer, Ingrid Unknown Date (has links)
This thesis addresses the application of automatic speech recognition to the task of offline closed-captioning of television programs, and describes the collection of corpora to support such research and an exploration of issues to be addressed. The use of automatic speech recognition (ASR) for transcription of broadcast speech and as an aid to captioning is reviewed. As background to the task, the methodology for large vocabulary continuous speech recognition (LVCSR) is presented, with particular attention given to the issues of large vocabulary language modelling and consideration of the acoustic complexity arising in broadcast material. / Thesis (MEng(Telecommunications))--University of South Australia, 2002.
|
Page generated in 0.1079 seconds