Return to search

Development of tests and preprocessing algorithms for evaluation and improvement of speech recognition units

This study considered the evaluation of commercially available isolated word, speaker dependent, speech recognition units, and preprocessing techniques that may be used for improving their performance. The problem was considered in three separate stages.
A series of tests were designed to exercise an isolated word, speaker dependent, speech recognition unit. These tests provided a sound basis for determining a given unit's strengths and weaknesses. This knowledge permits a more informed decision on the best recognition device for a given price range. As well, this knowledge may be used in the design of a robust vocabulary, and creation of guidelines for best performance. The test vocabularies were based on the forty English phonemes identified by Rabiner and Schafer [28] and the test variations were representative of common variations which may be expected in normal use.
A digital archive system was implemented for storing the voice input of test subjects. This facility provided a data base for an investigation of preprocessing techniques. As well, it permits the testing of different speech recognition units with the same voice input, providing a platform for device comparison.
Several speech preprocessing and performance improvement techniques were then investigated. Specifically, two types of time normalization, the enhancement of low energy phonemes and a change in training technique were investigated. These techniques permit a more accurate analysis of the failure mechanism of the speech recognition unit. They may also provide the basis for a speech preprocessor design which could be placed in front of a commercial speech recognition unit.
A commercially available speech recognition unit, the NEC SR100, was used as a measure of the effectiveness of the tests and of the improvements. Results of the study indicated that the designed tests and the preprocessing & performance improvement techniques investigated were useful in identifying the speech recognition unit's weaknesses. Also, depending on the economics of implementation, it was found that preprocessing may provide a cost effective solution to some of the recognition unit's shortcomings. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate

Identiferoai:union.ndltd.org:UBC/oai:circle.library.ubc.ca:2429/26750
Date January 1986
CreatorsWasmeier, Hans
PublisherUniversity of British Columbia
Source SetsUniversity of British Columbia
LanguageEnglish
Detected LanguageEnglish
TypeText, Thesis/Dissertation
RightsFor non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Page generated in 0.0016 seconds