Spelling suggestions: "subject:"epeech aprocessing lemsystems"" "subject:"epeech aprocessing atemsystems""
121 |
Hardware implementation of an automatic speaker recognition system using artificial neural networksMoonasar, Viresh January 2002 (has links)
Submitted in fulfillment of the academic requirements for the degree of Master of Technology in Electrical Engineering in the Department of Electronic Engineering, Faculty of Engineering, ML Sultan Technikon of Durban in South Africa, March 2002. / The use of speaker recognition technology in interactive voice response and electronic commerce systems has been limited. This is due to the lack of research attention and published results when compared to all the other areas of speech recognition technologies / M
|
122 |
A cost, complexity and performance comparison of two automatic language identification architecturesCombrinck, Hendrik Petrus 21 December 2006 (has links)
This dissertation investigates the cost-complexity-performance relationship between two automatic language identification systems. The first is a state-of-the-art archi¬tecture, trained on about three hours of phonetically hand-labelled telephone speech obtained from the recognised OGLTS corpus. The second system, introduced by our¬selves, is a simpler design with a smaller, less complex parameter space. It is a vector quantisation-based approach which bears some resemblance to a system suggested by Sugiyama. Though trained on the same data, it has no need for any labels and is therefore less costly. A number of experiments are performed to find quasi-optimal parameters for the two systems. In further experiments the systems are evaluated and compared on a set of ten two-language tasks, spanning five languages. The more com¬plex system is shown to have a substantial performance advantage over the simpler design - 81% versus 65% on 40 seconds of speech. However, both results are well under reported state-of-the-art performance of 94% and would suggest that our systems can benefit from additional attention to implementation detail and optimisation of various parameters. Given the above, our suggested architecture may potentially provide an adequate solution where the high development cost associated with state-of-the-art technology and the necessary training corpora are prohibitive. / Dissertation (M Eng (Computer Engineering))--University of Pretoria, 2006. / Electrical, Electronic and Computer Engineering / unrestricted
|
123 |
Investigation of time-domain measurements for analysis and machine recognition of speechIto, Mabo Robert January 1971 (has links)
At present in speech analysis and mechanical speech recognition work, spectral measurements are the conventional form of signal representation and acoustical descriptions of speech sounds are usually given in terms of this form of representation. In this thesis, certain time-domain measurements
are investigated as an alternative form of signal representation and as a basis for acoustical characterization of speech sounds. The primary measurements studied are the short-time averages of the zero-crossing rate of the acoustic waveform and the distribution patterns of the time intervals between zero-crossings. These measurements are found to be easy to implement with digital techniques and are implemented through digital computer simulation. Other advantages of these measurements include effectiveness in handling the large intensity range of speech sounds and ability to track rapid transient phenomena such as the release of unvoiced stops.
Computer software for an interactive graphics facility was developed for acquisition, presentation, manipulation and analysis of the acoustic speech data. One of the pattern analysis programs, for the display of time-interval distribution data, yielded a visual presentation which could be compared to frequency spectrograms. Theoretical expressions are developed to relate the time-domain and spectral representation for some phone types and these relationships are compared with experimental results. The above theoretical expressions show that important spectral characterization features are accounted for. These findings, combined with empirical observation of the utility of the time-domain signal representation in phonetic characterization, indicates that this form of representation is a useful alternative to the spectral representation.
The speech materials employed were selected to study temporal structures and contextual variations of acoustic properties and to provide quantitative data useful for word recognition applications. The vowels, fricatives and stops were the main phoneme classes studied. Quantitative data on the acoustic properties of the selected phonemes is presented and discussed in terms of i) our own spectral data, ii) other data reported in the literature and iii) simple production models. The time-domain signal representation was found to provide an effective means of analyzing and characterizing the acoustically complex stops and voiced fricatives. For the vowels and unvoiced fricatives, which are well suited to spectral analysis, the time domain measurements were found to yield very simple and direct characterization features. Some limited phonemic decomposition and machine recognition work is described which demonstrates the design of useful characterization features and provides a basis for further work. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
|
124 |
Development of tests and preprocessing algorithms for evaluation and improvement of speech recognition unitsWasmeier, Hans January 1986 (has links)
This study considered the evaluation of commercially available isolated word, speaker dependent, speech recognition units, and preprocessing techniques that may be used for improving their performance. The problem was considered in three separate stages.
A series of tests were designed to exercise an isolated word, speaker dependent, speech recognition unit. These tests provided a sound basis for determining a given unit's strengths and weaknesses. This knowledge permits a more informed decision on the best recognition device for a given price range. As well, this knowledge may be used in the design of a robust vocabulary, and creation of guidelines for best performance. The test vocabularies were based on the forty English phonemes identified by Rabiner and Schafer [28] and the test variations were representative of common variations which may be expected in normal use.
A digital archive system was implemented for storing the voice input of test subjects. This facility provided a data base for an investigation of preprocessing techniques. As well, it permits the testing of different speech recognition units with the same voice input, providing a platform for device comparison.
Several speech preprocessing and performance improvement techniques were then investigated. Specifically, two types of time normalization, the enhancement of low energy phonemes and a change in training technique were investigated. These techniques permit a more accurate analysis of the failure mechanism of the speech recognition unit. They may also provide the basis for a speech preprocessor design which could be placed in front of a commercial speech recognition unit.
A commercially available speech recognition unit, the NEC SR100, was used as a measure of the effectiveness of the tests and of the improvements. Results of the study indicated that the designed tests and the preprocessing & performance improvement techniques investigated were useful in identifying the speech recognition unit's weaknesses. Also, depending on the economics of implementation, it was found that preprocessing may provide a cost effective solution to some of the recognition unit's shortcomings. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
|
125 |
Speaker-independent access to a large lexiconMathan, Luc Stefan January 1987 (has links)
No description available.
|
126 |
Perceptual postfiltering for low bit rate speech codersChen, Wei, 1976- January 2007 (has links)
No description available.
|
127 |
An investigation of digital vocoders.Trottier, Lorne Ira. January 1973 (has links)
No description available.
|
128 |
Speaker normalizing transforms in speech recogniton by computerSejnoha, Vladimir. January 1982 (has links)
No description available.
|
129 |
Speaker recognition using digit utterancesScrimgeour, J. Michael. January 1984 (has links)
No description available.
|
130 |
Experiments on automatic phonetic segmentation and transcription of speechLennig, Matthew. January 1983 (has links)
No description available.
|
Page generated in 0.1023 seconds