A subset of speech recognition is the use of speech recognition techniques for voice authentication. Voice authentication is an alternative security application to the other biometric security measures such as the use of fingerprints or iris scans. Voice authentication has advantages over the other biometric measures in that it can be utilized remotely, via a device like a telephone. However, voice authentication has disadvantages in that the authentication system typically requires a large memory and processing time than do fingerprint or iris scanning systems. Also, voice authentication research has yet to provide an authentication system as reliable as the other biometric measures. Most voice recognition systems use Hidden Markov Models (HMMs) as their basic probabilistic framework. Also, most voice recognition systems use a frame based approach to analyze the voice features. An example of research which has been shown to provide more accurate results is the use of a segment based model. The HMMs impose a requirement that each frame has conditional independence from the next. However, at a fixed frame rate, typically 10 ms., the adjacent feature vectors might span the same phonetic segment and often exhibit smooth dynamics and are highly correlated. The relationship between features of different phonetic segments is much weaker. Therefore, the segment based approach makes fewer conditional independence assumptions which are also violated to a lesser degree than for the frame based approach. Thus, the HMMs using segmental based approaches are more accurate. The speech polynomials (feature vectors) used in the segmental model have been shown to be Chebychev polynomials. Use of the properties of these polynomials has made it possible to reduce the computation time for speech recognition systems. Also, representing the spoken word waveform as a Chebychev polynomial allows for the recognition system to easily extract useful and repeatable features from the waveform allowing for a more accurate identification of the speaker. This thesis describes the segmental approach to speech recognition and addresses in detail the use of Chebychev polynomials in the representation of spoken words, specifically in the area of speaker recognition. .
Identifer | oai:union.ndltd.org:ucf.edu/oai:stars.library.ucf.edu:etd-1398 |
Date | 01 January 2005 |
Creators | Strange, John |
Publisher | STARS |
Source Sets | University of Central Florida |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Electronic Theses and Dissertations |
Page generated in 0.0018 seconds