People with motor function disorders that cause dysarthric speech find difficulty using state-of- the-art automatic speech recognition (ASR) systems. These systems are developed based on non- dysarthric speech models, which explains the poor performance when used by individuals with dysarthria. Thus, a solution is needed to compensate for the poor performance of these systems. This thesis examines the possibility of quantifying vowels of dysarthric and non-dysarthric speech into codewords regardless of inter-speaker variability and possible to be implemented on limited- processing-capability machines. I show that it is possible to model all possible vowels and vowel- like sounds that a North American speaker can produce if the frequencies of the first and second formants are used to encode these sounds. The proposed solution is aligned with the use of neural networks and hidden Markov models to build an acoustic model in conventional ASR systems. A secondary finding of this study includes the feasibility of reducing the set of ten most common vowels in North American English to eight vowels only. / Graduate / 2021-05-11
Identifer | oai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/11771 |
Date | 25 May 2020 |
Creators | Albalkhi, Rahaf |
Contributors | Sima, Mihai, Livingston, Nigel Jonathan |
Source Sets | University of Victoria |
Language | English, English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Rights | Available to the World Wide Web |
Page generated in 0.0018 seconds