• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 513
  • 40
  • 37
  • 35
  • 27
  • 25
  • 21
  • 21
  • 11
  • 11
  • 11
  • 11
  • 11
  • 11
  • 11
  • Tagged with
  • 931
  • 931
  • 512
  • 217
  • 166
  • 151
  • 151
  • 101
  • 98
  • 87
  • 78
  • 75
  • 74
  • 71
  • 68
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
231

A feasibility study on the use of a voice recognition system for training delivery

Gibson, Marcia 25 August 2008 (has links)
This feasibility study examined the possibility of using an independent voice recognition system as the input device during a training delivery requirement. The intent was to determine whether the voice recognition system could be incorporated into a training delivery system designed to train students how to use the Communications Electronics Operating Instructions manual, a tool used for communicating over the radio network during military operations. This study showed how the voice recognition system worked in an integrated voice based delivery system for the purpose of delivering instruction. An added importance of the study was that the voice system was an independent speech recognition system. At the time this study was conducted, there did not exist a reasonably priced speech recognition system that interfaced with both graphics and authoring software which allowed any student to speak to the system without training the system to recognize the individual student's voice. This feature increased the usefulness and flexibility of the system. The methodology for this feasibility study was a development and evaluation model. This required a market analysis, development of the voice system and instructional course ware, testing the system using a sample population from the Armor School at Ft. Knox, Kentucky, and making required alterations. The data collection approach was multifaceted. There were surveys to be completed by each subject: a student profile survey, a pretest, a posttest, and an opinion survey about how well the instruction met expectations. Data was also collected concerning how often the recognition system recognized, did not recognize, or misrecognized the voice of each subject. The information gathered was analyzed to determine how well the voice recognition system performs in a training delivery application. The findings of this feasibility study indicated that an effective voice based training delivery system could be developed by integrating an IBM clone personal computer with a graphics board and supporting software, signal processing board and supporting software for audio output and input, and instructional authoring software. Training was delivered successfully since all students completed the course, 85% performed better on the posttest than on the pretest, and the mean gain scores more than satisfied the expected criterion for the training course. The misrecognition factor was 12%. An important finding of this study is that the misrecognition factor did not affect the students' opinion of how well the voice system operated or the students' learning gain. / Ed. D.
232

Evaluating the Effects of Automatic Speech Recognition Word Accuracy

Doe, Hope L. 10 August 1998 (has links)
Automatic Speech Recognition (ASR) research has been primarily focused towards large-scale systems and industry, while other areas that require attention are often over-looked by researchers. For this reason, this research looked at automatic speech recognition at the consumer level. Many individual consumers will purchase and use automatic software recognition for a different purpose than that of the military or commercial industries, such as telecommunications. Consumers who purchase the software for personal use will mainly use ASR for dictation of correspondences and documents. Two ASR dictation software packages were used to conduct the study. The research examined the relationships between (1) speech recognition software training and word accuracy, (2) error-correction time by the user and word accuracy, and (3) correspondence type and word accuracy. The correspondences evaluated were those that resemble Personal, Business, and Technical Correspondences. Word accuracy was assessed after initial system training, five minutes of error-correction time, and ten minutes of error-correction time. Results indicated that word recognition accuracy achieved does affect user satisfaction. It was also found that with increased error-correction time, word accuracy results improved. Additionally, the results found that Personal Correspondence achieved the highest mean word accuracy rate for both systems and that Dragon Systems achieved the highest mean word accuracy recognition for the Correspondences explored in this research. Results were discussed in terms of subjective and objective measures, advantages and disadvantages of speech input, and design recommendations were provided. / Master of Science
233

The Effectiveness of Oral Expression through the use of Continuous Speech Recognition Technology in Supporting the Written Composition of Postsecondary Students with Learning Disabilities

Snider, Richard Conrad 17 April 2002 (has links)
A large number of individuals who are identified as having learning disabilities have deficits in written expression. Existing theory and research indicate that for those individuals oral expression not only precedes, but also exceeds their written expression capabilities. As a result, dictation has been investigated as an accommodation for these individuals. Research in this area indicates that dictation does tend to increase quality, length, and rate of production of written expression. This mode, however, has a number of shortcomings, including difficulties caused by social skills deficits and a loss of independence. Additionally, for universities providing this accommodation, the annual cost of providing a transcription service is high. Speech recognition has the potential to overcome these shortcomings, but presently little research has been conducted to investigate the advantages and disadvantages of this mode of writing. The purpose of this study was to examine the compensatory effectiveness of oral expression through the use of continuous speech recognition technology on the written composition performance of postsecondary students with learning disabilities. This writing mode was compared to a popular accommodation involving oral expression, using a human transcriber to create a verbatim transcription, and to a common visual-motor method of writing, using a keyboard without assistance. Analysis of the data revealed that students with learning disabilities in the area of written expression wrote significantly higher quality essays at a faster rate using the transcription and speech recognition modes of writing than they did using the keyboarding method of writing. There was no significant difference in the length of essays across the three treatment groups. This study suggests that current continuous speech recognition technology can offer postsecondary students with learning disabilities a method to write that is superior to keyboarding as indicated by measures of quality and rate of production. Since the speech recognition technology does not have the limitations of the transcription process (i.e., loss of independence and high cost), it may be the best alternative for postsecondary students with learning disabilities in the area of written expression to maximize their oral language strengths to more efficiently produce better quality writing. / Ph. D.
234

The Effectiveness of Speech Recognition as a User Interface for Computer-Based Training

Creech, Wayne E. (Wayne Everette) 08 1900 (has links)
Some researchers are saying that natural language is probably one of the most promising interfaces for use in the long term for simplicity of learning. If this is true, then it follows that speech recognition would be ideal as the interface for computer-based training (CBT). While many speech recognition applications are being used as a means for a computer interface, these are usually confined to controlling the computer or causing the computer to control other devices. The user input or interface has been the recipient of a strong effort to improve the quality of the communication between man and machine and is proposed to be a dominant factor in determining user productivity, performance, and satisfaction. However, other researchers note that full natural interfaces with computers are still a long way from being the state-of-the art with technology. The focus of this study was to determine if the technology of speech recognition is an effective interface for an academic lesson presented via CBT. How does one determine if learning has been affected and how is this measured? Previous research has attempted quantify a learning effect when using a variety of interfaces. This dissertation summarizes previous studies using other interfaces and those using speech recognition. It attempted to apply a framework used to measure learning effectiveness in some of these studies to quantify the measurement of learning when speech recognition is used as the sole interface. The focus of the study was on cognitive processing which affects short-term memory and in-turn, the effect on original learning (OL). The methods and procedures applied in an experimental study were presented.
235

Automatic speech recognition system for people with speech disorders

Ramaboka, Manthiba Elizabeth January 2018 (has links)
Thesis (M.Sc. (Computer Science)) --University of Limpopo, 2018 / The conversion of speech to text is essential for communication between speech and visually impaired people. The focus of this study was to develop and evaluate an ASR baseline system designed for normal speech to correct speech disorders. Normal and disordered speech data were sourced from Lwazi project and UCLASS, respectively. The normal speech data was used to train the ASR system. Disordered speech was used to evaluate performance of the system. Features were extracted using the Mel-frequency cepstral coefficients (MFCCs) method in the processing stage. The cepstral mean combined variance normalization (CMVN) was applied to normalise the features. A third-order language model was trained using the SRI Language Modelling (SRILM) toolkit. A recognition accuracy of 65.58% was obtained. The refinement approach is then applied in the recognised utterance to remove the repetitions from stuttered speech. The approach showed that 86% of repeated words in stutter can be removed to yield an improved hypothesized text output. Further refinement of the post-processing module ASR is likely to achieve a near 100% correction of stuttering speech Keywords: Automatic speech recognition (ASR), speech disorder, stuttering
236

Incorporation of syntax and semantics to improve the performance of an automatic speech recognizer

Rapholo, Moyahabo Isaiah January 2012 (has links)
Thesis (M.Sc. (Computer Science)) -- University of Limpopo, 2012 / Automatic Speech Recognition (ASR) is a technology that allows a computer to identify spoken words and translate those spoken words into text. Speech recognition systems have started to be used in may application areas such as healthcare, automobile, e-commerce, military, and others. The use of these speech recognition systems is usually limited by their poor performance. In this research we are looking at improving the performance of the baseline ASR systems by incorporating syntactic structures in grammar into an existing Northern Sotho ASR, based on hidden Markov models (HMMs). The syntactic structures will be applied to the vocabulary used within the healthcare application area domain. The Backus Naur Form (BNF) and the Extended Backus Naur Form (EBNF) was used to specify the grammar. The experimental results show the overall improvement to the baseline ASR System and hence give a basis for following this approach.
237

Confidence Measures for Speech/Speaker Recognition and Applications on Turkish LVCSR

Mengusoglu, Erhan 24 May 2004 (has links)
Confidence measures for the results of speech/speaker recognition make the systems more useful in the real time applications. Confidence measures provide a test statistic for accepting or rejecting the recognition hypothesis of the speech/speaker recognition system. Speech/speaker recognition systems are usually based on statistical modeling techniques. In this thesis we defined confidence measures for statistical modeling techniques used in speech/speaker recognition systems. For speech recognition we tested available confidence measures and the newly defined acoustic prior information based confidence measure in two different conditions which cause errors: the out-of-vocabulary words and presence of additive noise. We showed that the newly defined confidence measure performs better in both tests. Review of speech recognition and speaker recognition techniques and some related statistical methods is given through the thesis. We defined also a new interpretation technique for confidence measures which is based on Fisher transformation of likelihood ratios obtained in speaker verification. Transformation provided us with a linearly interpretable confidence level which can be used directly in real time applications like for dialog management. We have also tested the confidence measures for speaker verification systems and evaluated the efficiency of the confidence measures for adaptation of speaker models. We showed that use of confidence measures to select adaptation data improves the accuracy of the speaker model adaptation process. Another contribution of this thesis is the preparation of a phonetically rich continuous speech database for Turkish Language. The database is used for developing an HMM/MLP hybrid speech recognition for Turkish Language. Experiments on the test sets of the database showed that the speech recognition system has a good accuracy for long speech sequences while performance is lower for short words, as it is the case for current speech recognition systems for other languages. A new language modeling technique for the Turkish language is introduced in this thesis, which can be used for other agglutinative languages. Performance evaluations on newly defined language modeling techniques showed that it outperforms the classical n-gram language modeling technique.
238

Speech coding and transmission for improved automatic recognition in communication networks

Zhong, Xin 01 December 2003 (has links)
Acknowledgements section and vita redacted from this digital version.
239

Using observation uncertainty for robust speech recognition

Arrowood, Jon A. 01 December 2003 (has links)
No description available.
240

A detection-based pattern recognition framework and its applications

Ma, Chengyuan 06 April 2010 (has links)
The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation. Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages. A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage. This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed in more applications in the future.

Page generated in 0.0585 seconds