Spelling suggestions: "subject:"speechrecognition"" "subject:"breedsrecognition""
321 |
Analysis, synthesis, and recognition of stressed speechCummings, Kathleen E. 12 1900 (has links)
No description available.
|
322 |
A Gaussian mixture modeling approach to text-independent speaker identificationReynolds, Douglas A. 08 1900 (has links)
No description available.
|
323 |
Seshat : A sync system for Audiobooks and eBooksDervisevic, Adnan, Oskarsson, Tobias January 2014 (has links)
In this degree project we present a way to construct a synchronization system that is able to create a timings file, which is the file the system uses to know how to sync the eBook and audiobook, using speech recognition and estimation algorithms. This file is then used by the system to let the user select a sentence and have the audiobook start reading from that sentence, or vice versa. This system can create these files with a mean offset from a manu-ally timed file which is within our expectations for the system. We use estimation algorithms to fill in the blanks where the speech recognition falls short. Speech recognition accuracy is typically between 40-60%, sometimes dipping lower, so there are blanks to fill in. Using basic algebraic principles of calculating velocity we can ex-trapolate the speed of a reader, using the duration of the audiobook as the time and the amount of characters written as the distance. For increased accuracy we derive this value on a per-chapter basis. Using this method we are able to create accurate files, which the user can use to freely sync any location in the book. Our system is designed to work for any book in the world that does not have an audiobook which cuts off between sentences in the audio files. We manually create timings files for four different books with widely varying publishing dates, author styles, reader style and gender to create as wide and representative a testing pool as possible for the project
|
324 |
Combining games and speech recognition in a multilingual educational environment / M. BoothBooth, Martin January 2014 (has links)
Playing has been part of people's lives since the beginning of time. However, play
does not take place in silence (isolated from speech and sound). The games people play
allow them to interact and to learn through experiences. Speech often forms an integral
part of playing games. Video games also allow players to interact with a virtual world
and learn through those experiences. Speech input has previously been explored as a
way of interacting with a game, as talking is a natural way of communicating. By talking
to a game, the experiences created during gameplay become more valuable, which
in turn facilitates effective learning. In order to enable a game to “hear", some issues
need to be considered. A game, that will serve as a platform for speech input, has to be
developed. If the game will contain learning elements, expert knowledge regarding the
learning content needs to be obtained. The game needs to communicate with a speech
recognition system, which will recognise players' speech inputs. To understand the role
of speech recognition in a game, players need to be tested while playing the game. The
players' experiences and opinions can then be fed back into the development of speech
recognition in educational games. This process was followed with six Financial Management students on the NWU Vaal Triangle campus. The students played FinMan, a game which teaches the fundamental concepts of the “Time value of money" principle. They
played the game with the keyboard and mouse, as well as via speech commands. The
students shared their experiences through a focus group discussion and by completing a
questionnaire. Quantitative data was collected to back the students' experiences. The
results show that, although the recognition accuracies and response times are important
issues, speech recognition can play an essential part in educational games. By freeing
learners to focus on the game content, speech recognition can make games more accessible and engaging, and consequently lead to more effective learning experiences. / MSc (Computer Science), North-West University, Vaal Triangle Campus, 2014
|
325 |
Spoken letter recognition with neural networksReynolds, James H. January 1991 (has links)
Neural networks have recently been applied to real-world speech recognition problems with a great deal of success. This thesis developes a strategy for optimising a neural network known as the Radial Basis Function classifier (RBF), on a large spoken letter recognition problem designed by British Telecom Research Laboratories. The strategy developed can be viewed as a compromise between a fully adaptive approach involving prohibitively large amounts of computation, and a heuristic approach resulting in poor generalisation. A value for the optimal number of kernel functions is suggested, and methods for determining the positions of the centres and the values of the width parameters are provided. During the evolution of the optimisation strategy it was demonstrated that spatial organisation of the centres does not adversely affect the ability of the classifier to generalise. An RBF employing the optimisation strategy achieved a lower error rate than a multilayer perceptron and two traditional static pattern classifiers on the same problem. The error rate of the RBF was very close to the theoretical minimum error rate obtainable with an optimal Bayes classifier. In addition to error rate, the performance of the classifiers was assessed in terms of the computational requirements of training and classification, illustrating the significant trade-off between computational investment in training and level of generalisation achieved. The error rate of the RBF was compared with that of a well established method of dynamic classification to examine whether non-linear time normalisation of word patterns was advantageous to generalisation. It was demonstrated that the dynamic classifier was better suited to small-scale speech recognition problems, and the RBF to speaker-independent speech recognition problems. The dynamic classifier was then combined with a neural network algorithm, greatly reducing its computational requirement without significantly increasing its error rate. This system was then extended into a novel system for visual feedback therapy in which speech is visualised as a moving trajectory on a computer screen.
|
326 |
Computer aided pronunciation system (CAPS) /Ananthakrishnan, Kollengode Subramanian. Unknown Date (has links)
Thesis (MEng(TelecommunicationsbyResearch))--University of South Australia, 2003.
|
327 |
A statistical approach to formant tracking /Gayvert, Robert T. January 1988 (has links)
Thesis (M.S.)--Rochester Institute of Technology, 1989. / Includes bibliographical references (leaves 20-21).
|
328 |
A prototype of an online speech-enabled information access tool using Java speech application programming interfaceNarayanaswami, Anand. January 2001 (has links)
Thesis (M.S.)--Ohio University, June, 2001. / Title from PDF t.p.
|
329 |
Automatic formant labeling in continuous speech /Richards, Elizabeth A. January 1989 (has links)
Thesis (M.S.)--Rochester Institute of Technology, 1989. / Includes bibliographical references (leaves 81-85).
|
330 |
Vowel recognition in continuous speech /Stam, Darrell C. January 1900 (has links)
Thesis (M.S.)--Rochester Institute of Technology, 1989. / Includes bibliographical references (leaves 73-75).
|
Page generated in 0.0753 seconds