Global ETD Search

Return to search

Strategic Selection of Training Data for Domain-Specific Speech Recognition

Speech recognition is now a key topic in computer science with the proliferation of voice-activated assistants, and voice-enabled devices. Many companies over a speech recognition service for developers to use to enable smart devices and services. These speech-to-text systems, however, have significant room for improvement, especially in domain specific speech. IBM's Watson speech-to-text service attempts to support domain specific uses by allowing users to upload their own training data for making custom models that augment Watson's general model. This requires deciding a strategy for picking the training model. This thesis experiments with different training choices for custom language models that augment Watson's speech to text service. The results show that using recent utterances is the best choice of training data in our use case of Digital Democracy. We are able to improve speech recognition accuracy by 2.3% percent over the control with no custom model. However, choosing training utterances most specific to the use case is better when large enough volumes of such training data is available.

Speech Recognition

Transcription

Identifer	oai:union.ndltd.org:CALPOLY/oai:digitalcommons.calpoly.edu:theses-3255
Date	01 June 2018
Creators	Girerd, Daniel
Publisher	DigitalCommons@CalPoly
Source Sets	California Polytechnic State University
Detected Language	English
Type	text
Format	application/pdf
Source	Master's Theses

Page generated in 0.0014 seconds

Strategic Selection of Training Data for Domain-Specific Speech Recognition

Description

Links & Downloads

Tags

Additional Fields