Global ETD Search

Return to search

Large Vocabulary Continuous Speech Recogniton For Turkish Using Htk

This study aims to build a new language model that can be used in a Turkish large vocabulary continuous speech recognition system. Turkish is a very productive language in terms of word forms because of its agglutinative nature. For such languages like Turkish, the vocabulary size is far from being acceptable. From only one simple stem, thousands of new word forms can be generated using inflectional or derivational suffixes. In this thesis, words are parsed into their stems and endings. One ending includes the suffixes attached to the associated root. Then the search network based on bigrams is constructed. Bigrams are obtained either using stem and endings, or using only stems. The language model proposed is based on bigrams obtained using only stems. All work is done in HTK (Hidden Markov Model Toolkit) environment, except parsing and network transforming. Besides of offering a new language model for Turkish, this study involves a comprehensive work about speech recognition inspecting into concepts in the state of the art speech recognition systems. To acquire good command of these concepts and processes in speech recognition isolated word, connected word and continuous speech recognition tasks are performed. The experimental results associated with these tasks are also given.

http://etd.lib.metu.edu.tr/upload/1205491/index.pdf

Identifer	oai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/1205491/index.pdf
Date	01 January 2003
Creators	Comez, Murat Ali
Contributors	Ciloglu, Tolga
Publisher	METU
Source Sets	Middle East Technical Univ.
Language	English
Detected Language	English
Type	M.S. Thesis
Format	text/pdf
Rights	To liberate the content for public access

Page generated in 0.0024 seconds

Large Vocabulary Continuous Speech Recogniton For Turkish Using Htk

Description

Links & Downloads

Tags

Additional Fields