Global ETD Search

Return to search

A Study On Language Modeling For Turkish Large Vocabulary Continuous Speech Recognition

This study focuses on large vocabulary Turkish continuous speech recognition. Continuous speech recognition for Turkish cannot be performed accurately because of the agglutinative nature of the language. The agglutinative nature decreases the performance of the classical language models that are used in the area. In this thesis firstly, acoustic models using different parameters are constructed and tested. Then, three types of n-gram language models are built. These involve class-based models, stem-based models, and stem-end-based models. Two pass recognition is performed using the Hidden Markov Modeling Toolkit (HTK) for testing the system first with the bigram models and then with the trigram models. At the end of the study, it is found that trigram models over stems and endings give better results, since their coverage of the vocabulary is better.

http://etd.lib.metu.edu.tr/upload/2/12606612/index.pdf

Identifer	oai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/2/12606612/index.pdf
Date	01 June 2005
Creators	Bayer, Ali Orkan
Contributors	Turhan Yondem, Meltem
Publisher	METU
Source Sets	Middle East Technical Univ.
Language	English
Detected Language	English
Type	M.S. Thesis
Format	text/pdf
Rights	To liberate the content for METU campus

Page generated in 0.002 seconds

A Study On Language Modeling For Turkish Large Vocabulary Continuous Speech Recognition

Description

Links & Downloads

Tags

Additional Fields