Return to search

Markov Language Model Applied to Taiwanese Tone Sandhi and Phonetic Annotations / 馬可夫語言模型應用di台語變調gah注音

碩士 / 國立清華大學 / 統計學研究所 / 93 / Taiwanese is rich in tone sandhi. It is a two-part problem: when to “tone sandhi,” and how to “tone sandhi.” For multi-syllabic words, major rules exist for both parts of the problems. For a complete sentence or a phrase consisting of multiple words, the tone sandhi rules for word may not apply at the last syllable of each word. Traditional approach to this problem is by the syntactic analysis, and this paper studies the tone sandhi problem by statistical approach. Using as corpora the seven volumes of Buddhist Sutra, published and phonetically annotated in Taiwanese by a senior nun, we model the phonetic transcription by syllable-based Markov language model, and study specifically the tone sandhi problem. A unigram model gives 80% correct and bigram model 84%. Both results are computed using seven-fold cross-validation.

Identiferoai:union.ndltd.org:TW/093NTHU5337003
Date January 2005
Creators洪俊詠
Contributors江永進
Source SetsNational Digital Library of Theses and Dissertations in Taiwan
Languagezh-TW
Detected LanguageEnglish
Type學位論文 ; thesis
Format34

Page generated in 0.0021 seconds