碩士 / 國立清華大學 / 統計學研究所 / 93 / Taiwanese is rich in tone sandhi. It is a two-part problem: when to “tone sandhi,” and how to “tone sandhi.” For multi-syllabic words, major rules exist for both parts of the problems. For a complete sentence or a phrase consisting of multiple words, the tone sandhi rules for word may not apply at the last syllable of each word. Traditional approach to this problem is by the syntactic analysis, and this paper studies the tone sandhi problem by statistical approach. Using as corpora the seven volumes of Buddhist Sutra, published and phonetically annotated in Taiwanese by a senior nun, we model the phonetic transcription by syllable-based Markov language model, and study specifically the tone sandhi problem. A unigram model gives 80% correct and bigram model 84%. Both results are computed using seven-fold cross-validation.
Identifer | oai:union.ndltd.org:TW/093NTHU5337003 |
Date | January 2005 |
Creators | 洪俊詠 |
Contributors | 江永進 |
Source Sets | National Digital Library of Theses and Dissertations in Taiwan |
Language | zh-TW |
Detected Language | English |
Type | 學位論文 ; thesis |
Format | 34 |
Page generated in 0.0021 seconds