Grapheme-to-phoneme conversion (G2P) is the task of converting a word, represented by a sequence of graphemes, to its pronunciation, represented by a sequence of phonemes. The G2P task plays a crucial role in speech synthesis systems, and is an important part of other applications, including spelling correction and speech-to-speech machine translation. G2P conversion is a complex task, for which a number of diverse solutions have been proposed. In general, the problem is challenging because the source string does not unambiguously specify the target representation. In addition, the training data include only example word
pairs without the structural information of subword alignments.
In this thesis, I introduce several novel approaches for G2P conversion. My contributions can be categorized into (1) new alignment models and (2) new output generation models. With respect to alignment models, I present techniques including many-to-many alignment, phonetic-based alignment, alignment by integer linear programing and alignment-by-aggregation. Many-to-many alignment is designed to replace the one-to-one
alignment that has been used almost exclusively in the past. The new many-to-many alignments are more precise and accurate in expressing grapheme-phoneme relationships. The other proposed alignment approaches attempt to advance the training method beyond the use of Expectation-Maximization (EM). With respect to generation models, I first describe a framework for integrating many-to-many alignments and language models for grapheme classification. I then propose joint processing for G2P using online discriminative training. I integrate a generative joint n-gram model into the discriminative framework. Finally, I apply the proposed G2P systems to name transliteration generation and mining tasks. Experiments show that the proposed system achieves state-of-the-art performance in both the G2P and name transliteration tasks.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:AEU.10048/1647 |
Date | 06 1900 |
Creators | Jiampojamarn, Sittichai |
Contributors | Grzegorz Kondrak (Computing Science), Randy Goebel (Computing Science), Dale Schuurmans (Computing Science), Harald Baayen (Linguistics), Anoop Sarkar (School of Computing Science, Simon Fraser University) |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 934012 bytes, application/pdf |
Relation | Sittichai Jiampojamarn, Kenneth Dwyer, Shane Bergsma, Aditya Bhargava, Qing Dou, Mi-Young Kim, and Grzegorz Kondrak "Transliteration generation and mining with limited training resources. In Proceeding of the Named Entities Workshop (NEWS) 2010, Sweden, July 2010, pp. 39-47, Sittichai Jiampojamarn and Grzegorz Kondrak "Letter-Phoneme Alignment: An Exploration" In Proceeding of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), July 2010, pp.780-788, Sittichai Jiampojamarn, Colin Cherry and Grzegorz Kondrak "Integrating Joint n-gram Features into a Discriminative Training Framework" In Proceeding of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2010), June 2010, pp.697-700., Sittichai Jiampojamarn and Grzegorz Kondrak "Online Discriminative Training for Grapheme-to-Phoneme Conversion" In Proceeding of the 10th Annual Conference of the International Speech Communication Association (INTERSPEECH), Brighton, U.K., September 2009, pp.1303-1306., Sittichai Jiampojamarn, Aditya Bhargava, Qing Dou, Kenneth Dwyer and Grzegorz Kondrak "DIRECTL: a Language-Independent Approach to Transliteration". In Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), Singapore, August 2009, pp.28-31., Qing Dou, Shane Bergsma, Sittichai Jiampojamarn and Grzegorz Kondrak "A Ranking Approach to Stress Prediction for Letter-to-Phoneme Conversion". Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, August 2009, pp.118-126., Sittichai Jiampojamarn, Colin Cherry and Grzegorz Kondrak. "Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion". In Proceeding of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-08: HLT), Columbus, OH, June 2008, pp.905-913., Sittichai Jiampojamarn, Grzegorz Kondrak and Tarek Sherif. "Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion". Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), Rochester, NY, April 2007, pp.372-379. |
Page generated in 0.0025 seconds