Return to search

Use of tone information in Cantonese LVCSR based on generalized character posterior probability decoding. / CUHK electronic theses & dissertations collection

Automatic recognition of Cantonese tones has long been regarded as a difficult task. Cantonese has one of the most complicated tone systems among all languages in the world. This thesis presents a novel approach of modeling Cantonese tones. We propose the use of supra-tone models. Each supra-tone unit covers a number of syllables in succession. The supra-tone model characterizes not only the tone contours of individual syllables but also the transitions among them. By including multiple tone contours in one modeling unit, the relative heights of the tones are captured explicitly. This is especially important for the discrimination among the level tones of Cantonese. / The decoding in conventional LVCSR systems aims at finding the sentence hypothesis, i.e. the string of words, which has the maximum a posterior (MAP) probability in comparison with other hypotheses. However, in most applications, the recognition performance is measured in terms of word error rate (or word accuracy). In Chinese languages, given that "word" is a rather ambiguous concept, speech recognition performance is usually measured in terms of the character error rate. In this thesis, we develop a decoding algorithm that can minimize the character error rate. The algorithm is applied to a reduced search space, e.g. a word graph or the N-best sentence list, which results from the 1st pass of search, and the generalized character posterior probability (GCPP) is maximized. (Abstract shortened by UMI.) / This thesis addresses two major problems of the existing large vocabulary continuous speech recognition (LVCSR) technology: (1) inadequate exploitation of alternative linguistic and acoustic information; and (2) the mismatch between the decoding (recognition) criterion and the performance evaluation. The study is focused on Cantonese, one of the major Chinese dialects, which is also monosyllabic and tonal. Tone is somewhat indispensable for lexical access and disambiguation of homonyms in Cantonese. However, tone information into Cantonese LVCSR requires effective tone recognition as well as a seamless integration algorithm. / Qian Yao. / "July 2005." / Adviser: Tan Lee. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 4009. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 100-110). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_343658
Date January 2005
ContributorsQian, Yao., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, theses
Formatelectronic resource, microform, microfiche, 1 online resource (xviii, 110 p. : ill.)
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0019 seconds