Global ETD Search

Return to search

Use of tone information in Cantonese LVCSR based on generalized character posterior probability decoding. / CUHK electronic theses & dissertations collection

Automatic recognition of Cantonese tones has long been regarded as a difficult task. Cantonese has one of the most complicated tone systems among all languages in the world. This thesis presents a novel approach of modeling Cantonese tones. We propose the use of supra-tone models. Each supra-tone unit covers a number of syllables in succession. The supra-tone model characterizes not only the tone contours of individual syllables but also the transitions among them. By including multiple tone contours in one modeling unit, the relative heights of the tones are captured explicitly. This is especially important for the discrimination among the level tones of Cantonese. / The decoding in conventional LVCSR systems aims at finding the sentence hypothesis, i.e. the string of words, which has the maximum a posterior (MAP) probability in comparison with other hypotheses. However, in most applications, the recognition performance is measured in terms of word error rate (or word accuracy). In Chinese languages, given that "word" is a rather ambiguous concept, speech recognition performance is usually measured in terms of the character error rate. In this thesis, we develop a decoding algorithm that can minimize the character error rate. The algorithm is applied to a reduced search space, e.g. a word graph or the N-best sentence list, which results from the 1st pass of search, and the generalized character posterior probability (GCPP) is maximized. (Abstract shortened by UMI.) / This thesis addresses two major problems of the existing large vocabulary continuous speech recognition (LVCSR) technology: (1) inadequate exploitation of alternative linguistic and acoustic information; and (2) the mismatch between the decoding (recognition) criterion and the performance evaluation. The study is focused on Cantonese, one of the major Chinese dialects, which is also monosyllabic and tonal. Tone is somewhat indispensable for lexical access and disambiguation of homonyms in Cantonese. However, tone information into Cantonese LVCSR requires effective tone recognition as well as a seamless integration algorithm. / Qian Yao. / "July 2005." / Adviser: Tan Lee. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 4009. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 100-110). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.

Automatic speech recognition

Cantonese dialects--Data processing

Cantonese dialects--Tone

Identifer	oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_343658
Date	January 2005
Contributors	Qian, Yao., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
Source Sets	The Chinese University of Hong Kong
Language	English, Chinese
Detected Language	English
Type	Text, theses
Format	electronic resource, microform, microfiche, 1 online resource (xviii, 110 p. : ill.)
Rights	Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0019 seconds

Use of tone information in Cantonese LVCSR based on generalized character posterior probability decoding. / CUHK electronic theses & dissertations collection

Description

Links & Downloads

Tags

Additional Fields