Return to search

A Hidden Markov Model-Based Approach for Emotional Speech Synthesis

In this thesis, we describe two approaches to automatically synthesize the emotional speech of a target speaker based on the hidden Markov model for his/her neutral speech.
In the interpolation based method, the basic idea is the model interpolation between the neutral model of the target speaker and an emotional model selected from a candidate pool. Both the interpolation model selection and the interpolation weight computation are determined based on a model-distance measure. We propose a monophone-based Mahalanobis
distance (MBMD).
In the parallel model combination (PMC) based method, our basic idea is to model the mismatch between neutral model and emotional model. We train linear regression model to describe this mismatch. And then we combine the target speaker neutral model with the linear regression model.
We evaluate our approach on the synthesized emotional speech of angriness, happiness, and sadness with several subjective tests. Experimental results show that the implemented system is able to synthesize speech with emotional expressiveness of the target speaker.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0830110-111455
Date30 August 2010
CreatorsYang, Chih-Yung
ContributorsChia-Ping Chen, Hsin-Min Wang, Jui-Feng Yeh, Chung-Hsien Wu
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0830110-111455
Rightsnot_available, Copyright information available at source archive

Page generated in 0.0024 seconds