Phone-based speech synthesis using neural network with articulatory control.

by Lo Wai Kit. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 151-160). / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Applications of Speech Synthesis --- p.2 / Chapter 1.1.1 --- Human Machine Interface --- p.2 / Chapter 1.1.2 --- Speech Aids --- p.3 / Chapter 1.1.3 --- Text-To-Speech (TTS) system --- p.4 / Chapter 1.1.4 --- Speech Dialogue System --- p.4 / Chapter 1.2 --- Current Status in Speech Synthesis --- p.6 / Chapter 1.2.1 --- Concatenation Based --- p.6 / Chapter 1.2.2 --- Parametric Based --- p.7 / Chapter 1.2.3 --- Articulatory Based --- p.7 / Chapter 1.2.4 --- Application of Neural Network in Speech Synthesis --- p.8 / Chapter 1.3 --- The Proposed Neural Network Speech Synthesis --- p.9 / Chapter 1.3.1 --- Motivation --- p.9 / Chapter 1.3.2 --- Objectives --- p.9 / Chapter 1.4 --- Thesis outline --- p.11 / Chapter 2 --- Linguistic Basics for Speech Synthesis --- p.12 / Chapter 2.1 --- Relations between Linguistic and Speech Synthesis --- p.12 / Chapter 2.2 --- Basic Phonology and Phonetics --- p.14 / Chapter 2.2.1 --- Phonology --- p.14 / Chapter 2.2.2 --- Phonetics --- p.15 / Chapter 2.2.3 --- Prosody --- p.16 / Chapter 2.3 --- Transcription Systems --- p.17 / Chapter 2.3.1 --- The Employed Transcription System --- p.18 / Chapter 2.4 --- Cantonese Phonology --- p.20 / Chapter 2.4.1 --- Some Properties of Cantonese --- p.20 / Chapter 2.4.2 --- Initial --- p.21 / Chapter 2.4.3 --- Final --- p.23 / Chapter 2.4.4 --- Lexical Tone --- p.25 / Chapter 2.4.5 --- Variations --- p.26 / Chapter 2.5 --- The Vowel Quadrilaterals --- p.29 / Chapter 3 --- Speech Synthesis Technology --- p.32 / Chapter 3.1 --- The Human Speech Production --- p.32 / Chapter 3.2 --- Important Issues in Speech Synthesis System --- p.34 / Chapter 3.2.1 --- Controllability --- p.34 / Chapter 3.2.2 --- Naturalness --- p.34 / Chapter 3.2.3 --- Complexity --- p.35 / Chapter 3.2.4 --- Information Storage --- p.35 / Chapter 3.3 --- Units for Synthesis --- p.37 / Chapter 3.4 --- Type of Synthesizer --- p.40 / Chapter 3.4.1 --- Copy Concatenation --- p.40 / Chapter 3.4.2 --- Vocoder --- p.41 / Chapter 3.4.3 --- Articulatory Synthesis --- p.44 / Chapter 4 --- Neural Network Speech Synthesis with Articulatory Control --- p.47 / Chapter 4.1 --- Neural Network Approximation --- p.48 / Chapter 4.1.1 --- The Approximation Problem --- p.48 / Chapter 4.1.2 --- Network Approach for Approximation --- p.49 / Chapter 4.2 --- Artificial Neural Network for Phone-based Speech Synthesis --- p.53 / Chapter 4.2.1 --- Network Approximation for Speech Signal Synthesis --- p.53 / Chapter 4.2.2 --- Feed forward Backpropagation Neural Network --- p.56 / Chapter 4.2.3 --- Radial Basis Function Network --- p.58 / Chapter 4.2.4 --- Parallel Operating Synthesizer Networks --- p.59 / Chapter 4.3 --- Template Storage and Control for the Synthesizer Network --- p.61 / Chapter 4.3.1 --- Implicit Template Storage --- p.61 / Chapter 4.3.2 --- Articulatory Control Parameters --- p.61 / Chapter 4.4 --- Summary --- p.65 / Chapter 5 --- Prototype Implementation of the Synthesizer Network --- p.66 / Chapter 5.1 --- Implementation of the Synthesizer Network --- p.66 / Chapter 5.1.1 --- Network Architectures --- p.68 / Chapter 5.1.2 --- Spectral Templates for Training --- p.74 / Chapter 5.1.3 --- System requirement --- p.76 / Chapter 5.2 --- Subjective Listening Test --- p.79 / Chapter 5.2.1 --- Sample Selection --- p.79 / Chapter 5.2.2 --- Test Procedure --- p.81 / Chapter 5.2.3 --- Result --- p.83 / Chapter 5.2.4 --- Analysis --- p.86 / Chapter 5.3 --- Summary --- p.88 / Chapter 6 --- Simplified Articulatory Control for the Synthesizer Network --- p.89 / Chapter 6.1 --- Coarticulatory Effect in Speech Production --- p.90 / Chapter 6.1.1 --- Acoustic Effect --- p.90 / Chapter 6.1.2 --- Prosodic Effect --- p.91 / Chapter 6.2 --- Control in various Synthesis Techniques --- p.92 / Chapter 6.2.1 --- Copy Concatenation --- p.92 / Chapter 6.2.2 --- Formant Synthesis --- p.93 / Chapter 6.2.3 --- Articulatory synthesis --- p.93 / Chapter 6.3 --- Articulatory Control Model based on Vowel Quad --- p.94 / Chapter 6.3.1 --- Modeling of Variations with the Articulatory Control Model --- p.95 / Chapter 6.4 --- Voice Correspondence : --- p.97 / Chapter 6.4.1 --- For Nasal Sounds ´ؤ Inter-Network Correspondence --- p.98 / Chapter 6.4.2 --- In Flat-Tongue Space - Intra-Network Correspondence --- p.101 / Chapter 6.5 --- Summary --- p.108 / Chapter 7 --- Pause Duration Properties in Cantonese Phrases --- p.109 / Chapter 7.1 --- The Prosodic Feature - Inter-Syllable Pause --- p.110 / Chapter 7.2 --- Experiment for Measuring Inter-Syllable Pause of Cantonese Phrases --- p.111 / Chapter 7.2.1 --- Speech Material Selection --- p.111 / Chapter 7.2.2 --- Experimental Procedure --- p.112 / Chapter 7.2.3 --- Result --- p.114 / Chapter 7.3 --- Characteristics of Inter-Syllable Pause in Cantonese Phrases --- p.117 / Chapter 7.3.1 --- Pause Duration Characteristics for Initials after Pause --- p.117 / Chapter 7.3.2 --- Pause Duration Characteristic for Finals before Pause --- p.119 / Chapter 7.3.3 --- General Observations --- p.119 / Chapter 7.3.4 --- Other Observations --- p.121 / Chapter 7.4 --- Application of Pause-duration Statistics to the Synthesis System --- p.124 / Chapter 7.5 --- Summary --- p.126 / Chapter 8 --- Conclusion and Further Work --- p.127 / Chapter 8.1 --- Conclusion --- p.127 / Chapter 8.2 --- Further Extension Work --- p.130 / Chapter 8.2.1 --- Regularization Network Optimized on ISD --- p.130 / Chapter 8.2.2 --- Incorporation of Non-Articulatory Parameters to Control Space --- p.130 / Chapter 8.2.3 --- Experiment on Other Prosodic Features --- p.131 / Chapter 8.2.4 --- Application of Voice Correspondence to Cantonese Coda Discrim- ination --- p.131 / Chapter A --- Cantonese Initials and Finals --- p.132 / Chapter A.1 --- Tables of All Cantonese Initials and Finals --- p.132 / Chapter B --- Using Distortion Measure as Error Function in Neural Network --- p.135 / Chapter B.1 --- Formulation of Itakura-Saito Distortion Measure for Neural Network Error Function --- p.135 / Chapter B.2 --- Formulation of a Modified Itakura-Saito Distortion (MISD) Measure for Neural Network Error Function --- p.137 / Chapter C --- Orthogonal Least Square Algorithm for RBFNet Training --- p.138 / Chapter C.l --- Orthogonal Least Squares Learning Algorithm for Radial Basis Function Network Training --- p.138 / Chapter D --- Phrase Lists --- p.140 / Chapter D.1 --- Two-Syllable Phrase List for the Pause Duration Experiment --- p.140 / Chapter D.1.1 --- 兩字詞 --- p.140 / Chapter D.2 --- Three/Four-Syllable Phrase List for the Pause Duration Experiment --- p.144 / Chapter D.2.1 --- 片語 --- p.144

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_321513
Date January 1996
ContributorsLo, Wai Kit., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
PublisherChinese University of Hong Kong
Source SetsThe Chinese University of Hong Kong
LanguageEnglish
Detected LanguageEnglish
TypeText, bibliography
Formatprint, xv, 160 leaves : ill. ; 30 cm.
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0026 seconds