Return to search

Prosodic features of imperatives in Xhosa : implications for a text-to-speech system

Thesis (MA)--University of Stellenbosch, 2000. / ENGLISH ABSTRACT: This study focuses on the prosodic features of imperatives and the role of prosodies in the
development of a text-to-speech (TIS) system for Xhosa, an African tone language. The
perception of prosody is manifested in suprasegmental features such as fundamental
frequency (pitch), intensity (loudness) and duration (length).
Very little experimental research has been done on the prosodic features of any
grammatical structures (moods and tenses) in Xhosa, therefore it has not yet been
determined how and to what degree the different prosodic features are combined and
utilized in the production and perception of Xhosa speech. One such grammatical
structure, for which no explicit descriptive phonetic information exists, is the imperative
mood expressing commands.
In this study it was shown how the relationship between duration, pitch and loudness, as
manifested in the production and perception of Xhosa imperatives could be determined
through acoustic analyses and perceptual experiments. An experimental phonetic approach
proved to be essential for the acquisition of substantial and reliable prosodic information.
An extensive acoustic analysis was conducted to acquire prosodic information on the
production of imperatives by Xhosa mother tongue speakers. Subsequently, various
statistical parameters were calculated on the raw acoustic data (i) to establish patterns of
significance and (ii) to represent the large amount of numeric data generated, in a compact
manner.
A perceptual experiment was conducted to investigate the perception of imperatives. The
prosodic parameters that were extracted from the acoustic analysis were applied to
synthesize imperatives in different contexts. A novel approach to Xhosa speech synthesis
was adopted. Monotonous verbs were recorded by one speaker and the pitch and duration
of these words were then manipulated with the TD-PSOLA technique. Combining the results of the acoustic analysis and the perceptual experiment made it
possible to present a prosodic model for the generation of perceptually acceptable
imperati ves in a practical Xhosa TIS system.
Prosody generation in a natural language processing (NLP) module and its place within the
larger framework of text-to-speech synthesis was discussed. It was shown that existing
architectures for TTS synthesis would not be appropriate for Xhosa without some
adaptation. Hence, a unique architecture was suggested and its possible application
subsequently illustrated. Of particular importance was the development of an alternative
algorithm for grapheme-to-phoneme conversion.
Keywords: prosody, speech synthesis, speech perception, acoustic analysis, Xhosa / AFRIKAANSE OPSOMMING: Hierdie studie fokus op die prodiese eienskappe van imperatiewe en die rol van prosodie in
die ontwikkeling van 'n teks-na-spraak-sisteem vir Xhosa, 'n Afrika-toontaal. Die
persepsie van prosodie word gemanifesteer in suprasegmentele eienskappe soos
fundamentele frekwensie (toonhoogte), intensiteit (luidheid) en duur (lengte).
Weinig eksperimentele navorsing bestaan ten opsigte van die prosodiese eienskappe van
enige grammatikale strukture (modus en tyd) in Xhosa. Hoe en tot watter mate die
verskillende prosodiese kenmerke gekombineer en gebruik word in die produksie en
persepsie van Xhosa-spraak is nog nie duidelik nie. 'n Grammatikale struktuur waarvoor
geen eksplisiete deskriptiewe fonetiese inligting bestaan nie, is die van die imperatiewe
modus wat bevele uitdruk.
Hierdie studie wys hoe die verhouding tussen duur, toonhoogte en luidheid, soos
gemanifesteer in die produksie en persepsie van Xhosa-imperatiewe bepaal kon word deur
akoestiese analises en persepsueIe eksperimente. Dit het geblyk dat 'n eksperimenteelfonetiese
benadering noodsaaklik is vir die verkryging van sinvolle en betroubare
prosodiese inligting.
'n Uitgebreide akoestiese analise is uitgevoer om prosodiese data omtrent die produksie
van imperatiewe deur Xhosa-moedertaalsprekers te bekom. Vervolgens is verskeie
statistiese analises op die rou akoestiese data uitgevoer om (i) patrone van beduidenheid te
bepaal en om (ii) die groot hoeveelheid numeriese data wat gegenereer is meer kompak
voor te stel.
'n PersepsueIe eksperiment is uitgevoer met die doelom die persepsie van imperatiewe te
ondersoek. Die prosodiese parameters soos uit die akoestiese analise bekom, is toegepas in
die sintese van bevele in verskillende kontekste. 'n Nuwe benadering tot Xhosaspraaksintese
is gevolg. Monotone werkwoorde is vir een spreker opgeneem en die
toonhoogte en duur van hierdie woorde is met TD-PSOLA tegniek gemanipuleer. 'n Kombinasie van akoestiese en persepsueie resultate is aangewend om 'n prosodiese
model te ontwikkel vir die sintese van persepsueel aanvaarbare imperatiewe in 'n praktiese
Xhosa teks- na- spraaksinteti seerder .
Prosodie-generering in 'n natuurlike taalprosesering-module en die plek daarvan binne die
raamwerk van teks-na-spraaksintese is bespreek. Daar is gewys dat bestaande argitekture
vir teks-na-spraaksisteme nie sonder sommige aanpassings toepaslik vir Xhosa sal wees
nie. Derhalwe is 'n unieke argitektuur gesuggereer en die moontlike toepassing daarvan
geïllustreer. Die ontwikkeling van 'n alternatiewe algoritme vir letter-na-klankomsetting
was van besondere belang.
Sleutelwoorde: spraaksintese, spraakpersepsie, akoestiese analise, Xhosa

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/51891
Date03 1900
CreatorsSwart, Philippa H.
ContributorsRoux, J. C., Botha, E. C., Stellenbosch University. Faculty of Arts and Social Sciences. Dept. of African Languages.
PublisherStellenbosch : Stellenbosch University
Source SetsSouth African National ETD Portal
Languageen_ZA
Detected LanguageUnknown
TypeThesis
Format167 p. : ill.
RightsStellenbosch University

Page generated in 0.0034 seconds