11 |
Spoken language identification with prosodic features. / CUHK electronic theses & dissertations collection / Digital dissertation consortiumJanuary 2011 (has links)
The PAM-based prosodic LID system is compared with other prosodic LID systems with a task of pairwise language identification. The advantages of comprehensive modeling of prosodic features is clearly demonstrated. Analysis reveals the confusion patterns among target languages, as well as the feature-language relationship. The PAM-based prosodic LID system is combined with a state-of-the-art phonotactic system by score-level fusion. Complementary effects are demonstrated between the two different features in the LID problem. An additional operation on score calibration, which further improves the LID system performance, is also introduced. / There are no conventional ways to model prosody. We use a large prosodic feature set which covers fundamental frequency (FO), duration and intensity. It also considers various extraction and normalization methods of each type of features. In terms of modeling, the vector space modeling approach is adopted. We introduce a framework called prosodic attribute model (PAM) to model the acoustic correlates of prosodic events in a flexible manner. Feature selection and preliminary LID tests are carried out to derive a preferred term-document matrix construction for modeling. / This thesis focuses on the use of prosodic features for automatic spoken language identification (LID). LID is the problem of automatically determining the language of spoken utterances. After three decades of research, the state-of-the-art LID systems seem to give a saturating performance. To meet the tight requirements on accuracy, prosody is proposed as alternative features to provide complementary information to LID. / Ng, Wai Man. / Adviser: Tan Lee. / Source: Dissertation Abstracts International, Volume: 73-04, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (leaves 112-125). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese.
|
12 |
A prosodic theory of prominence and rhythm /Mellander, Evan W. January 2002 (has links)
No description available.
|
13 |
Focus accent, word length and position as cues to L1 and L2 word recognitionSennema, Anke, van de Vijver, Ruben, Carroll, Susanne E., Zimmer-Stahl, Anne January 2005 (has links)
The present study examines native and nonnative perceptual
processing of semantic information conveyed by prosodic
prominence. <br>Five groups of German learners of English each listened
to one of 5 experimental conditions. Three conditions differed in place
of focus accent in the sentence and two conditions were with spliced
stimuli. <br>The experiment condition was presented first in the learners’
L1 (German) and then in a similar set in the L2 (English). The effect
of the accent condition and of the length and position of the target in
the sentence was evaluated in a probe recognition task. <br>In both the L1
and L2 tasks there was no significant effect in any of the five focus
conditions. Target position and target word length had an effect in the
L1 task. Word length did not affect accuracy rates in the L2 task. For
probe recognition in the L2, word length and the position of the target
interacted with the focus condition.
|
14 |
The recognition of the prosodic focus position in German-learning infants from 4 to 14 monthsSchmitz, Michaela, Höhle, Barbara, Müller, Anja, Weissenborn, Jürgen January 2006 (has links)
The aim of the present study was to elucidate in a study with 4-, 6-, 8-, and 14-month-old German-learning children, when and how they may acquire the regularities which underlie Focus-to-Stress Alignment (FSA) in the target language, that is, how prosody is associated with specific communicative functions. Our findings suggest, that 14-month-olds have already found out that German allows for variable focus positions, after having gone through a development which goes from a predominantly prosodically driven processing of the input to a processing where prosody interacts more and more with the growing
lexical and syntactic knowledge of the child.
|
15 |
Emotional Text-to-Speech System of Baseball BroadcastHuang, Yi-chin 10 September 2008 (has links)
In this study, we implement an emotional text-to-speech system for the limited domain of on-line play-by-play baseball game summary. TheChinese Professional Baseball League (CPBL) is our target domain. Our goal is that the output synthesized speech is fluent with appropriate emotion. The system first parses the input text and keeps the on-court informations, e.g., the number of runners and which base is occupied, the number of outs, the score of each team, the batter's performance in game. And the system adds additional sentences in the input text.
Then, the system outputs neutral synthesized speech from the text with additional sentences inserted, and subsequently converts it to emotional speech. Our approach to speech conversion is to simulate a baseball braodcaster. Specifically, our system learns and uses the prosody from a broadcaster. To learn the prosody, we record two baseball games and analyze the prosodic features of emotional utterances.
These observations are used to generate some prosodic rules of emotional conversion. The subjective evaluation is used to study the preference of the subjects about the additional sentences insertion and the emotion conversion in the system.
|
16 |
Pitch-accent of standard-Japanese賴玉華, Lai, Yuk-wah, Esther. January 1983 (has links)
published_or_final_version / Language Studies / Master / Master of Philosophy
|
17 |
Prosodic domains in optimality theoryRodier, Dominique. January 1998 (has links)
Cross-linguistically, the notion 'minimal word' has proved fruitful grounds for explanatory accounts of requirements imposed on morphological and phonological constituents. Word minimality requires that a lexical word includes the main-stressed foot of the language. As a result, subminimal words are augmented to a bimoraic foot through diverse strategies like vowel lengthening, syllable addition, etc. Even languages with numerous monomoraic lexical words may impose a minimality requirement on derived words that would otherwise be smaller than a well-formed foot. In addition, the minimal word has been argued to play a central role in characterizing a prosodic base within some morpho-prosodic constituent for the application of processes such as reduplication and infixation. / The goal of this thesis is to offer an explanation as to why and in which contexts grammars may prefer a prosodic constituent which may not be reducible to a bimoraic foot. I provide explanatory accounts for a number of cases where the prosodic structure of morphological or phonological constituents cannot be defined as coextensive with the main stressed foot of the language. To this end, I propose to add to the theory of Prosodic Structure (Chen 1987; Selkirk 1984, 1986, 1989, 1995; Selkirk and Shen 1990) within an optimality-theoretic framework by providing evidence for a new level within the Prosodic Hierarchy, that of the Prosodic Stem (PrStem). / An important aspect of the model of prosodic structure proposed here is a notion of headship which follows directly from the Prosodic Hierarchy itself and from the metrical grouping of prosodic constituents. A theory of prosodic heads is developed which assumes that structural constraints can impose well-formedness requirements on the prosodic shape and the distribution of heads within morphological and phonological constituents.
|
18 |
A prosodic theory of prominence and rhythm /Mellander, Evan W. January 2002 (has links)
Building on earlier work, notably Kager (1993, 1995) and framed in Optimality Theory (Prince & Smolensky 1993), this thesis presents a theory of foot structure in which the asymmetric maximal expansions of iambic and trochaic feet (cf. the Iambic/Trochaic Law: 1TL, e.g. Hayes 1995) are accounted for by a single constraint, HEAD GOVERNMENT (Mellander 2001c, 2002b). The present analysis devotes special attention to a class of quantitative processes in trochaic systems which generate uneven (HL) trochaic feet. In contrast to previous analyses (e.g. Hayes 1995), such processes are shown to be of phonological rather than phonetic nature in certain languages, and the ramifications of this conclusion are explored with regard to a variety of issues in prosodic theory. / The evidence for the phonological status of (HL)-creating processes comes from published data on Mohawk, Selayarese, Gidabal, and Oromo, as well as original field data from Central Slovak. Following Piggott (1998, 2001) and Mellander (2001a, c, 2002b), these processes are seen to follow from H EAD PROMINENCE, a constraint which requires greater relative intrinsic prominence in the head of a prosodic constituent. Since HEAD PROMINENCE is sensitive to intrinsic prominence, its effects are shown to hold irrespective of derived prominence resulting from the application of stress rules. H EAD PROMINENCE is also shown to play a central role in accounting for diphthongal quantity-prominence relations, where cross-linguistic patterns of long vowel diphthongization in bimoraic syllables mirror those of (HL)-creating processes in disyllabic feet. / In contrast to previous work on HEAD GOVERNMENT (Mellander 2001c, 2002b), the absence of languages which require violations of this constraint implies that it is universally undominated, contra the standard Optimality Theoretic assumption of universal constraint violability. This view is also supported by the analysis of ternary stress systems, where the absence of unattested quaternary and quinternary systems relies crucially on the inviolability of HEAD G OVERNMENT. / A final aspect of this thesis is the development of a preliminary model to explain asymmetries in structure and markedness between iambic and trochaic systems, including distributional asymmetries, Iambic Lengthening, and the ITL. Based on work by Van de Vijver (1998) this approach abandons traditional symmetric notions of iambicity and trochaicity in favour of an asymmetric pair of constraints---PEAK-FIRST and *E DGEMOST. Iambic/trochaic asymmetries consequently emerge as artefacts of constraint interaction and require no additional theoretical machinery.
|
19 |
Pitch detection using the short-term phase spectrumCesbron, Fred́eŕique Chantal 08 1900 (has links)
No description available.
|
20 |
The use of prosody in speech recognition systemsDe Backer, Philippe Paul 08 1900 (has links)
No description available.
|
Page generated in 0.0582 seconds