Global ETD Search

21	Talking Heads - Models and Applications for Multimodal Speech Synthesis Beskow, Jonas January 2003 (has links) This thesis presents work in the area of computer-animatedtalking heads. A system for multimodal speech synthesis hasbeen developed, capable of generating audiovisual speechanimations from arbitrary text, using parametrically controlled3D models of the face and head. A speech-specific directparameterisation of the movement of the visible articulators(lips, tongue and jaw) is suggested, along with a flexiblescheme for parameterising facial surface deformations based onwell-defined articulatory targets. To improve the realism and validity of facial and intra-oralspeech movements, measurements from real speakers have beenincorporated from several types of static and dynamic datasources. These include ultrasound measurements of tonguesurface shape, dynamic optical motion tracking of face pointsin 3D, as well as electromagnetic articulography (EMA)providing dynamic tongue movement data in 2D. Ultrasound dataare used to estimate target configurations for a complex tonguemodel for a number of sustained articulations. Simultaneousoptical and electromagnetic measurements are performed and thedata are used to resynthesise facial and intra-oralarticulation in the model. A robust resynthesis procedure,capable of animating facial geometries that differ in shapefrom the measured subject, is described. To drive articulation from symbolic (phonetic) input, forexample in the context of a text-to-speech system, bothrule-based and data-driven articulatory control models havebeen developed. The rule-based model effectively handlesforward and backward coarticulation by targetunder-specification, while the data-driven model uses ANNs toestimate articulatory parameter trajectories, trained ontrajectories resynthesised from optical measurements. Thearticulatory control models are evaluated and compared againstother data-driven models trained on the same data. Experimentswith ANNs for driving the articulation of a talking headdirectly from acoustic speech input are also reported. A flexible strategy for generation of non-verbal facialgestures is presented. It is based on a gesture libraryorganised by communicative function, where each function hasmultiple alternative realisations. The gestures can be used tosignal e.g. turn-taking, back-channelling and prominence whenthe talking head is employed as output channel in a spokendialogue system. A device independent XML-based formalism fornon-verbal and verbal output in multimodal dialogue systems isproposed, and it is described how the output specification isinterpreted in the context of a talking head and converted intofacial animation using the gesture library. Through a series of audiovisual perceptual experiments withnoise-degraded audio, it is demonstrated that the animatedtalking head provides significantly increased intelligibilityover the audio-only case, in some cases not significantly belowthat provided by a natural face. Finally, several projects and applications are presented,where the described talking head technology has beensuccessfully employed. Four different multimodal spokendialogue systems are outlined, and the role of the talkingheads in each of the systems is discussed. A telecommunicationapplication where the talking head functions as an aid forhearing-impaired users is also described, as well as a speechtraining application where talking heads and languagetechnology are used with the purpose of improving speechproduction in profoundly deaf children. / QC 20100506 Talking heads facial animation speech synthesis coarticulation intelligibility embodied conversational agents
22	Anticipatory Coarticulation and Stability of Speech in Typically Fluent Speakers and People Who Stutter Across the Lifespan: An Ultrasound Study Belmont, Alissa Joy 01 January 2015 (has links) This study uses ultrasound to image onset velar stop consonant articulation in words. By examining tongue body placement, the extent of velar closure variation across vowel contexts provides for the measurement of anticipatory coarticulation while productions within the same vowel context provide measurement of extent of token-to-token variation. Articulate Assistant Advanced 2.0 software was used to semi-automatically generate midsagittal tongue contours at the initial point of maximum velar closure and was used to fit each contour to a curved spline. Patterns of lingual coarticulation and measures of speech motor stability, based on curve-to-curve distance (Zharkova, Hewlett, & Hardcastle, 2011), are investigated to compare the speech of typically fluent speakers to the speech of people who stutter. Anticipatory coarticulation can be interpreted as a quantitative measure indicating the maturity of the speech motor system and its planning abilities. Token-to-token variability is examined from multiple velar vowel productions within the same vowel context, describing the accuracy of control, or stability, of velar closure gestures. Measures for both speaking groups are examined across the lifespan at stages during speech development, maturation, and aging. Results indicate an overall age effect, interpreted as refinement, with increased speech stability and progressively more segmental (less coarticulated) productions across the lifespan. A tendency toward decreased stability and more coarticulated speech was found for younger people who stutter, but this difference was small and absent among older adults. Outcomes of this study suggest the articulatory maturation trajectories of people who stutter may be delayed, but overall maturation of the speech mechanism is evident by older adulthood for typically fluent speakers and those who stutter. Applications to intervention are discussed in closing. Anticipatory coarticulation Speech motor control Speech stability Stuttering Token-to-token variability Ultrasound Speech and Hearing Science
23	Gestural overlap across word boundaries: evidence from English and Mandarins speakers Luo, Shan 26 January 2016 (has links) This research examines how competing factors determine the articulation of English stop-stop sequences across word boundaries in both native (L1) and nonnative (L2) speech. The two general questions that drive this research are 1) how is consonantal coordination implemented across English words? And 2) is this implementation different in L1 versus L2 speech? A group of 15 native English (NE) speakers and a group of 25 native Mandarin speakers (NM) who use English as a foreign language (ESL) participated in this study. The stimuli employed in this research were designed along four major parameters: 1) place of articulation, 2) lexical frequency, 3) stress, and 4) speech rate. The release percentages and closure duration ratios produced by English and Mandarin speakers were measured. The results showed that place of articulation had different effects on English and Mandarin speakers in their English stop-stop coarticulation, especially in heterorganic clusters. Specifically, a place order effect (i.e., more releases and more overlap in front-back clusters than in back-front clusters; POE) was only partially supported in native speech but not shown at all in nonnative speech in the current research. The results also confirmed a gradient lexical frequency effect, finding a significant correlation between self-rated frequency and overlap. A group difference was observed in the interaction between the effects of place of articulation and categorical frequency (real words vs. nonwords). In addition, the results showed, unexpectedly, a stronger stress effect for the NM group rather than for the NE group. Further analyses showed that increased speech rate did not systematically induce increased temporal overlap, because speakers from both groups varied in their behavior, having either more or less overlap at the fast speech rate than at the slow rate. Lastly, the analyses found no correlation between closure duration ratio and perceived accent in L2 speech. This finding was not predicted, given that timing features had always been considered critical to foreign accent perception. / Graduate gestural overlap consonant cluster coarticulation place order effect Mandarin speakers frequency
24	Perception et production des voyelles orales du français par des futures enseignantes tchèques de Français Langue Etrangère (FLE) / Perception and Production of French Oral Vowels in Pre-Service Czech Teachers of French as a Foreign Language (FFL) Maurová Paillereau, Nikola 12 January 2015 (has links) Cette étude acoustico-perceptive concerne les limites de la perception et de la production des voyelles orales du français [i, e, ɛ, a, u, o, ɔ, y, ø, œ], en isolation et en contextes consonantiques divers, chez dix tchécophones, futures enseignantes de Français Langue Étrangère (FLE). Les résultats montrent que (1) La maîtrise phonétique des voyelles dépend de leurs graphies et de l’entourage consonantique. (2) Les voyelles fermées [i, y, u] et le [a] sont globalement maîtrisées avec authenticité. (3) Les capacités de perception des contrastes entre les voyelles moyennes e/ɛ, ø/œ et o/ɔ ainsi que leur production sont limitées. Ces résultats ne sont que partiellement en accord avec les prédictions établies à partir du Speech Learning Model (SLM) de Flege (1995), basé sur la notion de similarité phonétique qui existe entre la langue maternelle (LM) et la langue étrangère (LE). / This acoustic-perceptual study concerns the limits of perception and production of French oral vowels [i, e, ɛ, a, u, o, ɔ, y, ø, œ], in isolation and in different consonantal contexts, in ten pre-service Czech teachers of French as a Foreign Language (FFL). The results show that (1) Phonetic proficiency in vowels depends on their spellings and consonantal context. (2) Vowels [i, y, u] and [a] are generally mastered with authenticity. (3) The ability to hear contrasts between the vowels e/ɛ, ø/œ and o/ɔ and pronounce them is limited. These results are only partially consistent with the predictions established in the Speech Learning Model (SLM) by Flege (1995), based on the notion of phonetic similarity between the mother tongue (MT) and the foreign language (FL). Phonétique Acoustique Formants Coarticulation Perception Prononciation Voyelles Français Tchèque Français Langue Etrangère (FLE) Speech Learning Model (SLM) Phonetics Acoustics Formants Coarticulation Percepetion Pronunciation Vowels French Czech FFL (French as a foreign language) SLM (Speech Learning Model) 414.8 407
25	Perception et production des voyelles orales du français par des futures enseignantes tchèques de Français Langue Etrangère (FLE) / Perception and Production of French Oral Vowels in Pre-Service Czech Teachers of French as a Foreign Language (FFL) Maurová Paillereau, Nikola 12 January 2015 (has links) Cette étude acoustico-perceptive concerne les limites de la perception et de la production des voyelles orales du français [i, e, ɛ, a, u, o, ɔ, y, ø, œ], en isolation et en contextes consonantiques divers, chez dix tchécophones, futures enseignantes de Français Langue Étrangère (FLE). Les résultats montrent que (1) La maîtrise phonétique des voyelles dépend de leurs graphies et de l’entourage consonantique. (2) Les voyelles fermées [i, y, u] et le [a] sont globalement maîtrisées avec authenticité. (3) Les capacités de perception des contrastes entre les voyelles moyennes e/ɛ, ø/œ et o/ɔ ainsi que leur production sont limitées. Ces résultats ne sont que partiellement en accord avec les prédictions établies à partir du Speech Learning Model (SLM) de Flege (1995), basé sur la notion de similarité phonétique qui existe entre la langue maternelle (LM) et la langue étrangère (LE). / This acoustic-perceptual study concerns the limits of perception and production of French oral vowels [i, e, ɛ, a, u, o, ɔ, y, ø, œ], in isolation and in different consonantal contexts, in ten pre-service Czech teachers of French as a Foreign Language (FFL). The results show that (1) Phonetic proficiency in vowels depends on their spellings and consonantal context. (2) Vowels [i, y, u] and [a] are generally mastered with authenticity. (3) The ability to hear contrasts between the vowels e/ɛ, ø/œ and o/ɔ and pronounce them is limited. These results are only partially consistent with the predictions established in the Speech Learning Model (SLM) by Flege (1995), based on the notion of phonetic similarity between the mother tongue (MT) and the foreign language (FL). Phonétique Acoustique Formants Coarticulation Perception Prononciation Voyelles Français Tchèque Français Langue Etrangère (FLE) Speech Learning Model (SLM) Phonetics Acoustics Formants Coarticulation Percepetion Pronunciation Vowels French Czech FFL (French as a foreign language) SLM (Speech Learning Model) 414.8 407
26	Anticipatory vowel-to-consonant coarticulation of Swedish voiceless fricatives [s], [ɕ] and [ɧ] / Anticipatorisk V-C koartikulation i de svenska tonlösa frikativorna [s], [ɕ] och [ɧ] Thörn, Lisa January 2022 (has links) In this thesis, anticipatory vowel-to-consonant coarticulation of Swedish fricatives [s], [ɕ], and [ɧ] was examined in isolated words and connected speech. 10 women and 10 men participated in the study, performing a production test where fricative-initial, real words were read as isolated words and in sentences. The first spectral moment (M1), also known as center of gravity, was measured at midpoint and end of the word-initial fricatives. M1 was found to differ greatly at the two measuring points in [s] and [ɕ], with a mean decrease of 2852 and 985 Hz, respectively. This was not the case for [ɧ], for which the M1 trajectories decreased and increased almost to the same extent. When taking vowel context into account, some correlation between vowel quality and M1 was found, in that M1 at midpoint of [s] and [ɕ] mirrored the height of F2 of the following vowel. M1 measurements were similar in isolated words and sentences, and no convincing differences were observed between the two in this investigation. anticipatory coarticulation fricatives spectral moments center of gravity Swedish vowel-to-consonant coarticulation anticipatorisk koartikulation frikativor spektrala moment svenska vokal-till-konsonant ko- artikulation General Language Studies and Linguistics
27	Espace acoustique et patrons coarticulatoires : les voyelles de l'arabe libyen de Tripoli en contexte pharyngalisé. Salam, Fathi 30 November 2012 (has links) (PDF) Ce travail de recherche porte sur un aspect phonétique qui s'inscrit dans trois domaines, la phonétique, la dialectologie et la sociophonétique arabes. Notre démarche, nos outils et nos analyses sont phonétiques par essence. Nous avons analysé la fréquence des trois premiers formants [F1, F2, F3] des voyelles cardinales brèves /i, u, a/ de l'arabe libyen de Tripoli (ALT) et nous avons alterné le contexte phonétique consonantique pharyngalisé / tˁ, sˁ, dˁ/ vs non pharyngalisé / t, s, d/ afin de vérifier l'impact de celui-ci sur la fréquence centrale des formants. Cependant, les résultats ainsi obtenus nous ont permis de comparer l'ALT à d'autres variétés populaires arabes modernes. Comme ils nous ont permis d'opérer des distinctions sociales fondamentales, comme celle du gender. Notre problématique articule la question de la réalisation de l'espace acoustique des voyelles en ALT avec le contraste consonantique de pharyngalisation, les patrons coarticulatoires qui en résultent ainsi que l'outil 'équation de locus' pour les révéler, tout cela dans une dimension de stratification sociale par le gender. Trois hypothèses de travail ont été présentées, la première sur la variation de l'espace vocalique et ses motivations, la deuxième sur la pertinence de l'utilisation de l'équation de locus et la troisième sur les distinctions liées au gender. Nos résultats, fondés sur l'analyse d'un corpus de mots trisyllabiques [C1V1- C2V2- C3V3] où C était soit une consonne pharyngalisée /s ˁ, t ˁ, d ˁ/, soit une consonne non pharyngalisée /s, t, d/, V étant une des trois voyelles brèves cardinales /i, u, a/, lu par dix locuteurs (6 hommes et 5 femmes) permettent de valider nos trois hypothèses : la variation des valeurs formantiques des voyelles, de l'espace acoustique et de la distance entre les deux premiers formants en fonction des trois facteurs : 1) le contexte consonantique (pharyngalisé vs non pharyngalisé) ; 2) la position prosodique (accentué vs inaccentué) ; et 3) la distinction sociale (homme vs femme). Notre travail a pu répondre positivement aux objectifs qui lui ont été assignés au départ : 1) sur le plan phonétique, donner un aperçu du système vocalique de l'ALT et de sa variation en fonction de la pharyngalisation ; 2) sur le plan dialectologique, répondre aux questions de la typologie dialectale arabe et le classement de l'arabe libyen, dialecte oriental vs dialecte maghrébin ; et 3) sur le plan sociophonétique, vérifier la profonde distinction sociale, parole de femme vs parole d'homme Espace acoustique Coarticulation Voyelles Arabe libyen Pharyngalisation
28	A phonetic investigation of vowel variation in Lekwungen Nolan, Tess 04 May 2017 (has links) This thesis conducted the first acoustic analysis on Lekwungen (aka Songhees, Songish) (Central Salish). It studied the acoustic correlates of stress on vowels and the effects of consonantal coarticulatory effects on vowel quality. The goals of the thesis were to provide useful and usable materials and information to Lekwungen language revitalisation efforts and to provide an acoustic study of Lekwungen vowels to expand knowledge of Salishan languages and linguistics. Duration, mean pitch, and mean amplitude were measured on vowels in various stress environments. Findings showed that there is a three-way contrast between vowels in terms of duration and only a two-way contrast in terms of pitch and amplitude. F1, F2, and F3 were measured at vowel onset (5%), midpoint (50%), and offset (95%), as well as a mean (5%-95%), in CVC sequences for four vowels: /i/, /e/, /a/, and /ə/. Out of five places of articulation of consonants in Lekwungen (alveolar, palatal, labio-velar, uvular, glottal), uvular and glottal had the most persistent effects on F1, F2, and F3 of all vowels. Of the vowels, unstressed /ə/ was the most persistently affected by all consonants. Several effects on perception were also preliminarily documented, but future work is needed to see how persistence in acoustic effects is correlated with perception. This thesis provides information and useful tips to help learners and teachers in writing and perceiving Lekwungen and for learners learning Lekwungen pronunciation, as a part of language revitalisation efforts. It also contributes to the growing body of acoustic phonetic work on Salishan languages, especially on vowels. / Graduate / 0290 language revitalization language documentation salish central salish acoustic phonetics songhees lekwungen songish north straits salish vowels coarticulation
29	Modélisation de la coarticulation en Langue des Signes Française pour la diffusion automatique d'informations en gare ferroviaire à l'aide d'un signeur virtuel Segouat, Jérémie 15 December 2010 (has links) (PDF) Le cadre de nos recherches est la diffusion d'informations en Langue des Signes Française via un signeur virtuel, par combinaison de segments d'énoncés préenregistrés. Notre étude porte sur une proposition de modèle de coarticulation pour ce système de diffusion. Le phénomène de coarticulation est encore très peu étudié dans le domaine des langues des signes : en puisant dans différents domaines (langues vocales, gestes) nous proposons une définition de ce qu'est la coarticulation en langue des signes, et nous présentons une méthodologie d'analyse de ce phénomène, en nous focalisant sur les configurations des mains et la direction du regard. Nous détaillons les différents aspects de la création et de l'annotation de corpus, et de l'analyse de ces annotations. Des calculs statistiques quantitatifs et qualitatifs nous permettent de proposer un modèle de coarticulation, basé sur des relâchements et des tensions de configurations des mains. Nous proposons et mettons en oeuvre une méthodologie d'évaluation de notre modèle. Enfin nous proposons des perspectives autour des utilisations potentielles de ce modèle pour des recherches en traitement d'image et en animation de personnages 3d s'exprimant en langue des signes française. [INFO] Computer Science Traitement automatique des langues Annotation de corpus Langue des signes française Alignement d'annotations Coarticulation Evaluation Animation de signeur virtuel
30	Indicateurs d'allophonie et de phonémicité Boruta, Luc 26 September 2012 (has links) (PDF) Bien que nous ne distinguions qu'un nombre fini et restreint de catégories de sons (les phonèmes d'une langue donnée), les sons des messages que nous recevons ne sont jamais identiques. Étant donnée l'ubiquité des processus allophoniques à travers les langues et le fait que chaque langue dispose de son propre inventaire phonémique, quels types d'indices les nourrissons, par exemple anglophones, pourraient-ils exploiter pour découvrir que [sıŋkıŋ] et [θıŋkıŋ] (sinking vs. thinking) ne peuvent pas dénoter la même action ? Le travail présenté dans cette thèse prolonge les travaux initiés par Peperkamp et al. (2006) concernant la définition de mesures de dissimilarité phone à phone indiquant quels phones sont des réalisations d'un même phonème. Nous montrons que résoudre la tâche proposée par Peperkamp et al. n'apporte pas une réponse complète au problème de l'acquisition des phonèmes, principalement parce que des limitations empiriques et formelles résultent de sa formulation phone à phone. Nous proposons une reformulation du problème comme un problème d'apprentissage automatique non-supervisé par partitionnement reposant sur le positionnement multidimensionnel des données. Les résultats de diverses expériences d'apprentissage supervisé et non-supervisé indiquent systématiquement que de bons indicateurs d'allophonie ne sont pas nécessairement de bons indicateurs de phonémicité. Dans l'ensemble, les résultats computationnels présentés dans ce travail suggèrent qu'allophonie et phonémicité ne peuvent être découvertes à partir d'informations acoustique, temporelle, distributionnelle ou lexicale que si, en moyenne, les phonèmes présentent peu de variabilité. phonologie phonèmes allophonie coarticulation acquisition clustering

Search results