Global ETD Search

1	The interaction of prosodic phrasing, verb bias, and plausibility during spoken sentence comprehension Blodgett, Allison Ruth 17 June 2004 (has links) No description available. prosody parsing verb bias plausibility prosodic phrasing ambiguity resolution
2	Quels liens entre accentuation et niveaux de constituance en français ? : une analyse perceptive et acoustique / The relationship between accentuation and levels of constituency in French : a perceptual and acoustical investigation Garnier, Laury 21 February 2018 (has links) L’accent en français est considéré comme un accent post-lexical marquant le niveau du groupe de mots plutôt que le niveau du mot lui-même. L’Accent Final (AF) primaire, cooccurrent à la frontière prosodique, s’effacerait perceptivement en frontière prosodique majeure. L’Accent Initial (AI), dit secondaire et optionnel, serait un accent rythmique apparaissant sur les longs constituants. Dans ce contexte, seuls deux niveaux de constituance sont communément admis en français : le Syntagme Intonatif (IP) et le Syntagme Accentuel (AP). L’existence d’un Syntagme Intermédiaire (ip) est en revanche controversée. Enfin, la prise en compte du Mot Prosodique (PW) (i.e. mot lexical) comme unité de planification, ou de réalisation des règles accentuelles, en structure de surface ne semble pas envisagé. L’objectif de cette étude est d’explorer l’organisation du phrasé prosodique en français. Dans ce cadre, nous proposons une étude perceptive, via un corpus de parole contrôlée manipulant des structures syntaxiques ambiguës, où 80 participants ont effectué 3 tâches de perception : proéminence, frontière et groupement. Les événements prosodiques perçus ont ensuite été mis en relation avec leurs réalités acoustiques. Les résultats montrent que les auditeurs sont capables de percevoir des niveaux de granularité de frontières plus fins que ce que les descriptions traditionnelles du français prédisent. Par ailleurs, les mots lexicaux sont systématiquement réalisés par un marquage bipolaire (AI+AF) de même force métrique. AI joue également un rôle plus structurel que rythmique, en marquant la structure prosodique de manière plus privilégiée qu’AF. Enfin, AF ne s’efface pas perceptivement en frontière prosodique majeure, et garde au contraire une trace métrique au niveau du mot lexical, qui ne varie pas strictement en fonction du niveau de constituance. / In French, accentuation is said to be post-lexical, marking the phrase rather than the word. That is, the primary final accent (FA) is considered to be perceptively weakened when co-occurring with a major prosodic boundary, while the Initial Accent (IA), regarded as a secondary and optional accent, is thought to hold merely a rhythmic function in balancing longer constituents. Consequently, only two levels of prosodic constituency are accounted for in French: the Intonational Phrase (IP), and the Accentual Phrase (AP). The existence of a third level, the Intermediate Phrase (ip), while advanced by some authors, remains controversial. Moreover, the Prosodic Word (PW) (i.e. lexical word) as a phonological unit, or as the domain of accentual rules, is disregarded altogether. The aim of our study is to investigate the organization of prosodic phrasing in French. We propose a perception study on a corpus in which syntactically ambiguous structures were manipulated, and asked 80 participants to perform 3 distinct perception tasks: a prominence, boundary and grouping task. The perceived prosodic events were then related to their acoustic realization. Taken together, our results indicate that listeners are able to distinguish finer-grained grouping levels than those predicted in traditional French descriptions. Moreover, lexical words are systematically realized by an accentual bipolarization (IA+FA), with each accent carrying the same metrical weight. The function of IA is shown to be more one of structuration than rhythmic balancing, with IA even marking structure more readily than FA. Finally, our results indicate that FA is not perceptively weakened when co-occurring with major prosodic boundaries, but instead remains a metrical mark at the level of the lexical word, in a manner independent from the level of constituency. Phrasé prosodique Perception Acoustique Proéminences Frontières Groupements Prosodic Phrasing Perception Acoustics Prominence Boundary Grouping
3	Focus and Tone Hartmann, Katharina January 2007 (has links) Tone is a distinctive feature of the lexemes in tone languages. The information-structural category focus is usually marked by syntactic and morphological means in these languages, but sometimes also by intonation strategies. In intonation languages, focus is marked by pitch movements, which are also perceived as tone. The present article discusses prosodic focus marking in these two language types. Tone (language) intonation (language) focus pitch accent prosodic phrasing Language, Linguistics
4	HMM-based Vietnamese Text-To-Speech : Prosodic Phrasing Modeling, Corpus Design System Design, and Evaluation / Text-To-Speech à base de HMM (Hidden Markov Model) pour le vietnamien : modélisation de la segmentation prosodique, la conception du corpus, la conception du système, et l’évaluation perceptive Nguyen, Thi Thu Trang 24 September 2015 (has links) L’objectif de cette thèse est de concevoir et de construire, un système Text-To-Speech (TTS) haute qualité à base de HMM (Hidden Markov Model) pour le vietnamien, une langue tonale. Le système est appelé VTED (Vietnamese TExt-to-speech Development system). Au vu de la grande importance de tons lexicaux, un tonophone” – un allophones dans un contexte tonal – a été proposé comme nouvelle unité de la parole dans notre système de TTS. Un nouveau corpus d’entraînement, VDTS (Vietnamese Di-Tonophone Speech corpus), a été conçu à partir d’un grand texte brut pour une couverture de 100% de di-phones tonalisés (di-tonophones) en utilisant l’algorithme glouton. Un total d’environ 4000 phrases ont été enregistrées et pré-traitées comme corpus d’apprentissage de VTED.Dans la synthèse de la parole sur la base de HMM, bien que la durée de pause puisse être modélisée comme un phonème, l’apparition de pauses ne peut pas être prédite par HMM. Les niveaux de phrasé ne peuvent pas être complètement modélisés avec des caractéristiques de base. Cette recherche vise à obtenir un découpage automatique en groupes intonatifs au moyen des seuls indices de durée. Des blocs syntaxiques constitués de phrases syntaxiques avec un nombre borné de syllabes (n), ont été proposés pour prévoir allongement final (n = 6) et pause apparente (n = 10). Des améliorations pour allongement final ont été effectuées par des stratégies de regroupement des blocs syntaxiques simples. La qualité du modèle prédictive J48-arbre-décision pour l’apparence de pause à l’aide de blocs syntaxiques, combinée avec lien syntaxique et POS (Part-Of-Speech) dispose atteint un F-score de 81,4 % (Précision = 87,6 %, Recall = 75,9 %), beaucoup mieux que le modèle avec seulement POS (F-score=43,6%) ou un lien syntaxique (F-score=52,6%).L’architecture du système a été proposée sur la base de l’architecture HTS avec une extension d’une partie traitement du langage naturel pour le Vietnamien. L’apparence de pause a été prédit par le modèle proposé. Les caractéristiques contextuelles incluent les caractéristiques d’identité de “tonophones”, les caractéristiques de localisation, les caractéristiques liées à la tonalité, et les caractéristiques prosodiques (POS, allongement final, niveaux de rupture). Mary TTS a été choisi comme plateforme pour la mise en oeuvre de VTED. Dans le test MOS (Mean Opinion Score), le premier VTED, appris avec les anciens corpus et des fonctions de base, était plutôt bonne, 0,81 (sur une échelle MOS 5 points) plus élevé que le précédent système – HoaSung (lequel utilise la sélection de l’unité non-uniforme avec le même corpus) ; mais toujours 1,2-1,5 point de moins que le discours naturel. La qualité finale de VTED, avec le nouveau corpus et le modèle de phrasé prosodique, progresse d’environ 1,04 par rapport au premier VTED, et son écart avec le langage naturel a été nettement réduit. Dans le test d’intelligibilité, le VTED final a reçu un bon taux élevé de 95,4%, seulement 2,6% de moins que le discours naturel, et 18% plus élevé que le premier. Le taux d’erreur du premier VTED dans le test d’intelligibilité générale avec le carré latin test d’environ 6-12% plus élevé que le langage naturel selon des niveaux de syllabe, de ton ou par phonème. Le résultat final ne s’écarte de la parole naturelle que de 0,4-1,4%. / The thesis objective is to design and build a high quality Hidden Markov Model (HMM-)based Text-To-Speech (TTS) system for Vietnamese – a tonal language. The system is called VTED (Vietnamese TExt-tospeech Development system). In view of the great importance of lexical tones, a “tonophone” – an allophone in tonal context – was proposed as a new speech unit in our TTS system. A new training corpus, VDTS (Vietnamese Di-Tonophone Speech corpus), was designed for 100% coverage of di-phones in tonal contexts (i.e. di-tonophones) using the greedy algorithm from a huge raw text. A total of about 4,000 sentences of VDTS were recorded and pre-processed as a training corpus of VTED.In the HMM-based speech synthesis, although pause duration can be modeled as a phoneme, the appearanceof pauses cannot be predicted by HMMs. Lower phrasing levels above words may not be completely modeled with basic features. This research aimed at automatic prosodic phrasing for Vietnamese TTS using durational clues alone as it appeared too difficult to disentangle intonation from lexical tones. Syntactic blocks, i.e. syntactic phrases with a bounded number of syllables (n), were proposed for predicting final lengthening (n = 6) and pause appearance (n = 10). Improvements for final lengthening were done by some strategies of grouping single syntactic blocks. The quality of the predictive J48-decision-tree model for pause appearance using syntactic blocks combining with syntactic link and POS (Part-Of-Speech) features reached F-score of 81.4% Precision=87.6%, Recall=75.9%), much better than that of the model with only POS (F-score=43.6%)or syntactic link (F-score=52.6%) alone.The architecture of the system was proposed on the basis of the core architecture of HTS with an extension of a Natural Language Processing part for Vietnamese. Pause appearance was predicted by the proposed model. Contextual feature set included phone identity features, locational features, tone-related features, and prosodic features (i.e. POS, final lengthening, break levels). Mary TTS was chosen as a platform for implementing VTED. In the MOS (Mean Opinion Score) test, the first VTED, trained with the old corpus and basic features, was rather good, 0.81 (on a 5 point MOS scale) higher than the previous system – HoaSung (using the non-uniform unit selection with the same training corpus); but still 1.2-1.5 point lower than the natural speech. The quality of the final VTED, trained with the new corpus and prosodic phrasing model, progressed by about 1.04 compared to the first VTED, and its gap with the natural speech was much lessened. In the tone intelligibility test, the final VTED received a high correct rate of 95.4%, only 2.6% lower than the natural speech, and 18% higher than the initial one. The error rate of the first VTED in the intelligibility test with the Latin square design was about 6-12% higher than the natural speech depending on syllable, tone or phone levels. The final one diverged about only 0.4-1.4% from the natural speech. Text-to-speech Vietnamien Langue tonale Modélisation de phrasé prosodique Text-to-speech Vietnamese Tonal language Prosodic phrasing modeling
5	AMPER-Argentina: pretonemas en oraciones interrogativas absolutas Gurlekian, Jorge, Toledo, Guillermo 25 September 2017 (has links) Este trabajo es parte del Proyecto AMPER (Atlas Multimedia de la Prosodia del Espacio Románico). El área dialectal de estudio es el español de Buenos Aires. En el artículo se analizan las oraciones interrogativas absolutas SVO: un SN (núcleos sintácticos paroxítonos, proparoxítonos, oxítonos), un SV (núcleo paroxítono), un SPrep (núcleos paroxítonos, proparoxítonos, oxítonos). También se examinan los pretonemas según el modelo de entonación métrico y autosegmental (AM), y se observa la influencia de la frase fonológica (φ) en la representación fonológica de los acentos tonales. Los resultados de los pretonemas indican diferencias y no un único fraseo prosódico que caracterice a esta modalidad. Los primeros picos (P1) de la primera φ no muestran tonos más altos si se los compara con los P1 de oraciones declarativas. Se descarta un tono de frontera H% inicial. Estos hallazgos confirman otro estudio previo: la información sobre la modalidad interrogativa absoluta se encuentra fuera del pretonema, en el tonema final. / The present work belongs to project AMPER (Multimedia Atlas of Prosody of the Romanic Space). The dialectal area of study is the Spanish from Buenos Aires. This work analyses absolute interrogative sentences of the SVO-type: a NP (oxytone, paroxytone and proparoxytone heads), a VP (paroxytone head), a Prep. phrase (oxytone, paroxytone and proparoxytone heads). In addition, pretonemes are examined according to the intonation Autosegmental-metrical (AM) framework and the phonological phrase (f) influence is observed on the phonological representation of pitch accents. The pretoneme results indicate differences and not only one prosodic phrasing which may characterize this modality. The first peaks (P1) which belong to the first f do not show higher tones if compared to the P1 of declarative sentences. An initial frontier tone H% is discarded. These findings confirm a previous study: information regarding the absolute interrogative modality is out of the pretoneme, in the final toneme. Linguistics And Literature Amper-Argentina Absolute Interrogative Sentence Prosodic Phrasing Pretonem Lingüística y Literatura Amper-Argentina Oración Interrogativa Absoluta Fraseo Prosódico Pretonema
6	Caractérisation phonétique et phonologique du syntagme intermédiaire en français : de la production à la perception Michelas, Amandine 04 July 2011 (has links) Le travail présenté ici est sous-tendu par deux observations majeures. Premièrement, la plupart des modèles proposés pour le français s’accordent sur l’existence de deux niveaux de structure prosodique: le syntagme accentuel et le syntagme intonatif. Deuxièmement, bien que l’existence d’un niveau additionnel de structure situé entre ces deux niveaux ait été proposé pour le français, les propriétés phonétiques et phonologiques de ce constituant n’ont pas clairement été définies. Dans cette thèse nous avons fourni des preuves de l’existence du syntagme intermédiaire (ip) à la fois en production et en perception de la parole. Grâce à cinq expérimentations menées dans le cadre de la phonologie de laboratoire, nous avons caractérisé les propriétés phonético-phonologiques de ce constituant et attesté de son rôle dans le traitement perceptif du langage. Les résultats obtenus en production montrent que l’ip est le domaine de l’abaissement des accents mélodiques en français. Sa frontière droite est marquée par un allongement pré-frontière ainsi qu’un accent de syntagme responsable du retour à la ligne de référence du registre. Les analyses menées en perception ont montré que les frontières droites du syntagme accentuel et du syntagme intermédiaire sont utilisées très tôt dans le processus de traitement syntaxique. Les indices phonétiques et phonologiques présents à ces frontières permettent aux auditeurs du français de construire des attentes sur la structure syntaxique des énoncés perçus. Une analyse séparée des différents types d’indices acoustiques a également montré qu’en l’absence de marquage tonal, les indices de durée semblent suffisants dans le but de marquer la frontière de syntagme accentuel. Un marquage conjoint de la frontière droite d’ip par les indices mélodiques et l’allongement pré-frontière semble au contraire nécessaire pour que les auditeurs du français perçoivent et utilisent cette frontière dans le traitement du langage. / The work described here is grounded by two major observations. Firstly, most of the French intonation models agree on the existence of two levels of prosodic phrasing: the accentual phrase and the intonation phrase. Secondly, although the existence of an additional level of structure ranked between these two levels has been proposed for French, the phonetic and phonological properties of this intermediate phrase (ip) have not been clearly defined. In this thesis we provide evidence for the existence of an intermediate level of phrasing in French through both speech production and perception studies. Results of five experiments conducted within the framework of laboratory phonology revealed specific ip phonetic and phonological properties and tested its role in the perceptual processing of language. The production studies showed that the ip is the domain of downstep in French and that its right boundary is marked by a phrase accent responsible for a return to the register reference line. Analyses conducted in perception showed that the accentual phrase and intermediate phrase right boundaries are used early in the syntactic processing. Phonetic and phonological indices at these boundaries allow French listeners to build expectations about the syntactic structure of spoken utterances. A separate analysis of different types of acoustic cues showed that without tonal marking, pre-boundary lengthening seems to be sufficient to mark the accentual phrase boundary. Joint marking through melodic and lengthening cues appears to be necessary to perceive and make use of the ip boundary in language processing. Découpage prosodique Syntagme accentuel Syntagme intermédiaire Intonation Accès au lexique Traitement syntaxique Phonologie de laboratoire Français Prosodic phrasing Accentual phrase Intermediate phrase Intonation Lexical access Syntactic processing Laboratory phonology French
7	Stress shift in English rhythm rule environments : effects of prosodic boundary strength and stress clash types Azzabou-Kacem, Soundess January 2018 (has links) It is well-known that the early assignment of prominence in sequences like THIRteen MEN vs. thirTEEN, (defined as the Rhythm Rule, or post-lexical stress shift), is an optional phenomenon. This dissertation examines some of the factors that encourage the application of stress shift in English and how it is phonetically realised. The aim is to answer two sets of questions related to why and how stress shift occurs in English: 1a) Does prosodic boundary strength influence stress shift? 1b) Does the adjacency of prominences above the level of the segmental string encourage stress shift? 2) How is stress shift realized? a) Is stress shift only a perceptual phenomenon? and b) Which syllables, if any, change acoustically when stress shift is perceived? To answer these questions, four experiments were designed. The first three experiments test whether the strength of the prosodic boundaries before and after the target word (e.g., canteen) influence stress shift. The effect of the strength of the left-edge prosodic boundary was investigated by comparing perceived stress patterns of the target (e.g., canteen) as produced in isolation where it is preceded by an utterance- and a phrase- initial prosodic boundary (the Isolated condition) with its rendition when embedded in a frame sentence (e.g., Say canteen again) where the left prosodic boundary before canteen is weaker (the Embedded condition). Results show a very clear tendency towards late phrasal prominence on the final accentable syllable (e.g., -teen in canteen) in the Embedded condition while in the Isolated condition this pattern appeared in less than half of the targets, showing that the stronger left boundary increased the incidence of stress shift. Two more experiments manipulated the strength of the boundary to the right of the target (#) respectively by changing the syntactic parse of the critical phrase (e.g. canteen cook) in sequences like (1) and by manipulating constituent length as in (2). Results showed that the syntactic manipulation significantly affected the strength of the prosodic boundary between the clashing words which was stronger in (1b) relative to (1a), and affected the incidence of stress shift, which was higher in (1a) relative to (1b). The length manipulation also affected the rate of stress shift, which was significantly higher in the phrase with the shorter word, e.g., soups (2a) relative to phrase with the longer word, e.g., supervisors (2b). (1) Example from the Syntax Experiment a. Who is the canteen (#) cook these days? (Pre-modifier + Noun) b. How do the canteen (#) cook these days? (NP + VP) (2) Example from the Length Experiment a. It should include the canteen (#) soups again. (Shorter constituent) b. It should include the canteen (#) supervisors again. (Longer constituent) Whilst we knew from the literature that the grouping of the clashing words within one Intonational Phrase (IP) encourages stress shift, results from the Syntax and Length experiments indicate that this (i.e., the phrasing of the clashing words within same IP) is not sufficient condition for the occurrence of stress shift, and that fine-grained degrees of boundary strength below the Intonational Phrase can drive changes in prominence pattern. The fact that higher rates of stress shift (and associated significant acoustic changes) were driven by manipulations of constituent length --for sequences with the same syntactic structure-- provides support for the idea that prosodic (rather than syntactic) boundaries directly influence stress shift. The fourth experiment tests the definition of stress clash in English in cases like fourteen candles where the two main lexical prominences are strictly adjacent along the time dimension, in fourteen canoes where the prominences are not adjacent in time, but adjacent at the higher levels of the metrical hierarchy, and in fourteen canteens where the main lexical prominences are not adjacent, and do not clash. This experiment highlighted and resolved an unacknowledged disagreement about what clash status sequences with one weak intervening syllable (e.g., fourTEEN caNOES). The fourTEEN caNOES type were shown to behave like metrically clashing sequences (e.g., fourteen CANdles) in attracting stress shift, and differently from the non-metrically-clashing sequences (e.g., fourteen CANTEENS) in discouraging it. These results provide empirical support for the Standard Metrical Theory (e.g. Selkirk, 1984; Nespor & Vogel, 1989) claim that 1) stress clash matters in triggering stress shift and that 2) stress clash in English is defined at the higher prosodic levels and not restricted to the level of the segmental string as indirectly assumed in a growing body of research (e.g., Vogel, Bunnel & Hoskins, 1995; Tomlinson, Liu & Fox Tree, 2014). Along with the establishment of prosodic boundary strength as one of the predictors influencing stress shift, another important contribution of the thesis is providing empirical evidence that the English Rhythm Rule is not solely a perceptual phenomenon and that it is associated with acoustic correlates. The main correlates of perceived stress shift consistently appearing across experiments is the decrease in the duration of the main lexical prominence of the target (e.g., -teen in canteen) and the increase of fundamental frequency and Sound Pressure Level peaks and on the initial syllable (e.g., canin canteen), when followed by a main clashing phrasal prominence. The acoustic analysis shows that the first accentable syllable also contributes in the perception of stress shift. This latter result does not lend support to the deletion formulation of the Rhythm Rule (Gussenhoven, 1991) which stipulates that the impressions of stress shift are solely associated with changes of prominence in the last accentable syllable of the target (e.g. -teen in canteen). Along with the determination of the acoustic correlates of perceived stress shift in English, the present research 1) indicates that fine-grained gradations of prosodic boundary strength can influence stress shift, 2) shows that while stress clash can increase the incidence of stress shift, stress shift can take place even in environments completely free of stress clash, and 3) provides evidence that stress clash should not be construed simply as the concatenation of two main lexical prominences along the time dimension.

1

Page generated in 0.0573 seconds