• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 139
  • 72
  • 48
  • 24
  • 17
  • 11
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 387
  • 119
  • 83
  • 71
  • 67
  • 63
  • 50
  • 49
  • 29
  • 28
  • 27
  • 25
  • 25
  • 25
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
351

Contrastes entre estratégias de falantes bilíngues na produção de um diálogo e um monólogo em inglês

Silva, Amaury Flávio 21 May 2009 (has links)
Made available in DSpace on 2016-04-28T18:24:03Z (GMT). No. of bitstreams: 1 Amaury Flavio Silva.pdf: 6544112 bytes, checksum: c648f2d679889ea8f71012575e1cf2ad (MD5) Previous issue date: 2009-05-21 / The purpose of this dissertation is to investigate the strategies used by a group of bilingual speakers in the production of a dialogue and a monologue in English and to analyse a listening activity from a course book. In order to do it, the theoretical background used in the investigations was based on the theories and models about coarticulation found in the book organised by Hardcastle and Hewlett (2002) Coarticulation: Theory, Data and Techniques; The Articulatory Phonology; developed by Browman and Goldstein (1986; 1989; 1990a; b; 1992); and the findings accomplished by Cho (2002) on The Effects of Prosody on Articulation. The production of the dialogue and the monologue were carried out by a group of late bilingual male speakers of English and Portuguese, aged from 18 to 48 years old. So as to carry out the analyses, the PRAAT free software version 4.5.18, developed by Paul Boersma and David Weenink, from the Institute of Phonetic Sciences of the University of Amsterdan was used. The results obtained through the investigations indicated the presence of coarticulatory phenomenona such as hiding in contexts like let me see; blending in almost daily; the presence of the flap in get out; the presence of vowels between consonants in contexts like much better; and so forth. The investigations concerning the course book revealed the fact that the consonants, which according to the course book answer key were not pronounced, were, indeed, pronounced. This was possible through the analyses of the spectrograms of each segmentation / O objetivo desta dissertação é o de investigar as estratégias de falantes bilíngues na produção de um diálogo e um monólogo em inglês e de analisar um exercío de compreensão auditiva proveniente de um livro didático. Para tanto, como base teórica para a realização das investigações foram utilizadas as teorias e modelos sobre a coarticulação provenientes do livro organizado por Hardcastle e Hewlett (2002) Coarticulation: Theory, Data and Techniques; a Fonologia Articulatória, desenvolvida por Browman e Goldstein (1986; 1989;1990a; b; 1992); e as descobertas realizadas por Cho (2002) sobre os Efeitos da Prosódia nos Articuladores. Participaram das gravações do diálogo e do monólogo um grupo de sujeitos bilíngues tardios do português e do inglês, todos do sexo masculino cujas idades variam entre 18 e 48 anos. Para realizar as análises foi utilizado o software livre PRAAT versão 4.5.18, desenvolvido por Paul Boersma e David Weenink, do Instituto de Ciências Fonéticas da Universidade de Amsterdã. Os resultados obtidos nas investigações apontaram a presença dos fenômenos coarticulatórios como hiding em contextos como em let me see; blending em almost daily; a presença do flepe em get out; a presença de vogais entre consoantes em contextos como much better; além de outros. Quanto ao livro didático, as investigações revelaram o fato de que as consoantes tidas como não pronunciadas puderam ser detectadas através da análise espectrográfica das segmentações
352

A segmentação não-convencional na escrita dos alunos do ensino fundamental II: dos erros aos acertos pela reescrita de texto / The unconventional segmentation in the writing of elementary school ii students: the errors to correct usage in the rewritten text

Garcia, Vera Lucia de Souza 05 February 2016 (has links)
Made available in DSpace on 2017-07-10T16:21:11Z (GMT). No. of bitstreams: 1 Dissertacao Vera.pdf: 3265973 bytes, checksum: c84ea3982ae25119b2b295324dc54ca5 (MD5) Previous issue date: 2016-02-05 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Did the identification, description and categorization of the phenomena considered language of the non-written agreements, phonological and orthographic origin, not suitable for standard written norm, showing an increased incidence of hypo- and hyper words in the textual productions of the students. Established, then the question - problem: What factor (or factors) does (do) that there is the incidence of hypo- and hyper writing students EF-II? To answer the question and understand the nature of the targets phenomenon unconventional, its phonetic, phonological and morphological character, we rely on Mattoso House (1985) and Bisol (2001); by Abaurre; Fiad and Mayrink Sabinson (1997); Oliveira (2009), Cagliari (2009), in view of the heterogeneity of language, speech / orality / writing / literacy; we base on Corrêa (2004); Capristano (2004); Chacon (2004 and 2006) Cunha (2004 and 2010). We found that unconventional segmentation present in the writing of the subjects scribes of our research are established both in terms of prosody, specific of prosodic constituents, as due to the heterogeneous constitution of written language as individuation points Scribe by circulation in the genesis of language and the image that it makes written code institutionalized. In view of these findings, based on Abaurre; Salek Fiad; Mayrink-Sabinson and Geraldi (1995); Leal and Brandao (2006); Salek-Fiad (2009); Ruiz (2013), about writing and the rewriting of texts, we propose a roadmap of writing and rewriting text activities, focusing on students' difficulties, as reworking strategy of non-conventional segmentation - hypo- and hyper. / A dificuldade apresentada pelos alunos em apreender as convenções da norma padrão da língua escrita tem sido bastante discutida em ambiente educacional. Os problemas se caracterizam em relação às convenções da língua escrita, são muitos e podem ser encontrados na produção escrita de alunos em todos os níveis de ensino. Isto assinala um não domínio da escrita convencional e transforma a produção de texto na escola uma ação incômoda. A percepção do fato nos levou a estabelecer o objetivo de compreender quais critérios linguísticos e da relação sujeito/linguagem podiam explicar a presença desses eventos na escrita de alunos do 6º ao 9º Ano do EF-II, integrantes do Projeto Jornal Escolar, em uma instituição pública de ensino da região noroeste do Paraná. Fizemos a identificação, a descrição e a categorização dos fenômenos considerados não-convenções da língua escrita, de origem fonológica e ortográfica, não adequados à norma padrão escrita, constatando maior incidência de hipo e hipersegmentação de palavras nas produções textuais dos alunos. Estabelecemos, assim, a questão-problema: Que fator (ou fatores) faz (fazem) com que haja essa incidência de hipo e hipersegmentação na escrita de alunos do EF-II? Para responder à questão e compreender a natureza do fenômeno de segmentações não-convencionais, seu caráter fonético-fonológico e morfológico, baseamo-nos em Mattoso Câmara (1985) e Bisol (2001); em Abaurre; Fiad e Mayrink Sabinson (1997); em Oliveira (2009) e em Cagliari (2009), na perspectiva da heterogeneidade da língua, da relação fala/oralidade/escrita/letramento; fundamentamos em Corrêa (2004); em Capristano (2004); em Chacon (2004; 2006), em Cunha (2004; 2010). Constatamos que a segmentação não-convencional presente na escrita dos escreventes sujeitos da nossa pesquisa se estabelecem tanto em função da prosódia, específico dos constituintes prosódicos, quanto em função da constituição heterogênea da língua escrita como pontos de individuação do escrevente pela circulação na gênese da língua e na imagem que ele faz do código escrito institucionalizado. Frente a essas constatações, baseadas em Abaurre; Salek Fiad; Mayrink-Sabinson e Geraldi (1995); em Leal e Brandrão (2006); em Salek-Fiad (2009) e em Ruiz (2013) sobre escrita e reescrita de texto, pudemos propor um roteiro de atividades de escrita e de reescrita de texto, focalizando as dificuldades dos alunos, como estratégia de reelaboração da segmentação não-convencional hipo e hipersegmentação
353

Prosodisk förmåga hos svenska grundskolebarn med cochleaimplantat

Fandén, Anna, McTaggart, Julia, Hellstadius, Åsa January 2008 (has links)
Prosody can be characterized as the rhythm and the melody of speech. Prosodic features convey emotions, thoughts and geographic origins of each individual. Spoken language without prosody would be monotonous, without variations in loudness and rate. Children with cochlear implants perceive speech in a different way than children with normal hearing. Consequently the speech produced by a child with cochlear implants may sound different. The purpose of this study was to examine prosodic skills in Swedish children with cochlear implants and to compare them with the prosodic skills in Swedish children with normal hearing. The purpose of the study was also to examine differences between these two groups and to characterize those differences. Eight children with cochlear implants and eight controls matched to age, sex and regional accent were included in the study. The children’s production and perception of prosody was tested. The results show that there are differences in prosodic skills between the children with cochlear implants and their matched controls at word, phrase and discourse levels. The differences were significant in production but not in perception. Observed differences in the speech of the children with cochlear implants included omission of unstressed syllables and function words, difficulties producing contrast of tonal word accents and pro-longed maintenance of phonological processes. The study contributes to the knowledge about prosodic and linguistic skills in Swedish children with cochlear implants.
354

Prosodisk förmåga hos svenska grundskolebarn med cochleaimplantat

Fandén, Anna, McTaggart, Julia, Hellstadius, Åsa January 2008 (has links)
<p>Prosody can be characterized as the rhythm and the melody of speech. Prosodic features convey emotions, thoughts and geographic origins of each individual. Spoken language without prosody would be monotonous, without variations in loudness and rate. Children with cochlear implants perceive speech in a different way than children with normal hearing. Consequently the speech produced by a child with cochlear implants may sound different.</p><p>The purpose of this study was to examine prosodic skills in Swedish children with cochlear implants and to compare them with the prosodic skills in Swedish children with normal hearing. The purpose of the study was also to examine differences between these two groups and to characterize those differences.</p><p>Eight children with cochlear implants and eight controls matched to age, sex and regional accent were included in the study. The children’s production and perception of prosody was tested.</p><p>The results show that there are differences in prosodic skills between the children with cochlear implants and their matched controls at word, phrase and discourse levels. The differences were significant in production but not in perception. Observed differences in the speech of the children with cochlear implants included omission of unstressed syllables and function words, difficulties producing contrast of tonal word accents and pro-longed maintenance of phonological processes.</p><p>The study contributes to the knowledge about prosodic and linguistic skills in Swedish children with cochlear implants.</p>
355

Le chunking perceptif de la parole : sur la nature du groupement temporel et son effet sur la mémoire immédiate

Gilbert, Annie 03 1900 (has links)
Dans de nombreux comportements qui reposent sur le rappel et la production de séquences, des groupements temporels émergent spontanément, créés par des délais ou des allongements. Ce « chunking » a été observé tant chez les humains que chez certains animaux et plusieurs auteurs l’attribuent à un processus général de chunking perceptif qui est conforme à la capacité de la mémoire à court terme. Cependant, aucune étude n’a établi comment ce chunking perceptif s’applique à la parole. Nous présentons une recension de la littérature qui fait ressortir certains problèmes critiques qui ont nui à la recherche sur cette question. C’est en revoyant ces problèmes qu’on propose une démonstration spécifique du chunking perceptif de la parole et de l’effet de ce processus sur la mémoire immédiate (ou mémoire de travail). Ces deux thèmes de notre thèse sont présentés séparément dans deux articles. Article 1 : The perceptual chunking of speech: a demonstration using ERPs Afin d’observer le chunking de la parole en temps réel, nous avons utilisé un paradigme de potentiels évoqués (PÉ) propice à susciter la Closure Positive Shift (CPS), une composante associée, entre autres, au traitement de marques de groupes prosodiques. Nos stimuli consistaient en des énoncés et des séries de syllabes sans sens comprenant des groupes intonatifs et des marques de groupements temporels qui pouvaient concorder, ou non, avec les marques de groupes intonatifs. Les analyses démontrent que la CPS est suscitée spécifiquement par les allongements marquant la fin des groupes temporels, indépendamment des autres variables. Notons que ces marques d’allongement, qui apparaissent universellement dans la langue parlée, créent le même type de chunking que celui qui émerge lors de l’apprentissage de séquences par des humains et des animaux. Nos résultats appuient donc l’idée que l’auditeur chunk la parole en groupes temporels et que ce chunking perceptif opère de façon similaire avec des comportements verbaux et non verbaux. Par ailleurs, les observations de l’Article 1 remettent en question des études où on associe la CPS au traitement de syntagmes intonatifs sans considérer les effets de marques temporels. Article 2 : Perceptual chunking and its effect on memory in speech processing:ERP and behavioral evidence Nous avons aussi observé comment le chunking perceptif d’énoncés en groupes temporels de différentes tailles influence la mémoire immédiate d’éléments entendus. Afin d’observer ces effets, nous avons utilisé des mesures comportementales et des PÉ, dont la composante N400 qui permettait d’évaluer la qualité de la trace mnésique d’éléments cibles étendus dans des groupes temporels. La modulation de l’amplitude relative de la N400 montre que les cibles présentées dans des groupes de 3 syllabes ont bénéficié d’une meilleure mise en mémoire immédiate que celles présentées dans des groupes plus longs. D’autres mesures comportementales et une analyse de la composante P300 ont aussi permis d’isoler l’effet de la position du groupe temporel (dans l’énoncé) sur les processus de mise en mémoire. Les études ci-dessus sont les premières à démontrer le chunking perceptif de la parole en temps réel et ses effets sur la mémoire immédiate d’éléments entendus. Dans l’ensemble, nos résultats suggèrent qu’un processus général de chunking perceptif favorise la mise en mémoire d’information séquentielle et une interprétation de la parole « chunk par chunk ». / In numerous behaviors involving the learning and production of sequences, temporal groups emerge spontaneously, created by delays or a lengthening of elements. This chunking has been observed across behaviors of both humans and animals and is taken to reflect a general process of perceptual chunking that conforms to capacity limits of short-term memory. Yet, no research has determined how perceptual chunking applies to speech. We provide a literature review that bears out critical problems, which have hampered research on this question. Consideration of these problems motivates a principled demonstration that aims to show how perceptual chunking applies to speech and the effect of this process on immediate memory (or “working memory”). These two themes are presented in separate papers in the format of journal articles. Paper 1: The perceptual chunking of speech: a demonstration using ERPs To observe perceptual chunking on line, we use event-related potentials (ERPs) and refer to the neural component of Closure Positive Shift (CPS), which is known to capture listeners’ responses to marks of prosodic groups. The speech stimuli were utterances and sequences of nonsense syllables, which contained intonation phrases marked by pitch, and both phrase-internal and phrase-final temporal groups marked by lengthening. Analyses of CPSs show that, across conditions, listeners specifically perceive speech in terms of chunks marked by lengthening. These lengthening marks, which appear universally in languages, create the same type of chunking as that which emerges in sequence learning by humans and animals. This finding supports the view that listeners chunk speech in temporal groups and that this perceptual chunking operates similarly for speech and non-verbal behaviors. Moreover, the results question reports that relate CPS to intonation phrasing without considering the effects of temporal marks. Paper 2: Perceptual chunking and its effect on memory in speech processing: ERP and behavioral evidence We examined how the perceptual chunking of utterances in terms of temporal groups of differing size influences immediate memory of heard speech. To weigh these effects, we used behavioural measures and ERPs, especially the N400 component, which served to evaluate the quality of the memory trace for target lexemes heard in the temporal groups. Variations in the amplitude of the N400 showed a better memory trace for lexemes presented in groups of 3 syllables compared to those in groups of 4 syllables. Response times along with P300 components revealed effects of position of the chunk in the utterance. This is the first study to demonstrate the perceptual chunking of speech on-line and its effects on immediate memory of heard elements. Taken together the results suggest that a general perceptual chunking enhances a buffering of sequential information and a processing of speech on a chunk-by-chunk basis.
356

Svenska studenters uppfattningar av tonerna i kinesiska tvåstaviga ord

Hu, Guohua January 2015 (has links)
Studenter med ett icke-tonspråk som modersmål brukar ha svårigheter att lära sig de kinesiskatonerna när de börjar sina studier. Å ena sidan brukar kursböckerna på denna nivån användabara enstaviga ord och det har ansetts vara god pedagogik att läraren visar tonkonturerna medsina händer. Å andra sidan har forskningen om kinesiska länge varit koncentrerad på vilkavärden som grundtonen (F0) uppvisar i enstaviga ord. Det är många faktorer som tvärspråksforskningeninte har uppmärksammat, bl.a. det inflytande som konsonanterna har på F0 (vilketinfödda talare inte alltid är medvetna om).Det finns ingen samsyn när det gäller att förklara tonförväxlingsmönstren. Tidigare teorierinom ämnet andraspråksinlärning (Second Language Acquisition, SLA), som Perception AssimilationModel (PAM) och Speech Learning Model (SLM) har visat sig otillräckliga förstudiet av tonperception. På senare tid har PAM-Suprasegment försökt förklara hur den lärandesmodersmål antas närma sig kinesiskans tonsystem men modellen tar inte upp ordprosodin.Eftersom den moderna kinesiskans ordförråd till majoriteten består av tvåstaviga ord börforskningen lämna gamla spår för att finna andra kriterier – som duration och betoning – föratt förklara hur man lär sig höra kinesiskans toner, t.ex. vad som händer när två toner återfinnsi ett ord.Denna studie har som mål att ta reda på hur svenskar som studerar kinesiska som andraspråk/främmande språk uppfattar tonerna i kinesiska tvåstaviga ord. Experimentet bygger intepå tillrättalagda testord. Resultaten visar att tonerna först och främst påverkas av den initialakonsonanten och sedan av de omgivande tonerna. Vidare visas att det svenska systemet medaccent I och II i sin tur kan åstadkomma tonförväxlingar eftersom kinesiska tvåstaviga orddelvis uppvisar samma mönster.Resultaten illustrerar att tonidentifiering är en dynamisk och komplex process. Det krävsfortsatt forskning om tonerna för att få grepp om dem men det kan inte stanna där: interaktionenmellan ljud och ordprosodi behöver belysas bättre för att uppnå god behärskning avprosodin i språkundervisningen. / Foreign adult students with atonal language usually have, in the beginning of their Chinesestudy, difficulties to identify the Chinese tones. On one side, only monosyllabic tones arementioned in course books during this earlier stage and to illustrate the tone contours withhands has been treated as an effective pedagogy. On the other side, research on Chinese hasfor long been solely concentrated upon the values of the fundamental frequency (F0) of thevowels in monosyllabic words. In cross-linguistic studies many factors, among others the effectsof consonants on F0 that native speakers are not aware of, have still not been paidspecial attention to.There is no consensus regarding the explanation to tone confusion patterns. Earlier theoriesof Second Language Acquisition (SLA) like Perception Assimilation Model (PAM) andSpeech Learning Model (SLM) are no longer suited for tone perception. Recently, PAMSuprasegmenthas tried to approach that the intonation of the learners’ native language is assumedto be assimilated to the Chinese tone system. However, this model ignores the wordprosody. Nowadays, when the modern Chinese vocabulary consists of a majority of disyllabicwords, research has to be re-directed to find other criteria e.g. temporal and stress for explainingthe complexity of Chinese tone perception, i.e. how two tones behave when they arecombined in one word.The purpose of this essay is to explore how native Swedish speakers learning Chinese assecond/foreign language perceive the Chinese tones of disyllabic words. The experiment isnot based on elaborated test words. The results show that tones are first of all affected by theinitial consonants and sequentially influenced by the surrounding tones with accordance toChinese. It further reveals that Swedish accent I/II patterns might be a reasonable explanationfor the Chinese tone confusion patterns since partially acoustic properties of Chinese disyllabicwords overlap the Swedish accents.These results mean that tone perception is a dynamic and complex process. Further researchon tone perception should explore profoundly and widen interaction between sounds andword prosody, which paves the way for more effective prosodic practice in language education.
357

A formative study of rhythm and pattern: semiotic potential of multimodal experiences for early years readers

Peters, J. Beryl 08 September 2011 (has links)
Literacy education defined as the reading and writing of print text is undergoing a paradigmatic shift towards a pedagogy of multiliteracies (Cole & Pullen, 2010). At the same time, demands for rapid, efficient, and accurate reading skills escalate (Katzir et al., 2006) in a global society with increasingly instant and complex literacy requirements. Musical rhythm plays a role in multiliteracy and print literacy learning. Rhythm is essential for music making and reading, and may facilitate print literacy for all children, including those who struggle with traditional print-based teaching and learning. The purpose of this research was to investigate the potential for the semiotic resource of rhythm to engage early years children in print and non-print literacy learning. A twelve week mixed methods quasi-experimental study was conducted to examine the effects of a multimodal Orff-based learning design on elements of reading and rhythm for grades one to three children in four schools. Students (N = 169) from nine classrooms were non-randomly assigned to one of two groups. The researcher instructed both groups two to three times a week totaling twenty-five sessions in each homeroom classroom. The experimental groups participated in Orff-based learning experiences that focused on elements of rhythm and prosodic oral reading fluency. The control group listened to and sang song-storybooks. Beat performance and oral reading rate assessments were administered as pre- and post-tests to each group. Struggling readers in the experimental group significantly improved on measures of oral reading rate compared to struggling readers in the control group using matched pairs t-procedures and analyses of variance. Associations between beat performance and oral reading rate were explored using bivariate and multivariate regression and correlation analysis. A strong positive correlation was found between measures of beat competency and measures of oral reading rate. Qualitative methods using grounded theory, semiotic data analysis, multimodal analysis, action research, and design research methods placed within a bricolage framework (Kincheloe & Berry, 2004) and examined through the lens of complexity thinking (Davis & Sumara, 2006) added multiperspectival meaning-making of data. Findings pointed to the value of multimodal music and rhythm experiences for engaged, deep, meaningful print and non-print learning for diverse individual and classroom collective learners in both control and experimental classrooms. Beat competency was important to both print and music literacy learning in experimental classrooms. Beat experiences were compelling, equitable, and appeared to organize music, oral language, and print literacy into meaningful and accessible patterns and structures. Similar findings may be occasioned through an ontology of multimodal richness, a complex epistemology, embodied ways of knowing and communicating, and systemic shared beliefs and values.
358

A formative study of rhythm and pattern: semiotic potential of multimodal experiences for early years readers

Peters, J. Beryl 08 September 2011 (has links)
Literacy education defined as the reading and writing of print text is undergoing a paradigmatic shift towards a pedagogy of multiliteracies (Cole & Pullen, 2010). At the same time, demands for rapid, efficient, and accurate reading skills escalate (Katzir et al., 2006) in a global society with increasingly instant and complex literacy requirements. Musical rhythm plays a role in multiliteracy and print literacy learning. Rhythm is essential for music making and reading, and may facilitate print literacy for all children, including those who struggle with traditional print-based teaching and learning. The purpose of this research was to investigate the potential for the semiotic resource of rhythm to engage early years children in print and non-print literacy learning. A twelve week mixed methods quasi-experimental study was conducted to examine the effects of a multimodal Orff-based learning design on elements of reading and rhythm for grades one to three children in four schools. Students (N = 169) from nine classrooms were non-randomly assigned to one of two groups. The researcher instructed both groups two to three times a week totaling twenty-five sessions in each homeroom classroom. The experimental groups participated in Orff-based learning experiences that focused on elements of rhythm and prosodic oral reading fluency. The control group listened to and sang song-storybooks. Beat performance and oral reading rate assessments were administered as pre- and post-tests to each group. Struggling readers in the experimental group significantly improved on measures of oral reading rate compared to struggling readers in the control group using matched pairs t-procedures and analyses of variance. Associations between beat performance and oral reading rate were explored using bivariate and multivariate regression and correlation analysis. A strong positive correlation was found between measures of beat competency and measures of oral reading rate. Qualitative methods using grounded theory, semiotic data analysis, multimodal analysis, action research, and design research methods placed within a bricolage framework (Kincheloe & Berry, 2004) and examined through the lens of complexity thinking (Davis & Sumara, 2006) added multiperspectival meaning-making of data. Findings pointed to the value of multimodal music and rhythm experiences for engaged, deep, meaningful print and non-print learning for diverse individual and classroom collective learners in both control and experimental classrooms. Beat competency was important to both print and music literacy learning in experimental classrooms. Beat experiences were compelling, equitable, and appeared to organize music, oral language, and print literacy into meaningful and accessible patterns and structures. Similar findings may be occasioned through an ontology of multimodal richness, a complex epistemology, embodied ways of knowing and communicating, and systemic shared beliefs and values.
359

Prosody modelling using machine learning techniques for neutral and emotional speech synthesis / Μοντελοποίηση προσωδίας με χρήση τεχνικών μηχανικής μάθησης στα πλαίσια ουδέτερης και συναισθηματικής συνθετικής ομιλίας

Λαζαρίδης, Αλέξανδρος 11 August 2011 (has links)
In this doctoral dissertation three proposed approaches were evaluated using two databases of different languages, one American-English and one Greek. The proposed approaches were compared to the state-of-the-art models in the phone duration modelling task. The SVR model outperformed all the other individual models evaluated in this dissertation. Their ability to outperform all the other models is mainly based on their advantage of coping in a better way with high-dimensionality feature spaces in respect to the other models used in phone duration modelling, which makes them appropriate even for the case when the amount of the training data would be small respectively to the number of the feature set used. The proposed fusion scheme, taking advantage of the observation that different prediction algorithms perform better in different conditions, when implemented with SVR (SVR-fusion), contributed to the improvement of the phone duration prediction accuracy over that of the best individual model (SVR). Furthermore the SVR-fusion model managed to reduce the outliers in respect to the best individual model (SVR). Moreover, the proposed two-stage scheme using individual phone duration models as feature constructors in the first stage and feature vector extension (FVE) in the second stage, implemented with SVR (SVR-FVE), improved the prediction accuracy over the best individual predictor (SVR), and the SVR-fusion scheme and moreover managed to reduce the outliers in respect to the other two proposed schemes (SVR and SVR-fusion). The SVR two-stage scheme confirms in this way their advantage over all the other algorithms of coping well with high-dimensionality feature sets. The improved accuracy of phone duration modelling contributes to a better control of the prosody, and thus quality of synthetic speech. Furthermore, the first proposed method (SVR) was also evaluated on the phone duration modelling task in emotional speech, outperforming all the state-of-the-art models in all the emotional categories. Finally, perceptual tests were performed evaluating the impact of the proposed phone duration models to synthetic speech. The perceptual test for both the databases confirmed the results of objective tests showing the improvement achieved by the proposed models in the naturalness of synthesized speech. / Η παρούσα διδακτορική διατριβή πραγματεύεται προβλήματα που αφορούν στο χώρο της τεχνολογίας ομιλίας, με στόχο την μοντελοποίηση προσωδίας με χρήση τεχνικών μηχανικής μάθησης στα πλαίσια ουδέτερης και συναισθηματικής συνθετικής ομιλίας. Μελετήθηκαν τρεις καινοτόμες μέθοδοι μοντελοποίησης προσωδίας, οι οποίες αξιολογήθηκαν με αντικειμενικά τεστ και με υποκειμενικά τεστ ποιότητας ομιλίας για την συνεισφορά τους στην βελτίωση της ποιότητα της συνθετικής ομιλίας: Η πρώτη τεχνική μοντελοποίησης διάρκειας φωνημάτων, βασίζεται στην μοντελοποίηση με χρήση Μηχανών Υποστήριξης Διανυσμάτων (Support Vector Regression – SVR). Η μέθοδος αυτή δεν έχει χρησιμοποιηθεί έως σήμερα στην πρόβλεψη διάρκειας φωνημάτων. Η μέθοδος αυτή συγκρίθηκε και ξεπέρασε σε απόδοση όλες τις μεθόδους της επικρατούσας τεχνολογίας (state-of-the-art) στη μοντελοποίηση της διάρκειας φωνημάτων. Η δεύτερη τεχνική, βασίζεται στην μοντελοποίηση διάρκειας φωνημάτων με συνδυαστικό μοντέλο πολλαπλών προβλέψεων. Συγκεκριμένα, οι προβλέψεις διάρκειας φωνημάτων από ένα σύνολο ανεξάρτητων μοντέλων πρόβλεψης διάρκειας φωνημάτων χρησιμοποιούνται ως είσοδος σε ένα μοντέλο μηχανικής μάθησης, το οποίο συνδυάζει τις εξόδους από τα ανεξάρτητα μοντέλα πρόβλεψης και επιτυγχάνει μοντελοποίηση της διάρκειας φωνημάτων με μεγαλύτερη ακρίβεια, μειώνοντας επιπλέον και τα μεγάλα σφάλματα (outliers), δηλαδή τα σφάλματα που βρίσκονται μακριά από το μέσο όρο των σφαλμάτων. Η τρίτη τεχνική, είναι μια μέθοδος μοντελοποίησης διάρκειας φωνημάτων δύο σταδίων με κατασκευή νέων χαρακτηριστικών και επέκταση του διανύσματος χαρακτηριστικών. Συγκεκριμένα, στο πρώτο στάδιο, ένα σύνολο ανεξάρτητων μοντέλων πρόβλεψης διάρκειας φωνημάτων που χρησιμοποιούνται ως παραγωγοί νέων χαρακτηριστικών εμπλουτίζουν το διάνυσμα χαρακτηριστικών. Στο δεύτερο στάδιο, το εμπλουτισμένο διάνυσμα χρησιμοποιείται για να εκπαιδευτεί ένα μοντέλο πρόβλεψης διάρκειας φωνημάτων το οποίο επιτυγχάνει υψηλότερη απόδοση σε σχέση με όλες τις προηγούμενες μεθόδους, και μειώνει τα μεγάλα σφάλματα. Επιπλέον εφαρμόστηκε η πρώτη μέθοδος σε συναισθηματική ομιλία. Το προτεινόμενο SVR μοντέλο επιτυγχάνει την υψηλότερη απόδοση συγκρινόμενο με όλα τα state-of-the-art μοντέλα. Τέλος, πραγματοποιήθηκαν υποκειμενικά τεστ ποιότητας ομιλίας ώστε να αξιολογηθεί η συνεισφορά των τριών προτεινόμενων μεθόδων στη βελτίωση της ποιότητας της συνθετικής ομιλίας. Τα τεστ αυτά επιβεβαίωσαν την αξία των προτεινόμενων μεθόδων και τη συνεισφορά τους στη βελτίωση της ποιότητας στην συνθετική ομιλία.
360

Estudo longitudinal de hipossegmentações em textos do Ensino Fundamental II / Longitudinal study of hyposegmentations in texts of Junior High School

Fiel, Roberta Pereira 22 March 2018 (has links)
Submitted by Roberta Pereira Fiel (roh_fiel@hotmail.com) on 2018-05-02T12:47:59Z No. of bitstreams: 1 FIEL_RP_2018_Estudo longitudinal de hipossegmentações no Ensino Fundamental II.pdf: 4063414 bytes, checksum: 7f6aacb0e843d3a08ad7154b8bdb3dcb (MD5) / Approved for entry into archive by Elza Mitiko Sato null (elzasato@ibilce.unesp.br) on 2018-05-02T23:42:08Z (GMT) No. of bitstreams: 1 fiel_rp_me_sjrp_int.pdf: 4063414 bytes, checksum: 7f6aacb0e843d3a08ad7154b8bdb3dcb (MD5) / Made available in DSpace on 2018-05-02T23:42:08Z (GMT). No. of bitstreams: 1 fiel_rp_me_sjrp_int.pdf: 4063414 bytes, checksum: 7f6aacb0e843d3a08ad7154b8bdb3dcb (MD5) Previous issue date: 2018-03-22 / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / Esta dissertação trata da caracterização longitudinal da escrita de alunos do EF II no que diz respeito às chamadas hipossegmentações de palavras escritas – como “puraqui” (“por aqui”), nas quais há a ausência não-convencional de fronteira gráfica. Nossos objetivos são: (i) identificar, por meio de análise quantitativa, se há correlação (ou não) entre número de hipossegmentações e tempo de escolarização; e (ii) descrever qualitativamente as hipossegmentações, quanto a aspectos prosódicos dos enunciados falados e aspectos gráficos relativos às informações da própria convenção ortográfica. Para alcançar esses objetivos, nos baseamos, por um lado, em aparato teórico da fonologia prosódica, modelo relation-based, que concebe a existência de sete constituintes prosódicos que estruturam os enunciados das línguas do mundo; por outro lado, em abordagem da escrita como constituída de modo heterogêneo. Dos resultados obtidos na análise quantitativa, destacamos que há correlação entre aumento dos anos de escolarização e diminuição de ocorrência de hipossegmentação. No que se refere aos resultados quantitativos das estruturas envolvidas, destacamos: (i) a junção entre clítico e palavra prosódica é a característica do maior conjunto de dados; (ii) a junção entre dois clíticos é a segunda estrutura mais recorrente, predominando a hipossegmentação “oque”; (iii) a junção entre duas palavras prosódicas, a terceira mais recorrente no material analisado, decorre da mobilização de várias características linguísticas, como a hipossegmentação de estruturas perifrásticas que constituem exemplos de mudança linguística em curso; (iv) a junção de palavra prosódica e clítico é a estrutura menos recorrente, sendo a maioria dos dados decorrente da combinação de palavras com a ausência do hífen, que levou à formação de possíveis palavras prosódicas; (v) a junção envolvendo mais de uma palavra prosódica e/ou clítico ocorreu apenas em três dados, que abrangem estruturas como a frase entoacional e o enunciado fonológico. No que se refere aos resultados qualitativos, a partir de análise de cunho linguístico-textual, os casos em que há a flutuação entre convencional e não-convencional: (i) se distinguem entre si pela configuração prosódica, gramatical e linguística-textual; (ii) são indícios mais explícitos da inserção dos alunos em práticas orais/faladas e letradas/escritas; (ii) são marcas do complexo processo que envolve o Outro como instância representativa da linguagem (e da escrita em particular), a escrita na complexidade de seu funcionamento (heterogeneamente constituída) e o aluno enquanto sujeito escrevente. A principal contribuição desta dissertação está em: (i) fazer análise quantitativa e qualitativa de hipossegmentações no EF II; e (ii) evidenciar a complexidade que subjaz às relações entre prosódia e escrita por meio da segmentação não-convencional de palavras. / This work deals with the longitudinal characterization of the writing by students from Junior High School (EF II in Brazil) with respect to the hyposegmentations of written words in which there is the unconventional absence of graphic frontier (e.g. "puraqui" - "por aqui" in Portuguese - "around here" in English). To reach these objectives, we are based on a theoretical apparatus of prosodic phonology, on the one hand, a relation-based model, which conceives the existence of seven prosodic constituents that structure the utterances of the world's languages; and, on the other hand, in the approach of writing as constituted in a heterogeneous way. From the results obtained in the quantitative analysis, we highlight that is a correlation between the increase in the years of schooling and a decrease in the occurrence of hyposegmentations. Regarding the quantitative results of the structures, we highlight: (i) the junction between clitic and prosodic word is the characteristic of the largest data set; (ii) the junction between two clitics is the second most recurrent structure; (iii) the junction between two prosodic words, the third most recurrent in the material analyzed, derives from the mobilization of several linguistic characteristics, such as the hyposegmentation of periphrastic structures that are examples of linguistic change in progress; (iv) the prosodic and clitic word junction is the least recurrent structure, most of which results from the combination of words with the absence of the hyphen, which led to the formation of possible prosodic words; (v) the junction involving more than one prosodic and / or clitic word occurred only in three data, covering structures such as the intonational phrase and phonological utterance. Regarding the qualitative results, from a linguistic-textual analysis, we highlight that the cases in which there is a fluctuation between conventional and unconventional are: (i) distinguished by their prosodic, grammatical and linguistic-textual configuration ; (ii) more explicit indications of students' insertion into oral / spoken practices and literacy / written practices; (ii) marks of the complex process involving the Other as an instance representative of language (and writing in particular), writing in the complexity of its functioning (heterogeneously constituted) and the student as a writing subject. The main contribution of this work is: (i) to make a quantitative and qualitative analysis of hyposegmentations in by students from Junior High School (EF II) in Brazil Elementary School; and (ii) to show the complexity that underlies the relations between prosody and writing through unconventional segmentation of words. / FAPESP (Processo Nº 2015/26763-6)

Page generated in 0.0557 seconds